Principal Component Analysis decomposes face images into a small set of characteristic feature images called Eigen faces, which are practically the principal components of the initial training set of face images. Recognition is done by projecting a new image into the subspace cover by the Eigen faces and then sort the face by comparing its position in the face space with the positions of the known individuals.
Poor separation power within the class and large estimation are the well-known common problems in PCA method. This limitation is overcome by Linear Discriminant Analysis. LDA is the most highlighted algorithms for feature selection in appearance based methods. But many LDA based system first use PCA to reduce dimension and then LDA to maximize the discriminating power of feature selection.
PCA is better than LDA and ICA under different lighting variations but LDA is better than ICA. LDA is more sensitive than PCA and ICA on partial obstruction, but PCA is less sensitive to partial obstruction compared to LDA and ICA. PCA is used as a dimension reduction technique in and for modeling expression deformations.
A recursive algorithm is introduce for calculating the discriminant features of PCA-LDA procedure. In this method focuses on challenging issue of calculating the separation of vectors from an incrementally arriving high dimensional data stream without calculating the corresponding covariance matrix.
This incremental PCA-LDA algorithm is very efficient in memory usage and it is very efficient in the calculation of first basis vectors. This algorithm gives a reasonable face recognition success rate in comparing with other face recognition algorithms such as PCA and LDA.
Modified PCA algorithm for face recognition were proposed. This method was based on the idea of reducing the effect of eigenvectors associate with the large Eigen values by normalizing the feature vector element by its corresponding standard deviation. The model results show that the proposed method results in a better performance than conservative PCA and LDA methods and the computational cost remains the same as that of PCA and much less than that of LDA.
Another face recognition method was introduced based on PCA, LDA and neural network.
This method consists of following steps
- Dimension reduction using PCA
- Feature extraction using LDA
- Classification using neural network.
Grouping of PCA and LDA were used for improving the ability of LDA when a few samples images were available and neural classifier was used to decrease number misclassification caused by not-linearly separable classes. This method was tested on Yale face database. Experimental results on this database verified the efficiency of the proposed method for face recognition with less incorrect classification in comparison with previous methods.
|Recognition is simple and effective compared to other matching approaches.
|PCA relies on linear assumptions|
|PCA completely relates any data in the transform domain.||PCA relies on orthogonal alterations of the original variables|
|Data compression is accomplished by the low dimensional subspace representation.||PCA based on mean vector and covariance matrix. Some of the classification may be categorized by this but not all.|
|Raw data are used directly for learning and recognition without any low-level or mid-level processing.||PCA is very sensitive to scale, so, a low level pre-processing is still necessary for scale normalization.
|No knowledge of geometry and transparency of faces is required.||There is considerable computational effort are needed for generation of Eigen values and Eigen values of the covariance matrix.|
|It reduces the total entropy of the data.||Its recognition rate decreases for recognition under varying pose and illumination.|
This algorithm generally extracts the relative information of an image and encodes it efficiently. For this purpose, simply gets collection of images from the person in order to the variations. Each image from the set contribute to an eigenvector, these vectors describe the variations between the images. When we represent these eigenvectors, we call it eigenfaces. Every face can be define as a linear combination of the eigenfaces, however we can reduce the number of eigenfaces to the ones with greater threshold values, therefore it will more efficient.
The basic idea of the algorithm is as follows:
1- Acquire a database of face images, calculate the eigenfaces and determine the face space with all them.
2-When a new image is found, calculate its set of weight.
3-Check whether the image is face and its resemblance close enough to face space.
4-Finaly, it will be determined if the image match to a known face of the database or not.
|Raw data are directly used for learning and recognition without any processing.||Eigenfaces recognition rate decreases for recognition under different pose & lighting.|
|No knowledge of reflection and geometry of faces is required.||This method is very sensitive to scale, so a low-level pre-processing is vital for scale.|
|Recognition is simple and effective.||Face images tested in the experiments are taken in the uniform background, while this situation is not suitable in natural scene.|
LDA is use to classify the unknown classes based on training sample of known classes. This algorithm use to find Maximum between class variance and minimum between within class variance. A block represents class in this technique. The variance between the classes is large but within the class is minimum/little.
It searches the vectors which discriminate between classes rather than just describe the data. We give the independent feature related to describe data to this algorithm and this algorithm in reply creates linear combination of these feature which yields a mean difference between desire classes. It also represents the mean of all the classes. The basic goal of this algorithm is to maximize measure it between classes but minimize it within the class measure.
LDA is closely related to PCA and factor analysis as both finds linear combination of variable which explain data. LDA model the difference between classes of data and factor analysis builds the feature base on differences and the PCA does not account any difference
|The LDA maximize the ratio of between class scatter to within class scatter to solve illumination problem.||LDA falls when there are all scatter matrices are singular.|
|PCA optimize low dimensional representation of objects by focusing on discriminating features but the LDA simply achieve object reconstruction.||LDA face small sample size issue. We have a large number of pixels available but the numbers of training sample are less than the dimensions of feature space.
|LDA works gives better accuracy in facial expression.||Can only classify a face which is “known” to the database.|
The Viola jones was proposed in 2001 for object detection in real time .It was the first framework to provide the detection of objects in real time. It was proposed by Paul Viola and Michael Jones. We can train it to detect variety of objects like we can use it to detect face in real time. It is use in OpenCV.
It is the best system for face detection in real time. This system consists on the basic ingredients for fast and accurate detection. Integral image is use for feature computation, Adaboost is use for feature selection and cascade is use for efficient allocation of computational resources. The main purpose of face detector is to tell us that weather the arbitrary image is the face of human or not. If it is the human face then find out and locate where it is. If we talked about our natural framework then we use binary classification in which classifier minimize the risks of misclassification. Because there is no object distribution which tells weather it has a face or not. It have to minimize the false negative and false positive rates in order to achieve the face.
The task require description of object on which a face can be contain and the objects should be accurate and different from the other parts of the body of human. If we can classify face portion from human body then we can recognize it. To do this we use Adaboost which is use to extract the features. Adaboost consists on weak classifiers which make a strong one from voting mechanism.
Integral image and cascade make this algorithm more efficient with real time detection. You can use it with webcame.it can also use on Desktop computers. It can detect faces from live webcam Images.
When we talk about a technology then it has many advantages and some limitations and this Voila-Jones algorithm also have some as follows:
|It is efficient to select features.||It works brilliant on frontal face but not side pose.|
|Invariant detector of scale and location.||t can hardly use on 45 degree angel|
|It did not scale the image itself but scale the features of the image.||Lightning is an effect in this algorithm.|
|This system can be used to train other objects like hand.||Multiple detection of same face is a problem.|
|It uses cascades to detect edges like the nose, the ears etc.||It does not behave well, if there is a group of people and their faces are close to each other, as edges tend to overlap in a crowd.|
Kanade Lucas Tomasi Feature tracker in computer vision is use to feature extraction. I was mainly propose for image registration which is generally very costly with use of traditional techniques. KLT make it easy. KLT uses spatial intensity information which direct the search for position which gives the best match. It works better than traditional techniques as it give more accurate result then traditional.
The KLT is use for feature racking. It was developed by Lucas and Kanade and then Tomasi Kanade work on its extension. This algorithm mainly use to detect scattered features which have texture which have something for track.
Here, we use Kanade-Lucas-Tomasi algorithm to track human face continuously from a video frame. We track the human face by finding the parameters by reducing the dissimilarity measurements between feature points and translational model. It first calculate the displacements of tracking point in first video frame and then from other. When it calculate the displacement then it is easy to track the head of human in video and movement of head. Optical flow tracker is use to track the human face .It is simple because the algorithm just have to detect face in first video frame and then detect it in succeeding frames.
|KLT find a good point to track from frame to frame.||KLT does not hold brightness constancy.|
|KLT uses intensity second moment matrix and difference across frames to find displacement.||KLT gives error when motion is large, so to fix it we have to match key points.|
|KLT iterate and use coarse-to-fine search to deal with larger movements.||KLT algorithm can deal with small pixel displacement so to overcome, we use pyramidal displacement.|
|KLT works better for textured pixels.||Windows size issue. Small window more sensitive to noise and may miss larger motion. Large windows may cross an occlusion boundary.|
|KLT is much quicker than other methods for checking lesser probable matches between pictures.||Small errors occur when appearance model is updated.|
This algorithm defines a new technique based on line edge maps (LEM) to perform face recognition. In addition, it offers a line matching technique to make this task possible. In contrast with other algorithms, LEM uses physiological features from human faces to solve the problem, it mainly uses mouth, nose and eyes.
In order to check the similarity of human faces, the face images are firstly converted into gray-level pictures. The images are encrypted into binary edge maps using Sobel edge detection algorithm. This system is very similar to the way human beings distinguish other people faces as it was stated in many psychological studies. The main advantage of line edge maps is the low response to illumination changes, because it is a mediate-level image representation derived from low-level edge map representation. This algorithm has an important improvement that is the low memory requirements because of data used.
|Less time consuming, less memory occupy, less sensitive to illumination changes and faster technique.||It can mix up with lines and not detect similarities.|
|It is more robust to varying facial expression and varying pose.||It works with the combination of Hausdorff distance measure.|
Gabor wavelet is use for face recognition. Gabor feature is one of the best way to recognize a face. Mostly researchers use Gabor wavelet because its kernel is similar to 2D receipt, in past the Gabor wavelet give impressive results about face recognition.
Typically there are four methods include dynamic link architecture, Gabor fisher classifier, elastic bunch graph matching and the last one is Adaboost Gabor fisher classifier. The Gabor feature is also use for gait recognition and for gender recognition. You can recognize a gender of image by using Gabor Feature. The Gabor feature are sensitive about local variable but they can be used to discriminate between patterns with similar magnitudes. The Gabor provide more information about local images. They work better with magnitudes .One have to compensate its sensitivity of misalignment and local variation.
Face recognition is one of the most important feature of the Gabor wavelet. The face image combined with sets of Gabor Filter and resulting image further processed for recognition. Gabor Filters is another name of Gabor wavelet in scope of application.
|Gabor features are also used for gait recognition and gender recognition.||Gabor wavelet is a general image processing tool, and it is not specifically designed for face recognition.
|Gabor phases are sensitive to local variations, they can discriminate between patterns with similar magnitudes, i.e. they provide more detailed information about the local image features||Gabor features do not contain face specific information learned from face training data.|
|Gabor wavelet transform has both the multi-resolution and multi-orientation properties and are optimal for measuring local spatial frequencies.||Gabor filters are not optimal when objective is to achieve broad spectral information with maximum spatial localization.|
|Gabor wavelet has many application, such as the facial expression classification, Gabor networks for face reconstruction, fingerprint recognition, facial landmark location, and iris recognition.||The dimension of feature vector obtained from Gabor filter is very huge so the time for performing Gabor feature extraction is very high.|
Artificial Neural Network is a combination of simple but highly interconnected elements. Which process information by their response to external input .The Artificial Neural Network is a machine which is design like human brain. Its model is just like human brain. In this technique first it learn to get knowledge and then store that knowledge for further use.
The Artificial Neural Network is based on model of human brain. It is design so that it can solve problems like human brain do. This system first get knowledge and then store it as our brain do. The ANN needs information about the system which is under study and the methods of training which are used. The ANN have the ability to remove unnecessary data during training. To get great results you need a large set of training data.
|Ann can approximate any function||You cannot use it everywhere.it mostly works on complex problems.|
|ANN works great on complex problems like image processing, face recognition.||It requires a lot of training and case|
|It works like our brain does.||The modification of ANN is very complex.|
SVM is recently proposed by Vapnik and his coworker .It is new type of pattern classifier which is based on novel statistical learning technique. It is different from traditional techniques like Artificial Neural Network which minimize the empirical training error. SVM works well with high dimensional spaces under small training sample. It gives better results than traditional.
SVM Has been successfully apply on face detection, object detection, handwritten character recognition, image and information retrieval, speech recognition and digit recognition. SVM basically designed to minimize error on dataset and it called Empirical risk minimization. The SVM can perform linear and nonlinear classification.to do non-linear classification it use Kernel trick. Kernel trick implicitly map their inputs into high dimensional feature space.
For supervised learning it is require that data should be labeled. Unsupervised learning is required when data is not labeled so that it can find natural clustering of the data to groups. When it finds natural clustering then it map this data to these new groups. Support vector clustering is an algorithm which is use to improve the support vector machines and mostly use in industrial areas.
|The development of SVMs involved sound theory first, then implementation and experiments.||The performance of SVM if depends upon the kernel we chose.|
|SVMs use structural risk minimization.||There is no theory that how to pick a good kernel.|
|SVM has arisen as very solid machine learning methods for supervised classification issues.||The size is the problem both in training and testing.|
|SVM training always finds a global minimum, convexity is an important and interesting property of nonlinear SVM classifiers.||Speed is also the problem in both training and testing. SVM is slow then Neural Network.|
|SVM cannot suffer from their theoretical weakness.||Training of large dataset like millions is unsolved problem in SVM.|
LBP is the best performing texture descriptors and widely used in various applications. It has proved itself to be highly discriminative and because its invariance to monotonic gray level changes and computational productivity, make it suitable for demanding image analysis tasks. Face can be seen as a composition of micro-patterns which can be well described by LBP operator.
The LBP operator was originally designed for weave description. The operator allocates a label to every pixel of an image by thresholding the 3×3 neighborhood of each pixel with the center pixel value and consider the result as a binary number.
There are some parameters that can be chosen to optimize the performance of the LBP-based algorithm.
- 1: Choosing the type of the LBP operator.
- 2: Division of the images into regions.
- 3: Selecting the distance measure for the nearest neighbor classifier.
- 4: Finding the weights wj for the weighted X^2 statistic.
|LBP tolerance to monotonic gray-scale changes.||The recognition rate of the local region based methods is lower than that of PCA.|
|The recognition rates of the LBP maintain high level under the effect of localization errors.||LBP produce long histograms, which slow down the recognition speed especially on large-scale face database.|
|LBP important property is its computational simplicity, which makes it possible to analyze images in challenging real time environment.||The binary data produced by LBP are sensitive to noise.|
Independent component analysis is a method to find factors or components from multi-dimensional statistical data. There is need to implement face recognition system using ICA for facial images having face directions and different lighting conditions, which will give better results as compared with existing systems.
The evaluation of face recognition using PCA and ICA on ‘FERET’ database with different classifiers were discussed and found that the ICA had superior recognition rate as compared to PCA with statistically independent basis images and also with statistically independent coefficients.
In ICA each face image is converted into a vector before calculating the independent components. ICA reduces face recognition error and dimensionality of recognition subspace becomes smaller.
|ICA provided a more powerful data representation than PCA.||The ICA model equation one cannot determine the variances of the independent components.|
|PCA_ICA attains higher average success rate than Eigenfaces, the Fisher face and methods||Cannot rank the order of dominant component.|
|ICA provided a more powerful data representation as its goal is that of providing an independent rather than uncorrelated image decomposition and representation.||In ICA, independent components are extracted through an iterative optimization procedure hence at different point of times there is little variation in the answer.|
|Principal Component Analysis (PCA)||Linear Discriminant Analysis (LDA)|
|PCA does feature classification.||LDA does data classification.|
|PCA consumes more time for face recognition.||LDA consumes less time for face recognition.|
|PCA optimize low dimensional representation of objects by focusing on discriminating features.||LDA simply achieve object reconstruction.|
|No distinction between inter and intra-class variability.||Find a sub-space which maximizes the ratio of inter class and intra-class variability.|
|Problems with illumination, head pose, etc.||Work also with various illuminations.|
|Reduce the dimension of the data from N2 to M.||Reduce dimension of the data from N2 to P-1.|
|Verify if the image is a face at all.||Can only classify a face which is “known” to the database.|
|PCA works well when lightening variation is small.||LDA works gives better accuracy in facial expression.|
|PCA used raw data directly for learning and recognition without any processing.||LDA seeks directions that are efficient for discrimination between the data before use.|
|In test result of FAR & FRR PCA shows poor result.||In test result of FAR & FRR LDA shows good result.|
|PCA does not pay attention to the underlying class structure.||LDA deals directly with discrimination between classes.|
|When the training set is small, PCA can outperform LDA.||When the number of samples is large for each class, LDA outperforms PCA.|
|Support Vector Machine||Artificial Neural Network|
|SVM classifier give slightly higher prediction accuracy.||ANN classifier give slightly lower prediction accuracy.|
|Accuracy obtained by SVM is 93.45%.||Accuracy obtained by ANN is 92.33%.|
|The development of SVMs involved sound theory first, then implementation and experiments.||The development of ANNs followed a heuristic path, with applications and extensive experimentation preceding theory.|
|SVMs use structural risk minimization.||ANNs use empirical risk minimization.|
|SVM approach does not attempt to control model complexity by keeping the number of features small.||ANN approach attempt to control model complexity by keeping the number of features small.|
|SVM cannot suffer from their theoretical weakness.||ANN suffer from their theoretical weakness.|
|SVM automatically select their model size.||ANN cannot select their model size.|
|SVM training always finds a global minimum, convexity is an important and interesting property of nonlinear SVM classifiers.||ANN suffer from the existence of multiple local minima solutions.|
|The most widely used in the classification of remotely sensed images is radial basis function SVMs.||The most widely used in the classification of remotely sensed images is multilayer perceptron’s (MLPs).|
|SVM has arisen as very solid machine learning methods for supervised classification issues.||ANN are machine learning models inspired on biological neural networks present in animal & human brains.|