184.108.40.206 Constellation Method
The methods stated in the above section rely extensively on heuristic information taken from various face images modeled under fixed conditions 12. Therefore, these may not properly work for locating faces of various poses in a complex background. A more rigorous work has been done on face detection method to reduce this problem by grouping facial features in face-like constellations using more robust modeling methods such as statistical analysis 11, 12, 16. The objects are represented using some features and their spatial relations between them 80. For instance, a human face can be represented as a collection of eyes, ears, nose, mouth, hair and skin, with nose lies between eyes and mouth. Therefore the generic object is modeled as constellations of features or constellation model. Each feature or parts has its own appearance and shape properties, and these features are tightly or loosely tied together by class-specific spatial configuration constraints. Various types of face constellations have been presented 12, 1. Wei-Lu Chao 1 built a statistical shape model facial detector using multi-orientation and multi-scale Gaussian derivative filters. They select facial features such as two eyes, two nostrils, and nose/lip junction and produce a prototype filter response for each of them. The multivariate-Gaussian distribution is used to represent the distribution of the mutual distances among each facial feature. Their experiment achieves 86% accuracy rate for the cluttered image. Another research by Burly et al 81 proposed a face localization system by combining the local detectors with a statistical model of the spatial arrangement of facial features to produce a good performance 12. Like Lun Chao the multi-orientation, multi-scale Gaussian derivative filters are used to identify candidate locations of features. The results of the local detectors are considered as candidate feature locations and the constellations are form from these. The constellations are classified based on the probability that the shape variables correspond to a face versus an alternative model. The constellation may not be complete due to missing some true features. The problems such as translation, rotation, and scale and missed features are handled by mapping to a set of shape variables 12, 81.
2.2.2 Model-based approaches
220.127.116.11 Active shape model (ASM)
Unlike the other feature-based methods described in section 2.2.1, the Active shape models concentrate on complex non-rigid features like actual physical and higher level appearance of features 11, 12. It is the most common model used to locate landmark points that define the shape of any statistically modeled object in an image and the shape of an object is iteratively deform to fit the object in a new image81,82. The set of facial features such as eyes, lips, nose, mouth, and eyebrows are demarcated within landmark points. These can be described as a set of distinguishable points present in most of the images under consideration. For instance nose, eyes etc can be represented as below fig. 2.3.
Figure 2.3 ASM Facial Features 82
The set of landmarks forms a shape and each shape represented as vectors of (x, y) coordinates points. The shape of the object can be defined as a set of point in the dimensions 84.
As figure 2.3 shows, the best landmark points are being chosen by finding the same points in another image. These points are situated at the place of the object boundaries and it resembles with most accuracy the shape of the faces. The ASM is important to build a model which enable to create new shapes and to synthesize shapes similar to those in a training set. The training set derives from hand annotation, but there may be some automatic landmark systems 84. Amir Faiz 2008 12 and Ghani Honi 2014 11 describe three categories of ASM. The first category called snake which was first introduced by Kass et al in 1987 56. Later Yuille et al 66 proposed a deformable templates method to take into account the a priori of facial features and to better the performance of snakes 11, 12. While the third category is a more flexible model named smart snakes and point distribution model (PDM) which was introduced by Cootes et al 16, 12, 11. This provides more efficient interpretation of the human face by using a set of labeled points that are only allowed to vary to certain shapes according to a training procedure 56.
18.104.22.168 Image-based approaches
As we have been shown in the previous sections, the feature based approaches are relying on explicit face features. The accuracy of the system can be affected by unpredictable face appearance and other environmental factors. Even if more attempts have been made to improve the performance of the feature based methods, there are still limited to head and shoulder and quasi-frontal faces. Therefore, it needs to use methods that can work in more problematic scenarios such as detecting multiple faces with clutter-intensive backgrounds. These conditions have inspired the researchers in this area to treat face detection problem as face and pattern recognition problems. Unlike the feature-based approach that relies on facial features, the image-based methods use predefined face patterns or templates to determine whether the input image has a face or not. The training algorithm classifies the image region into face and non-face classes. Most Image-based methods depend on multiple resolution windows scanning for faces detection 30. The window scanning algorithm is essential for systematically search the possible face locations from the input image; however, there are some variations in the implementation of these algorithms in most of all image-based systems 56. Each method may vary the size of the scanning window, the sub-sampling rate, the step size, and the number of iterations which are needed for a computationally efficient system. This approach divided into linear subspace model, neural network, and statistical approaches 12, 30.
22.214.171.124 Neural Network
Neural network method is becoming much more popular for face recognition, face detection, and autonomous robot driving. Its popularity is due to the adaptive learning, self-organizing, real-time operation, and fault tolerance via redundancy information coding 30. The Basic idea of the neural network was started when researchers trying to understand how the biological neural networks processing information. The neural network includes various nodes connected together to form a network. The network should be adopted by some methods to generate the expected result. Early effort on neural networks has dated back the early of the 1970s, where high processing speed techniques were used to solve complex problems. As other techniques, it has also some limitations such as slow for a complex problem, susceptible to noise, and highly dependent on training set 86, however, it can be reduced through careful design.
The Multiple Layer Perceptrons (MLPs) is an early neural network method implemented for face detection system, and it generates an encouraging result on fairly simple datasets 86. Later in 1998, Rowley et al 85 presented a more advanced neural network approach that operates on a large and visually complex dataset. The system is based on retinal connected neural network (RCNN) that examines small windows of an image to decide whether each window contains a face 86. To minimize the number of false positive, they use several networks and every network is trained in the same manner, but with random generated initial weights, and permutations of the order of presentation of the scenery images. Their output combined using various arbitration methods to produce the final decision. However, face detection training using neural network method is challenging task due to the difficulty of characterizing the non-face image.
126.96.36.199 Statistical Approaches
Other than the linear subspace and neural network techniques, there are several statistical face detection techniques that deal with nonlinear subspaces (kernel-PCA and kernel-LDA), Support Vector Machine (SVM) 89, Bayes’ decision rule 56, and transformation such as DCT, HMM and Fourier Transform 89. Statistics approaches attract the interest of many researchers in the area of face detection and recognition. The method is based on a statistical feature of a face acquired through learning and use the acquired knowledge to classify the face 75. A few statistical techniques outlined in this section are the Support Vector Machine (SVM) classifier and Bayes’ decision rule. In face detection system, train or learn a rule that identifies a class of objects from a class of non-objects is challenging task.
Schneiderman and Kanade 70 design a face detection system based on Statistical model and probabilistic learning. They statistically represent the appearance of an object and the rest of the world. To construct the statistical model, they adopt multiple histograms where each histogram represents the statistical behavior (attributes) of a different group of quantized wavelet coefficients. But histogram can only be used with a relatively small number of discrete values to describe appearance. So to deal with object detection, they extract multiple attributes from the image and compute their histograms separately. The overall model for Bayes’ decision rule is described as the ratio of the two probabilities (Eq. 3).
If the ratio of the left side is greater than the right side (Eq.3), therefore it is decided that an object (a face) is present and object is not present if the condition is not satisfied. Using the Bayesian approach (maximum likelihood), one can choose the most likely class of a new data point 90. There are two advantages of writing the Bayes’ decision rule as a likelihood ratio (Eq.3). First, it is simple to collect statistics for two probability functions, P (image | object) and P (image | non-object) 56 than directly collect statistics for the posterior probability function. Second, by expressing the Bayes decision rule as above equation, the result of the prior probabilities can be a factor out and combine them into a scalar threshold, ?. Then the algorithm can be applied to different settings by changing the threshold.
Stanley Michael Bileschi (2003) describe two density based classifiers and Support Vector Machine (SVM) for face detection. Like Kanade et al 91, they use two different density estimation classifiers. The probability of positive data (face) is being calculated by finding the mean and covariance matrix of Gaussian distribution, but the probability of negative data is computed from a uniform distribution, where every distribution x is equally probable.
2.3 Machine based face recognition
According 68 the ability of humans to recognize faces provides an implicit benchmark for designing an effective face recognition system. Chellappa et al (2009) presented in his paper, early efforts on pattern classification techniques have dated back 1970’s where measurements between features in faces or face profiles, were used. After 10 years, the work on face recognition becomes more interesting than the previous 66. At the early 1990’s research interest on face recognition has been grown extremely and many new methods were emerged to increase the performance of the recognizer. This is due to the reason that:
Ø The availability of high-quality hardware devices such as camera
Ø The extensive need of highly secured application
Ø The studies on neural network classifiers with emphasis on real-time computation and adaptation 66
Face recognition touches many research areas such as computer vision, machine learning, and neural network 11. until this day, a number of methods have been introduced in literature for matching the face acquired from image or video camera. Figure 2.4 shows three type of face recognition methods. The analytic approach is based on the analytic techniques, which use facial features, geometric-based, and template-based methods. The second method is based on holistic approach, which uses the information derived from the whole face pattern includes PCA, LDA, and ICA. Whereas the third category, hybrid approach combines both local and global features, this method produces a complete representation of facial images 113066.
Face recognition approaches
Figure 2.4 classification face recognition techniques
2.4 Analytical approaches
Analytics approaches presented in 66, various features such as distance, the angle between facial points, and shape facial features are extracted for face recognition. Analytics approach also further divided into the template and geometrical feature methods as shown in figure 2.4 and it focus on extracting local features such as eye, nose, mouth, and hair for identifying face of an individuals, while in template method, the facial features are used to match with facial point templates such as eyes, nose, and mouth etc, where the similarity scores of each facial feature are simply added into a global score for face recognition 89. The major benefit of this method is to design a flexible system that effectively works with defamation of key feature points.
2.4.1. Geometric local feature method
The geometric method is based on extracting or measuring distinct facial features and computes the distance between those facial points. It is different from feature-based method because it constructs a graph using the facial features of each subject 66. Standard statistical pattern recognition methods are then used to match faces using these measurements 14. Most of the early automated face recognition systems were based on geometric based. The first effort made to develop automatic recognition of face using a vector of geometric features was by Kanade et al (1973) where a set of 16 features were computed using robust face detector. The Kanade system achieves 75% correct recognition rate on a database of 20 different peoples using 2 images per person one for training and 1 for testing. The intra and extra class variations are among the factors that affect the effectiveness of the system.
Another research by Brunelli and Poggio70 compare two different algorithms. The first experiment was based on the computational set of geometric features such as nose width and length, mouth position and chins shape, while the second experiment used template matching technique and the overall experiment achieve 90% and 100 % for both geometric features and template matching respectively.
Ladis et al 69 proposed an object recognition system based on the dynamic link architecture, which is an extension of the classical artificial neural network. Objects were represented using spares graph whose vertex were labeled by a multi-resolution description in terms of a local power spectrum, and whose edges were labeled by geometrical distance vectors. In this technique, a set of feature vector was formed over a dense grid of image points based on Gabor-type wavelets. The magnitude of Gabor wavelet coefficients was used for matching and recognition. During recognition of a face from an image, every graph in the model was matched against the image separately and scored good result against 87 images with different facial expressions. However, the matching process was time-consuming, taking 25 seconds to compare an image with 87 stored images. Another research which was presented by Starovoitov et al 71 compares geometric, elastic matching and neural networks method for face recognition. The geometric method was tested on 70 images of 12 persons collected from Olivetti Research Laboratory (ORL) face database collected and achieved 98.5% recognition rate. While elastic matching and neural networks methods were achieved 92.5%, 94% recognition rate.
More recently, Sonu Dhall and Poonam Sethi 72 presented a survey on geometric and appearance feature analysis for facial expression recognition. They evaluated the facial expression on Local Binary Patterns (LBP) three orthogonal planes, Pyramid of Histogram of Gradients, and a geometrical feature based on the distance between fiducial points for person-independent facial expression recognition. A support vector machine (SVM) classifier was used for face detection tasks. They train and test the classifier using 70 and 30 subjects respectively and scores satisfactory result. Moreover, the face recognition methods (LBP-TOP) achieved 45.45% accuracy for Cohn-Kanade database which is quite more than that for geometric points (23.14%).
2.4.2 Template method
As Unsang Park 50 explained, template method is based on finding similarity between the input image and the template image stored in the database. It is related to holistic methods which attempt to represent the whole face appearance, and use cross-correlation between the input image and stored templates of the whole face features to determine the presence of the face. The face features can be defined individually as eyes, face contour, nose, mouth, and edge model. These methods are important to deal with frontal face. Ion Marqu´es 78 presented the silhouette face template and the relation between brightness and darkness region to represent face template. Anshita Agrawal 77 also describes a different form of template matching; for instance, a testing image can be represented as a two-dimensional array of intensity values and compared using a suitable metric, such as the Euclidean distance, with a single template representing the whole face 66. But some template matching methods may use more than one face template from a different viewpoint to represent each image. This method is simple to implement and achieve a good result but it has time computational complexity problem.
2.5 Holistic approaches
Holistic approaches utilize the entire face region in the detected image to perform face recognition. The most relevant information that describes a face is derived from the whole face image 30, and the information used to uniquely identify individuals. This approach very important because designing a model for the geometric based method is difficult due to the large variability of face appearances like pose and light variance, while in template method it is difficult to deal with a non-frontal view. There are several well-known holistic methods available like principal component analysis (PCI), linear discriminating analysis (LDA) and independent component analysis (ICI) for face recognition 3060. Sirovich and Kirby 56 use PCA to develop the eigenface method for face recognition and representation. In the eigenface method, the whole face pattern is converted to feature vector and the eigenfaces are computed from a set of training images. PCA mainly focus on reducing the high dimensionality of training sets to lower dimensionality by preserving the key information contained in the dataset. Currently, this method has been implemented by many researchers such as 11, 30, 56, 66 and obtained high recognition accuracy under various facial expression variance using ORL, YALE, and FERNET databases.