Theses and Dissertations

Permanent URI for this collectionhttp://ir.daiict.ac.in/handle/123456789/1

Browse

Search Results

Now showing 1 - 4 of 4
  • ItemOpen Access
    Facial expression recognition: feature based approaches to deep learning techniques
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2020) Sujata; Mitra, Suman K.
    Facial expression recognition (FER) is a problem of pattern recognition that invites the attention of computer vision researchers for the last three decades. However, the problem is still alive due to challenges such as - blurring, illumination variation, pose variation, face image captured in the unconstrained environment, and so on. In the beginning, hand-crafted features followed by classical classification mechanism through a classifier have been studied for various features as well as various classifiers. The hand-crafted features that are associated with changes in expression are hard to extract due to the individual distinction and variations in emotional states. With the induction of deep neural network (DNN) and convolution neural network (CNN), a change in the techniques of facial expression recognition is observed both in terms of efficiency and handling various challenges mentioned above. The modular approach presented here mimics the capability of the human to identify a person with a limited facial part. Facial parts like eyes, nose, lips, and forehead contribute more to the expression recognition task. In this thesis, we have addressed classical feature-based approaches to deep learning techniques. This thesis presents approaches for Facial Expression Recognition (FER). Firstly, we propose two dimensional Taylor expansion for the facial feature extraction as well as to handle the local illumination. Most procedures just used the arrangement with global illumination varieties and thus yielded more unsatisfactory recognition performances within the case of natural illumination variations that are usually uncontrolled within the globe. Hence, to address the brightening variety issue, at that point we presented the (LL) Laplace-Logarithmic area in this article for further improving the exhibition. We applied the proposed 2D Taylor expansion theorem in the facial feature extraction phase and formulated the 2DTFP method. In our second FER approach, we propose a histogram of second-order gradients (HSOG) for the feature extraction. Most of the popular local image descriptors in the literature, such as SIFT, HOG, DAISY, LBP and GLOH, only use the first-order gradient information related to slope and elasticity, e.g., length, area, etc. of a surface, and therefore partly characterize the geometric properties of an image. We exploit the local image descriptor that extracts the histogram of second-order gradients (HSOG), which capture the local curvatures of differential geometry, i.e., cliffs, ridges, summits, valleys, basins, etc. That gives us a different shape index. The shape index is computed from the curvatures, and its different values correspond to different shapes. That different shape corresponds to different expressions of the face. Much work has been done in this field where local texture, features have been extracted and used in the classification. Due to the very local nature of this information, the dimension of the feature vector achieved for the full image is very high, posing computational challenges in real-time expression recognition. In recent times, Dimensionality Reduction methods have been successfully used in image recognition tasks. Here we propose two Dimensionality Reduction methods E-PCA (Euler Principal Component Analysis) and CS-ONPP (Orthogonal Neighborhood Preserving Projection with Class Similarity-based neighborhood). It proved to be gaining huge margin in terms of feature vector length while maintaining the same recognition accuracy. Classical FER methods do well in certain well-controlled cases. The fundamental issue with hand-crafted features based arrangement approaches is that they require space learning and not generalize well like in the complex dataset. Deep learning is fast becoming a go-to tool for many artificial intelligence problems due to its ability to overcome other approaches and even humans in many problems. DNN has millions of parameters. To get an optimal set of parameters, we need to have a lot of data to train. Even if we have a lot of data, training generally requires multiple iterations, and it takes a toll on the computing resources. The task of fine-tuning a network is to tweak the parameters of an already-trained network so that it adapts to the new task at hand. Here we propose two deep learning-based methods. The first method is DNNFG (DNN based on Fourier transform followed by Gabor filtering), where we used pre-trained model VGG16 with fine tuning for extracting the facial features. VGG16 is chosen due to the fact of its effective performance in visible detection and speedy convergence. It's concerning 138 million parameters and contains 13 convolutional layers, followed by 3 fully-connected layers (FCs). Since the VGG framework not designed for the FER tasks, so we modified the framework according to our requirements. And the second is 2DNN (Double-channel based Deep Neural Network). Where we utilized VGGFace architecture, VGGFace is trained on 2.6M face images from 2.6k different people. VGGFace architecture is the same as VGG16. Input images are just different in VGGFace other architecture is the same as VGG16. Adapt VGGFace to FER problem, VGGFace is fine-tuned. It easily utilized local and global information about the expressions. DNN based methods improved recognition accuracy compared to classical approaches. Facial expression recognition (FER) experiments are performed on a number of the benchmark FER databases. Here experiments performed on the four benchmark databases, which are JAFFE, VIDEO, CK+, OULU-CASIA. Basically thesis addresses the classical facial expression recognition approaches and its shortcomings, then moved to deep learning-based approaches to handle these shortcomings. It performed well compared to handcrafted methods. Also, experimentally proved in the thesis that a modular approach is to perform better than holistic approach.
  • ItemOpen Access
    Sparse representation and fisher discriminant criterion based dimensionality reduction for face recognition
    (2020) Chavda, Parita; Mitra, Suman K.
    Dimensionality Reduction(DR) is a very popular topic in the field of pattern recognition. Generally, Practical applications like face recognition, object classification, and text categorization include high dimensional data. However, Past research shows that high dimensional image may reside in a low dimensional manifold. Therefore, To understand high dimensional data efficiently dimensionality reduction is a necessary pre-processing step. Many linear, non-linear, neighborhood and kernel-based DR techniques are developed and demonstrated good results in face recognition. All these methods are less efficient in case of a large variation in facial expression, illumination, and pose in realtime face recognition. A few years back, a sparse representation(SR) based classifier(SRC) shown amazing results in classification. To get SR, more number of training samples required than the input image size. In face recognition, training data size is mostly less compare to input image size. So, Dimensionality reduction becomes compulsory in this case before applying SRC. Recently, sparsity-based DR methods such as SPP, SRC-DP, and SRC-FDC are developed and shown great results in real-world face recognition. SPP and SRCDP use sparse reconstruction residual which is not much useful in classification. To overcome this, SRC-FDC uses the Fisher discriminant criterion for better class separation, but it uses random initialization for the initial projection matrix P0. A new DR technique with proper initialization for initial matrix P0 called Initialized SRC-FDC is proposed.Experiments performed on Extended Yale B, CMUPIE, and Coil-20 dataset shows that Initialized SRC-FDC is more effective and efficient than the original SRC-FDC.
  • ItemOpen Access
    Variants of orthogonal neighborhood preserving projections for image recognition
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2018) Koringa, Purvi Amrutlal; Mitra, Suman K.
    With the increase in the resolution of image capturing sensors and data storagecapacity, a huge increase in image data is seen in past decades. This informationupsurge has created a huge challenge for machines to perform tasks such as imagerecognition, image reconstruction etc. In image data, each observation or a pixelcan be considered as a feature or a dimension, thus an image can be represented asa data point in the very high-dimensional space. Most of these high-dimensionalimages lie on or near a low-dimensional manifold. Performing machine learningalgorithms on this high-dimensional data is computationally expensive and usuallygenerates undesired results because of the redundancy present in the imagedata. Dimensionality Reduction (DR) methods exploit this redundancy withinthe high-dimensional image space and explore the underlying low-dimensionalmanifold structure based on some criteria or image properties such as correlation,similarity, pair-wise distances or neighborhood structure.This study focuses on variants of one such DR technique, Orthogonal NeighborhoodPreserving Projections (ONPP). ONPP searches for a low-dimensionalrepresentation that preserves the local neighborhood structure of high-dimensionalspace. This thesis studies and addresses some of the issues with the existingmethod and provides the solution for the same. ONPP is a three-step procedure,in which the first step defines a local neighborhood followed by the secondstep which defines locally linear neighborhood relationship in high-dimensionalspace, the third step seeks a lower-dimensional subspace that preserved the relationshipsought in the second step.The major issues with existing ONPP technique are local linearity assumptioneven with varying size of the neighborhood, strict distance based or classmembership based neighborhood selection rule, non-normalized projections orsusceptibility to the presence of outliers in the data. This study proposes variviiants of ONPP by suggesting modification in each of these steps to tackle abovementioned problems that better suit image recognition application. This thesisalso proposes a 2-dimensional variant that overcomes the limitation of NeighborhoodPreserving Projections (NPP) and Orthogonal Neighborhood PreservingProjections (ONPP) while performing image reconstruction. All the new proposalsare tested on benchmark data-sets of face recognition and handwritten numeralsrecognition. In all cases, the new proposals outperform the conventionalmethod in terms of recognition accuracy with reduced subspace dimensions.Keywords: Dimensionality Reduction, manifold learning, embeddings, NeighborhoodPreserving Projection (NPP), Orthogonal Neighborhood Preserving Projections(ONPP), image recognition, face recognition, text recognition, image reconstruction
  • ItemOpen Access
    Locality preserving projection: a study and applications
    (Dhirubhai Ambani Institute of Information and Communication Technology, 2012) Shikkenawis, Gitam; Mitra, Suman K
    Locality Preserving Projection (LPP) is a recently proposed approach for dimensionality reduction that preserves the neighbourhood information and obtains a subspace that best detects the essential data manifold structure. Currently it is widely used for finding the intrinsic dimensionality of the data which is usually of high dimension. This characteristic of LPP has made it popular among other available dimensionality reduction approaches such as Principal Component Analysis (PCA). A study on LPP reveals that it tries to preserve the information about nearest neighbours of data points, thus may lead to misclassification in the overlapping regions of two or more classes while performing data analysis. It has also been observed that the dimension reducibility capacity of conventional LPP is much less than that of PCA. A new proposal called Extended LPP (ELPP) which amicably resolves two issues mentioned above is introduced. In particular, a new weighing scheme is designed that pays importance to the data points which are at a moderate distance, in addition to the nearest points. This helps to resolve the ambiguity occurring at the overlapping regions as well as increase the reducibility capacity. LPP is used for a variety of applications for reducing the dimensions one of which is Face Recognition. Face Recognition is one of the most widely used biometric technology for person identification. Face images are represented as highdimensional pixel arrays and due to high correlation between the neighbouring pixel values; they often belong to an intrinsically low dimensional manifold. The distribution of data in a high dimensional space is non-uniform and is generally concentrated around some kind of low dimensional structures. Hence, one of the ways of performing Face Recognition is by reducing the dimensionality of the data and finding the subspace of the manifold in which face images reside. Both LPP and ELPP are used for Face and Expression Recognition tasks. As the aim is to separate the clusters in the embedded space, class membership information may add more discriminating power. With this in mind, the proposal is further extended to the supervised version of LPP (SLPP) that uses the known class labels of data points to enhance the discriminating power along with inheriting the properties of ELPP