Titre : | Deep learning for emotion recognition |
Auteurs : | Mohamed Amine Mahmoudi, Auteur ; Boufera fatma, Directeur de thèse |
Type de document : | texte imprimé |
Editeur : | Mascara : Université Mustapha Stambouli, 2022 |
Format : | 153 p. / Fig. / 30 cm |
Accompagnement : | 01 CD |
Note générale : | Doctorat |
Langues: | Anglais |
Index. décimale : | 00 (Informatique) |
Mots-clés: | Facial expression recognition ; Fine-grained recognition ; Kernel function ; Deep learning |
Résumé : | Facial expression recognition (FER) is a research area that consists of classifying human emotions through the expressions on their faces as one of seven basic emotions: happiness, sadness, fear, disgust, anger, surprise, and neutral. FER finds applications in different fields including security, intelligent human-computer interaction, robotics, and clinical medicine for autism, depression, pain, and mental health problems. FER is a very challenging problem due to the subtle differences that exist between its categories. Indeed, the difference in facial expression categories relies on small subtle areas in the facial images like the mouth, eyebrows. This type of problem is known, within the computer vision community as Fine-Grained recognition. It consists of discriminating categories that were considered previously as a single category and have only small subtle visual differences (e.g. bird species, car models. . . etc.). With the resurgence of deep learning techniques, the computer vision community has witnessed an era of blossoming results. One of the most used deep learning techniques are Convolutional Neural Networks (CNNs) which have been extremely successful in that field. However, in fine-grained recognition CNNs do not perform as well as the usual image classification. We believe this is due to the linear kernel function that CNNs are built on. Linear kernel functions are less discriminative and fails to fit the input data. Especially when the data is not linearly separable. To overcome this issue, we have incorporated more complex functions in CNN, instead of simple linear functions, at different levels. These non-linear kernel functions are able to fit more complex input data than the linear kernel function and thus be more discriminative. These methods also have the benefits of being less memory-consuming, even though they are harder to train. At the pooling level, we first proposed to use bilinear and improved bilinear pooling with CNNs for FER. This framework has been evaluated FER datasets and has shown that the use of bilinear and improved bilinear pooling with CNNs can enhance the overall accuracy to nearly 3% for FER and achieve state-of-the-art results. We have also introduced a more filter distortion-aware pooling layer based on kernel functions. The proposed pooling reduces the feature map dimensions while keeping track of the majority of the information fed to the next layer instead of ignoring part of them. The experiments on FER databases demonstrate the benefits of such a layer and show that our model achieves competitive results with respect to state-of-the-art approaches. At the fully connected layers level, we proposed a Kernelized Dense Layer (KDL) which captures higher-order feature interactions instead of conventional linear relations. The experimental results demonstrate the benefits of such a layer and show that our model achieves competitive results with respect to the state-of-the-art approaches on FER datasets. To further improve CNNs performance, we investigated the usage of kernel functions at the different layers of the latter. We carried out extensive studies of their impact vion convolutional, pooling, and fully-connected layers. We notice that the linear kernel may not be sufficiently effective to fit the input data distributions, whereas high order kernels are prone to over-fitting. This leads to conclude that a trade-off between complexity and performance should be reached. We have used combinations of our previously proposed methods on several datasets. The experiments on conventional classification datasets i.e. MNIST, FASHION-MNIST, and CIFAR-10, show that the proposed techniques improve the performance of the network compared to classical convolution, pooling, and fully connected layers. Moreover, experiments on fine-grained classification i.e. FER databases demonstrated that the discriminative power of the network is boosted since the proposed techniques improve the awareness to slight visual details and allow the network to reach state-of-the-art results. The extensive study described above led us to conclude that neither linear nor nonlinear kernels are sufficient enough to reach the best performance without over-fitting. Thus a combination of these methods must be used to reach the best results. Therefore, we proposed a combination method, based on kernel enhanced CNN model. Our method improves the performance of a CNN without increasing neither its depth nor its width. It consists of expanding the linear kernel function, used at different levels of a CNN. The expansion is performed by combining multiple polynomial kernels with different degrees. By doing so, we allow the network to automatically learn the suitable kernel for the specific target task. The network can either use one specific kernel or a combination of multiple kernels. In the latter case, we will have a kernel in the form of a Taylor series kernel. This kernel function is more sensitive to subtle details than the linear one and is able to better fit the input data. The sensitivity to subtle visual details is a key factor for better facial expression recognition. Furthermore, this method uses the same number of parameters as a convolution layer or a dense layer. The experiments conducted on FER datasets show that the use of our method allows the network to outperform ordinary CNNs. |
Exemplaires (4)
Code-barres | Cote | Support | Localisation | Section | Disponibilité |
---|---|---|---|---|---|
bc1952th | 00TH 50 | Thèse | Bibliothèque centrale | Magasin sciences technologie | Libre accès Disponible |
bc1953th | 00TH 50 | Thèse | Bibliothèque centrale | Magasin sciences technologie | Libre accès Disponible |
bc1974th | 00TH 50 | Thèse | Bibliothèque centrale | Magasin sciences technologie | Libre accès Disponible |
bc1975th | 00TH 50 | Thèse | Bibliothèque centrale | Magasin sciences technologie | Libre accès Disponible |
Aucun avis, veuillez vous identifier pour ajouter le vôtre !
Accueil