Titre : | Deep learning for emotion recognition |
Auteurs : | Mohamed Amine Mahmoudi, Auteur ; Fatma Boufera, Directeur de thèse |
Type de document : | texte manuscrit |
Editeur : | Université mustapha stambouli de Mascara:Faculté des sciences exactes, 2022 |
ISBN/ISSN/EAN : | SE02150T |
Format : | 153P. / couv. ill. / 29m. |
Accompagnement : | disque optique numérique (CD-ROM) |
Langues: | Anglais |
Résumé : |
Facial expression recognition (FER) is a research area that consists of classifying human emotions through the expressions on their faces as one of seven basic emotions: happiness, sadness, fear, disgust, anger, surprise, and neutral. FER finds applications in different fields including security, intelligent human-computer interaction, robotics, and clinical medicine for autism, depression, pain, and mental health problems. FER is a very challenging problem due to the subtle differences that exist between its categories. Indeed, the difference in facial expression categories relies on small subtle areas in the facial images like the mouth, eyebrows. This type of problem is known, within the computer vision community as Fine-Grained recognition. It consists of discriminating categories that were considered previously as a single category and have only small subtle visual differences (e.g. bird species, car models. . . etc.). With the resurgence of deep learning techniques, the computer vision community has witnessed an era of blossoming results. One of the most used deep learning techniques are Convolutional Neural Networks (CNNs) which have been extremely successful in that field. However, in fine-grained recognition CNNs do not perform as well as the usual image classification. We believe this is due to the linear kernel function that CNNs are built on. Linear kernel functions are less discriminative and fails to fit the input data. Especially when the data is not linearly separable. To overcome this issue, we have incorporated more complex functions in CNN, instead of simple linear functions, at different levels. These non-linear kernel functions are able to fit more complex input data than the linear kernel function and thus be more discriminative. These methods also have the benefits of being less memory-consuming, even though they are harder to train. At the pooling level, we first proposed to use bilinear and improved bilinear pooling with CNNs for FER. This framework has been evaluated FER datasets and has shown that the use of bilinear and improved bilinear pooling with CNNs can enhance the overall accuracy to nearly 3% for FER and achieve state-of-the-art results. We have also introduced a more filter distortion-aware pooling layer based on kernel functions. The proposed pooling reduces the feature map dimensions while keeping track of the majority of the information fed to the next layer instead of ignoring part of them. The experiments on FER databases demonstrate the benefits of such a layer and show that our model achieves competitive results with respect to state-of-the-art approaches. At the fully connected layers level, we proposed a Kernelized Dense Layer (KDL) which captures higher-order feature interactions instead of conventional linear relations. The experimental results demonstrate the benefits of such a layer and show that our model achieves competitive results with respect to the state-of-the-art approaches on FER datasets. |
Exemplaires (1)
Code-barres | Cote | Support | Localisation | Section | Disponibilité |
---|---|---|---|---|---|
SE02150T | INF813 | Livre audio | Bibliothèque des Sciences Exactes | 6-Thèses doctorat | Consultation sur place Exclu du prêt |
Aucun avis, veuillez vous identifier pour ajouter le vôtre !
Accueil