Please use this identifier to cite or link to this item:
https://accedacris.ulpgc.es/jspui/handle/10553/168027
| Title: | An Evaluation of a Visual Question Answering Strategy for Zero-shot Facial Expression Recognition in Still Images | Authors: | Salas Cáceres, José Ignacio Castrillon-Santana, Modesto Freire-Obregon, David Santana, Oliverio J. Hernández-Sosa, Daniel Lorenzo-Navarro, Javier |
UNESCO Clasification: | 1203 Ciencia de los ordenadores | Keywords: | Fer Vlms Vqa Zero-Shot |
Issue Date: | 2025 | Conference: | International Conference on Visual Communications and Image Processing (VCIP 2025) | Abstract: | Facial expression recognition (FER) is a key research area in computer vision and human-computer interaction. Despite recent advances, challenges persist, especially in generalizing to new scenarios. In fact, zero-shot FER significantly reduces the performance of state-of-the-art FER models. The community has recently started to explore the integration of knowledge from Large Language Models for visual tasks. In this work, we evaluate a broad collection of Visual Language Models (VLMs), avoiding the lack of task-specific knowledge by adopting a Visual Question Answering strategy. We compare the proposed pipeline with state-of-the-art FER models, both integrating and excluding VLMs, evaluating well-known FER benchmarks: AffectNet, FERPlus, and RAF-DB. The results show state-of-the-art performance for some VLMs in zero-shot FER scenarios, suggesting a research line for further exploration to improve FER generalization. | URI: | https://accedacris.ulpgc.es/jspui/handle/10553/168027 | ISSN: | 2642-9357 | DOI: | 10.1109/VCIP67698.2025.11396850 | Source: | 2025 International Conference On Visual Communications And Image Processing, VCIP[ISSN 2642-9357], (2025) |
| Appears in Collections: | Actas de congresos |
Items in accedaCRIS are protected by copyright, with all rights reserved, unless otherwise indicated.