Please use this identifier to cite or link to this item: https://accedacris.ulpgc.es/jspui/handle/10553/168027
Title: An Evaluation of a Visual Question Answering Strategy for Zero-shot Facial Expression Recognition in Still Images
Authors: Salas Cáceres, José Ignacio 
Castrillon-Santana, Modesto 
Freire-Obregon, David 
Santana, Oliverio J. 
Hernández-Sosa, Daniel 
Lorenzo-Navarro, Javier 
UNESCO Clasification: 1203 Ciencia de los ordenadores
Keywords: Fer
Vlms
Vqa
Zero-Shot
Issue Date: 2025
Conference: International Conference on Visual Communications and Image Processing (VCIP 2025) 
Abstract: Facial expression recognition (FER) is a key research area in computer vision and human-computer interaction. Despite recent advances, challenges persist, especially in generalizing to new scenarios. In fact, zero-shot FER significantly reduces the performance of state-of-the-art FER models. The community has recently started to explore the integration of knowledge from Large Language Models for visual tasks. In this work, we evaluate a broad collection of Visual Language Models (VLMs), avoiding the lack of task-specific knowledge by adopting a Visual Question Answering strategy. We compare the proposed pipeline with state-of-the-art FER models, both integrating and excluding VLMs, evaluating well-known FER benchmarks: AffectNet, FERPlus, and RAF-DB. The results show state-of-the-art performance for some VLMs in zero-shot FER scenarios, suggesting a research line for further exploration to improve FER generalization.
URI: https://accedacris.ulpgc.es/jspui/handle/10553/168027
ISSN: 2642-9357
DOI: 10.1109/VCIP67698.2025.11396850
Source: 2025 International Conference On Visual Communications And Image Processing, VCIP[ISSN 2642-9357], (2025)
Appears in Collections:Actas de congresos
Show full item record

Google ScholarTM

Check

Altmetric


Share



Export metadata



Items in accedaCRIS are protected by copyright, with all rights reserved, unless otherwise indicated.