Identificador persistente para citar o vincular este elemento:
https://accedacris.ulpgc.es/jspui/handle/10553/154925
| Título: | Multimodal Emotion Recognition via Multilevel Fusion of Visual, Audio, and Textual Data | Autores/as: | Salas Cáceres, José Ignacio Lorenzo Navarro, José Javier Castrillón Santana, Modesto Fernando |
Clasificación UNESCO: | 120304 Inteligencia artificial | Palabras clave: | Multimodal data fusion Emotion recognition Biometry Human-Machine Interaction |
Fecha de publicación: | 2026 | Conferencia: | 23rd International Conference on Image Analysis and Processing (ICIAP2025) | Resumen: | Human–machine interactions are becoming increasingly common in society, making it important to improve their user experience. In this regard, an accurate emotion recognition system could substantially benefit the experience. This work presents a novel framework for multimodal emotion recognition that performs fusion at multiple levels, feature and score, to effectively combine visual, audio, and textual information. Modality-specific embeddings are extracted using VGGFace for visual data, a Wav2Vec2-Large-Robust model for audio, and BERT for text. These representations are unified via three different feature-level fusion strategies: concatenation, Embrace, and cross-attention. A subsequent score-level fusion employs an adaptive weighted sum to produce the final class probabilities. On the four-emotion classification task of the IEMOCAP dataset, our approach achieves an unweighted accuracy of 73.53%, which represents solid results comparable with some state-of-the-art baselines and demonstrates the added value of visual cues. Our experiments also analyze the impact of fusion and pooling choices, providing insights for future multimodal systems. | URI: | https://accedacris.ulpgc.es/jspui/handle/10553/154925 | ISBN: | 978-3-032-10191-4 | ISSN: | 0302-9743 | DOI: | 10.1007/978-3-032-10192-1_45 |
| Colección: | Actas de congresos |
Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.