Identificador persistente para citar o vincular este elemento: https://accedacris.ulpgc.es/jspui/handle/10553/154925
Campo DC Valoridioma
dc.contributor.authorSalas Cáceres, José Ignacioen_US
dc.contributor.authorLorenzo Navarro, José Javieren_US
dc.contributor.authorCastrillón Santana, Modesto Fernandoen_US
dc.date.accessioned2026-01-13T09:59:17Z-
dc.date.available2026-01-13T09:59:17Z-
dc.date.issued2026en_US
dc.identifier.isbn978-3-032-10191-4en_US
dc.identifier.issn0302-9743en_US
dc.identifier.urihttps://accedacris.ulpgc.es/jspui/handle/10553/154925-
dc.description.abstractHuman–machine interactions are becoming increasingly common in society, making it important to improve their user experience. In this regard, an accurate emotion recognition system could substantially benefit the experience. This work presents a novel framework for multimodal emotion recognition that performs fusion at multiple levels, feature and score, to effectively combine visual, audio, and textual information. Modality-specific embeddings are extracted using VGGFace for visual data, a Wav2Vec2-Large-Robust model for audio, and BERT for text. These representations are unified via three different feature-level fusion strategies: concatenation, Embrace, and cross-attention. A subsequent score-level fusion employs an adaptive weighted sum to produce the final class probabilities. On the four-emotion classification task of the IEMOCAP dataset, our approach achieves an unweighted accuracy of 73.53%, which represents solid results comparable with some state-of-the-art baselines and demonstrates the added value of visual cues. Our experiments also analyze the impact of fusion and pooling choices, providing insights for future multimodal systems.en_US
dc.languageengen_US
dc.subject120304 Inteligencia artificialen_US
dc.subject.otherMultimodal data fusionen_US
dc.subject.otherEmotion recognitionen_US
dc.subject.otherBiometryen_US
dc.subject.otherHuman-Machine Interactionen_US
dc.titleMultimodal Emotion Recognition via Multilevel Fusion of Visual, Audio, and Textual Dataen_US
dc.typebook_contenten_US
dc.relation.conference23rd International Conference on Image Analysis and Processing (ICIAP2025)en_US
dc.identifier.doi10.1007/978-3-032-10192-1_45en_US
dc.investigacionIngeniería y Arquitecturaen_US
dc.type2Artículoen_US
dc.utils.revisionen_US
dc.identifier.ulpgcen_US
dc.contributor.buulpgcBU-INFen_US
item.fulltextSin texto completo-
item.grantfulltextnone-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Robótica y Oceanografía Computacional-
crisitem.author.deptIU de Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Robótica y Oceanografía Computacional-
crisitem.author.deptIU de Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Robótica y Oceanografía Computacional-
crisitem.author.deptIU de Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.orcid0009-0004-7543-3385-
crisitem.author.orcid0000-0002-2834-2067-
crisitem.author.orcid0000-0002-8673-2725-
crisitem.author.parentorgIU de Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería-
crisitem.author.parentorgIU de Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería-
crisitem.author.parentorgIU de Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería-
crisitem.author.fullNameSalas Cáceres, José Ignacio-
crisitem.author.fullNameLorenzo Navarro, José Javier-
crisitem.author.fullNameCastrillón Santana, Modesto Fernando-
Colección:Actas de congresos
Vista resumida

Google ScholarTM

Verifica

Altmetric


Comparte



Exporta metadatos



Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.