Identificador persistente para citar o vincular este elemento: http://hdl.handle.net/10553/131973
Campo DC Valoridioma
dc.contributor.authorSánchez-Nielsen, Elenaen_US
dc.contributor.authorCastrillón-Santana, Modestoen_US
dc.contributor.authorFreire Obregón, David Sebastiánen_US
dc.contributor.authorSantana Jaria, Oliverio Jesúsen_US
dc.contributor.authorHernández-Sosa, Danielen_US
dc.contributor.authorLorenzo-Navarro, Javieren_US
dc.date.accessioned2024-07-01T08:46:44Z-
dc.date.available2024-07-01T08:46:44Z-
dc.date.issued2024en_US
dc.identifier.issn2661-8907en_US
dc.identifier.otherScopus-
dc.identifier.urihttp://hdl.handle.net/10553/131973-
dc.description.abstractPedestrian Attribute Recognition (PAR) poses a significant challenge in developing automatic systems that enhance visual surveillance and human interaction. In this study, we investigate using Visual Question Answering (VQA) models to address the zero-shot PAR problem. Inspired by the impressive results achieved by a zero-shot VQA strategy during the PAR Contest at the 20th International Conference on Computer Analysis of Images and Patterns in 2023, we conducted a comparative study across three state-of-the-art VQA models, two of them based on BLIP-2 and the third one based on the Plug-and-Play VQA framework. Our analysis focuses on performance, robustness, contextual question handling, processing time, and classification errors. Our findings demonstrate that both BLIP-2-based models are better suited for PAR, with nuances related to the adopted frozen Large Language Model. Specifically, the Open Pre-trained Transformers based model performs well in benchmark color estimation tasks, while FLANT5XL provides better results for the considered binary tasks. In summary, zero-shot PAR based on VQA models offers highly competitive results, with the advantage of avoiding training costs associated with multipurpose classifiers.en_US
dc.languageengen_US
dc.relation.ispartofSN Computer Scienceen_US
dc.sourceSN Computer Science [ISSN 2661-8907], v. 5, (680), (Junio 2024)en_US
dc.subject120304 Inteligencia artificialen_US
dc.subject.otherPedestrian attribute recognitionen_US
dc.subject.otherBiometricsen_US
dc.subject.otherVision language modelsen_US
dc.subject.otherVisual question answeringen_US
dc.titleVisual Question Answering Models for Zero-Shot Pedestrian Attribute Recognition: A Comparative Studyen_US
dc.typeinfo:eu-repo/semantics/articleen_US
dc.typeArticleen_US
dc.identifier.doi10.1007/s42979-024-02985-0en_US
dc.identifier.scopus85197373236-
dc.contributor.orcid0000-0002-8673-2725-
dc.contributor.orcidNO DATA-
dc.contributor.orcidNO DATA-
dc.contributor.orcidNO DATA-
dc.contributor.orcidNO DATA-
dc.contributor.orcidNO DATA-
dc.contributor.authorscopusid57218418238-
dc.contributor.authorscopusid13105159100-
dc.contributor.authorscopusid23396618800-
dc.contributor.authorscopusid7003605046-
dc.contributor.authorscopusid6507124168-
dc.contributor.authorscopusid15042453800-
dc.identifier.eissn2661-8907-
dc.identifier.issue6-
dc.relation.volume5en_US
dc.investigacionIngeniería y Arquitecturaen_US
dc.type2Artículoen_US
dc.description.numberofpages13en_US
dc.utils.revisionen_US
dc.date.coverdateJunio 2024en_US
dc.identifier.ulpgcen_US
dc.contributor.buulpgcBU-INFen_US
dc.description.sjr0,721-
dc.description.sjrqQ2-
item.grantfulltextopen-
item.fulltextCon texto completo-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Robótica y Oceanografía Computacional-
crisitem.author.deptIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Robótica y Oceanografía Computacional-
crisitem.author.deptIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Robótica y Oceanografía Computacional-
crisitem.author.deptIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Robótica y Oceanografía Computacional-
crisitem.author.deptIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Robótica y Oceanografía Computacional-
crisitem.author.deptIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.orcid0000-0002-8673-2725-
crisitem.author.orcid0000-0003-2378-4277-
crisitem.author.orcid0000-0001-7511-5783-
crisitem.author.orcid0000-0003-3022-7698-
crisitem.author.orcid0000-0002-2834-2067-
crisitem.author.parentorgIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.parentorgIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.parentorgIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.parentorgIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.parentorgIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.fullNameCastrillón Santana, Modesto Fernando-
crisitem.author.fullNameFreire Obregón, David Sebastián-
crisitem.author.fullNameSantana Jaria, Oliverio Jesús-
crisitem.author.fullNameHernández Sosa, José Daniel-
crisitem.author.fullNameLorenzo Navarro, José Javier-
Colección:Artículos
Adobe PDF (1,73 MB)
Vista resumida

Google ScholarTM

Verifica

Altmetric


Comparte



Exporta metadatos



Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.