Identificador persistente para citar o vincular este elemento: https://accedacris.ulpgc.es/jspui/handle/10553/149206
Campo DC Valoridioma
dc.contributor.authorSalas Cáceres, José Ignacio-
dc.date.accessioned2025-10-02T17:48:43Z-
dc.date.available2025-10-02T17:48:43Z-
dc.date.issued2025-
dc.identifier.isbn978-3-032-04967-4-
dc.identifier.issn0302-9743-
dc.identifier.otherScopus-
dc.identifier.urihttps://accedacris.ulpgc.es/jspui/handle/10553/149206-
dc.description.abstractPedestrian Attribute Recognition (PAR) plays a key role in surveillance scenarios where classical biometric traits, such as facial features, are often unavailable due to low image quality, occlusions, or variable conditions. By extracting soft biometric attributes, such as gender, clothing type, and carried objects, PAR provides essential contextual information that can support tasks like person re-identification and behavior analysis. In this work, a novel approach is proposed based on Visual Question Answering (VQA) models, which avoids the limitations of supervised learning methods by leveraging general-purpose models without the need for additional training. This extends the PAR2023-winning strategy by introducing two state-of-the-art models, PaliGemma 1 and PaliGemma 2, along with a refined set of attribute-specific questions and an innovative fusion mechanism that combines both models’ strengths. Experimental results on the PAR2025 dataset demonstrate that the proposed system surpasses previous methods, achieving a mean accuracy of 95.4% on the private set, outranking previous approaches on this task.-
dc.languageeng-
dc.publisherSpringer-
dc.relationInteraccióny Re-Identificación de Personas Mediante Machine Learning, Deep Learningy Análisis de Datos Multimodal: Hacia Una Comunicación Más Natural en la Robótica Social-
dc.relation.ispartofLecture Notes in Computer Science-
dc.sourceComputer Analysis of Images and Patterns. CAIP 2025. Lecture Notes in Computer Science, vol. 15621, p. 16–26. Springer, Cham.-
dc.subject120304 Inteligencia artificial-
dc.subject.otherContest-
dc.subject.otherPedestrian Attribute Recognition-
dc.subject.otherVision Language Model-
dc.subject.otherVisual Question Answering-
dc.titleLeveraging Generalist VQA Models to Improve Zero-Shot Pedestrian Attribute Recognition-
dc.typebook_content-
dc.relation.conference21st International Conference in Computer Analysis of Images and Patterns (CAIP 2025)-
dc.identifier.doi10.1007/978-3-032-04968-1_2-
dc.identifier.scopus105017376735-
dc.contributor.orcid0009-0004-7543-3385-
dc.contributor.authorscopusid58745737800-
dc.identifier.eissn1611-3349-
dc.description.lastpage26-
dc.description.firstpage16-
dc.relation.volume15621-
dc.investigacionIngeniería y Arquitectura-
dc.type2Actas de congresos-
dc.identifier.eisbn978-3-032-04968-1-
dc.utils.revision-
dc.date.coverdateSeptember 2025-
dc.identifier.conferenceidevents156046-
dc.identifier.ulpgc-
dc.contributor.buulpgcBU-INF-
dc.description.sjr0,606-
dc.description.sjrqQ2-
dc.description.miaricds10,0-
item.fulltextSin texto completo-
item.grantfulltextnone-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Robótica y Oceanografía Computacional-
crisitem.author.deptIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.orcid0009-0004-7543-3385-
crisitem.author.parentorgIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.fullNameSalas Cáceres, José Ignacio-
crisitem.event.eventsstartdate22-09-2025-
crisitem.event.eventsenddate25-09-2025-
crisitem.project.principalinvestigatorCastrillón Santana, Modesto Fernando-
Colección:Actas de congresos
Vista resumida

Google ScholarTM

Verifica

Altmetric


Comparte



Exporta metadatos



Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.