Identificador persistente para citar o vincular este elemento:
https://accedacris.ulpgc.es/jspui/handle/10553/143106
| Campo DC | Valor | idioma |
|---|---|---|
| dc.contributor.advisor | Hernández Tejera, Francisco Mario | - |
| dc.contributor.advisor | Hernández Cabrera, José Juan | - |
| dc.contributor.author | Cárdenes Pérez, Ricardo Juan | - |
| dc.date.accessioned | 2025-07-20T20:01:29Z | - |
| dc.date.available | 2025-07-20T20:01:29Z | - |
| dc.date.issued | 2025 | en_US |
| dc.identifier.other | Gestión académica | |
| dc.identifier.uri | https://accedacris.ulpgc.es/handle/10553/143106 | - |
| dc.description.abstract | In recent years, there has been significant interest in integrating into deep learning models the ability to process and interpret multiple data modalities, such as text and images, jointly. To achieve this, many modern architectures project these data into a shared latent space, where representations of equivalent concepts—although expressed in different forms—are expected to be semantically aligned. This cross-modal alignment is fundamental for tasks such as cross-generation and for obtaining a more comprehensive modeling of the world around us. This work specifically addresses the problem of aligning latent representations between the text and image modalities, proposing an approach based on variational autoencoders trained jointly. To this end, a reformulation of sequence-to-sequence models is proposed, allowing their integration within a variational framework by incorporating latent variables capable of modeling the inherent uncertainty of natural language. In particular, this will be done for three recurrent architectures widely used in text modeling: LSTM, GRU, and xLSTM. These architectures will be integrated into a joint model with image autoencoders, sharing a common latent space where representations from both modalities converge. Inspired by the theoretical framework of the MVAE, a similar notation and formulation is adopted to ensure conceptual compatibility, although an alternative architecture better suited for sequential tasks is proposed. As part of the work, an extensible experimentation framework has been developed, designed to facilitate scientific collaboration and a more practical, formal, and reproducible approach. This framework allows for defining and executing experiments in a structured manner, accelerating the incorporation of new architectures and multimodal configurations. Experiments are conducted on the MNIST, FashionMNIST, and CelebA datasets. The source code of the experiments, as well as the framework, is available at: https://github.com/ricardocardn/Re-MVAE. | en_US |
| dc.language | spa | en_US |
| dc.subject | 120317 Informática | en_US |
| dc.title | Estudio acerca del entrenamiento con alineamiento de representaciones multimodales en el espacio latente | en_US |
| dc.type | info:eu-repo/semantics/bachelorThesis | en_US |
| dc.type | BachelorThesis | en_US |
| dc.contributor.departamento | Departamento de Informática y Sistemas | en_US |
| dc.contributor.facultad | Escuela de Ingeniería Informática | en_US |
| dc.investigacion | Ingeniería y Arquitectura | en_US |
| dc.type2 | Trabajo final de grado | en_US |
| dc.utils.revision | Sí | en_US |
| dc.identifier.matricula | TFT-36361 | |
| dc.identifier.ulpgc | Sí | en_US |
| dc.contributor.buulpgc | BU-INF | en_US |
| dc.contributor.titulacion | Grado en Ciencia e Ingeniería de Datos | |
| item.fulltext | Sin texto completo | - |
| item.grantfulltext | none | - |
| crisitem.advisor.dept | GIR SIANI: Inteligencia Artificial, Redes Neuronales, Aprendizaje Automático e Ingeniería de Datos | - |
| crisitem.advisor.dept | IU de Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería | - |
| crisitem.advisor.dept | Departamento de Informática y Sistemas | - |
| crisitem.advisor.dept | GIR SIANI: Inteligencia Artificial, Redes Neuronales, Aprendizaje Automático e Ingeniería de Datos | - |
| crisitem.advisor.dept | IU de Sistemas Inteligentes y Aplicaciones Numéricas en Ingeniería | - |
| crisitem.advisor.dept | Departamento de Informática y Sistemas | - |
| Colección: | Trabajo final de grado | |
Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.