Identificador persistente para citar o vincular este elemento: https://accedacris.ulpgc.es/jspui/handle/10553/150726
Campo DC Valoridioma
dc.contributor.authorEstupiñán Ojeda, Cristian Daviden_US
dc.contributor.authorSandomingo-Freire, Raul J.en_US
dc.contributor.authorPadro, Lluisen_US
dc.contributor.authorTurmo, Jordien_US
dc.date.accessioned2025-10-28T19:55:25Z-
dc.date.available2025-10-28T19:55:25Z-
dc.date.issued2025en_US
dc.identifier.issn2574-2531en_US
dc.identifier.otherWoS-
dc.identifier.urihttps://accedacris.ulpgc.es/jspui/handle/10553/150726-
dc.description.abstractObjectives Joint recognition and ICD-10 linking of diagnoses in bilingual, non-standard Spanish and Catalan primary care notes is challenging. We evaluate parameter-efficient fine-tuning (PEFT) techniques as a resource-conscious alternative to full fine-tuning (FFT) for multi-label clinical text classification.Materials and Methods On a corpus of 21 812 Catalan and Spanish clinical notes from Catalonia, we compared the PEFT techniques LoRA, DoRA, LoHA, LoKR, and QLoRA applied to multilingual transformers (BERT, RoBERTa, DistilBERT, and mDeBERTa).Results FFT delivered the best strict Micro-F1 (63.0), but BERT-QLoRA scored 62.2, only 0.8 points lower, while reducing trainable parameters by 67.5% and memory by 33.7%. Training on combined bilingual data consistently improved generalization across individual languages.Discussion The small FFT margin was confined to rare labels, indicating limited benefit from updating all parameters. Among PEFT techniques, QLoRA offered the strongest accuracy-efficiency balance; LoRA and DoRA were competitive, whereas LoHA and LoKR incurred larger losses. Adapter rank mattered: ranks below 128 sharply degraded Micro-F1. The substantial memory savings enable deployment on commodity GPUs while delivering performance very close to FFT.Conclusion PEFT, particularly QLoRA, supports accurate and memory-efficient joint entity recognition and ICD-10 linking in multilingual, low-resource clinical settings.Primary care providers often rely on Non-Standard Clinical Notes, which are written in free text and may combine multiple languages such as Spanish and Catalan. These notes capture important details about patients but are difficult for computers to interpret. Automatically linking them to diagnostic codes such as the International Classification of Diseases, 10th Revision (ICD-10), could help clinicians document care more efficiently and consistently. Traditional approaches for this task use large models that must be fully retrained. This process is accurate but requires powerful computers and significant memory, which are rarely available in smaller clinics. In this study, we explored lighter training strategies that adjust only small parts of the models instead of all their internal weights. We tested these approaches on a realistic bilingual dataset of Non-Standard Clinical Notes. Our results show that these lighter methods achieve accuracy close to full model training while using far less computing power and memory. Training with bilingual notes further improved performance. These findings suggest that accurate automatic coding of Non-Standard Clinical Notes is possible even in low-resource primary care settings, opening the way for practical and affordable use of artificial intelligence tools in everyday healthcare.en_US
dc.languageengen_US
dc.relation.ispartofJAMIA Openen_US
dc.sourceJamia Open, v. 8 (5), (Octubre 2025)en_US
dc.subject120317 Informáticaen_US
dc.subject.otherNatural Language Processingen_US
dc.subject.otherJoint Entity Recognition And Linkingen_US
dc.subject.otherIcd-10 Codesen_US
dc.subject.otherParameter-Efficient Fine-Tuningen_US
dc.titleHigh-fidelity parameter-efficient fine-tuning for joint recognition and linking of diagnoses to ICD-10 in non-standard primary care notesen_US
dc.typeinfo:eu-repo/semantics/Articleen_US
dc.typeArticleen_US
dc.identifier.doi10.1093/jamiaopen/ooaf120en_US
dc.identifier.isi001594748200001-
dc.identifier.eissn2574-2531-
dc.identifier.issue5-
dc.relation.volume8en_US
dc.investigacionIngeniería y Arquitecturaen_US
dc.type2Artículoen_US
dc.contributor.daisngidNo ID-
dc.contributor.daisngidNo ID-
dc.contributor.daisngidNo ID-
dc.contributor.daisngidNo ID-
dc.description.numberofpages9en_US
dc.utils.revisionen_US
dc.contributor.wosstandardWOS:Estupiñán-Ojeda, C-
dc.contributor.wosstandardWOS:Sandomingo-Freire, RJ-
dc.contributor.wosstandardWOS:Padró, L-
dc.contributor.wosstandardWOS:Turmo, J-
dc.date.coverdateOctubre 2025en_US
dc.identifier.ulpgcen_US
dc.contributor.buulpgcBU-INFen_US
dc.description.sjr0,824
dc.description.sjrqQ2
item.fulltextCon texto completo-
item.grantfulltextopen-
crisitem.author.deptGIR SIANI: Inteligencia Artificial, Redes Neuronales, Aprendizaje Automático e Ingeniería de Datos-
crisitem.author.deptIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.parentorgIU Sistemas Inteligentes y Aplicaciones Numéricas-
crisitem.author.fullNameEstupiñán Ojeda, Cristian David-
Colección:Artículos
Adobe PDF (978,23 kB)
Vista resumida

Google ScholarTM

Verifica

Altmetric


Comparte



Exporta metadatos



Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.