Identificador persistente para citar o vincular este elemento: https://accedacris.ulpgc.es/jspui/handle/10553/163430
Título: Small Language Models for Legislative Summarization: An Empirical Evaluation of Performance and Suitability
Autores/as: Medina Ramírez, Miguel Ángel 
Estupiñán Ojeda, Cristian David 
Torres Rodríguez, Victoria 
Sánchez-Nielsen, Elena
Guerra Artal, Cayetano 
Hernández Tejera, Francisco Mario 
Clasificación UNESCO: 33 Ciencias tecnológicas
Palabras clave: Small language models
long document summarization
normative text summarization
parliamentary debate summarization
legislative natural language processing
Fecha de publicación: 2026
Publicación seriada: IEEE Access 
Resumen: Parliamentary institutions generate extensive, domain-specific legislative documents, including normative texts and parliamentary debate transcripts. These documents differ in content and linguistic complexity, making automatic summarization essential for producing coherent summaries aligned with institutional standards. While large language models (LLMs) achieve high summarization quality, their computational requirements limit deployment in parliamentary and public-sector environments. In contrast, small language models (SLMs) offer a more resource-efficient alternative, but their capabilities and performance relative to LLMs, extractive methods, and other SLMs remain underexplored. In this work, we present the first comprehensive evaluation of SLMs for legislative summarization, assessing their effectiveness across document types and languages. We use two complementary datasets: EUR-LexSum, a multilingual corpus of normative texts covering six European languages, and ParcanDeb-Sum, a Spanish dataset of parliamentary debate records aligned with expert-written summaries. Summary quality is evaluated through a three-tier framework combining automatic metrics (ROUGE and BERTScore), LLMbased qualitative assessment, and expert-guided evaluation formalizing parliamentary debate summarization criteria. Our results show that: 1) instruction-tuned SLMs consistently outperform extractive baselines and, in several settings, rival LLMs with seven to eight billion parameters; 2) performance differs by document type, with fine-tuning being critical for debate transcripts, whereas instruction-tuning alone suffices for normative texts; and 3) for normative texts, SLMs establish a new benchmark for multilingual performance, while for parliamentary debates, fine-tuned SLMs achieve performance comparable to domain experts. These findings provide empirical evidence that high-quality legislative summarization can be achieved with SLMs, offering actionable guidance for selecting models that balance performance with computational constraints.
URI: https://accedacris.ulpgc.es/jspui/handle/10553/163430
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2026.3679718
Colección:Artículos
Adobe PDF (4,8 MB)
Vista completa

Google ScholarTM

Verifica

Altmetric


Comparte



Exporta metadatos



Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.