Identificador persistente para citar o vincular este elemento:
https://accedacris.ulpgc.es/jspui/handle/10553/163430
| Título: | Small Language Models for Legislative Summarization: An Empirical Evaluation of Performance and Suitability | Autores/as: | Medina Ramírez, Miguel Ángel Estupiñán Ojeda, Cristian David Torres Rodríguez, Victoria Sánchez-Nielsen, Elena Guerra Artal, Cayetano Hernández Tejera, Francisco Mario |
Clasificación UNESCO: | 33 Ciencias tecnológicas | Palabras clave: | Small language models long document summarization normative text summarization parliamentary debate summarization legislative natural language processing |
Fecha de publicación: | 2026 | Publicación seriada: | IEEE Access | Resumen: | Parliamentary institutions generate extensive, domain-specific legislative documents, including normative texts and parliamentary debate transcripts. These documents differ in content and linguistic complexity, making automatic summarization essential for producing coherent summaries aligned with institutional standards. While large language models (LLMs) achieve high summarization quality, their computational requirements limit deployment in parliamentary and public-sector environments. In contrast, small language models (SLMs) offer a more resource-efficient alternative, but their capabilities and performance relative to LLMs, extractive methods, and other SLMs remain underexplored. In this work, we present the first comprehensive evaluation of SLMs for legislative summarization, assessing their effectiveness across document types and languages. We use two complementary datasets: EUR-LexSum, a multilingual corpus of normative texts covering six European languages, and ParcanDeb-Sum, a Spanish dataset of parliamentary debate records aligned with expert-written summaries. Summary quality is evaluated through a three-tier framework combining automatic metrics (ROUGE and BERTScore), LLMbased qualitative assessment, and expert-guided evaluation formalizing parliamentary debate summarization criteria. Our results show that: 1) instruction-tuned SLMs consistently outperform extractive baselines and, in several settings, rival LLMs with seven to eight billion parameters; 2) performance differs by document type, with fine-tuning being critical for debate transcripts, whereas instruction-tuning alone suffices for normative texts; and 3) for normative texts, SLMs establish a new benchmark for multilingual performance, while for parliamentary debates, fine-tuned SLMs achieve performance comparable to domain experts. These findings provide empirical evidence that high-quality legislative summarization can be achieved with SLMs, offering actionable guidance for selecting models that balance performance with computational constraints. | URI: | https://accedacris.ulpgc.es/jspui/handle/10553/163430 | ISSN: | 2169-3536 | DOI: | 10.1109/ACCESS.2026.3679718 |
| Colección: | Artículos |
Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.