Please use this identifier to cite or link to this item: http://hdl.handle.net/10553/135723
Title: Estilometría TIP: enhanced text analysis tool with customisable metrics for Spanish texts
Authors: Carreras-Riudavets, Francisco J. 
Hernández-Figueroa, Zenón 
UNESCO Clasification: 57 Lingüística
Keywords: Computational Linguistics
Computer Science (General)
Information Technology
Language & Linguistics
Literature, et al
Issue Date: 2025
Journal: Cogent Arts and Humanities 
Abstract: Stylometric analysis is a tool across the social sciences and humanities, aiding disciplines like education, psychology, history, anthropology, and linguistics. However, most tools are developed for English, limiting their effectiveness for Spanish texts, which involve complex inflections. This paper addresses this gap by introducing Estilometría TIP, a web-based tool specifically designed for the stylometric analysis of Spanish texts. Estilometría TIP overcomes the challenges posed by Spanish’s inflected forms through two primary functionalities. First, it offers customizable metrics: researchers can define and compute their own metrics using a configuration file, allowing them to tailor their analyses to specific research needs across different fields. This feature dynamically adjusts the user interface, adding or modifying menus to facilitate seamless exploration of customized results. Second, Estilometría TIP incorporates Lexicon TIP, a highly accurate lexical recognition service for Spanish with an accuracy of over 99.8%. Lexicon TIP draws on a comprehensive database of more than 320,000 lemmas and 8 million inflected forms, accounting for variations in number, gender, superlatives, diminutives, augmentatives, derogatory terms, and verb conjugations. Two key algorithms enhance this functionality: prefix detection, which accurately identifies prefixed words (e.g. ‘predeterminar’), and enclitic pronoun identification, which handles verb forms combined with enclitic pronouns (e.g. ‘comiéndotelas’).
URI: http://hdl.handle.net/10553/135723
ISSN: 2331-1983
DOI: 10.1080/23311983.2025.2451513
Source: Cogent Arts and Humanities[EISSN 2331-1983],v. 12 (1), (Enero 2025)
Appears in Collections:Artículos
Show full item record

Google ScholarTM

Check

Altmetric


Share



Export metadata



Items in accedaCRIS are protected by copyright, with all rights reserved, unless otherwise indicated.