Identificador persistente para citar o vincular este elemento: http://hdl.handle.net/10553/43489
Título: Integration of an XML electronic dictionary with linguistic tools for natural language processing
Autores/as: Santana Suárez, Octavio
Carreras Riudavets, Francisco J. 
Hernandez Figueroa, Zenon 
González Cabrera, Antonio C.
Clasificación UNESCO: 570104 Lingüística informatizada
Palabras clave: Encoding
Dictionary
XML
Computational linguistics
Fecha de publicación: 2007
Publicación seriada: Information Processing and Management 
Resumen: This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to the lexical units. Linguistic information, such as morphology, syllables, phonology, etc., will be included by means of specific linguistic tools. The use of XML as a container for the information allows the use of other XML tools for carrying out searches or for enabling presentation of the information in different resources. This model is particularly important as it combines two parallel paradigms-extendable labelling of documents and computational linguistics-and it is also applicable to other languages. We have included a comparison with the labelling proposal of printed dictionaries carried out by the Text Encoding Initiative (TEI). The proposed design has been validated with a dictionary of more than 145 000 accepted meanings.
URI: http://hdl.handle.net/10553/43489
ISSN: 0306-4573
DOI: 10.1016/j.ipm.2006.08.005
Fuente: Information Processing and Management [ISSN 0306-4573], v. 43, p. 946-957
Colección:Artículos
Vista completa

Google ScholarTM

Verifica

Altmetric


Comparte



Exporta metadatos



Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.