Identificador persistente para citar o vincular este elemento: http://hdl.handle.net/10553/121839
Campo DC Valoridioma
dc.contributor.authorSánchez Pérez, Javieren_US
dc.contributor.authorSalgado de la Nuez, Agustín Javieren_US
dc.contributor.authorGarcía, Alejandroen_US
dc.date.accessioned2023-04-12T19:30:24Z-
dc.date.available2023-04-12T19:30:24Z-
dc.date.issued2022en_US
dc.identifierhttps://zenodo.org/record/7386980-
dc.identifier10.5281/zenodo.7386980-
dc.identifieroai:zenodo.org:7386980-
dc.identifier.urihttp://hdl.handle.net/10553/121839-
dc.descriptionCode for the IDSEM dataset, first version. IDSEM is an acronym for "an Invoices Database of the Spanish Electricity Market" This database contains electricity bills related to energy consumption in Spanish households. The contents of bills are automatically generated using this code. The main purpose of the dataset is for training machine learning algorithms, especially for designing new methods for extracting information from invoices. There are 86 different labels, which are related to several topics, such as the customer and marketer, the contract, energy consumption, or billing. The code relies on a set of dictionaries and template documents for generating many training and test samples. The file format of invoices is PDF and the labels are stored in JSON files. More information can be found at https://idsem.ulpgc.es/ and in the following article: [1] Javier Sánchez, Agustín Salgado, Alejandro García, and Nelson Monzón, "IDSEM, an invoices database of the Spanish electricity market", Sci. Data, (2022). Full Changelog: https://github.com/jsanchezperez/idsem/commits/v1.0.0-
dc.languageengen_US
dc.rightsinfo:eu-repo/semantics/openAccess-
dc.rightshttps://fedoraproject.org/wiki/Licensing:BSD#Modification_Variant-
dc.subject.otherMachine learningen
dc.subject.otherNatural language processingen
dc.subject.otherInformation extractionen
dc.subject.otherElectricity invoiceen
dc.titleIDSEM dataset: Source code – v1.0.0.en_US
dc.typeinfo:eu-repo/semantics/otheren_US
dc.typesoftwareen_US
dc.identifier.doi10.5281/zenodo.7386979-
dc.type2softwareen_US
dc.identifier.supplementhttps://zenodo.org/record/7386980-
dc.identifier.supplement10.5281/zenodo.7386980-
dc.identifier.supplementoai:zenodo.org:7386980-
dc.identifier.supplementhttps://zenodo.org/record/7386980-
dc.identifier.supplement10.5281/zenodo.7386980-
dc.identifier.supplementoai:zenodo.org:7386980-
dc.identifier.supplement7386980-
dc.identifier.supplement7386980-
dc.identifier.zenodo7386980en
dc.utils.zenodofilehttps://zenodo.org/record/7386980/files/jsanchezperez/idsem-v1.0.0.zip?download=1;jsanchezperez/idsem-v1.0.0.zip;4M;ZIP-
dc.utils.zenodofilehttps://zenodo.org/record/7386980/files/jsanchezperez/idsem-v1.0.0.zip?download=1;jsanchezperez/idsem-v1.0.0.zip;4M;ZIP
item.grantfulltextnone-
item.fulltextSin texto completo-
crisitem.author.deptGIR IUCES: Centro de Tecnologías de la Imagen-
crisitem.author.deptIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.deptGIR IUCES: Centro de Tecnologías de la Imagen-
crisitem.author.deptIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.deptDepartamento de Informática y Sistemas-
crisitem.author.orcid0000-0001-8514-4350-
crisitem.author.orcid0000-0002-6142-3432-
crisitem.author.parentorgIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.parentorgIU de Cibernética, Empresa y Sociedad (IUCES)-
crisitem.author.fullNameSánchez Pérez, Javier-
crisitem.author.fullNameSalgado De La Nuez, Agustín-
Colección:Datasets ULPGC
ZIP (4M)
ZIP (4M)
Vista resumida

Visitas

163
actualizado el 04-may-2024

Google ScholarTM

Verifica

Altmetric


Comparte



Exporta metadatos



Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.