Please use this identifier to cite or link to this item:
https://accedacris.ulpgc.es/handle/10553/137037
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Javier Sánchez | en |
dc.contributor.author | Agustín Salgado | en |
dc.contributor.author | Alejandro García | en |
dc.date.accessioned | 2025-04-07T10:58:18Z | - |
dc.date.available | 2025-04-07T10:58:18Z | - |
dc.date.issued | 2022 | en |
dc.identifier | https://doi.org/10.5281/zenodo.7386980 | - |
dc.identifier | oai:zenodo.org:7386980 | - |
dc.identifier.uri | https://accedacris.ulpgc.es/handle/10553/137037 | - |
dc.description | <p>Code for the IDSEM dataset, first version. IDSEM is an acronym for "an Invoices Database of the Spanish Electricity Market"</p> <p>This database contains electricity bills related to energy consumption in Spanish households. The contents of bills are automatically generated using this code. The main purpose of the dataset is for training machine learning algorithms, especially for designing new methods for extracting information from invoices. There are 86 different labels, which are related to several topics, such as the customer and marketer, the contract, energy consumption, or billing.</p> <p>The code relies on a set of dictionaries and template documents for generating many training and test samples. The file format of invoices is PDF and the labels are stored in JSON files.</p> <p>More information can be found at https://idsem.ulpgc.es/ and in the following article:</p> <p>[1] Javier Sánchez, Agustín Salgado, Alejandro García, and Nelson Monzón, "IDSEM, an invoices database of the Spanish electricity market", Sci. Data, (2022).</p> <p><strong>Full Changelog</strong>: <a href="https://github.com/jsanchezperez/idsem/commits/v1.0.0">https://github.com/jsanchezperez/idsem/commits/v1.0.0</a></p> | - |
dc.language | eng | - |
dc.publisher | Zenodo | - |
dc.rights | info:eu-repo/semantics/openAccess | - |
dc.subject.other | Machine learning | en |
dc.subject.other | Natural language processing | en |
dc.subject.other | Information extraction | en |
dc.subject.other | Electricity invoice | en |
dc.title | IDSEM dataset: Source code – v1.0.0. | - |
dc.type | info:eu-repo/semantics/other | - |
item.fulltext | Sin texto completo | - |
item.grantfulltext | none | - |
Appears in Collections: | Datasets ULPGC |
Items in accedaCRIS are protected by copyright, with all rights reserved, unless otherwise indicated.