Identificador persistente para citar o vincular este elemento: http://hdl.handle.net/10553/120402
Título: IDSEM, an invoices database of the Spanish electricity market
Autores/as: Sánchez, Javier 
Salgado De La Nuez, Agustín 
García, Alejandro
Monzón, Nelson 
Clasificación UNESCO: 1203 Ciencia de los ordenadores
120302 Lenguajes algorítmicos
Fecha de publicación: 2022
Proyectos: Análisis de Vídeo y Mejora de la Calidad de Imagen 
Publicación seriada: Scientific data 
Resumen: This article describes a new database of electricity bills related to energy consumption in Spanish households. The dataset includes individual invoices containing information about the consumption and billing of each supply point. These documents include additional data about the customer, the contract, and the electricity company. We propose a pipeline for the creation of bill contents through a simulation process based on regulations and statistics from official bodies and electricity companies. This makes it possible to generate many documents with synthetic data. The simulation is based on 86 different labels, which are necessary to create realistic invoices. The dataset has 75 000 documents in PDF format with their corresponding labels in JSON files. It is useful for training machine learning algorithms and, in particular, for developing methods to automatically extract information from the bills. It is also interesting to design new algorithms for analyzing the behavior of electricity markets from different perspectives.
URI: http://hdl.handle.net/10553/120402
ISSN: 2052-4463
DOI: 10.1038/s41597-022-01885-3
Fuente: Scientific Data [EISSN 2052-4463], v. 9 (1), 786, (Diciembre 2022)
Colección:Artículos
Adobe PDF (1,89 MB)
Vista completa

Google ScholarTM

Verifica

Altmetric


Comparte



Exporta metadatos



Los elementos en ULPGC accedaCRIS están protegidos por derechos de autor con todos los derechos reservados, a menos que se indique lo contrario.