Towards robust voice pathology detection: Investigation of supervised deep learning, gradient boosting, and anomaly detection approaches across four databases

Harar, Pavol; Galaz, Zoltan; Alonso Hernández, Jesús Bernardino; Mekyska, Jiri; Burget, Radim; Smekal, Zdenek

Identificador persistente para citar o vincular este elemento: http://hdl.handle.net/10553/55723

Campo DC	Valor	idioma
dc.contributor.author	Harar, Pavol	en_US
dc.contributor.author	Galaz, Zoltan	en_US
dc.contributor.author	Alonso Hernández, Jesús Bernardino	en_US
dc.contributor.author	Mekyska, Jiri	en_US
dc.contributor.author	Burget, Radim	en_US
dc.contributor.author	Smekal, Zdenek	en_US
dc.date.accessioned	2019-06-10T12:53:06Z	-
dc.date.available	2019-06-10T12:53:06Z	-
dc.date.issued	2020	en_US
dc.identifier.issn	0941-0643	en_US
dc.identifier.uri	http://hdl.handle.net/10553/55723	-
dc.description.abstract	Automatic objective non-invasive detection of pathological voice based on computerized analysis of acoustic signals can play an important role in early diagnosis, progression tracking, and even effective treatment of pathological voices. In search towards such a robust voice pathology detection system, we investigated three distinct classifiers within supervised learning and anomaly detection paradigms. We conducted a set of experiments using a variety of input data such as raw waveforms, spectrograms, mel-frequency cepstral coefficients (MFCC), and conventional acoustic (dysphonic) features (AF). In comparison with previously published works, this article is the first to utilize combination of four different databases comprising normophonic and pathological recordings of sustained phonation of the vowel /a/ unrestricted to a subset of vocal pathologies. Furthermore, to our best knowledge, this article is the first to explore gradient-boosted trees and deep learning for this application. The following best classification performances measured by F1 score on dedicated test set were achieved: XGBoost (0.733) using AF and MFCC, DenseNet (0.621) using MFCC, and Isolation Forest (0.610) using AF. Even though these results are of exploratory character, conducted experiments do show promising potential of gradient boosting and deep learning methods to robustly detect voice pathologies.	en_US
dc.language	eng	en_US
dc.relation.ispartof	Neural Computing and Applications	en_US
dc.source	Neural Computing and Applications [ISSN 0941-0643], n. 32(20), p. 15747–15757, (2020)	en_US
dc.subject	3307 Tecnología electrónica	en_US
dc.subject.other	Voice pathology detection	en_US
dc.subject.other	Deep learning	en_US
dc.subject.other	Gradient boosting	en_US
dc.subject.other	Anomaly detection	en_US
dc.title	Towards robust voice pathology detection: Investigation of supervised deep learning, gradient boosting, and anomaly detection approaches across four databases	en_US
dc.type	info:eu-repo/semantics/Article	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1007/s00521-018-3464-7	en_US
dc.identifier.scopus	85044933261	-
dc.contributor.orcid	#NODATA#	-
dc.contributor.orcid	#NODATA#	-
dc.contributor.orcid	#NODATA#	-
dc.contributor.orcid	#NODATA#	-
dc.contributor.orcid	#NODATA#	-
dc.contributor.orcid	#NODATA#	-
dc.contributor.authorscopusid	57192572816	-
dc.contributor.authorscopusid	56888706700	-
dc.contributor.authorscopusid	57195466969	-
dc.contributor.authorscopusid	35746344400	-
dc.contributor.authorscopusid	23011250200	-
dc.contributor.authorscopusid	36855362600	-
dc.description.lastpage	11	en_US
dc.identifier.issue	20	-
dc.description.firstpage	1	en_US
dc.investigacion	Ingeniería y Arquitectura	en_US
dc.type2	Artículo	en_US
dc.utils.revision	Sí	en_US
dc.date.coverdate	Abril 2018	en_US
dc.identifier.ulpgc	Sí	en_US
dc.contributor.buulpgc	BU-ING	en_US
dc.description.sjr	0,713
dc.description.jcr	5,606
dc.description.sjrq	Q1
dc.description.jcrq	Q1
dc.description.scie	SCIE
item.grantfulltext	none	-
item.fulltext	Sin texto completo	-
crisitem.author.dept	GIR IDeTIC: División de Procesado Digital de Señales	-
crisitem.author.dept	IU para el Desarrollo Tecnológico y la Innovación	-
crisitem.author.dept	Departamento de Señales y Comunicaciones	-
crisitem.author.orcid	0000-0002-7866-585X	-
crisitem.author.parentorg	IU para el Desarrollo Tecnológico y la Innovación	-
crisitem.author.fullName	Alonso Hernández, Jesús Bernardino	-
Colección:	Artículos

Vista resumida

Citas SCOPUS^TM

44

actualizado el 30-mar-2025

Citas de WEB OF SCIENCE^TM
Citations

22

actualizado el 30-mar-2025

Visitas

201

actualizado el 22-feb-2025

Citas SCOPUS^TM

Citas de WEB OF SCIENCE^TM
Citations

Visitas

Google Scholar^TM

Altmetric

Comparte

Exporta metadatos

Dirección

Contacto

Legal

De interés

Citas SCOPUSTM

Citas de WEB OF SCIENCETM Citations

Visitas

Google ScholarTM

Altmetric

Comparte

Exporta metadatos

Dirección

Citas SCOPUS^TM

Citas de WEB OF SCIENCE^TM
Citations

Google Scholar^TM