Please use this identifier to cite or link to this item: http://hdl.handle.net/10553/43982
Title: Special issue on machine learning for nonlinear processing
Authors: Travieso, Carlos M. 
Alonso, Jesús B. 
UNESCO Clasification: 3307 Tecnología electrónica
Issue Date: 2014
Publisher: 0925-2312
Journal: Neurocomputing 
Abstract: This special issue aims to cover some problems related to machine learning for non-linear processing and non-conventional speech processing. The origin of this volume is in the ISCA Tutorial and Research Workshop on Non-Linear Speech Processing, NOLISP'11, held at the Universidad de Las Palmas de Gran Canaria (Spain) on November 7–8, 2011. The series of NOLISP workshops started in 2003 have become a biannual event whose aim is to discuss alternative techniques for speech processing that, in a sense; do not fit into mainstream approaches. A selected choice of papers based on the revision processes and presentations delivered at NOLISP'11 has given rise to this issue of Neurocomputing. The topics of the special issue are an interesting and active field and 12 papers have been submitted, selected from up to 40 papers. After at least two rounds of reviews, 7 papers were selected for publication. The papers can be categorized into four clusters based on machine learning for nonlinear processing: voice analysis and characterization, pathology detector, voice synthesis, and the development of kernel. We summarize the papers as follows. (1) Voice analysis and characterization. Wöllmer and Schumer showed that speech recognition in challenging scenarios can be improved by applying bidirectional Long Short-Term Memory modeling within the recognizer front-end. BLSTM networks are able to incorporate a flexible, self-learned amount of contextual information in the feature extraction process which was shown to result in enhanced probabilistic features, prevailing over conventional RNN or MLP features. Results showed that this concept prevails over recently published architectures for feature-level context modeling. Drugman and Dutoit proposed an approach for the speech polarity determination. It is based on the calculation of higher-order statistical moments is introduced. These moments oscillate at the local fundamental frequency with a phase shift which is dependent on the speech polarity. Besides, a comparative evaluation between the proposed method and three other state-of-the-art techniques is carried out. Henriquez et al. proposed the application of complexity measures based on nonlinear dynamics for emotional speech characterization. Measures such as mutual information, dimension correlation, entropy correlation, Shannon entropy, Lempel-Ziv complexity and Hurst exponent are extracted from the samples of three databases of emotional speech. A procedure for feature selection is proposed based on an affinity analysis of the features and they reached good accuracy for emotional speech detection. Khanagha et al. showed that a very compact representation of speech signal is possible to achieve by the study of geometrical scaling exponents which relate non-linearity to the multiscale structure of complex signals by speed increments, energy dissipation, etc., giving rise to the same MSMs and reconstruction properties. (2) Voice synthesis. Picart et al. aimed at analyzing the adaptation process, and the resulting speech quality, of a neutral speech synthesizer to generate hypo and hyperarticulated speech. The goal was to have a better understanding of the factors leading to high-quality HMM-based speech synthesis with various degrees of articulation (neutral, hypo and hyperarticulated). This is why adaptation was subdivided into four e_ects: cepstrum, prosody, phonetic transcription adaptation as well as the complete adaptation. (3) Pathology detector. Gómez et al. investigated the usefulness of two non-uniform state space reconstruction techniques for pattern recognition tasks, comparing its performance with the classical uniform embedding in several pathology detection experiments. Besides, a novel and computationally feasible non-uniform procedure has been presented to obtain the time lag vector needed to reconstruct the state space. The results also suggest that the method is able to improve the quality of the reconstructions and therefore contribute to a better performance in pattern recognition tasks. (4) Development of kernel. Travieso et al. presented an approach based on the transformation of the Cepstral domain on Hidden Markov Model (HMM), which is employed for the automatic diagnosis of the Obstructive Sleep Apnea syndrome. This proposal improves the accuracy rate versus the typical use of Cepstral domain classified by HMM. We wish to thank to all the people that enabled the publication of this special issue. First of all, we wish to thank Prof. Dr. Tom Heskes, Editor-in-chief for the special issues of this journal, for accepting the idea and for his support, patience and motivation. Our gratitude also goes to the Journal Manager and to all the staff from Elsevier, in particular to Ms. Vera Kamphuis (Editorial assistant Neurocomputing), for the impeccable and timely logistical support. The papers in this issue were reviewed on two or three rounds of reviews. We wish to equally thank the authors and the reviewers for all their hard work and contribution for the excellence of this special issue.
URI: http://hdl.handle.net/10553/43982
ISSN: 0925-2312
DOI: 10.1016/j.neucom.2013.07.046
Source: Neurocomputing[ISSN 0925-2312],v. 132, p. 111-112
Appears in Collections:Comentario
Show full item record

Page view(s)

138
checked on Nov 1, 2024

Google ScholarTM

Check

Altmetric


Share



Export metadata



Items in accedaCRIS are protected by copyright, with all rights reserved, unless otherwise indicated.