University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Efficient feature extraction based on two-dimensional cepstrum analysis for speech recognition.

Marvi, Hossein. (2004) Efficient feature extraction based on two-dimensional cepstrum analysis for speech recognition. Doctoral thesis, University of Surrey (United Kingdom)..

[img]
Preview
Text
10148176.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (5MB) | Preview

Abstract

Solving speech recognition problems requires an adequate feature extraction technique to transform the raw speech signal to a set of feature vectors to preserve most of information corresponding to the speech signal. The features should ideally be compact, distinct and well representative of the speech signal. If the feature vectors do not represent the important content of the speech, the performance of the system will perform poorly regardless of the pattern recognition techniques applied. Many different feature extraction representations of the speech signal have been suggested and tried for speech recognition. The most popular features which are used currently are Mel- frequency cesptral coefficients (MFCC) and perceptual linear prediction (PLP), which are based on one dimensional cepstrum analysis. The two dimensional cepstrum (TDC) is an alternative approach for time-frequency representation of any speech signal which can preserve both the instantaneous and transitional information of the speech signal. Here, in this thesis, the principle aim concerns the study of the two dimensional cepstrum analysis as a feature extraction technique for speech recognition. A novel feature extraction technique, two dimensional root cepstrum (TDRC) is also introduced. It has the advantage of an adjustable y parameter which can be used to optimise the feature extraction process, reducing the dimensions of the feature matrix and giving simple computation. In addition, the Mel TDRC has been proposed as a modified method of original TDRC to improve the accuracy. It is shown that both the TDC and the TDRC outperform the conventional cepstrum. To preserve both magnitude and phase details of the speech signal simultaneously in a feature matrix, the Hartley transform (HT) is suggested as a substitute for the Fourier transform (FT) in two-dimensional cepstrum analysis. Experimental results demonstrate the enhanced capability of the HT in the two dimensional root cepstral analysis to improve recognition accuracy. An experimental comparative study of 9 kinds of feature extraction methods based on cepstral analysis are also carried out.

Item Type: Thesis (Doctoral)
Divisions : Theses
Authors :
NameEmailORCID
Marvi, Hossein.
Date : 2004
Contributors :
ContributionNameEmailORCID
http://www.loc.gov/loc.terms/relators/THS
Depositing User : EPrints Services
Date Deposited : 09 Nov 2017 12:16
Last Modified : 20 Jun 2018 11:08
URI: http://epubs.surrey.ac.uk/id/eprint/843940

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800