University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Advanced pre-and-post processing techniques for speech coding.

Farsi, Hassan. (2003) Advanced pre-and-post processing techniques for speech coding. Doctoral thesis, University of Surrey (United Kingdom)..

[img]
Preview
Text
10148973.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (6MB) | Preview

Abstract

Advances in digital technology in the last decade have motivated the development of very efficient and high quality speech compression algorithms. While in the early low bit rate coding systems, the main target was the production of intelligible speech at low bit rates, expansion of new applications such as mobile satellite systems increased the demand for reducing the transmission bandwidth and achieving higher speech quality. This resulted in the development of efficient parametric models for speech production system. These models were the basis of powerful speech compression algorithms such as CELP, MBE, MELP and WI. The performance of a speech coder not only depends on the speech production model employed but also on the accurate estimation of speech parameters. Periodicity, also known as pitch, is one of the speech parameters that greatly affect the synthesised speech quality. Thus, the subject of pitch determination has attracted much research in the area of low bit rate coding. In these studies it is assumed that for a short segment of speech, called frame, the pitch is fixed or smoothly evolving. The pitch estimation algorithms generally fail to determine irregular variations, which can occur at onset and offset speech segments. In order to overcome this problem, a novel preprocessing method, which detects irregular pitch variations and modifies the speech signal such as to improve the accuracy of the pitch estimation, is proposed. This method results in more regular speech while maintaining perceptual speech quality. The perceptual quality of the synthesised speech may also be improved using postfiltering techniques. Conventional postfiltering methods generally consider the enhancement of the whole speech spectrum. This may result in the broadening of the first formant, which leads to the increase of quantisation noise for this formant. A new postfiltering technique, which is based on factorising the linear prediction synthesis filter, is proposed. This provides more control over the formant bandwidth and attenuation of spectral speech valleys. Key words: Pitch smoothing, speech pre-processor, postfiltering.

Item Type: Thesis (Doctoral)
Divisions : Theses
Authors :
NameEmailORCID
Farsi, Hassan.
Date : 2003
Contributors :
ContributionNameEmailORCID
http://www.loc.gov/loc.terms/relators/THS
Depositing User : EPrints Services
Date Deposited : 09 Nov 2017 12:18
Last Modified : 20 Jun 2018 11:48
URI: http://epubs.surrey.ac.uk/id/eprint/844491

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year


Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800