University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Analysis-by-synthesis coding of narrowband and wideband speech at medium bit rates.

Black, Alastair William. (1997) Analysis-by-synthesis coding of narrowband and wideband speech at medium bit rates. Doctoral thesis, University of Surrey (United Kingdom)..

Full text is not currently available. Please contact, should you require it.


The last few years has seen a rapid expansion in the development of efficient speech compression algorithms which has been primarily fuelled by the proliferation of digital mobile communication systems. Low bit rate speech coding algorithms estimate, quantise and efficiently encode the parameters of a speech production model by using the original speech waveform. The most popular of these models is based on the technique of Linear Prediction which has resulted in a class of speech coding algorithms known as Analysis-by-Synthesis Linear Prediction Coding (AbS-LPC). In the AbS-LPC coding system, a closed loop optimisation procedure is used to determine the excitation signal for the Linear Prediction filter. This methodology of speech coding has been the foundation of many algorithms operating at medium to low bit rates. In particular, the Codebook Excited Linear Prediction (CELP) algorithm has received much attention in the past few years which has culminated in numerous standards being based on this principle. CELP achieves its coding efficiency and high quality by representing the excitation signal as a vector. However, in the original implementation of this algorithm the excitation search was very computationally intensive due to the structure of the codebook. In order to reduce this computational complexity and improve the quality of the synthetic speech this thesis explores various structures of secondary excitations which are based on sparsely populated pulsed vectors. A variable rate implementation of the CELP algorithm is also presented where techniques typically found in vocoders are used to provide an accurate classification of the different types of speech. These metrics are then used to vary the speech segment size and coding rate to take advantage of the differing regions of speech. Narrowband speech is defined to be band limited between 300 Hz - 3.4 kHz and is sampled at the Nyquist sampling rate of 8 kHz. However, wideband speech lies between 50 Hz and 7 kHz and is consequently sampled at a higher rate of 16 kHz. Wideband speech exhibits characteristics which are not normally embodied within the narrowband signal. It is these characteristics which contribute to the superior perceived quality and therefore it is imperative that a coding scheme maintains this information. This thesis formulates various strategies for the coding of wideband speech using the CELP coding structure. Particular attention is paid to preserving the information in the higher frequencies so that the overall quality is maintained in the synthetic signal. A low delay variant of the wideband coder is also presented where particular attention is paid to the effects of backward LPC prediction over the full bandwidth of the signal are investigated. This results in a split band architecture which is capable of producing high quality wideband speech.

Item Type: Thesis (Doctoral)
Divisions : Theses
Authors :
Date : 1997
Contributors :
Depositing User : EPrints Services
Date Deposited : 09 Nov 2017 12:16
Last Modified : 09 Nov 2017 14:44

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800