University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Advanced Low Bit-Rate Speech Coding Below 2.4 Kbps.

Unver, Emre. (2010) Advanced Low Bit-Rate Speech Coding Below 2.4 Kbps. Doctoral thesis, University of Surrey (United Kingdom)..

Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (5MB) | Preview


There has been a fast growth in the telecommunications industry in the past decades. With the increasing demand in the transmission of speech over bandwidth-limited media, such as mobile or satellite communication links, and storage of spoken information in bit-rate-limited media, such as silicon memory, efficient compression of speech has become an important issue. Although there are speech coding standards producing high quality speech above 4 kbps, there is still room for improvement at lower bit rates especially at 2.4 kbps and below. Especially for military wireless communications where some of the bandwidth is required for error correction, or for applications where speech is embedded into other speech or non-speech data, achieving good speech quality and intelligibility at very low bit-rates is important. Parametric coders, such as sinusoidal coders, are used extensively at low bit-rates. In this work, relaxing the delay, memory and complexity constraints, strategies for lowering the bit-rates of sinusoidal coders while maintaining good speech quality are discussed. These strategies include the extension of the previous work in the literature on combining several frames within a metaframe and variable bit-allocation schemes as well as a new voicing estimation algorithm from the spectral envelope. Moreover, the use of phonemes in speech coding is investigated for further bit reductions. A method for producing highly intelligible speech with modest quality at a very low bit-rate is presented. Coding of any extra information in order to achieve high quality is also discussed. These strategies have been implemented in the SB-LPC vocoder in order to perform parameter quantisation at several bit-rates. In listening tests, it has been found that the proposed techniques have been effective in lowering the bit-rate from 2.4 kbps to 1.2 kbps, from 1.2 kbps to 0.8 kbps, and from 4.0 kbps to 1.8 kbps while maintaining the speech quality. In addition to those, a coding scheme is also designed operating at 309 bps and producing speech whose intelligibility is similar to that of the MELP operating at 600 bps. Finally, discussions about the performance of the strategies proposed in this thesis as well as possibilities for improvement are given.

Item Type: Thesis (Doctoral)
Divisions : Theses
Authors : Unver, Emre.
Date : 2010
Additional Information : Thesis (Ph.D.)--University of Surrey (United Kingdom), 2010.
Depositing User : EPrints Services
Date Deposited : 14 May 2020 14:56
Last Modified : 14 May 2020 15:08

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800