University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Sound visualisation as an aid for the deaf, a new approach.

Soltani-Farani, A. A. (1998) Sound visualisation as an aid for the deaf, a new approach. Doctoral thesis, University of Surrey (United Kingdom)..

Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (3MB) | Preview


Visual translation of speech as an aid for the deaf has long been a subject of electronic research and development. This thesis is concerned with a technique of sound visualisation based upon the theory of the primacy of dynamic, rather than static, information in the perception of speech sounds. The goal is design and evaluation of a system to display the perceptually important features of an input sound in a dynamic format as similar as possible to the auditory representation of that sound. The human auditory system, as the most effective system of sound representation, is first studied. Then, based on the latest theories of hearing and techniques of auditory modelling, a simplified model of the human ear is developed. In this model, the outer and middle ears together are simulated by a high-pass filter, and the inner ear is modelled by a bank of band-pass filters the outputs of which, after rectification and compression, are applied to a visualiser block. To design an appropriate visualiser block, theories of sound and speech perception are reviewed. Then the perceptually important properties of sound, and their relations to the physical attributes of the sound pressure wave, are considered to map the outputs from the auditory model onto an informative and recognisable running image-like the one known as cochleagram. This conveyor-like image is then sampled by a window of 20 milliseconds duration at a rate of 50 samples per second, so that a sequence of phase-locked, rectangular images is produced. Animation of these images results in a novel method of spectrography displaying both the time-varying and the time-independent information of the underlying sound with a high resolution in real time. The resulting system translates a spoken word into a visual gesture, and displays a still picture when the input is a steady state sound. Finally the implementation of this visualiser system is evaluated through several experiments undertaken by normal-hearing subjects. In these experiments, recognition of the gestures of a number of spoken words, is examined through a set of two-word and multi-word forced-choice tests. The results of these preliminary experiments show a high recognition score (40-90 percent, where zero represents chance expectation) after only 10 learning trials. General conclusions from the results suggest: a potential quick learning of the gestures, language independence of the system, fidelity of the system in translating the auditory information, and persistence of the learned gestures in the long-term memory. The results are very promising and motivate further investigations.

Item Type: Thesis (Doctoral)
Divisions : Theses
Authors :
Soltani-Farani, A. A.
Date : 1998
Contributors :
Depositing User : EPrints Services
Date Deposited : 09 Nov 2017 12:16
Last Modified : 20 Jun 2018 11:21

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800