Spectrum restoration from multiscale auditory phase singularities by generalized projections

Tai-Shih Chi*, Shihab A. Shamma

*Corresponding author for this work

Research output: Contribution to journalArticle

3 Scopus citations

Abstract

We examine the encoding of acoustic spectra by parameters derived from singularities found in their multiscale auditory representations. The multiscale representation is a wavelet transform of an auditory version of the spectrum, formulated based on findings of perceptual experiments and physiological research in the auditory cortex. The multiscale representation of a spectral pattern usually contains well-defined singularities in its phase function that reflect prominent features of the underlying spectrum such as its relative peak locations and amplitudes. Properties (locations and strength) of these singularities are examined and employed to reconstruct the original spectrum by using an iterative projection algorithm. Although the singularities form a nonconvcx set, simulations demonstrate that a well-chosen initial pattern usually converges on a good approximation of the input spectrum. Perceptually intelligible speech can be rcsynthesized from the reconstructed auditory spectrograms, and hence these singularities can potentially serve as efficient features in speech compression. Besides, the singularities are very noise-robust which makes them useful features in various applications such as vowel recognition and speaker identification.

Original languageEnglish
Pages (from-to)1179-1192
Number of pages14
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume14
Issue number4
DOIs
StatePublished - 1 Jul 2006

Keywords

  • Auditory model
  • Convex projection
  • Phase singularity
  • Spectrum restoration

Fingerprint Dive into the research topics of 'Spectrum restoration from multiscale auditory phase singularities by generalized projections'. Together they form a unique fingerprint.

  • Cite this