Binary mask estimation based on frequency modulations

Chung Chien Hsu*, Jen-Tzung Chien, Tai-Shih Chi

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review


In this paper, a binary mask estimation algorithm is proposed based on modulations of speech. A multi-resolution spectrotemporal analytical auditory model is utilized to extract modulation features to estimate the binary mask, which is often used in speech segregation applications. The proposed method estimates noise from the beginning of each test sentence, a common approach seen in many conventional speech enhancement algorithms, to further enhance the modulation features. Experimental results demonstrate that the proposed method outperforms the AMS-GMM system in terms of the HIT-FA rate when estimating the binary mask.s


  • Frequency modulation
  • Mask estimation
  • Spectro-temporal modulation

Fingerprint Dive into the research topics of 'Binary mask estimation based on frequency modulations'. Together they form a unique fingerprint.

Cite this