Sparse coding based music genre classification using spectro-temporal modulations

Kai Chun Hsu, Chih Shan Lin, Tai-Shih Chi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Spectro-temporal modulations (STMs) of the sound convey timbre and rhythm information so that they are intuitively useful for automatic music genre classification. The STMs are usually extracted from a time-frequency representation of the acoustic signal. In this paper, we investigate the efficacy of two kinds of STM features, the Gabor features and the rate-scale (RS) features, selectively extracted from various time-frequency representations, including the short-time Fourier transform (STFT) spectrogram, the constant-Q transform (CQT) spectrogram and the auditory (AUD) spectrogram, in recognizing the music genre. In our system, the dictionary learning and sparse coding techniques are adopted for training the support vector machine (SVM) classifier. Both spectral-type features and modulation-type features are used to test the system. Experiment results show that the RS features extracted from the log. magnituded CQT spectrogram produce the highest recognition rate in classifying the music genre.

Original languageEnglish
Title of host publicationProceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016
EditorsJohanna Devaney, Douglas Turnbull, Michael I. Mandel, George Tzanetakis
PublisherInternational Society for Music Information Retrieval
Pages744-750
Number of pages7
ISBN (Electronic)9780692755068
DOIs
StatePublished - Aug 2016
Event17th International Society for Music Information Retrieval Conference, ISMIR 2016 - New York, United States
Duration: 7 Aug 201611 Aug 2016

Publication series

NameProceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016

Conference

Conference17th International Society for Music Information Retrieval Conference, ISMIR 2016
CountryUnited States
CityNew York
Period7/08/1611/08/16

Fingerprint Dive into the research topics of 'Sparse coding based music genre classification using spectro-temporal modulations'. Together they form a unique fingerprint.

  • Cite this

    Hsu, K. C., Lin, C. S., & Chi, T-S. (2016). Sparse coding based music genre classification using spectro-temporal modulations. In J. Devaney, D. Turnbull, M. I. Mandel, & G. Tzanetakis (Eds.), Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016 (pp. 744-750). (Proceedings of the 17th International Society for Music Information Retrieval Conference, ISMIR 2016). International Society for Music Information Retrieval. https://doi.org/10.5281/zenodo.1418099