A Multi-Dilation and Multi-Resolution Fully Convolutional Network for Singing Melody Extraction

Ping Gao, Cheng You You, Tai-Shih Chi

研究成果: Conference contribution同行評審

摘要

Each human cognitive function involves bottom-up and top-down processes. Several methods have been proposed for singing melody extraction by emphasizing either the bottom-up or top-down processes. For hearing, the bottom-up processes include spectral and spectro-temporal decomposition of the sound by the cochlea and the auditory cortex. In this paper, we propose a neural network, which includes spectro-temporal multi-resolution decomposition of the log-spectrogram of the sound and a semantic segmentation model to respectively address the bottom-up and top-down processing of hearing, for singing melody extraction. Simulation results show the proposed model outperforms all previously proposed methods, emphasizing either bottom-up or top-down processing, in almost all objective evaluation metrics.

原文English
主出版物標題2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
發行者Institute of Electrical and Electronics Engineers Inc.
頁面551-555
頁數5
ISBN(電子)9781509066315
DOIs
出版狀態Published - 五月 2020
事件2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Barcelona, Spain
持續時間: 4 五月 20208 五月 2020

出版系列

名字ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2020-May
ISSN(列印)1520-6149

Conference

Conference2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
國家Spain
城市Barcelona
期間4/05/208/05/20

指紋 深入研究「A Multi-Dilation and Multi-Resolution Fully Convolutional Network for Singing Melody Extraction」主題。共同形成了獨特的指紋。

引用此