Machine learning-based efficient audio production separation method
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Audio production separation, extracting individual sound sources from a mixture signal, has numerous applications in audio processing, audio remixing, and hearing aids. However, most existing methods only utilize spectral information and neglect spatial cues available in multi-microphone setups, limiting their performance. This paper proposes a novel audio production separation algorithm combining hyperdirectional beamforming and a long short-term memory (LSTM) network to exploit spatial and spectral information for efficient multi-speaker audio production separation. The hyperdirectional beamforming enhances target audio signals from desired directions while suppressing interference. The enhanced signals are processed by an LSTM network that predicts time-frequency masks for separating individual sources using a multi-task learning objective. Extensive experiments on simulated and real-world datasets demonstrate the superiority of the proposed algorithm over benchmark algorithms in terms of objective metrics across various acoustic conditions. Subjective listening tests with human participants further validate the proposed algorithm's improved perceptual quality and intelligibility. An ablation study highlights the importance of both hyperdirectional beamforming and LSTM components, as well as their synergistic effect. The proposed algorithm offers a practical approach for exploiting spatial and spectral information in multi-speaker audio production separation, with potential applications in teleconferencing, hearing aids, and audio signal processing.

키워드

Audio production separationHyperdirectional beamformingLong short-term memory networkShort-time fourier transform
제목
Machine learning-based efficient audio production separation method
저자
Zhang, WenzhuKim, Byung-Gyu
DOI
10.1007/s13042-025-02542-y
발행일
2025-11
유형
Article; Early Access
저널명
International Journal of Machine Learning and Cybernetics
16
11
페이지
8603 ~ 8616