"Maximum a Posteriori Speech Enhancement Based on Double Spectrum"
Pejman Mowlaee, Daniel Scheran, Johannes Stahl, Sean Wood, Bastiaan Kleijn
- Audio samples -
Below, we present some audio samples demonstrating the impact of the proposed maximum a posteriori (MAP) speech estimator in the double spectrum (DS-MAP) versus benchmark methods. The benchmark methods are the MMSE-STSA method in the acoustic frequency domain [1], modulation spectral subtraction (ModSpecSub) in the Short-Time Spectral Modulation (STSM) domain [2] and DS-Wiener [3].
[1] Y. Ephraim and D. Malah, “Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Process., vol. 32, no. 6, pp. 1109–1121, Dec. 1984.
[2] K.~Paliwal, K.~Wojcicki, and B.~Schwerin, "Single-channel speech enhancement using spectral subtraction in the short-time modulation domain," speech communication, vol. 52, no. 5, pp. 450 – 475, May 2010.
[3] P. Mowlaee, M. Blass, B. Kleijn, "New Results in Modulation-Domain Single-Channel Speech Enhancement", IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 25, Iss. 11, pp. 2125-2137, Nov. 2017.
Female speech in babble noise at SNR = 5 (dB):
Clean
|
Noisy
|
ModSpecSub
|
MMSE-STSA
|
DS-Wiener
|
DS-MAP
|
Male speech in babble noise at SNR = 5 (dB):
Clean
|
Noisy
|
ModSpecSub
|
MMSE-STSA
|
DS-Wiener
|
DS-MAP
|
Clean
|
Noisy
|
ModSpecSub
|
MMSE-STSA
|
DS-Wiener
|
DS-MAP
|
Clean
|
Noisy
|
ModSpecSub
|
MMSE-STSA
|
DS-Wiener
|
DS-MAP
|