"Binaural Codebook-based Speech Enhancement with Atomic Speech Presence Probability"

Sean Wood, Johannes Stahl, Pejman Mowlaee

- Audio samples -



Below, we present some audio samples demonstrating the impact of the proposed binaural speech enhancement approach based on atomic speech presence probability (ASPP). The benchmark methods are the Model-based EM Source Separation and Localization (MESSL) [1] and RANdom SAmple Consensus (RANSAC) [2].

  • [1] M. I. Mandel, R. J. Weiss, and D. P. Ellis, “Model-based expectationmaximization source separation and localization,” IEEE Transactions on Audio, Speech and Language Processing, vol. 18, no. 2, pp. 382–394, 2010.

  • [2] J. Traa and P. Smaragdis, “Multichannel source separation and tracking with RANSAC and directional statistics,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 22, no. 12, pp. 2233–2243, 2014.



Female speech (dr6_msds0_si1707) in Ventilation noise at SNR = 0 (dB):

IPD-ILD-ICM weighting factors: [0.1508,0.1902,0.6591]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1

Clean
Noisy
-->
Proposed
-->
MESSL
-->
RANSAC
-->

Female speech (dr7_fjsk0_si1682) in babble noise at SNR = 0 (dB):

IPD-ILD-ICM weighting factors: [0.2960,0.3317,0.3723]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1

Clean
-->
Noisy
-->
Proposed
-->
MESSL
-->
RANSAC
-->

Male speech (dr8_mmea0_si758_Babble_-5 dB) in babble noise at SNR = -5 (dB):

IPD-ILD-ICM weighting factors: [0.1253,0.1759,0.6988]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1

Clean
-->
Noisy
-->
Proposed
-->
MESSL
-->
RANSAC
-->

Male speech (dr8_mmea0_si758) in Ventilation noise at SNR = -5 (dB):

IPD-ILD-ICM weighting factors: [0.0376,0.3849,0.5774]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1

Clean
-->
Noisy
-->
Proposed
-->
MESSL
-->
RANSAC
-->

Male speech (dr8_mmea0_si1388) in babble noise at SNR = 0 (dB):

IPD-ILD-ICM weighting factors: [0.3263,0.1967,0.4770]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1
Clean
-->
Noisy
-->
Proposed
-->
MESSL
-->
RANSAC
-->