"Binaural Codebook-based Speech Enhancement with Atomic Speech Presence Probability"

Sean Wood, Johannes Stahl, Pejman Mowlaee

- Audio samples -

Below, we present some audio samples demonstrating the impact of the proposed binaural speech enhancement approach based on atomic speech presence probability (ASPP). The benchmark methods are the Model-based EM Source Separation and Localization (MESSL) [1] and RANdom SAmple Consensus (RANSAC) [2].

[1] M. I. Mandel, R. J. Weiss, and D. P. Ellis, “Model-based expectationmaximization source separation and localization,” IEEE Transactions on Audio, Speech and Language Processing, vol. 18, no. 2, pp. 382–394, 2010.
[2] J. Traa and P. Smaragdis, “Multichannel source separation and tracking with RANSAC and directional statistics,” IEEE/ACM Transactions on Audio Speech and Language Processing, vol. 22, no. 12, pp. 2233–2243, 2014.

Female speech (dr6_msds0_si1707) in Ventilation noise at SNR = 0 (dB):

IPD-ILD-ICM weighting factors: [0.1508,0.1902,0.6591]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1

Clean	Noisy -->
Proposed -->	MESSL -->
RANSAC -->

Female speech (dr7_fjsk0_si1682) in babble noise at SNR = 0 (dB):

IPD-ILD-ICM weighting factors: [0.2960,0.3317,0.3723]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1

Clean -->	Noisy -->
Proposed -->	MESSL -->
RANSAC -->

Male speech (dr8_mmea0_si758_Babble_-5 dB) in babble noise at SNR = -5 (dB):

IPD-ILD-ICM weighting factors: [0.1253,0.1759,0.6988]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1

Clean -->	Noisy -->
Proposed -->	MESSL -->
RANSAC -->

Male speech (dr8_mmea0_si758) in Ventilation noise at SNR = -5 (dB):

IPD-ILD-ICM weighting factors: [0.0376,0.3849,0.5774]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1

Clean -->	Noisy -->
Proposed -->	MESSL -->
RANSAC -->

Male speech (dr8_mmea0_si1388) in babble noise at SNR = 0 (dB):

IPD-ILD-ICM weighting factors: [0.3263,0.1967,0.4770]
sigma=2.2361,kappa=3.1623,alpha=31.6228,beta=1

Clean -->	Noisy -->
Proposed -->	MESSL -->
RANSAC -->