Welcome to Phase Processing Lab conducted by Pejman Mowlaee CV

Iterative Joint Maximum A Posteriori Single-Channel
Speech Enhancement Given Non-uniform Phase Prior

Pejman Mowlaee, Johannes Stahl, Josef Kulmer


Below, we present some audio samples demonstrating the impact of incorporating phase information in phase-aware speech enhancement. The results are shown for the fully blind scenario where the phase and amplitude are estimated directly from the noisy observation. For comparison, we also include the results obtained by the phase-blind enhancement method (SG-jMAP) [1] and the benchmark minimum mean square error (MMSE) phase-aware speech enhancement (CUP) [2]. The results are shown for female and male scenarios for modulated pink and babble noise scenarios.

[1] T. Lotter and P. Vary, “Speech enhancement by MAP spectral amplitude estimation using a super-gaussian speech model,” EURASIP J. on Advances in Signal Processing, vol. 2005, no. 7, pp. 1110–1126, 2005.

[2] T. Gerkmann, “Bayesian estimation of clean speech spectral coefficients given a priori knowledge of the phase,” IEEE Trans. Signal Process., vol. 62, no. 16, pp. 4199–4208, Aug. 2014.

- Audio samples -

Female speech: ''The library has open shelves even in the unbound periodical stockroom'' in modulated pink noise at SNR = 5 (dB):

Here we present some instrumental prediction for perceived quality and speech intelligibility using the instrumental measures PESQ and STOI, respectively:


PESQ: Noisy = 2.16, phase-blind = 2.37, CUP = 2.49, Proposed = 2.67.

STOI: Noisy = 0.804, phase-blind = 0.810, CUP = 0.810, Proposed = 0.810.

Male speech: ''This will prevent flat falls and toe injuries'' in modulated pink noise at SNR = 0 (dB):

PESQ: Noisy = 2.08, phase-blind = 2.23, CUP = 2.26, Proposed = 2.32.

STOI: Noisy = 0.680, phase-blind = 0.702, CUP = 0.699, Proposed = 0.706.

Male speech: ''They'll move around that rock all day, following the shade''. in babble noise SNR = 5 (dB):

PESQ: Noisy = 1.97, phase-blind = 2.12, CUP = 2.33, Proposed = 2.38

STOI: Noisy = 0.607, phase-blind = 0.609, CUP = 0.617, Proposed = 0.601