"Exploiting Temporal Correlation in Pitch-Adaptive Speech Enhancement"

Johannes Stahl, Pejman Mowlaee


- Audio samples -


Below, we present some audio samples demonstrating the impact of the proposed pitch-adaptive decision-directed (PADDi) method and the pitch-adaptive complex-valued Kalman filter (PACO). The audio samples consist of utterances spoken by male and female speakers, corrupted with different noise types.

Female speaker: ''Objects made of pewter are beautiful.'' in white noise, SNR = 10 dB:

Male speaker: ''She wore warm, fleecy, woolen overalls.'' in pink modulated noise, SNR = 0 dB:

Male speaker: ''Where were you while we were away?'' in factory noise, SNR = 5 dB:

Male speaker: ''Why buy oil when you always use mine?'' in babble noise, SNR = 5 dB:

Female speaker: ''Why yell or worry over silly items?'' in rain noise, SNR = 10 dB:

Listening example: ''He will allow a rare lie.'' in babble noise noise, SNR = 5 dB:

Listening example: ''Please dig up my potatoes before frost.'' in babble noise noise, SNR = 10 dB:

Listening example: ''The clumsy customer spills some expensive perfume.'' in factory noise noise, SNR = 5 dB:

Listening example: ''Pretty soon a woman came along carrying a folded umbrella as a walking stick'' in factory noise noise, SNR = 10 dB:

Female speaker in a bus as an example for a real-world scenario. The recording was part of the Chime 4 challenge:
Emmanuel Vincent, Shinji Watanabe, Aditya Arie Nugraha, Jon Barker, and Ricard Marxer "An analysis of environment, microphone and data simulation mismatches in robust speech recognition", Computer Speech and Language, 2016.:

Female speaker in a cafe as an example for a real-world scenario. The recording was part of the Chime 4 challenge:
Emmanuel Vincent, Shinji Watanabe, Aditya Arie Nugraha, Jon Barker, and Ricard Marxer "An analysis of environment, microphone and data simulation mismatches in robust speech recognition", Computer Speech and Language, 2016.: