Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF
- | Time domain | Frequency domain |
---|---|---|
Clean | ![]() |
![]() |
Noisy | ![]() |
![]() |
NMF | ![]() |
![]() |
NMF-DNN | ![]() |
![]() |
DNN (TF-Masking) | ![]() |
![]() |
DNN (Spectral-Mapping) | ![]() |
![]() |
Clean speech data was taken from the TIMIT database. We selected 62 utterances for training set and 10 utterances for test set. PESQ performance will be further enhanced with a full TIMIT dataset. The factory, babble, machinegun noises from NOISEX-92 database were used for training and test.
Ten utterances from 5 male and 5 female speakers were used for performance evaluation. All models were trained over 3 types (factroy, babble, machinegun) of noises pooled together to examine whether the proposed algorithm can learn various types of source characteristics simultaneously.