ENGINEERING AND APPLIED SCIENCE RESEARCHVolume 46, No. 01, Month JANUARY, Year 2019, Pages 56 - 63
Bangla dataset and mmfcc in text-dependent speaker identification
Md Atiqul Islam, An-Nazmus Sakib
Abstract Download PDFAutomatic Speaker Identification (SID) is a challenging research topic that is mostly done based on either text-dependent or text-independent speech materials. Generally, an automatic SID system is designed based on English speech. The main goal of this study is to present a text-dependent dataset based on Bangla speech. We explored three different feature extractors as a front-end processor: the Mel-frequency Cepstral Coefficient (MFCC), the Gammatone Frequency Cepstral Coefficient (GFCC), and a newly developed feature – a Modified MFCC (MMFCC) to simulate SID accuracy. The SID accuracies were simulated under clean and noisy conditions. Four types of noises were added to clean signals to generate noisy signals for a range of signal to noise ratios (SNRs) from -5 dB to 15 dB. A standard dataset based on English speech is also presented to compare the SID accuracies with the presented Bangla dataset SID accuracies. The second goal of this study is to examine MMFCC and introduce its novelty in a text-dependent SID system. It is seen from the results of this study, the MMFCC-based method results significantly outperform the MFCC and GFCC-based methods under noisy conditions and produce comparable results in a clean environment.
Bangla dataset, UM dataset, SID system, MMFCC, GFCC, MFCC, Robust performance