JAPED Home · Issue Contents

A Signal Processing Framework for Hindi Digit Recognition with Kaldi ASR
Prashant Upadhyay, Anupam Vyas Atul Makraiya and Nishtha Vyas

The paper shows the performance analysis of the Hindi digit recognition model using the MFCC and PLP features. In this study, we presented the two approaches for comparative analysis. First, the results were computed with the LDA classifier using the MATLAB approach, and in the second, the results were computed using the Kaldi ASR toolkit. In the first approach, with the quadratic classifier, the MFCC and PLP feature shows the best accuracy with 78.53% and 77.53% respectively. On the other hand, with the Kaldi ASR, the best accuracy, computed as 99.17% and 98.75% respectively for MFCC and PLP features, using the bigram language model. It clearly shows that the MFCC features provide better sensitivity to the speech signal. Whereas, the robust feature extraction technique like CMVN and better handling of the acoustic and language in Kaldi ASR has given higher recognition accuracy. This shows why the Kaldi ASR toolkit has become the state-of-the-art for researchers due to its effectiveness in extracting the acoustic features, and helps to develop more accurate ASR systems.

Keywords: Hindi digit, kaldi toolkit, MFCC, PLP, LDA, acoustic model

Full Text (IP)