We use cookies to improve your experience with our site.
ZHENG Fang, ZHANG Guoliang, SONG Zhanjiang. Comparion of Different Implementations of MFCC[J]. Journal of Computer Science and Technology, 2001, 16(6).
Citation: ZHENG Fang, ZHANG Guoliang, SONG Zhanjiang. Comparion of Different Implementations of MFCC[J]. Journal of Computer Science and Technology, 2001, 16(6).

Comparion of Different Implementations of MFCC

  • The performance of the Mel-Frequency Cepstrum Coefficients (MFCC) maybe affected by (1) the number of filters, (2) the shape of filters, (3)the way in which filters are spaced, and (4) the way in which the powerspectrum is warped. In this paper, several comparison experiments aredone to find a best implementation. The traditional MFCC calculationexcludes the 0th coefficient for the reason that it is regardedas somewhat unreliable. According to the analysis and experiments,the authors find that it can be regarded as the generalized frequency band energy(FBE) and is hence useful, which results in the FBE-MFCC. The authors alsopropose a better analysis, namely the auto-regressive analysis, on theframe energy, which outperforms its 1st and/or 2nd orderdifferential derivatives. Experiments with the "863" Speech Databaseshow that, compared with the traditional MFCC with its correspondingauto-regressive analysis coefficients, the FBE-MFCC and the frameenergy with their corresponding auto-regressive analysis coefficientsform the best combination, reducing the Chinese syllable error rate(CSER) by about 10%, while the FBE-MFCC with the correspondingauto-regressive analysis coefficients reduces CSER by 2.5%.Comparison experiments are also done with a quite casual Chinesespeech database, named Chinese Annotated Spontaneous Speech (CASS)corpus. The FBE-MFCC can reduce the error rate by about 2.9% on anaverage.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return