Publication: Identifying perceptually similar languages using teager energy based cepstrum
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Research Projects
Organizational Units
Journal Issue
Abstract
Language Identification (LID) refers to the task of identifying an unknown language from the test utterances. In this paper, a new feature set,�viz.,T-MFCC by amalgamating Teager Energy Operator (TEO) and well-known Mel frequency cepstral coefficients (MFCC) is developed. The effectiveness of the newly derived feature set is demonstrated for identifying perceptually similar Indian languages such as Hindi and Urdu. The modified structure of polynomial classifier of 2nd�and 3rd�order approximation has been used for the LID problem. The results have been compared with state-of-the art feature set,�viz.,MFCC and found to be effective (an average jump 21.66%) in majority of the cases. This may be due to the fact that the T-MFCC represents the combined effect of airflow properties in the vocal tract (which are known to be language and speaker dependent) and human perception process for hearing.