Language identification using parallel syllable-like unit recognition

Abstract

Automatic spoken language identification (LID) is the task of identifying the language from a short utterance of the speech signal. The most successful approach to LID uses phone recognizers of several languages in parallel. The basic requirement to build a parallel phone recognition (PPR) system is annotated corpora. A novel approach is proposed for the LID task which uses parallel syllable-like unit recognizers, in a framework similar to the PPR approach in the literature. The difference is that unsupervised syllable models are built from the training data. The data is first segmented into syllable-like units. The syllable segments are then clustered using an incremental approach. This results in a set of syllable models for each language. Our initial results on the OGI MLTS corpora show that the performance is 69.5%. We further show that if only a subset of syllable models that are unique (in some sense), are considered, the performance improves to 75.9%.

References

Page 1

	Year	Citations

Page 1