Spoken Language Recognition: From Fundamentals to Practice

TLDR

Spoken language recognition automatically identifies the language of a speech sample, and recent decades have seen significant progress driven by advances in signal processing, pattern recognition, cognitive science, and machine learning. The paper presents a computational framework for quantitatively determining spoken language, offers an introductory tutorial on theory and state‑of‑the‑art solutions, and reviews current trends and future directions using NIST’s LRE as case studies. The authors develop a computational framework that quantitatively decides spoken language, provide a tutorial on theoretical and state‑of‑the‑art methods, and review trends and future research using NIST’s LRE as case studies.

Abstract

Spoken language recognition refers to the automatic process through which we determine or verify the identity of the language spoken in a speech sample. We study a computational framework that allows such a decision to be made in a quantitative manner. In recent decades, we have made tremendous progress in spoken language recognition, which benefited from technological breakthroughs in related areas, such as signal processing, pattern recognition, cognitive science, and machine learning. In this paper, we attempt to provide an introductory tutorial on the fundamentals of the theory and the state-of-the-art solutions, from both phonological and computational aspects. We also give a comprehensive review of current trends and future research directions using the language recognition evaluation (LRE) formulated by the National Institute of Standards and Technology (NIST) as the case studies.

References

Page 1

	Year	Citations

Page 1