Publication | Closed Access
A 1000-word vocabulary, speaker-independent, continuous live-mode speech recognizer implemented in a single FPGA
58
Citations
11
References
2007
Year
Unknown Venue
EngineeringMachine LearningSilico Vox ProjectSpeech RecognitionNatural Language ProcessingPhoneticsComputational Linguistics1000-Word VocabularyRobust Speech RecognitionVoice RecognitionLanguage StudiesReal-time LanguageComputer EngineeringComputer ScienceSingle FpgaText-to-speechSpeech CommunicationSpeech TechnologySpeech ProcessingSpeech InputSpeech PerceptionGraphics ChipsSpeech InterfaceCarnegie Mellon
The Carnegie Mellon In Silico Vox project seeks to move best-quality speech recognition technology from its current software-only form into a range of efficient all-hardware implementations. The central thesis is that, like graphics chips, the application is simply too performance hungry, and too power sensitive, to stay as a large software application. As a first step in this direction, we describe the design and implementation of a fully functional speech-to-text recognizer on a single Xilinx XUP platform. The design recognizes a 1000 word vocabulary, is speaker-independent, recognizes continuous (connected) speech, and is a "live mode" engine, wherein recognition can start as soon as speech input appears. To the best of our knowledge, this is the most complex recognizer architecture ever fully committed to a hardware-only form. The implementation is extraordinarily small, and achieves the same accuracy as state-of-the-art software recognizers, while running at a fraction of the clock speed.
| Year | Citations | |
|---|---|---|
Page 1
Page 1