MITalk-79: The 1979 MIT text-to-speech system

Abstract

To mark the completion of a ten-year effort to develop a high performance text-to-speech algorithm, we have established a benchmark system called “MITalk.” Components of the computer-simulated bench mark include: (1) conversion of abbreviations and special text symbols, (2) a lexicon consisting of about 11 000 morphs with pronunciation and parts of speech, (3) morpheme analysis, (4) letter-to-sound rules, (5) syntactic analysis, (6) rules for stress assignment, boundary placement and phonological recoding, (7) fundamental frequency and segmental duration prediction, (8) phonetic-to-parametric conversion, and (9) digital formant synthesis. The MITalk-79 system is being extensively documented and its performance is being evaluated. The presentation will summarize aspects of system organization and performance. (A more complete description will be given in a one-week course to be offered June 25–29, 1979.) The oral presentation will include a five-minute demonstration of synthetic speech generated from English text with absolutely no human intervention. While currently simulated on a large digital computer, MITalk-79 is amenable to practical IC technology. Implementation issues will be briefly discussed. [We gratefully acknowledge the synthesis-by-rule programs and advice provided by Dennis Klatt.]