Publication | Closed Access
TBALL data collection: the making of a young children's speech corpus
49
Citations
9
References
2005
Year
Unknown Venue
EngineeringSpeech CorpusMultilingualismSpoken Language ProcessingCommunicationLanguage LearningCorpus LinguisticsSpeech RecognitionApplied LinguisticsNatural Language ProcessingLanguage DocumentationData CollectionComputational LinguisticsLanguage TestingLanguage AcquisitionLanguage StudiesTball ProjectMachine TranslationLanguage TechnologyTball Data CollectionData Collection MethodologySpeech CommunicationSpeech TechnologySpeech AnalysisLanguage CorpusSpeech ProcessingYoung ChildrenData-driven LearningComputer-assisted Language LearningLinguistics
In this paper we describe the data collection for the TBALL project (Technology Based Assessment of Language and Literacy) and report the results of our efforts. We focus on aspects of our corpus that distinguish it from currently available corpora. The speakers are children (grades K-4), largely nonnative speakers of English, and from diverse socio-economic backgrounds, who are learning to read. We also describe how we adapted our methodology to accommodate these differences: our recording setup, data collection methodology, and transcription scheme. We also discuss the task this corpus was designed to serve and our research approach.
| Year | Citations | |
|---|---|---|
Page 1
Page 1