Publication | Closed Access
Exploring big educational learner corpora for SLA research
48
Citations
22
References
2015
Year
Second Language LearningRelative ClausesEducationLanguage LearningCorpus LinguisticsLanguage ProcessingNatural Language ProcessingSecond Language AcquisitionSyntaxCollaborative LearningComputational LinguisticsLanguage AcquisitionGrammarLanguage StudiesOpen Access DatabaseSla ResearchNatural LanguageNlp TaskLanguage TechnologyStudent-centered LearningLearning AnalyticsLanguage CorpusData-driven LearningDevelopmental TrajectoryLinguistics
We consider the opportunities presented by big educational learner corpora for Second Language Acquisition (SLA). In particular, we focus on the EF Cambridge Open Language Database (EFCAMDAT), an open access database of student writings submitted to Englishtown , the online school of EF Education First . EFCAMDAT stands out for its size (33 million words, 85 thousand learners) and a range of 128 writing tasks covering all CEFR levels with data from learners from varying nationalities. We discuss methodological issues arising from analyzing big data resources generated in educational contexts and argue that Natural Language Processing (NLP) is essential for the automated processing of such datasets. As a study case, we follow the developmental trajectory of relative clauses, a construction that necessitates deeper syntactic analysis. We consider specific issues that can affect the developmental trajectory, including task effects, formulaic language and national language effects.
| Year | Citations | |
|---|---|---|
Page 1
Page 1