Publication | Closed Access
Comparing Two Basic Methods for Discriminating Between Similar Languages and Varieties
19
Citations
16
References
2016
Year
Naive Bayes ClassifiersEngineeringCross-lingual RepresentationMultilingualismVariety (Linguistics)Language VariationComparative MethodSemanticsCorpus LinguisticsText MiningApplied LinguisticsNatural Language ProcessingLanguage DocumentationInformation RetrievalData ScienceData MiningComputational LinguisticsLanguage EngineeringDocument ClassificationLinguistic DiversityLinguistic TypologyBetween Similar LanguagesLanguage StudiesBasic MethodsMachine TranslationNlp TaskKnowledge DiscoveryCross-language RetrievalRanked DictionariesCitius_ixa_imaxin TeamLinguistics
This article describes the systems submitted by the Citius_Ixa_Imaxin team to the Discriminating Similar Languages Shared Task 2016. The systems are based on two different strategies: classification with ranked dictionaries and Naive Bayes classifiers. The results of the evaluation show that ranking dictionaries are more sound and stable across different domains while basic bayesian models perform reasonably well on in-domain datasets, but their performance drops when they are applied on out-of-domain texts.
| Year | Citations | |
|---|---|---|
Page 1
Page 1