Publication | Open Access
But dictionaries are data too
60
Citations
3
References
1993
Year
Unknown Venue
Structured VocabularyEngineeringMultilingualismEmpiricist ApproachesMultilingual PretrainingSemantic WebSemanticsCorpus LinguisticsApplied LinguisticsNatural Language ProcessingLanguage DocumentationData ScienceBilingual DictionariesComputational LinguisticsLexicographyData IntegrationLanguage StudiesData ManagementMachine-readable DictionaryMachine TranslationComputer-assisted TranslationNeural Machine TranslationLinguistics
Although empiricist approaches to machine translation depend vitally on data in the form of large bilingual corpora, bilingual dictionaries are also a source of information. We show how to model at least a part of the information contained in a bilingual dictionary so that we can treat a bilingual dictionary and a bilingual corpus as two facets of a unified collection of data from which to extract values for the parameters of a probabilistic machine translation system. We give an algorithm for obtaining maximum likelihood estimates of the parameters of a probabilistic model from this combined data and we show how these parameters are affected by inclusion of the dictionary for some sample words.
| Year | Citations | |
|---|---|---|
Page 1
Page 1