Publication | Closed Access
On the estimation of 'small' probabilities by leaving-one-out
77
Citations
12
References
1995
Year
EngineeringMachine LearningAlgorithmic LearningStatistical FoundationTraining SamplesMathematical StatisticLarge Language ModelCorpus LinguisticsNatural Language ProcessingUncertainty QuantificationComputational LinguisticsLanguage EngineeringLanguage StudiesStatisticsMachine TranslationComputational Learning TheoryLinear Discounting ModelAbsolute Discounting ModelProbability TheoryComputer ScienceGrammar InductionImprecise ProbabilityStatistical InferenceLinguistics
We apply the leaving-one-out concept to the estimation of 'small' probabilities, i.e., the case where the number of training samples is much smaller than the number of possible classes. After deriving the Turing-Good formula in this framework, we introduce several specific models in order to avoid the problems of the original Turing-Good formula. These models are the constrained model, the absolute discounting model and the linear discounting model. These models are then applied to the problem of bigram-based stochastic language modeling. Experimental results are presented for a German and an English corpus.
| Year | Citations | |
|---|---|---|
Page 1
Page 1