Publication | Closed Access
Sequence modeling with mixtures of conditional maximum entropy distributions
20
Citations
24
References
2004
Year
Unknown Venue
Structured PredictionEngineeringMachine LearningCorpus LinguisticsText MiningStatistical Relational LearningWord EmbeddingsNatural Language ProcessingConditional Maxent ModelsData ScienceMixture AnalysisComputational LinguisticsLanguage StudiesSequence ModellingNlp TaskKnowledge DiscoveryConditional Maximum EntropyComputer ScienceMaxent FrameworkMixture DistributionEntropyStatistical InferenceLinguistics
We present a novel approach to modeling sequences using mixtures of conditional maximum entropy (maxent) distributions. Our method generalizes the mixture of first-order Markov models by including the "long-term" dependencies in model components. The "long-term" dependencies are represented by the frequently used in the natural language processing (NLP) domain probabilistic triggers or rules (such as "A occurred k positions back"/spl rarr/"the current symbol is B" with probability P). The maxent framework is then used to create a coherent global probabilistic model from all selected triggers. We enhance this formalism by using probabilistic mixtures with maxent models as components, thus representing hidden or unobserved effects in the data. We demonstrate how our mixture of conditional maxent models can be learned from data using the generalized EM algorithm that scales linearly in the dimensions of the data and the number of mixture components. We present empirical results on the simulated and real-world data sets and demonstrate that the proposed approach enables us to create better quality models than the mixtures of first-order Markov models and resist overfitting and curse of dimensionality that would inevitably present themselves for the higher order Markov models.
| Year | Citations | |
|---|---|---|
Page 1
Page 1