Publication | Open Access
Learning mixtures of structured distributions over discrete domains
57
Citations
34
References
2012
Year
Density EstimationMixture DistributionEngineeringMachine LearningMonotone Hazard RateData MiningPattern RecognitionEntropyDiscrete DomainMixture AnalysisKnowledge DiscoveryStructured DistributionsStatistical InferenceComputer ScienceProbability TheoryVariable-width HistogramUnsupervised Machine Learning
Let $\mathfrak{C}$ be a class of probability distributions over the discrete domain $[n] = \{1,...,n\}.$ We show that if $\mathfrak{C}$ satisfies a rather general condition -- essentially, that each distribution in $\mathfrak{C}$ can be well-approximated by a variable-width histogram with few bins -- then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of $k$ unknown distributions from $\mathfrak{C}.$ We analyze several natural types of distributions over $[n]$, including log-concave, monotone hazard rate and unimodal distributions, and show that they have the required structural property of being well-approximated by a histogram with few bins. Applying our general algorithm, we obtain near-optimally efficient algorithms for all these mixture learning problems.
| Year | Citations | |
|---|---|---|
Page 1
Page 1