Publication | Closed Access
Dissimilarity Measures for Histogram-valued Observations
33
Citations
13
References
2012
Year
Density EstimationEngineeringData ScienceData MiningSimilarity MeasureSymbolic Data AnalysisHistogram DataCumulative Distribution MeasureKnowledge DiscoveryMultidimensional AnalysisFuzzy ClusteringStatistical InferenceDissimilarity MeasuresContemporary DatasetsMathematical StatisticFunctional Data AnalysisStatisticsStatistical Analysis
Contemporary datasets can be immense and complex in nature. Thus, summarizing and extracting information frequently precedes any analysis. The summarizing techniques are many and varied and driven by underlying scientific questions of interest. One type of resulting datasets contains so-called histogram-valued observations. While such datasets are becoming more and more pervasive, methodologies to analyse them are still very inadequate. One area of interest falls under the rubric of cluster analysis. Unfortunately, to date, no dis/similarity or distance measures that are readily computable exist for multivariate histogram-valued data. To redress that problem, the present article introduces various dissimilarity measures for histogram data. In particular, extensions to the Gowda-Diday and Ichino-Yaguchi measures for interval data are introduced, along with extensions of some DeCarvalho measures. In addition, a cumulative distribution measure is developed for histograms. These new measures are illustrated for the Fisher iris data and applied to a U.S. temperature dataset.
| Year | Citations | |
|---|---|---|
Page 1
Page 1