Publication | Open Access
The <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" display="inline" id="d1e309" altimg="si47.svg"> <mml:mi>t</mml:mi> </mml:math> -digest: Efficient estimates of distributions
44
Citations
2
References
2020
Year
The t-digest is an on-line algorithm for building small sketches of data that can be used to approximate rank-based statistics with high accuracy, particularly near the tails. This new kind of sketch is robust with respect to skewed distributions, repeated samples and ordered datasets. Separately computed sketches can be combined with little or no loss in accuracy.An open-source Java implementation with no external dependencies of this algorithm is available as a free-standing library. Independent implementations in Go, C++ and Python are available. The t-digest is in widespread internal use in major companies and is also available in popular software such as Postgres, ElasticSearch, Apache Kylin and Apache Druid.This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
| Year | Citations | |
|---|---|---|
Page 1
Page 1