Concepedia

Publication | Closed Access

A Distributed Decision Tree Algorithm and Its Implementation on Big Data Platforms

11

Citations

16

References

2016

Year

Abstract

Decision tree algorithms are very popular in the field of data mining. This paper proposes a distributed decision tree algorithm and shows examples of its implementation on big data platforms. The major contribution of this paper is the novel KS-Tree algorithm which builds a decision tree in a distributed environment. KS-Tree is applied to some real world data mining problems and compared with state-of-the-art decision tree techniques that are implemented in R and Apache Spark. The results show that KS-Tree can achieve better results, especially with large data sets. Furthermore, we demonstrate that KS-Tree can be applied to various data mining tasks, such as variable selection.

References

YearCitations

Page 1