Publication | Open Access
LightGBM: A Highly Efficient Gradient Boosting Decision Tree
9.5K
Citations
20
References
2017
Year
Artificial IntelligenceEngineeringMachine LearningSmall GradientsMachine Learning ToolData ScienceData MiningPattern RecognitionDecision TreeDecision Tree LearningSupervised LearningMachine VisionFeature EngineeringPredictive AnalyticsKnowledge DiscoveryComputer EngineeringLarger GradientsComputer ScienceDeep LearningFeature ConstructionOptimal BundlingClassifier System
Gradient Boosting Decision Trees are widely used, yet existing implementations struggle with high‑dimensional, large‑scale data because they must scan all instances for each feature to compute split gains. This work introduces Gradient‑based One‑Side Sampling and Exclusive Feature Bundling to address this inefficiency. GOSS discards many low‑gradient samples and uses the remaining ones to estimate split gains, while EFB groups mutually exclusive features into bundles to reduce the feature space. Theoretical proofs and experiments show that LightGBM, which incorporates both techniques, can accelerate training by over 20× while preserving accuracy.
Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations such as XGBoost and pGBRT. Although many engineering optimizations have been adopted in these implementations, the efficiency and scalability are still unsatisfactory when the feature dimension is high and data size is large. A major reason is that for each feature, they need to scan all the data instances to estimate the information gain of all possible split points, which is very time consuming. To tackle this problem, we propose two novel techniques: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB). With GOSS, we exclude a significant proportion of data instances with small gradients, and only use the rest to estimate the information gain. We prove that, since the data instances with larger gradients play a more important role in the computation of information gain, GOSS can obtain quite accurate estimation of the information gain with a much smaller data size. With EFB, we bundle mutually exclusive features (i.e., they rarely take nonzero values simultaneously), to reduce the number of features. We prove that finding the optimal bundling of exclusive features is NP-hard, but a greedy algorithm can achieve quite good approximation ratio (and thus can effectively reduce the number of features without hurting the accuracy of split point determination by much). We call our new GBDT implementation with GOSS and EFB LightGBM. Our experiments on multiple public datasets show that, LightGBM speeds up the training process of conventional GBDT by up to over 20 times while achieving almost the same accuracy.
| Year | Citations | |
|---|---|---|
2001 | 27.3K | |
2002 | 6.9K | |
2000 | 6.9K | |
2007 | 1K | |
1996 | 781 | |
2006 | 769 | |
2007 | 434 | |
2017 | 318 | |
1999 | 281 | |
2013 | 199 |
Page 1
Page 1