Concepedia

Publication | Open Access

LightGBM: A Leading Force in Breast Cancer Diagnosis Through Machine Learning and Image Processing

34

Citations

35

References

2024

Year

Abstract

The early diagnosis of breast cancer (BC), a prominent global cause of mortality, necessitates the development of innovative diagnostic strategies. This study leverages machine learning (ML) and advanced image processing techniques to analyze histopathology images, thereby augmenting the capabilities for BC diagnosis. A robust feature extraction (FE) pipeline is developed, integrating techniques such as color histogram analysis, contour FE, hu moments, and haralick texture features. Ten ML algorithms, including LightGBM (LGBM), CatBoost, and XGBoost, are systematically evaluated across varying magnifications of the BreakHis dataset to assess their diagnostic performance. The research introduces a novel approach by combining distinct FE techniques, enhancing the model’s ability to distinguish between benign and malignant tissues with exceptional accuracy. These integrated techniques significantly elevate BC diagnostic accuracy and reliability, holding the potential to positively impact patient outcomes and healthcare systems. Notably, the combination of the FE pipeline and LGBM achieves the highest accuracy, reported in two forms: before augmentation accuracies (0.9598 for <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$40 \times $ </tex-math></inline-formula> , 0.9516 for <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$100 \times $ </tex-math></inline-formula> , 0.9652 for <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$200 \times $ </tex-math></inline-formula> , 0.9535 for <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$400 \times $ </tex-math></inline-formula> , and 0.9570 for all magnifications combined) and after augmentation accuracies (0.9949 for <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$40 \times $ </tex-math></inline-formula> , 0.9870 for <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$100 \times $ </tex-math></inline-formula> , 0.9987 for <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$200 \times $ </tex-math></inline-formula> , and 0.9918 for <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$400 \times $ </tex-math></inline-formula> ) for the classification of magnification histopathological images. Moreover, the study highlights the crucial role of augmentation in further refining classification accuracy. Extending its applicability, the proposed method is also successfully applied to the classification of lung colon cancer images (LC25000 dataset), achieving an impressive accuracy of 0.9983. The model demonstrates its effectiveness and adaptability as a compelling method for histopathological image classification. This research contributes to the evolving field of BC diagnostics, offering a framework for robust and accurate ML-based diagnostic tools that may revolutionize cancer diagnosis and enhance patient care.

References

YearCitations

Page 1