Advanced automated machine learning framework for photovoltaic power output prediction using environmental parameters and SHAP interpretability

Abstract

Accurate prediction of power output from a photovoltaic (PV) system is crucial for ensuring operational efficiency. This study addresses the challenge of predicting plant-scale PV power output by integrating automated machine learning (Auto-ML) with explainable modeling techniques. The integrated approach enhances predictive accuracy, supporting well-informed decision-making in power systems through data-driven frameworks. Real PV power data from a plant at Universiti Tun Hussein Onn Malaysia (UTHM) and five key weather parameters were used in this experiment. Auto-ML was employed to automatically identify the best-performing models tailored to the dataset. The top four performing models, achieving the highest predictive accuracies, were identified as Extra Tree (91% accuracy), Random Forest (85%), XGBoost (75%), and Decision Tree (68%) for further analysis. Their performance was then validated against commonly used artificial neural networks (ANN) and support vector machines (SVM) using multiple evaluation metrics including prediction accuracy, error rates, and interpretability. The results clearly demonstrate the superiority of the proposed approach across all performance metrics. For practical applications, a novel data mining method is also proposed to identify primary environmental drivers of PV performance using bivariate data analysis. Additionally, the model-based role of each parameter in the machine learning (ML) context is assessed using additivity of feature importance to uncover the underlying predictive mechanism of each ML model. This study establishes an advanced and powerful framework combining Auto-ML and explainable AI for predictive modeling of PV power output. It sets new standards for significantly improved operational decisions and a broader integration of AI in renewable energy forecasting for data-driven optimization in power systems. • This study addresses challenges in solar energy forecasting, emphasizing the need for reliable prediction frameworks due to solar irradiance complexity and model variability. • It aims to enhance plant-scale photovoltaic (PV) power output prediction by integrating automated machine learning (Auto-ML) with explainable modeling techniques. • Real PV data from Universiti Tun Hussein Onn Malaysia (UTHM) and five weather parameters are utilized for analysis. • Auto-ML identifies the best-performing models for predicting PV power output, selecting Extra Tree (91% accuracy), Random Forest (85%), XGBoost (75%), and Decision Tree (68%) for further evaluation. • Selected machine learning (ML) models are compared with artificial neural networks (ANN) and support vector machines (SVM) using six evaluation metrics. • Bivariate analysis reveals key environmental drivers of PV performance for the first time in PV power forecasting, providing insights into ML models' mechanics. • SHAP value analysis is conducted to offer deeper insights into model predictions, highlighting the influence of individual features on models' output. • A robust framework combining Auto-ML and explainable AI for PV power output prediction is established, setting new standards for operational decisions in renewable energy forecasting. • Findings provide actionable insights for optimizing PV system performance, addressing inconsistencies in solar energy output, and improving power supply reliability.

References

Page 1

	Year	Citations

Page 1