Abstract:
Upward lightning (UL), due to its abrupt nature and destructive potential, poses a critical challenge in lightning protection for power systems and high-rise structures. Traditional identification methods often suffer from high false alarm rates and poor recognition accuracy under complex meteorological conditions. This study proposes a UL identification approach based on thunderstorm environmental features and the random forest algorithm. Six key meteorological variables—vertical velocity, wind shear, convective available potential energy (CAPE), precipitable water vapor (PWV), and others—are integrated into a multivariate classification model. A Leave-One-Day-Out cross-validation strategy is adopted to enhance temporal generalization capability. The model outputs probabilistic predictions, and its performance is further optimized through ROC curve analysis and multi-metric threshold tuning. Experiments conducted on 247 UL samples in a selected region of eastern China demonstrate the model's strong performance, achieving an accuracy of 91%, recall of 87%, precision of 84%, F1 score of 85%, and AUC of 0.93. Variable importance analysis indicates that vertical velocity and wind shear are the dominant factors influencing UL occurrence. This approach provides an efficient and interpretable pathway for UL early warning modeling, offering substantial practical value in engineering.