Bilingual
Volume 40, Issue 3 (2025)                   GeoRes 2025, 40(3): 243-252 | Back to browse issues page
Article Type:
Original Research |
Subject:

Print XML Persian Abstract PDF HTML


History

How to cite this article
Babaeian A. Role of Surface and Upper-Air Data in Predicting Daily Minimum Temperature. GeoRes 2025; 40 (3) :243-252
URL: http://georesearch.ir/article-1-1857-en.html
Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Rights and permissions
Authors A.H. Babaeian *
Department of Industrial Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
* Corresponding Author Address: Department of Industrial Engineering, Faculty of Engineering, Ferdowsi University of Mashhad, Azadi Square, Mashhad, Iran. Postal Code: 917794895 (babaeian.am@gmail.com)
Full-Text (HTML)   (19 Views)
Background
Global warming, despite increasing the average minimum temperature, intensifies the intrusion of cold air and the occurrence of sudden frosts, making accurate prediction of minimum temperature essential for regions such as Iran. The limitations of numerical models and the need for precise local forecasts have directed attention toward machine learning approaches; however, the coherent application of these methods and the optimal selection of input data for minimum temperature prediction in Iran have not yet been thoroughly investigated.
Previous Studies
Multiple studies have demonstrated that machine learning approaches exhibit remarkable performance in temperature prediction. Fan et al. (2018) have reported high accuracy of SVM and ELM models in Beijing, while Ustaoglu et al. (2008) show that the RBF model achieved the best performance in Turkey. Jose et al. (2022), using CMIP5 and CMIP6 datasets, identify the LSTM model as the most accurate method for minimum temperature prediction. Additionally, hybrid recurrent models have produced notable results in China (Huang et al., 2019). Other studies have reported superior performance of algorithms such as HBA-ANN (Zhou et al., 2023), SVR (Oduro et al., 2026), and M5P (Elbeltagi et al., 2025). In Iran, the effectiveness of statistical and machine learning models, including AR, ANN, SDSM, SVR, and Random Forest, has been confirmed for temperature forecasting (Aghalpoor & Nadi, 2018; Asakereh & Matlabizad, 2017; Shahabi & Izadi, 2024; Karimi et al., 2020). Recent studies have also shown improvements in numerical model performance using SVM and hybrid LSTM approaches (Shokouhi et al., 2024; Hosseini & Alamatiyan, 2022; Ghaffari et al., 2024).
Aim(s)
This study was conducted with the aim of evaluating the performance of machine learning algorithms in predicting daily minimum temperature.
Research Type
The present study was of an applied nature.
Research Society, Place and Time
The population of this study consisted of meteorological data on daily minimum temperature, obtained from the synoptic station of Mashhad. The study was conducted in the city of Mashhad, with the geographical coordinates of its meteorological station, for which both surface-level and upper-atmosphere data (extracted from the ERA5 database) were collected. The time period covered in this research spans from 2000 to 2023. The study was carried out in 2025.
Sampling Method and Number
In this study, the sampling method was census sampling, meaning that all available daily minimum temperature data from the Mashhad synoptic station for the entire study period were included in the analysis. The total number of samples corresponds to all days between 2000 and 2023. After collection, the dataset was divided into two subsets for modeling: the training period, comprising data from 2000 to 2017, and the testing period, comprising data from 2018 to 2023.
Used Devices & Materials
The materials and data used in this study included daily minimum temperature records from the Mashhad synoptic station, as well as upper-atmosphere data obtained from the ERA5 reanalysis database. These datasets served as the primary inputs for the modeling process. The tools and software employed encompassed data processing and numerical modeling methods, which were used to train and test the models over the specified periods (2000–2017 for training and 2018–2023 for testing). Additionally, meteorological data archive systems were utilized to extract surface-level observations and ERA5 reanalysis data.

Findings
In this study, the performance of five machine learning models  including Random Forest, AdaBoost, XGBoost, CatBoost, and LightGBM was evaluated for one-day-ahead minimum temperature prediction. Model parameters were optimized using GridSearchCV combined with five-fold cross-validation to maximize R² and prevent overfitting. Model accuracy was then assessed using MAE, RMSE, R², Adjusted R², and KGE. The entire modeling process was conducted across three data scenarios: Surface-level data (S), upper-atmosphere data (U), and combined data (S & U) (Table 1).

Table 1. Performance of five machine learninga models in predicting the next-Day minimum temperature using the three approaches: S, U, and S & U


In the U scenario, multicollinearity was addressed by calculating the VIF, and variables GH500, GH300, Temp500, and Temp700 were removed. After model training, CatBoost achieved the best performance with R²=0.9376, MAE=1.65, and RMSE=2.13 (Table 1).
In the S scenario, VIF analysis led to the removal of Tm, Um, and Tmax due to high multicollinearity. According to Table 3, LightGBM outperformed other models, with R²=0.9260, MAE=1.79, and RMSE=2.32.
For the S & U scenario, variables with high VIF were eliminated, and the BSS method was applied to select an optimal five-variable feature set: Tmin, Rel-Hum700, Spe-Hum700, Umin, and season-Summer (Figure 1). LightGBM achieved the highest performance in this scenario with R²=0.9390, MAE=1.63, and RMSE=2.10 (Table 3).


Figure 1. Feature selection using the BSS method

Finally, LightGBM in the S & U scenario accurately reproduced the daily minimum temperature patterns, showing strong alignment between observed and predicted values (Figure 2). Overall, the results indicate that combining surface and upper-atmosphere data with LightGBM provides the most accurate and stable one-day-ahead minimum temperature predictions.


Figure 2. Comparison of observed and predicted daily minimum temperature at the Mashhad station using the LightGBM model based on S & U data

Main Comparisons to Similar Studies
The findings of this study, in comparison with previous research, indicated that although classical linear models and single-source neural networks are capable of minimum temperature prediction, the multi-source approach employed here is more advanced and physically interpretable. In Aghalpoor and Nadi (2018), AR(5) and MA(5) time series models achieved acceptable performance with R²≈0.94 and RMSE≈1.7; however, their reliance solely on temperature data limited model generalizability. Similarly, in Ustaoglu et al. (2008), the RBF network achieves R≈0.98 and RMSE=1.64, with the inclusion of neighboring station data reducing prediction errors. In contrast, the present study, by integrating surface and upper-atmosphere data, more effectively captured the physical structures underlying nocturnal cooling. The LightGBM model in the combined scenario achieved competitive performance (R²≈0.94, RMSE≈2.10) along with higher temporal stability (KGE=0.93). These results demonstrated that leveraging multi-layer vertical data provides a more efficient and robust alternative to previous single-source approaches.
Suggestions
For future research, the use of sequential models such as LSTM and GRU is recommended, along with incorporating data from multiple meteorological stations and integrating numerical weather prediction outputs with machine learning (i.e., hybrid models). Additionally, employing interpretability methods such as SHAP could provide clearer insights into the roles of surface and upper-atmosphere variables in controlling minimum temperature.

Conclusion
The integration of surface and upper-atmosphere data significantly enhances model accuracy and stability. Within the combined-data scenario, the LightGBM algorithm provides an efficient and balanced framework in terms of simplicity, robustness, and the ability to capture nonlinear patterns.


Acknowledgments: The authors express their gratitude to the Iran Meteorological Organization for providing the data from the Mashhad meteorological station.
Ethical Permission: None reported by the authors.
Conflict of Interest: None reported by the authors.
Authors’ Contributions: Babaeian AH (First author), Main Researcher/Introduction Writer/Discussion Writer/Methodologist/Statistical Analyst (100%)
Funding: None reported by the authors.
Keywords:

References
1. Aghalpoor P, Nadi M (2018). Comparison of the performance of autoregressive and moving average models in predicting daily maximum and minimum temperature. Proceedings of the National Conference on Water Resources Management Strategies and Environmental Challenges. Sari: CIVILICA. [Persian] [Link]
2. Ahadi M, Zeynali B, Salahi B, Shoja F, Fazl Kazemi A, Babaeian I, et al (2025). Projection of future drought trends in Iran using the CMIP6 multi-model ensemble. Journal of Natural Environmental Hazards. 14(46):43-74. [Persian] [Link]
3. Asakereh H, Matlabizad S (2017). Comparison of the performance of SDSM and artificial neural network models in predicting minimum temperature variations (case study: Urmia Station). The Journal of Spatial Planning and Geomatics. 21(4):140-160. [Persian] [Link]
4. Chai T, Draxler RR (2014). Root mean square error (RMSE) or mean absolute error (MAE)?-Arguments against avoiding RMSE in the literature. Geoscientific Model Development. 7(3):1247-1250. [Link] [DOI:10.5194/gmd-7-1247-2014]
5. Cohen J, Zhang X, Francis J, Jung T, Kwok R, Overland J, et al (2020). Divergent consensuses on arctic amplification influence on midlatitude severe winter weather. Nature Climate Change. 10(1):20-29. [Link] [DOI:10.1038/s41558-019-0662-y]
6. Elbeltagi A, Vishwakarma DK, Katipoğlu OM, Sushanth K, Heddam S, Singh BP, et al (2025). Air temperature estimation and modeling using data driven techniques based on best subset regression model in Egypt. Scientific Reports. 15(1):20200. [Link] [DOI:10.1038/s41598-025-06277-2]
7. Fan J, Wu L, Zhang F, Cai H, Wang X, Lu X, et al (2018). Evaluating the effect of air pollution on global and diffuse solar radiation prediction using support vector machine modeling based on sunshine duration and air temperature. Renewable and Sustainable Energy Reviews. 94:732-747. [Link] [DOI:10.1016/j.rser.2018.06.029]
8. Ghaffari HR, Shahraki S, Malbousi S (2024). Weather analysis with deep learning based on feature selection with crow learning algorithm. Journal of Climate Research. 1402(55):177-193. [Persian] [Link]
9. Hanoon MS, Najah Ahmed A, Zaini N, Razzaq A, Kumar P, Sherif M, et al (2021). Developing machine learning algorithms for meteorological temperature and humidity forecasting at Terengganu state in Malaysia. Scientific Reports. 11(1):18935. [Link] [DOI:10.1038/s41598-021-96872-w]
10. Hosseini AA, Alamatiyan E (2022). Using machine learning methods to predict air temperature (case study: Weather stations in Mashhad County). Proceedings of the 15th International Conference on Science and Technology Advances. Mashhad:CIVILICA. [Persian] [Link]
11. Huang Y, Zhao H, Huang X (2019). A prediction scheme for daily maximum and minimum temperature forecasts using recurrent neural network and rough set. IOP Conference Series: Earth and Environmental Science. 237(2):22005. [Link] [DOI:10.1088/1755-1315/237/2/022005]
12. IPCC (2023). Climate change 2021-the physical science basis: Working group I contribution to the sixth assessment report of the intergovernmental panel on climate change. Cambridge: Cambridge University Press. [Link]
13. Jose DM, Vincent AM, Dwarakish GS (2022). Improving multiple model ensemble predictions of daily precipitation and temperature through machine learning techniques. Scientific Reports. 12(1):4678. [Link] [DOI:10.1038/s41598-022-08786-w]
14. Karimi SM, Kisi O, Porrajabali M, Rouhani-Nia F, Shiri J (2020). Evaluation of the support vector machine, random forest and geo-statistical methodologies for predicting long-term air temperature. ISH Journal of Hydraulic Engineering. 26(4):376-386. [Link] [DOI:10.1080/09715010.2018.1495583]
15. Kisi O, Heddam S, Parmar KS, Petroselli A, Külls C, Zounemat-Kermani M (2025). Integration of Gaussian process regression and K means clustering for enhanced short term rainfall runoff modeling. Scientific Reports. 15(1):7444. [Link] [DOI:10.1038/s41598-025-91339-8]
16. Kretschmer M, Cohen J, Matthias V, Runge J, Coumou D (2018). The different stratospheric influence on cold-extremes in Eurasia and North America. NPJ Climate and Atmospheric Science. 1(1):44. [Link] [DOI:10.1038/s41612-018-0054-4]
17. Lee J (2025). Estimating near-surface air temperature from satellite-derived land surface temperature using temporal deep learning: A comparative analysis. IEEE Access. 13:28935-28945. [Link] [DOI:10.1109/ACCESS.2025.3539581]
18. Oduro C, Osibo BK, Amankwah SOY, Khan S, Gyamfi Kedjanyi EA, Darteh OF, et al (2026). Leveraging machine learning for accurate near-surface air temperature prediction to enhance climate adaptation in Ghana. Journal of African Earth Sciences. 233:105877. [Link] [DOI:10.1016/j.jafrearsci.2025.105877]
19. Pande CB, Sidek LM, Varade AM, Elkhrachy I, Radwan N, et al (2024). Forecasting of meteorological drought using ensemble and machine learning models. Environmental Sciences Europe. 36(1):160. [Link] [DOI:10.1186/s12302-024-00975-w]
20. Roshani, Haroon S, Tamal Kanti S, Md Hibjur R, Md M, Yatendra S, et al (2023). Analyzing trend and forecast of rainfall and temperature in Valmiki Tiger Reserve, India, using non-parametric test and random forest machine learning algorithm. Acta Geophysica. 71(1):531-552. [Link] [DOI:10.1007/s11600-022-00978-2]
21. Sattari MT, Bagheri R, Shirini K, Allahverdipour P (2024). Modeling daily and monthly rainfall in Tabriz using ensemble learning models and decision tree regression. Journal of Climate Change Research. 5(18):31-48. [Persian] [Link]
22. Shahabi R, Izadi A (2024). Evaluation of different regression modeling approaches for temperature prediction using international meteorological conditions. Proceedings of the 24th International Conference on Information Technology,Computer and Telecommunication. Tehran: CIVILICA. [Persian] [Link]
23. Shokouhi M, Mesrizadeh M, Asadi Oskouei E (2024). Bias correction of short-term minimum and maximum temperature forecasts of the WRF model by using the pursuit machine. Journal of the Earth and Space Physics. 50(2):465-479. [Persian] [Link]
24. Tabatabaei S, Nazeri Tahroudi M, Dastourani M (2018). Performance comparison of GP, ANN, BCSD and SVM models for temperature simulation comparison performance of GP, ANN, BCSD and SVM models in temperature simulation. Journal of Meteorology and Atmospheric Science. 1(1):53-64. [Persian] [Link]
25. Ustaoglu B, Cigizoglu HK, Karaca M (2008). Forecast of daily mean, maximum and minimum temperature time series by three artificial neural network methods. Meteorological Applications. 15(4):431-445. [Link] [DOI:10.1002/met.83]
26. Zhou J, Wang D, Band SS, Mirzania E, Roshni T (2023). Atmosphere air temperature forecasting using the honey badger optimization algorithm: on the warmest and coldest areas of the world. Engineering Applications of Computational Fluid Mechanics. 17(1):2174189. [Link] [DOI:10.1080/19942060.2023.2174189]