Using the Copernicus datasets, statistics and artificial intelligence to beat climatology
Due to the chaotic behavior of atmospheric systems, accurate seasonal forecasts are still an ongoing less than trivial problem. Indeed today, one of the industry’s most common approach to estimate these atmospheric magnitudes for subsequent months is nothing more than to make a ‘climatology’ guess: i.e., using the mean of that month’s values for the last 30 years. Yet, some weather forecasting centers are leveraging ensembles forecasts to make skillful predictions using numerical models which account for real physical behavior.
Slightly different initial and boundary conditions may yield completely different results; to avoid this, an ensemble of numerical model forecasts can be used to calculate the probabilities of different weather outcomes in the months ahead (starting from the estimated similar initial conditions of ocean, atmosphere and land). Due to small inaccuracies in a model or a different understanding of atmosphere dynamics, even different prediction systems can result into divergent predictions. In this regard, the ensemble output can be conceived as the forecasted probability distribution and, more reliability, can even be obtained by merging these different models in a multi-model.
Nevertheless, could these distributions be collapsed into a deterministic prediction? Sure, the wind and meteorological industry would leverage a skillful value much more than a range of them (e.g., adjusting budgets according to the prospects). However, there are also several subtleties which make this task somewhat more challenging.
For instance, simple techniques for collapsing ensembles (e.g., the ensemble mean) can be used, but they may yield to poorer guesses even than climatology for some sites and months. This lack of skill can be circumvented by:
- Postprocessing the mean of the ensembles, transforming them into more accurate predictions with machine learning models.
- Clustering the ensembles using artificial intelligence.
- Merging different models and weighting them according to different statistical procedures.
Yet, despite being an active field of academic research, ensemble postprocessing is not straight-forward. Indeed, the lack of abundant monthly samples (and even significatively correlated predictors) can, in some cases, give rise to poor or even zero learning by these AI models, and consequently, a deterioration of the ensemble results.
Vortex’s Seasonal forecasting product takes advantage of these ensemble models (ECMWF, NCEP, Met Office, DWD, etc.) both in their raw and post-processed versions (using KNN, SVR, etc.), but also extended to 12 months employing known procedures of time-series analysis (using SARIMA, prophet, etc.).
For more detailed information about the technical details and validation results, please download the document here.