Friday, 25 May 2018

Weather Prediction Refinements in MATLAB Machine Learning Models

Update 30 May 2018: the models presented here have now been deployed online, as described in the next post.

In this post, I pretty much pick up from where I left off in my previous post where I developed some preliminary Machine Learning (ML) models for weather prediction using MATLAB. In this next post, I explore some further refinements to the models. You will need to read the previous post for context, as I do not repeat any of that here.

Revised Error Curves

All the results of current refinements are presented in the following set of Error Curves which are revised versions of those presented in the previous post. Each of the updates to the curves is described later in this post.

The Previous Models

The results from the previous models are reproduced in the Revised Error Curves exactly as they were before, but with slight label changes summarised as follows:

  • The solid blue curves are the "LSTM alone" curves from before. Now labelled "LSTM", the "(dash, Multivar)" can be ignored for now.
  • The solid red curves are the "Multi-regression plus LSTM" curves from before. Now labelled "Multi-reg plus LSTM", the "(+ single period)" can be ignored for now.
  • The solid orange curves are the "Multi-regression alone" curves from before. Now labelled "Multi-reg", the "(+ single period)" can be ignored for now.
  • The black dotted line labelled "sdev obs" is the same as before 

The New Models 

Zero Order Hold

The first of the new models, the most naïve model of all, represented by the solid black curves labelled "ZOH" (for Zero Order Hold), is described as follows: basically, use the last known observed values as the forecast values for the future. It is about the simplest method of forecasting. As the curves show, it works well in the leftmost portions of the graphs, for short forecast periods (half hour, one hour etc), but unsurprisingly, the performance drops off (i.e., curves rise steeply) as the forecast period increases. That said, the (frankly disappointing) realisation is that this naïve model actually performs better than all the previously-presented models up to approximately 5 hours or so, which probably says more about the poor quality of those previous models. Note: the "dips" in the ZOH error at 24, 48, and 72-hours ahead correspond to the fact that local time of the forecast is exactly the same as the local time of the observation used in the ZOH, so diurnal variation will be effectively nulled,  leading to a stronger correlation.

Multivariable LSTM

The second of the new models represented by the blue dot-dash curves labelled "(dash, Multivar)" (alongside the previous "LSTM" labels) is a variation of the previous LSTM modelling. Recalling from the previous post, when building the LSTM models, I opted to treat each variable individually, and trained a single-variable LSTM model on the single time-history of observations for each variable. My (albeit nothing more than instinctive) reasoning was that it could be expecting "too much from the model" to try and fit all 8 variables together.  However, I did remark back then that it might be worth trying the multi-variable LSTM i.e., by fitting for 8 variables simultaneously in a single LSTM neural network, just in case there were useful internal correlations that might help. MATLAB makes this extension to multiple variables straightforward, and the results are now in. Unfortunately, with the exception of a few sporadic "dips" (i.e., regions of lower errors), the multivariable LSTM model generally under-performs the previous individual LSTM models. Moreover, it almost exclusively under-performs the naïve ZOH models for every variable across almost the entirety of the forecast periods (validating my original instincts).

Multivariable Regressions for Individual Forecast Periods

Recalling from the previous post, when fitting the multivariable regression models (the solid red and orange curves), I opted to fit for all forecast periods simultaneously (mostly to minimise the number of models required). However, as noted back then, the performance of the resulting models was relatively poor at small forecast periods, where the regressions were expected to perform better. I reasoned that the single error being minimised via the back propagation training algorithm was hampering the short forecast periods by being unable to get below the value for the long forecast periods (where the error is always going to be larger). So, my suggested proposal as a future enhancement was to fit a regression for each single forecast period of interest. The results of those regressions are now in (specifically for 0.5, 1, 3, 6, 12, 24, 36, 48, and 72-hour forecast periods) and are represented by the red and orange 'plus signs' in the error curves labelled "(+ single period)" alongside the corresponding "Multi-reg" labels. As described previously, the red (solid and now 'plus signs') correspond to the case where the outputs from the (solid blue) LSTM estimates are used as further input (regressors) in the regressions; the orange (solid and now 'plus signs') correspond to the case where the outputs from the (solid blue) LSTM estimates are not used as further input (regressors) in the regressions. As can be observed from the Revised Error Curves, it turns out that the Multivariable Regressions without LSTM, computed individually for each forecast period of interest, represented by the orange 'plus signs', are generally the best of all models across almost the entire range of forecast periods except for the low periods where ZOH prevails. There is one notable exception, namely Sea-Level Pressure, where ZOH prevails exclusively, indicating that this particular variable is very difficult to predict from previous time-histories.

Revised Recipe for an Online Weather Forecaster

Given the above results, the recipe for an online weather predictor for a given location is simplified, as follows:

  • Capture and persist the METAR briefings every half-hour for the location of interest.
  • Every few months or so, use the above-mentioned METAR time histories to re-train a set of neural network multivariable input regression models, with individual target responses per variable, and one model per forecast period of interest.
  • Every half hour, update the forecasts using the above-mentioned trained regression models with the most recent set of METAR observations as inputs (and desired forecasts as outputs). In those specific cases (combinations of variable and forecast period) where ZOH prevails, use that instead of the neural net.
Over time, one can expect that the predictive capabilities of the Deep Learning networks used in the above-mentioned regressions should improve as the training datasets grow. Moreover, once sufficient data has been gathered to span at least a year, the day-of-year variable (currently omitted, see previous post for the reason why) can be included as a further input (regressor). This will improve the performance by capturing the seasonal effects of the weather (i.e., in addition to the intra-day effects already captured).

Production Deployment

This is the next important step and will be explored in a future post.