LSTM for Time Series predictions
Continuing with my last week blog about using Facebook Prophet for Time Series forecasting, I want to show how this is done using Tensor Flow esp. the LSTM layers.
We begin with the usual imports.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inlineimport warnings
warnings.filterwarnings('ignore')import tensorflow as tffrom tensorflow.keras.preprocessing.sequence
import TimeseriesGeneratorfrom tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
We read the passangers data CSV into a pandas dataframe.
df=pd.read_csv('/content/drive/My Drive/airline_passengers.csv')
We split this data frame into training set and testing. Then we do the usual MinMaxScaler transformation. Do note we only use training data for the fit process, this is to make sure we are not accidentally leaking any test data.
train=df[:132]
test=df[132:]scaler=MinMaxScaler()
scaled_train=scaler.fit_transform(train)
scaled_test=scaler.transform(test)
With the data preprocessing sets done, lets dive into the deep learning part.
Before we do that we need to make two quick things clear, as this is seasonal data and we know its monthly we want to use last 12 months data to predict the next month data. Also we are not giving any features other than the number of the passengers, if say we are predicting housing prices etc we might have more features like number of bed rooms, bath rooms, total square footage etc. But for this data we don’t have any features except itself, hence we declare the following two variables.
n_input=12
n_features=1
TimeSeriesGenerator
The next step is to create a TimeSeries generator that would give us the data in the 3D format we need. Also here is a quick hint on how the format needs to be (#batch_size,#inputs,#features), so in our case this would be (1,12,1). The reason why I am emphasizing this because this was the most tricky part in the whole process and one that really varies with different TimeSeries data.
This is how we get the training generator
train_generator=TimeseriesGenerator(scaled_train,
scaled_train,
n_input,
batch_size=1)
Please note that both the data and target for this generator is “scaled_train”.
Model
I created a deep model with three LSTM layers followed by a Dense layer for the final output. Here is the full model.
model=Sequential()
model.add(LSTM(100,activation='relu',input_shape=(n_input,n_features),return_sequences=True))model.add(LSTM(50,activation='relu',return_sequences=True))
model.add(LSTM(10,activation='relu'))
model.add(Dense(1))
The most important thing to note is the return_sequences=True for the first two layers. This makes sure that we can pile LSTM layers on top of LSTM layers.
Here is how we compile the model and a quick model summary.
model.compile(optimizer='adam',loss='mse')
model.summary()
Once this model is built, lets fit the model.
model.fit(train_generator,epochs=30)
Predictions
One big difference between regular regression models and time series models is how we run predictions. The first one should be pretty obvious, we take the last 12 months of train data and predict it to get the first test data.
How do we predict the next one?
This is a big issue esp. if you take a shortcut and use the test data’s first value and use that as your last prediction. That way you are feeding the correct values for the prior steps helping the model to create better results that it would otherwise give.
What needs to happen is that the “first prediction” needs to be added to the last 11 training data to create a new set of 12 data points to predict the next one. This way we are not cheating at all, the test data is really test data and is never seen by the model.
Here is a quick for loop to do this.
test_predictions = []#Select last n_input values from the train data
first_eval_batch = scaled_train[-n_input:]#reshape the data into LSTM required (#batch,#timesteps,#features)
current_batch = first_eval_batch.reshape((1, n_input, n_features))for i in range(len(test)):# get prediction, grab the exact number using the [0]
pred = model.predict(current_batch)[0]# Add this prediction to the list
test_predictions.append(pred)# The most critical part, update the (#batch,#timesteps,#features
# using np.append(
# current_batch[: ,1: ,:] ---------> read this as
# current_batch[no_change,1:end,no_change]
# (Do note the second part has the timesteps)
# [[pred]] need the double brackets as current_batch is a 3D array
# axis=1, remember we need to add to the second part i.e. 1st axis current_batch = np.append(current_batch[:,1:,:],
[[current_pred]],
axis=1)
This is the most important part of the code. As long as your remember that the input data needs to be in the 3D format of (#batch,#timesteps,#features) then this should be simple. For anyone who doesn’t follow the append directly please do spend sometime reading the comments, hopefully that should make it clear.
These are my test predictions based on the above.
test_predictions
Obviously we need to scale them back so we can compare the final results.
actual_predictions = scaler.inverse_transform(test_predictions)
actual_predictions
Lets combine these with the test results to have a full chart comparison.
test['Predictions'] = actual_predictions
test.plot(figsize=(12,8));
As you can see we have a better prediction now.
Hope you enjoyed this & Happy reading!!!
References: