Time-Series Forecasting using Deep Neural Networks
In this blog I am going to revisit time-series esp. forecasting but with Deep Learning methods. Please my previous blog about using standard statistical forecasting methods if you are interested.
Before we dive in there a few things I would like to bring to your attention. As anyone who did some machine learning would tell you it is very important to do a lot of exploratory data analysis (EDA) and a bit of data cleaning before we run any algorithms. This is especially true if you are forecasting. You need to make sure you find the lags etc. and make the data stationary using diffs and shifts before you run anything. The problem with doing this is well it’s not scalable.
To explain this better, imagine you are trying to forecast the expected sales volume of particular coffee beans across various stores and in different locations. Yes, you can get all the previous time-series data, and then you need to go over each time-series to figure out how to make it stationary.
Would it be better if a system/framework or algorithm to do this on its own so you don’t have to do the heavy lifting for each one? Luckily this is where deep learning models come in handy and help us the best. You still need to do some simple data cleaning like forward fills for missing data etc. but at least you are not tweaking the P’s, Q’s, or any other specific variable to get this done.
So without further ado, let’s jump into how to do this using Keras and TensorFlow 2.0. We follow the usual five-step model that we have been using.
1.Dataset: Load the data set, do some feature engineering if needed.
2. Build Model: Build a TensorFlow model with various layers.
3. Compile Model: Here we compile the model, select the loss & Optimizer functions.
4. Fit Model: Here we finally train the model using the training data and get some metrics.
5. Evaluate Model: We check our model performance on the validation data.
Dataset:
I have been thinking about the dataset I want to use and instead of using something that has little noise and great decomposable seasonality and trend I decided to use a rather noisy dataset. I went to FRED (btw it’s a nice website with tons of economic data) and got the rolling 10-year yields.
Here is what the time series looks like
Create Dataset:
Unfortunately, we do need to do this additional step to get our data in the shape that our models can understand. The way I am creating this is to look at the last 9 10-year yields and trying to predict the 10th value. Think of this as some kind of a momentum/technical indicator as to where the model see’s the 10-year yield to be based on last nine closing values.
Here is the code of how to do that
#Usual Imports
import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inlinerates = pd.read_csv(r'DGS10.csv',parse_dates=True)
rates['DGS10']=pd.to_numeric(rates['DGS10'],errors='coerce')
rates = rates.fillna(method='ffill')n_split=1200 #We need to split data into training and test data
df_train = df['DGS10'][:n_split]
df_test = df['DGS10'][n_split:]#Create the Training Dataset
dataset = tf.data.Dataset.from_tensor_slices(df_train)
dataset = dataset.window(10,shift=1,drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(10))
dataset = dataset.map(lambda window: (window[:-1],window[-1]))
dataset = dataset.shuffle(buffer_size=100)
dataset = dataset.batch(1).prefetch(1)#Create the Testing Dataset
dataset_test = tf.data.Dataset.from_tensor_slices(df_test)
dataset_test = dataset_test.window(10,shift=1,drop_remainder=True)
dataset_test = dataset_test.flat_map(
lambda window: window.batch(10))
dataset_test = dataset_test.map(
lambda window: (window[:-1],window[-1]))
dataset_test = dataset_test.shuffle(buffer_size=100)
dataset_test = dataset_test.batch(1).prefetch(1)
This created the dataset we need. We can check the values of X,y in the dataset using the following code.
for x,y in dataset:
print(f'x= {x.numpy()}, y = {y.numpy()}')
Build Model:
Now we are at the core part of the project. Here I wanted to try two different models and see how they perform. My first model is a basic Dense based Sequential model. I wanted to see how a simple DNN model works on a time-series with very little EDA.
To my surprise, this worked really well. The ‘Mean-Squared-Error’ was less than 0.5% when the training was done.
Here is the model
model = tf.keras.Sequential([
tf.keras.layers.Dense(128,activation='relu',input_shape=[9]),
tf.keras.layers.Dense(128,activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(64,activation='relu'),
tf.keras.layers.Dense(1)
])
Here is the second model based on simple RNN. Again just to be clear this is not a simple model but the layer is called simple RNN (whose idea was it to call this simple anyway?). The RNN’s need an extra column ie the input shape needs to be (9,1,1), the last column is created using the Lambda layer and using the tf.expand_dims. This wonderful trick of doing the re-shape is explained in the wonderful course on “Sequences, Time Series and Prediction” by Laurence Moroney from coursera.
model_Rnn =
tf.keras.models.Sequential([
tf.keras.layers.Lambda(lambda x: tf.expand_dims(x,axis=-1),
input_shape=[None]), tf.keras.layers.SimpleRNN(40, return_sequences=True),
tf.keras.layers.SimpleRNN(40),
tf.keras.layers.Dense(1)
])
Compile Model
Since I am training these models on large epochs I wanted to print the metrics for every 10th epoch, so I wrote a simple custom callback that does exactly this.
class PrintCallBack(tf.keras.callbacks.Callback):
def on_epoch_end(self,epoch,logs={}):
if epoch%10==0:
print(f'Epoch = {epoch}, MSE = {logs.get("mse"):.2%},
loss = {logs.get("loss"):.2%}')
print10 = PrintCallBack()
Here are the compile statements for both the models.
model.compile(optimizer='adam',loss='mse',metrics=['mse'])
model_Rnn.compile(optimizer='adam',loss='mse',metrics=['mse'])
Fit Model
model.fit(dataset,epochs=100,callbacks=[print10], verbose=0)
model_Rnn.fit(dataset,epochs=30,callbacks=[print10],verbose=0)
Evaluate Model
Here are the vanilla evaluation metrics:
model.evaluate(dataset_test)
model_Rnn.evaluate(dataset_test)
One thing to notice is that we cannot just evaluate the data like this esp. for the time series forecasting as technically we don’t have all this data. So I had to write a one-step evaluation and then use that value to the existing training data to produce new values.
Using this method, we get the following results.
window_size = 9
series = df[n_split-window_size:].values
Y = df[n_split:].values
split_time=1200forecast=[]
forecast1=[]for time in range(len(series) - window_size):
forecast.append(model.predict(
series[time:time + window_size][np.newaxis]))for time in range(len(series) - window_size):
forecast1.append(model_Rnn.predict(
series[time:time + window_size][np.newaxis]))results = np.array(forecast)[:,0,0]
results1 = np.array(forecast1)[:,0,0]final = pd.DataFrame(np.vstack((Y,results)))
final = final.T
final.columns = ['Y','Y_hat_Dense']final1 = pd.DataFrame(np.vstack((Y,results1)))
final1 = final1.T
final1.columns = ['Y','Y_hat_RNN']
The results are as follows
As expected a simple RNN model did much better than the Dense model.
I would like to end this blog with a few thoughts as to how easy the time series forecasting has become using deep learning. We can improve the model performance using better models, MC dropout, hyperparameter tuning, and a ton of other things. This is just the beginning.
Hope you enjoyed this blog.
Good luck !!!
References:
Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow by Aurelien Geron