Autoencoders in TensorFlow — part 2

Sailaja Karra
4 min readDec 22, 2020

--

In this blog I am going to show to use convolution neural networks as part of Autoencoders in Tensor flow. Please see my part1 blog here for a general introduction to Autoencoders.

As we did last time I am still going to use the fashion mnist data to show how we can use convolution layers and re-construct an image. I believe this is more useful than Dense layers as convolution layers are better suited and often used more with image data.

I also want to take a second to give a huge shout out to Aurelien Geron for his amazing book Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow. Most of the code is inspired from this book, I made a few tweaks here and there but I cannot emphasize enough how good this book this.

So without further ado, here is the code on how to do this.

#General imports
import tensorflow as tf
import matplotlib.pyplot as plt
%matplotlib inline
#Load Dataset
mnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels)
= mnist.load_data()
#Scale
X_train=training_images / 255.0
X_valid=test_images/255.0

Here is the encoder convolution layers

encoder_conv = tf.keras.models.Sequential([
tf.keras.layers.Reshape([28,28,1],input_shape=[28,28]),
tf.keras.layers.Conv2D(filters=16,kernel_size=3,
padding='same',activation='relu'),
tf.keras.layers.MaxPool2D(pool_size=2),
tf.keras.layers.Conv2D(filters=32,kernel_size=3,
padding='same',activation='relu'),
tf.keras.layers.MaxPool2D(pool_size=2),
tf.keras.layers.Conv2D(filters=64,kernel_size=3,
padding='same',activation='relu'),
tf.keras.layers.MaxPool2D(pool_size=2)
])

The code should be very familiar to anyone who used Conv2D layers for image related tasks. We first let the model know what input shape the data is which is in our case 28x28 and we reshape it explicitly to 28x28x1
(1 for greyscale and 3 for RGB).

Then we run the convolution layers with first 16 filters, this creates a tensor for 28x28x16 (one for each filter) and then passing this through a MaxPool2D layer shrinks this to 14x14x16 (as our pool_size=2).

Here is the encoder_conv summary for the full model and output shapes.

encoder_conv.summary()

As you can see the final output shape of the encoder is a 3x3x64, obviously we shrunk an already small image set from 28x28 to 3x3 but with various filters.

Now we need to construct this back using the decoder.

decoder_conv = tf.keras.models.Sequential([
tf.keras.layers.Convolution2DTranspose(filters=32,kernel_size=3,
strides=2,padding='valid',activation='relu',input_shape=[3,3,64]),
tf.keras.layers.Convolution2DTranspose(filters=16,kernel_size=3,
strides=2,padding='same',activation='relu'),
tf.keras.layers.Convolution2DTranspose(filters=1,kernel_size=3,
strides=2,padding='same',activation='sigmoid'),
tf.keras.layers.Reshape([28,28])
])

Here we are taking the 3x3x64 input and slowly building it back into a 28x28 image. See below the decoder_conv summary for the full output image construction by layer.

decoder_conv.summary()

With this we are ready to build our full model and run our test. Here are the compile and fit steps.

AE_Conv = tf.keras.models.Sequential([encoder_conv,decoder_conv])
AE_Conv.compile(loss='binary_crossentropy',optimizer='adam')
history = AE_Conv.fit(X_train,X_train,epochs=10,verbose=2,
validation_data=(X_valid,X_valid))

Here are our final images

AutoEncoder_Conv generated images in second row

As you can see we are able to get back decent images but obviously we are losing a bit of detail like colors & patterns.

As we discussed in the part1 blog, the biggest strength of Autoencoders is being able to handle noise. Lets see how the model performs with a 50% dropout.

encoder_conv_50p = tf.keras.models.Sequential([
tf.keras.layers.Reshape([28,28,1],input_shape=[28,28]),
tf.keras.layers.Dropout(0.5), #50% dropout of input datatf.keras.layers.Conv2D(filters=16,kernel_size=3,
padding='same',activation='relu'),
tf.keras.layers.MaxPool2D(pool_size=2),
tf.keras.layers.Conv2D(filters=32,kernel_size=3,
padding='same',activation='relu'),
tf.keras.layers.MaxPool2D(pool_size=2),
tf.keras.layers.Conv2D(filters=64,kernel_size=3,
padding='same',activation='relu'),
tf.keras.layers.MaxPool2D(pool_size=2)
])

We use the same decoder as before

decoder_conv = tf.keras.models.Sequential([
tf.keras.layers.Convolution2DTranspose(filters=32,kernel_size=3,
strides=2,padding='valid',activation='relu',input_shape=[3,3,64]),
tf.keras.layers.Convolution2DTranspose(filters=16,kernel_size=3,
strides=2,padding='same',activation='relu'),
tf.keras.layers.Convolution2DTranspose(filters=1,kernel_size=3,
strides=2,padding='same',activation='sigmoid'),
tf.keras.layers.Reshape([28,28])
])

Final model creation and rest of the steps

AE_Conv2 = tf.keras.models.Sequential(
[encoder_conv_50p,decoder_conv])
AE_Conv2.compile(loss='binary_crossentropy',optimizer='adam')history2 = AE_Conv2.fit(X_train,X_train,epochs=10,verbose=2,
validation_data=(X_valid,X_valid))

Here are the images re-constructed with 50% of the data.

AutoEncoder_Conv2 generated images in second row with 50% data

These images a bit more noisy as expected with loss of a lot more color and detail but the model certainly has learned the shapes and items well.

We will look into RNNs and LSTM based Auto Encoders in the next blog.

Happy reading !!!

References:
Book
: Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow by Aurelien Geron

--

--

No responses yet