Day 8: Move away from MNIST datasets

The MNIST dataset is too easy and can be distinguished by one pixel.

Ian Goodfellow wants people to move away from MNIST.

Francois, the Keras’s author, also advocates people to use CIFAR10 instead of MNIST:

No problem! I will rerun my experiments on FashionMNIST, EMNIST, CIFAR10, and STL. I will skip the SVHN dataset because the original CVAE paper already shows that SVHN works on CVAE model. For this post, I will just work on FashionMNIST and EMNIST.

CVAE model

FashionMNIST:

CVAE_FashionMNIST.png

CVAE: Sample 16 images from each category using the same latent vectors. Each row is a category, each column is one unique latent vector of 64 dimensions.

In FashionMNIST, CVAE has a harder time distinguishing a pullover from a coat. The sneaker and sandal are also hardly distinguishable. It shows that CVAE is not so good at capturing the fine-grained detail.

EMNIST

EMNIST_results.png

I hardly recognize anything from here. CVAE does quite poorly here.

If CVAE does not do well, how does GANs perform on FashionMNIST and EMNIST?

DCGAN

Here are the 20 images sampled from the generator after training for 500 epochs for each dataset.

FashionMNIST:

DCGAN_FashionMNIST.png

The generated images from this dataset are somewhat okay but not great. I can still be able to tell the category of each image. Some images are not complete and some clothes look like they have a hole. The generated images are completely different from VAE. VAE is able to preserve the over the structure of the images but GANs seems to be able to take care of all the little details. The combination of the two’s could be interesting.

EMNIST

DCGAN_EMNIST.png

For EMNIST dataset, I can hardly recognize anything. However, EMNIST dataset itself is already difficult even for me to distinguish each letter. I can either add more capacity to the model or work on stabilizing the GAN. Currently, the discriminator’s loss goes to 0 which looks like a failure mode.

What is next:

I want to work on the color images and see how each model performs on these datasets such as CIFAR10, STL, or SVHN.