Day 8: Move away from MNIST datasets

The MNIST dataset is too easy and can be distinguished by one pixel.

Ian Goodfellow wants people to move away from MNIST.

Instead of moving on to harder datasets than MNIST, the ML community is studying it more than ever. Even proportional to other datasets https://t.co/TAo52VC1Fg

— Ian Goodfellow (@goodfellow_ian) April 13, 2017

Francois, the Keras’s author, also advocates people to use CIFAR10 instead of MNIST:

Many good ideas will not work well on MNIST (e.g. batch norm). Inversely many bad ideas may work on MNIST and no transfer to real CV.

— François Chollet (@fchollet) April 13, 2017

No problem! I will rerun my experiments on FashionMNIST, EMNIST, CIFAR10, and STL. I will skip the SVHN dataset because the original CVAE paper already shows that SVHN works on CVAE model. For this post, I will just work on FashionMNIST and EMNIST.

CVAE model

FashionMNIST:

CVAE: Sample 16 images from each category using the same latent vectors. Each row is a category, each column is one unique latent vector of 64 dimensions.

In FashionMNIST, CVAE has a harder time distinguishing a pullover from a coat. The sneaker and sandal are also hardly distinguishable. It shows that CVAE is not so good at capturing the fine-grained detail.

EMNIST

I hardly recognize anything from here. CVAE does quite poorly here.

If CVAE does not do well, how does GANs perform on FashionMNIST and EMNIST?

DCGAN

Here are the 20 images sampled from the generator after training for 500 epochs for each dataset.

FashionMNIST:

The generated images from this dataset are somewhat okay but not great. I can still be able to tell the category of each image. Some images are not complete and some clothes look like they have a hole. The generated images are completely different from VAE. VAE is able to preserve the over the structure of the images but GANs seems to be able to take care of all the little details. The combination of the two’s could be interesting.

EMNIST

For EMNIST dataset, I can hardly recognize anything. However, EMNIST dataset itself is already difficult even for me to distinguish each letter. I can either add more capacity to the model or work on stabilizing the GAN. Currently, the discriminator’s loss goes to 0 which looks like a failure mode.

What is next:

I want to work on the color images and see how each model performs on these datasets such as CIFAR10, STL, or SVHN.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

[Sage]Blog

A personal research blog.

Day 8: Move away from MNIST datasets

CVAE model

DCGAN

CVAE model

DCGAN

Share this: