Generating Beetles From Public Domain Images

Ever since [Ian Goodfellow] and his colleagues invented the generative adversarial network (GAN) in 2014, hundreds of projects, from style transfers to poetry generators, have been produced using the concept of contesting neural networks. Unlike traditional neural networks, GANs can generate new data that fits statistically within the same set as the training set.

[Bernat Cuni], the one-man design team behind [cunicode] came up with the idea to generate beetles using this technique. Inspired by material published on Machine Learning for Artists, he decided to deploy some visual experiments with zoological illustrations. The training data was found from a public domain book hosted at archive.org, found through the Biodiversity Heritage Library. A combination of OpenCV and ImageMagick helped with individually extracting illustrations to squared images.

[Cuni] then ran a DCGAN with the data set, generating the first set of quasi-beetles after some tinkering with epochs and settings. After the failed first experiment, he went with StyleGAN, setting up a machine at PaperSpace with 1 GPU and running the training for >3 days on 128 px images. The results were much better, but fairly small and the cost of running the machine was quite expensive (>€125).

Given the success of the previous experiment, he decided to transfer over to Google CoLab, using their 12 hours of K80 GPU per run for free to generate some more beetles. With the intent on producing more HD beetles, he used Runway trained on 1024 px beetles, discovering much better results after 3000 steps. The model was moved over to Google CoLab to produce HD outputs.

He has since continued to experiment with the beetles, producing some confusing generated images and fun collectibles.

Check out the beetles here:

10 thoughts on “Generating Beetles From Public Domain Images

  1. Interesting that the initial NN’s did not pickup on some fundamental properties of symmetry ie right half/left half. Also might speed up training to work with just half images?

    1. Correct, the reality is that GANs are really crappy at doing symmetry unless the dataset is very constrained as the beetle set is. It works because the NN vector space is on a scale similar to the genetic one that controls beetle morphology. Try it with anything more complex and you rapidly see how useless GANs are for reliable high quality image generation that requires a complex encoding of symmetry at different scales. The is because one patch of the image has very little influence over remote parts. The NN crowd find it hard to admit this weakness, I know because I have been pointing it out to them for about a year, they can’t even generate a convincing snowflake ❄️ or spiral 🐚 despite such geometries been ubiquitous in the universe at almost all scales up to the size of galaxies.

  2. Not so fast. Artificial edible beetles used for grossing out your friends are in very high demand among the huge and lucrative 6-11 year-old demographic. But fake tasty beetles don’t auto-reproduce, and even if they did they would all look the same – how boring. At least with [Ian Goodfellow’s] ML beetle-style generator no two edible fake beetles will ever look the same. IMO that is work worthy of a Nobel.

Leave a Reply to Warol Kojtyła Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.