You might have heard about NVIDIA’s AI projects called GauGAN and Image Inpainting that uses deep learning to create and edit. If you haven’t, here’s a review.
You might have heard about NVIDIA’s AI projects called GauGAN and Image Inpainting that use deep learning to edit images and create highly realistic scenes. If you haven’t, here’s a little overview, since today you can finally play around with both of these tools!
As NVIDIA Research team says, GauGAN “leverages generative adversarial networks, or GANs, to convert segmentation maps into lifelike images”. The tool has a few simplistic features that correspond to some natural and man-made elements such as sky, grass, sea, road, tree, bush, and more. With them, you can simply doodle on the canvas and get an adapted realistic scene. Below you can find a short video about the project:
The project can provide help in creating virtual worlds and landscape designs, and quickly prototype ideas. Out-of-box or custom filters applied to the image can vary the look and contribute to brainstorming even more.
Bryan Catanzaro, vice president of applied deep learning research at NVIDIA, describes GauGAN as a “smart paintbrush” that can quickly fill in the details inside rough segmentation maps. Users can “draw their own segmentation maps and manipulate the scene, labeling each segment with labels like sand, sky, sea or snow”.
The whole system acts as a cooperating pair of networks: a generator and a discriminator. The generator creates images and presents them to the discriminator. The discriminator is trained on real images which allows it to give its “partner” pixel-by-pixel feedback on how to improve the realism of the synthetic products. For example, the discriminator knows that water and snow are reflective and guides the generator in this matter. As a result, the neural network creates a believable image.
If you are interested in GauGAN, read more here. You can also try out the demo version and see the possibilities yourself (head here). Note that in the demo version, you’ll not be able to use the real-time feature and will have to wait a few moments to see the image converted. It didn’t prevent us from having some fun, though. Looks a bit blurry – just like our last vacation photos:
The NVIDIA research team says: “Our model can robustly handle holes of any shape, size location, or distance from the image borders. Previous deep learning approaches have focused on rectangular regions located around the center of the image, and often rely on expensive post-processing. Further, our model gracefully handles holes of increasing size”.
To train the network, the researches generated thousands of masks of various holes, streaks, and other shapes and applied them to the images from the databases, then let the network recover the input. What’s more, the network uses “a partial convolution layer that renormalizes each output depending on the validity of its corresponding receptive field”. Previously, the accuracy of the image recovery strongly depended on the value of the input given to the network in order to evaluate and fill in missing pixels. This resulted in numerous artifacts. Thanks to the renormalization mentioned above this dependency is eliminated which lets the network outrun the previous methods.