Novice Perspective: An Overview of the Deep Learning Landscape

2020-10-29

Let me start with a disclaimer: I'm writing this article because I think it'll be a good way to deal with a feeling of overwhelm around deep learning - there's just so much which has happened and is happening in the field. I feel the need to have at least an approximate mental model of interesting developments. Because I'm a novice there might be important things missing, and there's a good chance that something's not quite right. If you spot one of those - let me know on Twitter @vsupalov :)

Alright, with that out of the way let's dive in!

What's The Goal?

I've written about it before: I feel like useful applications of deep learning have snuck up on me. Very impressive results and really cool applications started popping up all around. I'd like to go through some of them and connect the dots to underlying deep learning techniques.

I hope that this will help me form a rough mental model of the field, at least naming the most visible novice-friendly developments of the past few years.

I think that having this map will help guide my future learning and will provide a sound foundation to categorize new things I stumble upon.

The Simplistic Approach Used

I'll list cool stuff I can think of or might discover while researching recent developments around deep learning applications. Listed in no particular order.

The (highly) subjective criteria is: enough visibility to get on my radar and creating some degree of excitement. Things which cause some kind of "whoa dude" reaction. For each entry, I'll try to connect them to relevant high-level techniques used.

I think it's a good place to start - applications are what's visible from the place I'm at right now. I hope they will provide a good window into what's been happening and help me understand what's cool when it comes to applied deep learning techniques.

01: Videogame Domination

DeepMind has created "AlphaStar" a StarCraft AI in 2019 which "plays better than 99.8% of human players". OpenAI's "OpenAI Five" defeated the DotA2 world champion team 2-0 in 2019 (for more details, this Two Minute Papers video is pretty cool for more details, or this or this official blog post by OpenAI for more juicy details).

For me, reinforcement learning is the main term of interest in these cases. "OpenAI Five plays 180 years worth of games against itself every day, learning via self-play.".

180 years worth of games. Per day. In total, OpenAI Five gathered about 900 years of experience per hero. I don't even.

02: Believable Text Generation

OpenAI's GPT-2 model seemed too scary to publish at first, and caused a lot of discussion. GPT-3 is the next iteration and only made accessible through an API.

The text produced by these models are hard to distinguish from human-written ones if the person reading them isn't paying attention. Apart from potentially-scary applications, it's good enough to co-create dream-like and imaginative text adventures.

My understanding of technical details is VERY limited (maybe a careful read through this cool explanation would help), but the main terms of interest here seem to be transformers (which seem to be the bee's knees right now), and BERT. (Here's a good explanation how they relate to each other.)

03: Learning On Images

Not quite as spectacular, but impressive to me: the way CNNs (convolutional neural networks) incorporate convolutions and make it possible to learn which features matter.

That said, it seems like transformers are on the rise in this area as well (as quickly described in this video), at least when it comes to cutting-edge and large-scale scenarios.

Terms of interest: CNNs, transformers (again).

04: This X Doesn't Exist

Generating similar, never-before seen images. Of people for example, but there are a lot more pages like these.

05: Good Artists Borrow...

Human: "I want this image, but make it look like it was painted by Monet." CycleGAN: "Ok done."

GANs strike again! Check out the GitHub repo for examples and (if you haven't seen it already) prepare to be blown away.

Apart from GAN, image-to-image translation is a term of interest.

06: Artbreeder

The previous two entries, turned up to 11.

You can generate images and mix & match existing ones. Take a look here. Don't plan on leaving too quickly. Once again, I don't even.

Once again, GAN is the term of interest. You can read what approach powers the site here.

Intermission: Curious to read more about GAN architectures? This is a very solid looking article providing lots of interesting details.

07: Deep Fakes

Sending ripples through the social fabric of the internet. Can you even trust images or videos anymore? Deep fakes are easy to make (at least so I heard) and their potential for harmful applications seems to be pretty darn high.

It's GANs with a twirly mustache again!

Terms of interest: GAN

08: Learn From Anyone

GPT-3 strikes again. This demo was just too cool to forget. You can find the site here but it doesn't look like it's open to the public to me.

As far as I understand, the applications primes GPT-3 with some output from the person in question and tries to generate plausible-looking text.

You type in the name of a (semi) famous person and GPT-3 tries very hard to generate text which fits your prompt and resembles text which looks similar to what the person sounds like.

It's silly. It's entertaining. It's fun. It's awfully close to useful.

Terms of interest: the same as above would do, but OpenAI API is a better fit.

09: Learning Go From Scratch

Blast from the past! This can be seen as a precursor to the videogame entry above. I couldn't help but mention it: AlphaGo Zero by DeepMind.

AlphaGo achieved great performance. AlphaGo Zero "learnt by playing against itself, starting from completely random play".

Terms of interest: reinforcement learning.

10: We Need More FPS

The latest piece of tech which blew me away: DLSS (deep learning super sampling) by Nvidia. They take a low-resolution image and upscale it in real-time, achieving more than acceptable quality and impressive framerates.

Here's a video about the tech.

A deep learning model dreams up details to make your video game run smoother and look fancy. I still have to process this. That's what happens when you have consumer hardware which can run AI models.

Terms of interest: CNN, super sampling

In Conclusion

So, that's about it. My humble view onto the landscape of deep learning powered applications. I hope seeing this list has helped you get one step closer to a more complete (but still simple) overview of the field.

I know that writing it has helped me a lot. There are quite a few interesting trails to follow, and I'm looking forward to add more entries to this collection as I stumble upon them.

If you have something which is missing, or spot a major flaw - let me know via Twitter @vsupalov.

Cheers and thanks for reading,
Vladislav

Deploying AI