AI & Machine Learning

Jeff Dean on machine learning, part 1: surveying the landscape

Jeff Dean describes the current landscape of machine learning: its past, present and future.

Whenever you search for pictures of your new puppy in Google Photos, or translate text while on the go to order your breakfast correctly on an international trip, you benefit from machine learning. And inside businesses across multiple industries, machine-learning use cases ranging from fraud detection, to social-media sentiment analysis to smart facilities management are at play to get the most value from enterprise data.

In this new series of posts devoted to machine learning, Jeff Dean, Google Senior Fellow and lead of the Google Brain research team, will answer questions about the history and future of that field. I’ll complement them with some supporting facts and pointers to other interesting resources.

In this installment, with Jeff serving as our pilot, we’ll fly over the landscape of machine learning from a high altitude. In subsequent installments, we’ll dive into more detail.

The re-emergence of deep learning

Why has machine learning become so popular in recent years?

JD: One of the things that’s really happened in the last 5 or 6 years, that has caused machine learning to really take off, is that we now have enough computational power, and large enough and interesting real-world datasets, to solve problems that previously we weren’t able to solve in any other way: problems in computer vision, speech recognition and language understanding.

P.S.: As examples supporting Jeff’s point, modern deep-learning frameworks like TensorFlow (which Google open sourced in 2015) can take advantage of today’s powerful cloud-based CPUs and GPUs to do the intensive matrix calculations needed by neural networks. Also, with the parallelization of computation via cloud computing and the vast amounts of data that can be stored, it's now possible to scale neural networks to large numbers of neurons and layers of neurons, and to train more complex models.

Beyond expectations

What has surprised you the most about machine learning?

JD: There have been a number of surprising results in the last 5 years or so in machine learningthings that I didn’t think computers could really do that all of a sudden they can now do. The progress in computer vision is really really dramatic in the last 5 years. But one of the most exciting things that I’ve seen done by researchers in our group and also contemporaneously by other researchers around the world, is work on image captioning: essentially models that take the raw pixels of an image and then can generate a sentence describing that image.

So they’ll take an image, and then the model can generate “a closeup of a young girl holding a teddy bear,” or “a young girl asleep on the couch curled with her bear.” And these kinds of models, if you’d ask me can computers do this a few years ago, I would’ve said, “Oh, I don’t think they’re going do that any time soon.” But the fact that we can actually generate quite plausible sounding sentences from the raw pixels of an image really shows that models are actually understanding what’s in those images to the level that they can write simple sentences about them. That’s pretty amazing.

P.S.: Finding appropriate captions for images is interesting in two ways: not only is the system able to understand what the image is, but it can also create a natural-sounding sentence that describes what it has recognized, in plain English. See the research paper by Oriol Vinyals and team, and the presentation by Jeff at the ACM showing an example, for more on this work. (Note that the TensorFlow-based model for this has also been open sourced.)

Overcoming machine-learning limitations

Which machine learning problems are yet to be solved?​

JD: Recently, advances in machine learning have mostly come in areas where we have large labeled datasets: problems where we have the inputs and then the desired outputs for those problems; things like large labeled datasets of images and then what’s in those images or speech recognition where we have audio tracks of speech and actual transcriptions of what people said in those things.

Those supervised-learning problems work extremely well today using the machine-learning techniques that we have. What tends not to work very well today, and is a very active area of research, is areas where we don’t have very much example data of what we want the system to do, where you have very few labeled examples. So being able to solve machine-learning problems where you have just a handful of demonstrations for the system of what it should do is an area where we know we need to use lots of unsupervised data just lots and lots of images from the world, but we don’t necessarily know what’s in those images but we know that they reflect what the world looks like. And then you’re being able to use a handful of examples to augment the unsupervised data to get the model to do what we want.

Unsupervised learning is a very active area of research and it doesn’t quite work yet for most real production problems, but in the next 5 to 10 years, I expect it to be a really active theme that is a complement to the kinds of supervised learning that we do today.

P.S.: Even with recent advances, supervised machine learning still needs lots of labeled data to match what the network predicts with the correct expected answer (which can often be unrealistic). As a POC of unsupervised learning, in 2012 the Google Brain team created a large-scale brain simulation to analyze unlabeled videos from YouTube, and the system managed to learn all by itself to recognize cats. (Which as everyone knows is why the internet was invented!) We also recently demonstrated "zero shot" learning in Google Translate, where it could translate language pairs it had never trained on.

Machine learning everywhere

What will the machine learning landscape look like in 5 to 10 years?

JD: One of the things I think is going to be true in machine learning in the next 5 to 10 years is that it’s going to be everywhere. It’s going be spread throughout lots and lots of organizations, solving lots and lots of problems in those organizations that are real business needs that can be tackled with machine learning. And if you look back at that graph that Google has about how we started out applying machine learning in a handful of different areas, and then over time our usage really exploded because we found all the different places where we could use machine learning, and got more and more of our developers comfortable with how machine learning could be used to solve problems.

We think that same kind of transition is going to happen in every organization. There’s going be dabbling in a bit of machine learning in some cases to solve some small problems, and then over time, people are going to become more and more comfortable with it, they’re going to be using it to tackle more and more ambitious problems. It’s going to spread towards the organization.

We think there’s going to be hundreds of thousands of organizations impacted by machine learning in the next 5 or 10 years.

P.S.: Beyond the obvious examples of powering consumer-facing products like Google Photos, one interesting example of an internal Google use case for machine learning is for optimizing datacenter cooling, which has contributed to the objective of reaching 100% renewable energy in its operations. There are thousands of more examples inside Google, and as other organizations adopt machine learning, we expect to see a similar rate of growth in their own applications.

Next steps

In upcoming articles in this series, we’ll dive deeper into TensorFlow, explore how Google applies machine learning internally, and explain how you can get started on your own smart apps, so please stay tuned!

At Google Cloud NEXT '17 in San Francisco (March 8-10), you’ll be able to dive into many facets of machine learning in person. Here are a few sessions that you might consider attending: