First Microsoft. Then Facebook. And now Google. As is so often the case, the giants of the Internet are chasing the same sparkly vision of the future: chatbots.
In the coming months and years, these companies promise, you'll chat with Internet services in much the same way you now chat with friends and family. Bots will instantly answer questions, respond to requests, and even anticipate your needs. While chatting with some some old college pals about an upcoming reunion, you'll ask an OpenTable bot for restaurant recommendations. Without opening a separate app, you'll book a hotel through Travelocity.
But a major challenge remains: building chatbots that can actually chat. Machines can mimic conversation in some ways, but they're still a long way from really grasping the way humans talk. Late last month, in an effort to advance the progress of such AI—and score PR points against its rivals—Google open sourced one of the tools it uses for natural language understanding. (If you share, you get more people pushing the state-of-the-art). And today, not to be outdone, Facebook unveiled an important part of its own underlying technology, a natural language engine it calls DeepText.
Facebook is not yet open sourcing this technology. And the company is only beginning to use DeepText with its own services. But as described by Facebook, DeepText shows how the giants of the Internet hope to accelerate the progress of natural language understanding in the months and years to come. In building these systems, they aim to rely far less on humans and far more on data—enormous troves of online data.
Learning to Understand
Both Google and Facebook are now using deep neural networks to advance their natural language ambitions. Deep neural nets have already proven so effective for so many other online tasks, such recognizing faces in photos or identifying commands spoken into smartphones, and the hope is that these networks of software and hardware, which learn discrete tasks by analyzing vast amounts of data, will prove just as effective in learning to understand and respond to language in a natural way.
Google's newly open sourced system, called SyntaxNet, uses neural nets to understand the grammatical logic of a given sentence. Much as a neural net can learn to recognize a cat by analyzing millions of cat photos, it can learn to understand grammar—nouns, verbs, how a verb relates to the object, and more—by analyzing millions of sentences. This approach, called syntactic parsing, is effective, but it's not without limitations. Humans must carefully tag those millions of example sentences, identifying each part of speech and how it relates to the rest before SyntaxNet can learn from the data. And even if a machine learns to understand the grammar of a sentence, it must go significantly further to understand the complete meaning of a conversation.
But Facebook researchers say they're already pushing the state-of-the-art into new territory. "[DeepText] helps us compensate for the lack of labeled data sets," says Facebook director of engineering Hussein Mehanna. "It comes with a massive amount of structure. It can learn in an unsupervised manner." In other words, Facebook's system relies more on math than grammatical exactitude.
Facebook's system relies more on math than grammatical exactitude.
"What they're saying is they didn't teach the neural network anything about the structure of language," explains Chris Nicholson, founder of deep learning startup Skymind, says of Facebook's work, which was previously discussed in a handful of public research papers. This is important, he adds, because it can make for a more flexible system—a system that can readily expand to so many different scenarios. Facebook's system can learn French or Spanish the same way it could English—by breaking it down to mere math. According to Mehanna, DeepText already works with 20 different languages.
So Much Chatter
In the past, researchers built natural language engines using carefully coded rules—an approach that's difficult and time-consuming. That's how Apple built Siri. By building systems that learn on their own, companies like Google and Facebook are seeking to build systems that can grow and get smarter without as much human intervention. But we're not quite there yet. Facebook's methods are still in the early stages, and not everyone is convinced they're as effective as Facebook says they are.
Noah Smith, a University of Washington computer scientist who specializes in natural language understanding, says Facebook's system is far from the only effort to reach such understanding through unlabeled data, and based on a recent Facebook research paper, Smith says, he doesn't find the company's approach especially exciting. But this is certainly an area where he and many others believe research will go.
Mehanna says Facebook will publish newer research related to DeepText this summer. And he says the company is beginning to test the tool as a way of powering chatbots inside Facebook Messenger. As he explains it, the system can help recognize, during an ordinary chat with friends and family, when you're looking for a taxi ride (see graphic, left). And there's good reason to believe Facebook might have an edge here: data.
In order to learn from natural language, you need enormous amounts of natural language—in digital form. Not so long ago, that was hard to come by. But Facebook has it droves—millions of real conversations that play out on its social network day after day. According to Mehanna, people create 400,000 new posts on the site with each passing minute, and each day, they post about 80 million comments on those posts.
Yes, since Facebook is training DeepText on data it culls from its own site, it's hard for outside researchers to verify the company's claims of natural language proficiency. But this data is also a uniquely powerful thing. Right now, almost all the chatter on Facebook is humans talking to humans. But the machines are listening and learning, and one day, we may be talking with them too.