In February 2011, IBM’s supercomputer project Watson won a special series of the American gameshow Jeopardy, trouncing the two best ever human players.
Even though it signed off one show by guessing that Toronto was an American city, the ambitious supercomputer project demonstrated just how far, with a great deal of time and money, natural language processing (NLP) could go.
Watson was mocked on Twitter for the Toronto gaffe but, to place it in context – which is only right since we’re talking about NLP – the fact is that Watson was way ahead of its opponents and could not lose. And it knew it.
That data, for example, could be generated from innumerable contact points with customers – perhaps hours and hours of call centre conversations, written communications, email, website submissions and even Facebook.
This unstructured data – in human language, not computer language – represents about 80 percent of the data hoarded by the enterprise, according to Rhinehart.
And there is valuable information contained within that NLP can unlock, revealing trends and insights to be acted upon.
“There is a tremendous amount of innovation around this notion of unstructured information and we think of Watson as a moment in time, as a grand challenge breakthrough,” he said.
Context is everything in natural language – Jeopardy is full of puns and wordplay and this fluid interaction between words is challenging for a computer that relies on precise instructions.
“Watson processes raw information, text and natural language so we can understand what’s in there. With Watson we then put that info into a knowledge base – for want of better word, not a knowledge base per se, but something that manages and stores unstructured info so it’s retrievable in the right context.
“Once you have this built up set of knowledge you can ask questions and get back answers with varying degrees of confidence.”
Watson’s knowledge base for Jeopardy contained around 200 million pages of information, including the complete works of Shakespeare and Wikipedia. Hundreds of algorithms scour its knowledge base for possible answers.
Then, hundreds more algorithms search for supporting evidence and yet more algorithms score each answer based on the supporting evidence resulting in a confidence rating.
Continued on page 2
Page: 1 2
Suspended prison sentence for Craig Wright for “flagrant breach” of court order, after his false…
Cash-strapped south American country agrees to sell or discontinue its national Bitcoin wallet after signing…
Google's change will allow advertisers to track customers' digital “fingerprints”, but UK data protection watchdog…
Welcome to Silicon In Focus Podcast: Tech in 2025! Join Steven Webb, UK Chief Technology…
European Commission publishes preliminary instructions to Apple on how to open up iOS to rivals,…
San Francisco jury finds Nima Momeni guilty of second-degree murder of Cash App founder Bob…