Watson Reigns Supreme In Jeopardy Showdown
IBM’s Watson computer proved it is more than a number-cruncher by beating humans at Jeopardy
The humans redeemed themselves in the second game of the man v machine Jeopardy tournament, after the previous game’s debacle, but it was not enough. The best Jeopardy player is now a computer named Watson.
Watson did not dominate the February 16 game as thoroughly as it had on the previous night, with Ken Jennings and Brad Rutter repeatedly beating Watson on the buzzer. Although Watson started out strong, Jennings rallied and got some momentum, taking the lead away from Watson. At the end of the first round, Jennings led the game with $8,600 to Watson’s $4,800 and Rutter’s $2,400. Jennings continued to perform in Double Jeopardy, with $17,000 to Watson’s $15,073 late in the game.
Watson Gets Its Second Wind
And then Watson found both Daily Doubles to close the gap and finish the round with a $5,240 lead over Jennings. After Final Jeopardy, which they all answered correctly, the final score was $44,131 for Watson, $19,200 for Jennings, and $11,200 for Rutter.
Game one was not even close, with Watson finding both Daily Doubles and running away with $35,734, while Rutter had $10,400 and Jennings $4,800. The total score for the tournament, cemented Watson’s victory, with $77,147, Jennings with $24,000 and Rutter $21,600.
“What have I learned over the past two days? One, Watson is fast, knows a lot of stuff, and can really dominate a match,” Jeopardy host Alex Trebek said at the beginning of the game.
However, some of the categories clearly gave Watson some difficulties. It did not even buzz in on some of them. Watson totally misunderstood the “Also on Your Computer Keys” category and while it figured out the answers in the “Actors Who Direct” category, it was not fast enough to buzz in ahead of the humans.
Throughout the match, Watson’s confidence levels were frequently drastically low, a dramatic contrast to game one. On several of the clues, especially computer keys in round one and the clothing category in Double Jeopardy, Watson’s confidence levels hovered around 20 percent or less. While its buzz threshold can change depending on the game state, Watson by default does not buzz in unless it is at least 50 percent confident in the answer, according to IBM researcher Jon Lenchner.
There were also several instances when the correct answer happened to the third item on its list of possible answers. For the Daily Double clue about a 1959 book, Watson inexplicably answered, “Dorothy Parker,” which it had a 14 percent confidence, while the correct answer, “Elements of Style,” had a mere 10 percent confidence.
“Watson is capable of some weird wagers,” Trebek said.
On its Daily Doubles, it wagered $2,127 and $367, respectively. Watson can precisely calculate its confidence level for the category based on what it has figured out about the category, as well as a “game state evaluator” model to estimate its chances of winning based on the other players’ scores, number of remaining clues, and value of remaining clues, according to IBM researcher Gerald Tesauro.
Watson departed from its usual conservative bets during this Final Jeopardy, wagering $17,973 on the “19th Century novelists,” category. And Jennings, who had joked about needing to either unplug Watson or bet it all during a Daily Double earlier in the game, wagered only $1,000.
This is not the first time an IBM supercomputer beat a human being in a game. IBM’s Deep Blue supercomputer defeated Gary Kasparov, the grandmaster of chess in a match in 1997. While many of the most powerful supercomputers in the world are “sons and grandsons” of Deep Blue, the technology remained in very specialised applications, Katharine Fase, vice-president of industry solutions and emerging business at IBM Research, told eWEEK. Watson was different in this regard, as the company has been looking at more common applications of the deep Q&A technology.
IBM has been pushing health care applications of Watson in various interviews. The company will announce on February 17 a collaboration with Columbia University and University of Maryland to create a physician’s assistant service where doctors will be able to query a cybernetic assistant, according to the New York Times. IBM will also work with Nuance Communications to add voice recognition to the service, which may be available in as little as 18 months, the Times said.
IBM has already begun working with computer scientists from around the country on Watson’s successor, known as Racr. It stands for either “reading and contextual reasoning” or “reading and contextual reasoner” (researchers have not settled on which name to use), according to the Wall Street Journal. Racr goes a step further from what Watson can do, as it would use its database of information to come up with reasoned responses, Eduard Hovy, director of the Information Sciences Institute at the University of Southern California, told The Wall Street Journal. A Racr model with the ability to learn background knowledge on given topics and then do “reasoning about that,” could be reality in five years, Hovy said in the article.
Watson can speedily match parts of speech with information it finds, and then picks the most likely answer. Racr, on the other hand, would pull together interrelated facts to understand the context and derive the most likely answer back on background knowledge. Researchers at IBM and various universities will begin developing machines that can begin to teach themselves, Hovy said in the WSJ.
The company is also in discussions with a “major consumer electronics retailer” for a version of Watson that could interact with consumers to help make buying decisions or offer technical support, IBM executives told the Times.
As Jennings wrote on his Final Jeopardy slate, “I for one welcome our new computer overlords.”