David Ferrucci Speaks About Jeopardy!-Playing Watson

In the realm of societal problems that computers could potentially solve, answers on the television game show Jeopardy! might not seem high on the list.

But programming computers to learn to understand natural human is a major computing hurdle: one that IBM wanted to tackle.

So that is exactly what it spent six years doing, said David Ferrucci, the principal investigator on IBM’s DeepQA Project, when he spoke to a packed room of faculty and students May 18 as part of the McCormick Dean’s Seminar Series.

IBM had gotten a lot of value with Deep Blue, a chess-playing computer that had won against world champion Garry Kasparov in 1997. Researchers discovered, however, that a computer was quite comfortable in that space: figuring out a well-defined math problem quickly.

Computers were far less successful with natural language, which is implicit, highly contextual, ambiguous, and requires a large cache of background knowledge.

As the story goes, IBM researchers were having dinner at a bar in 2004 when they noticed everyone crowded around the television to watch Ken Jennings compete in his historic 74-game Jeopardy! run. Why not try to solve the human language problem by creating a system that can play Jeopardy!?

“It’s a game written by humans for humans,” Ferrucci said. “It would advance the science…and it would capture the imagination of a broader audience.”

The team began by sampling 20,000 Jeopardy! Questions and found 2,500 distinct types. After feeding the computer syste, dubbed "Watson," with terabytes of information from encyclopedias, dictionaries, and reference materials, they then began to program Watson to analyze the text for plausible answers by parsing the sentence into subject, verb, and object. The research team tried out many different algorithms to teach it to “learn” what different words mean. For example, Watson has to know whether a fluid is a liquid, and vice versa. The team used a million lines of new code and hundreds of algorithms to program the computer, which is actually a cluster of ninety IBM Power 750 servers with a total of 2880 POWER7 processor cores and 16 terabytes of RAM.

The result was IBM’s DeepQA technology. To be ready for Jeopardy!, however, the team knew Watson had to be able to ring in quickly and answer with a high percentage of precision. Ken Jennings, for example, rang in first 61 percent of the time and had 92 percent precision. In 2007, Watson only had 47 percent precision. The team continued to try new algorithms and report to their bosses; at some point, Ferrucci said, they knew they were either going to be fired or finally win a Jeopardy! game.

In 2010, researchers thought Watson was ready.

“There was always the risk of losing,” Ferrucci said. “There was always a risk of looking stupid.”

Watson, of course, ended up winning the three-game match with $77,147. (Jennings scored $24,000 and Brad Rutter scored $21,600.) Watson did get the final Jeopardy! question wrong; the category was U.S. Cities, and the clue was, “Its largest airport was named for a World War II hero; its second largest, for a World War II battle."

Watson guessed Toronto. Ferrucci explained that Watson had learned that the title of the category wasn’t essential to the answer (a category called “Authors,” for example, might include answers that weren’t authors’ names), and Toronto has an American League baseball team. If the question began, “What U.S. city’s largest airport…” Watson would have answered correctly (“Chicago”). 

Watson’s technology has already proved successful in medical diagnoses, and next Ferrucci hopes to focus on dialogue, so users could one day have conversations with their computers.

“I’m always driven by Star Trek,” he said, referring to the computer that would regularly converse with and provide answers for the crew. “I’d love to get there.”