Watson Will Win. Here’s Why…

by Andrew McAfee on February 9, 2011

Watson, IBM’s astonishing Jeopardy!-playing supercomputer, will take on two top human players in head-to-head competition televised on February 14-16. I predict Watson will win, and I believe the competition would be closer if the machine played against only one person rather than two. Let me explain…

I  had the chance to watch Watson in action last week at Lotusphere (my keynote at the conference is here). In addition showing the game show’s normal displays (the board showing all possible topics and dollar amounts, and the text of the question being asked) the Lotusphere screens also presented Watson’s ‘thinking’ during each question. This consisted of the top three responses it was mulling, along with its estimate of the probability that each response was the right one.

In the (low-quality) photo below, which I took during a rehearsal session at Lotusphere, Watson is considering three answers to the question being asked: “Thomas Hardy,” “Oscar Wilde,” and “Tolstoy” (sorry, I forget what the question was). It’s not very confident, though, that any of these are correct: it thinks there’s only a 20% chance that Hardy is the right answer, and the other two are even lower.

A rehearsal at Lotusphere 2011 for man-vs-machine Jeopardy! The topmost rectangle onscreen shows Watson's three candidate responses, and its estimated probability that each is correct.

In this circumstance, Watson is not going to try be the first to buzz in — to signal to the host that it wants to answer the question. In Jeopardy!, the first player to buzz in once the host is done reading the question wins the right to answer it. If the answer is correct, the player gets the $$ associated with the question; if the answer is wrong, the player loses that $$ amount and the other two players get a chance to answer. So the three critical factors for winning the game are speed, smarts, and accurate estimates about how smart one is.

As I was watching Watson rehearse, I noticed two things and concluded two others. First, I noticed the supercomputer knew a lot, and also knew what it knew. In other words, it very often came up with the right answer — not always, but very often — and when it did it also typically assigned that answer a high probability of being correct. The rehearsal only lasted a few minutes and so didn’t provide a big sample, but I saw no cases where Watson was falsely confident or falsely unconfident.

The second thing I noticed was that the supercomputer was (duh) fast. It generated its answers and associated probabilities very quickly after being given the text of the question. Watson’s answers and probabilities appeared onscreen almost immediately after the question was revealed, and well before the human host was done reading the question aloud.

This matters a great deal in Jeopardy! because players (human or machine) can’t buzz in until the host is done reading the question aloud. If my casual observations are at all accurate, Watson is ready well before the host is done reading – it knows if it wants to buzz in, and it knows what it wants to say. And (again, if my observations are accurate) it’s rarely too confident, or not confident enough.

Now, there’s no way a human is going to buzz in more quickly than a confident Watson. When the players get the signal that it’s OK for them to buzz in, silicon and fiber optics are just always going to be faster than muscle and nerve fiber. So a confident Watson is always going to get to answer first.

Given this permanent speed advantage, the outcome of the game basically depends on only one thing: is Watson confident/smart enough on enough questions? It doesn’t matter how smart or fast its human opponents are; it only matters how smart Watson is, since it’s always fastest. And my first conclusion is that it’s smart enough. As I wrote earlier, the Watson team has done an astonishingly good job at building a Jeopardy!-playing supercomputer —   one that can answer natural language questions on a huge and unspecified range of topics.

This is a true milestone, and I’ll have a lot more to say about it later. For now, let’s stick to the upcoming competition. Watson will buzz in first on all the questions where it has high confidence, and will answer them correctly. The two human players will be left to fight it out on only the questions Watson couldn’t answer. They’ll be fighting over the machine’s table scraps.

If we assume the two human players are of roughly equal ability, they’ll each get about half of Watson’s leftovers. This is great news for the computer; it lowers the percentage of questions it has to answer in order to have a reasonable shot of winning, from somewhere around 50% to somewhere just above 33%. To oversimplify a bunch (and to ignore wrinkles like the Daily Double and Final Jeopardy!), the computer needs to get only a bit more than a third of the total $$ at play if it can be confident that the humans will split the other 2/3.

So my second conclusion is that the competition would be much closer if Watson were only playing against a single human champion. The supercomputer would still get to answer first whenever it was confident, but the lone human would get his chance every time Watson wasn’t confident.

Over time, of course, even this won’t be much of a competition. Watson is only going to get smarter over time, and it’s not going to get slower, so it will someday (probably soon) become the world’s best Jeopardy! player, just as computers are now easily the world’s best standalone chess players.

As I’ll write later, I find more good news than bad news in this. For now, though, I’m just going to pop some popcorn, plop myself on my couch, and watch the competition on TV to see if my predictions hold up.

What do you think? Do you agree with my observations and conclusions, or am I missing anything important here? And who do you think is going to win? Leave a comment, please, and let us know…

  • http://twitter.com/cariocacardinal frank deal

    What % confidence level does Watson need in order to answer? At 60%, he will still be wrong 40% of the time and that will significantly reduce his winnings. Even at 80%, their will be significant erosion.

    The situation you are referring to in the post does not apply to final jeopardy so in that all important round, Watson’s advantage (of 2 vs 1 player) is eliminated.

    It should be fun!

  • Straite1

    understanding completely a language ,, any language,,, is hugh. The Unsinkable Titantic??? I think the possibility that there is something out there the unknown that will get in the way and that it will strike the iceberg is something that is put there intentionally if man begins to think that his head is the biggest thing in the universe. I have news for those who think this,,,,, it comes from a computer that is invisible and speaks in a voice that members only can hear,,,,and believe me ,,,, no one wins against it !!!!!!!

  • Amiran

    It would be more interesting to have watson not only make strategic decision on the basis of its individual capacity to answer the question but also analyze how the other players play and include their behaviour in his strategic decision. This would increase his chances significantly. He would need info on the time it took for it’s oppenents the hit the buzz and how often they answered wrong or right. I bet that he’s smart enough to distinguish a pattern in his oppenents play somewhere halfway and signifiicantly increase his chances from there onwards.

  • http://andrewmcafee.org/blog amcafee

    Frank, good question. I don’t know what Watson’s confidence threshold his. As I watched (again, only for a short time) it seemed that it was either VERY confident or not at all confident. I didn’t see many answers in which it had something like 60% confidence…

  • http://andrewmcafee.org/2011/03/mcafee-watson-ibm-healthcare-verghese/ Dr. Watson, Please Report to the Health Care System

    [...] to the news that IBM is working with the universities of Columbia and Maryland to adapt Watson, the Jeopardy! champion supercomputer, to the task of medical diagnosis. Verghese is perceptive and eloquent about the power of a [...]

  • http://de.bride.md/ russische frau

    Has read all in detail, excellent blog! Watson is considering three answers to the question being asked: “Thomas Hardy,” “Oscar Wilde,” and “Tolstoy” (sorry, I forget what the question was). It’s not very confident, though, that any of these are correct: it thinks there’s only a 20% chance that Hardy is the right answer, and the other two are even lower.

  • stand up comedy

    contoh kata pengantar http://goo.gl/bjsrQZ

Previous post:

Next post: