So: How Well Did Watson Do?
Because I enjoy the statistical side of tennis and the understanding that it can provide, I decided to take some time to see just how well Watson did in predicting match winners. Gathering the data is a bit time-consuming and I did this exercise with only the Men’s singles draw. (I may go back and do the women’s draw and ask the question: who is more predictable, men or women? – Haha!) It’s really not possible to compare Watson with human guesses because the results of the online Bracket Challenge have not been posted by the US Open site. But, judging from other smaller pools of bracket players, my best guess is that picking 85/127 matches correctly would have won The Bracket Challenge.
(Now here is a bit of outright deception: The USTA, through the US Open site, promotes the Bracket Challenge as a “One Million Dollar Bracket Challenge.” If you read the fine print, you will see that anyone who enters the Bracket pool must get the draw 100% correct to win a million dollars. Just think back to your NCAA March Madness brackets – imagine that you have to get every game right to win. No one would play …). Thank you USTA for treating your fans as dolts.
Watson, it turns out, chose 90/127 matches correctly, but the comparison with Bracket Challenge participants is not apples to apples. Watson ALWAYS gets to choose between the actual participants in a match. For example, if I chose Borna Coric to go far (and I did!), in the Bracket Challenge, I would not only get the match which he lost wrong, but I would necessarily get all of the subsequent matches where I had Coric advancing wrong. Watson does not pay any penalty for an incorrect choice, as he (?? – what is Watson’s gender – an important question in today’s world?) is always updated to choose only between the players who are actually playing.
There is no scientific way to compare these two specific ways of playing a bracket, but I think you would all agree that, seen in the light of always getting a fresh start for each round, Watson’s performance is pretty dismal. I can think of two reasons for this.
First, Watson (incorrectly!) uses reasoning that “momentum” is on the side of a particular player. I have discussed this issue at length before, but statements like “his win in the last round gives him momentum going into this match” are complete hogwash. To the extent that Watson is not even smart enough to “learn” that momentum is meaningless raises serious concerns about what AI is capable of, not only in tennis, but in general. My best guess is that Watson cannot learn BECAUSE HE IS PROGRAMMED BY A FOOL who thinks “momentum” exists. (There is even a section under each match recap labeled “Momentum” – Yikes!)
My concern for AI, in general, is based upon my personal experience with a security which I purchased some years ago with Watson making all of the stock trading and positioning (go long or short) decisions. It is underperforming the market! Maybe there is a field where AI works, but with both tennis and the stock market, it is a dismal failure!
Secondly, Watson incorporates press and media reports into its predictions. For example, Watson will report that “social media posts suggest a better level of conditioning for Player X.” If we’ve learned anything, it’s that social media is an enormous fact manipulating machine that can rarely be trusted. Once again, I suspect that Watson isn’t permitted to “learn” this because it has been programmed by some dummkopf to treat these things as if they were relevant.
There is another dimension though to Watson’s predictions: Watson doesn’t just predict the winner of a match, but does so with a percentage. For example, in the Alcaraz-Tiafoe semifinal, Watson had Tiafoe favored by 52-48. So, it got the result wrong, but was somewhat prescient in predicting a close match. We can use these numbers to answer other questions, like what were the biggest upsets in the tournament?
Here are a couple of examples. Taylor Fritz was favored by 76-24 against Brandon Holt in his first-round match which he lost in 4 sets. The next largest surprise was Miomir Kecmanovic’ loss to Richard Gasquet in the second round despite being favored by 71-29. Gasquet won in 4 sets, but was easily dismissed by Nadal in the following round. Watson-like thinking would ask: will these players have “scar-tissue”, i.e. negative momentum going forward? You know my answer: the question is hogwash!
AI sounds good (just like a Million Dollar Bracket Challenge) but there’s nothing “Intelligent” about it. When we look under the hood, there’s a lot that’s “Artificial” and very little that smacks of “Intelligence.”