Here’s a profound question about beards: is the number of acrobats with beards the same as the number of bearded people who are acrobats? Go with your gut instinct. It’s not a trick question. If you answer “yes,” then you’ve understood the central idea behind Bayes’s theorem. If you’re one of those people who likes to titter about how bad you are at mathematics, stop it. Retake your exams, learn how to pin this obviousness down in symbols, and you can produce artificial intelligence, forecast stock market collapses and understand this:
P(A|B) = (P(B|A)∙P(A))/(P(B))
This is Bayes’s equation, the formula which, as Tom Chivers insists in his remarkable, bold book — Everything is Predictable — “explains the world.”
Rich mathematics begins by spotting an oddity in the obvious. Although the number of bearded people who are acrobats is the same as the number of acrobats who are bearded people, the two ways of saying the same thing reach the answer by different routes. They are like two paths circling up a mountain to arrive at the identical view. One path starts in the land of bearded people (P(B), in the previous equation, then meanders off to find out how many are acrobats, P(A|B). The other starts alongside the circus tent of tumblers and balancers, P(A), then heads on to how many are bearded, P(B|A).
Imagine you’re told someone is mild-mannered, loves books and wears glasses. Is this person more likely to be a librarian or a farmer? If you (you Sherlock!) diagnosed librarian, you’re wrong, because you’ve forgotten that there are vastly more farmers in the world than librarians. Even if only a tiny fraction of farmers are bespectacled wallflowers, that’s still a lot more farmers than librarians.
The same statistical fumble in medicine can be life threatening. Imagine you wander, for no particular reason, into a hospital and decide to take a test to see if you have a certain rare, fatal disease. The result comes back and it’s positive. The first thing you’ll want to know is how often does this horrible test make mistakes? The technician admits that ten percent of the time it gives a false positive, i.e. it incorrectly tells you you have the disease when you don’t. Does this mean there’s barely a one in ten chance of surviving to Christmas?
Your positive test result is the medical equivalent of the description “mild mannered, loves books and wears glasses;” the rare disease is the counterpart of “librarian.” The vitally important fact that you forgot to take into account in your gloom is that there are vastly more people without this illness than with it (and so, for a random population, considerably more well people getting a false positive than sick ones): that’s your farmer.
Plug these details into the formula, add a dose of false negatives and you discover that there’s a 93 percent chance of survival. You began thinking you had only a one in ten chance of making Christmas. Now, thanks to the insight of an obscure eighteenth-century non-conformist minister and amateur mathematician from Tunbridge Wells called Thomas Bayes, it’s better than nine in ten. Bayesian statistics is happy statistics.
At the heart of Bayes’s idea about probability is that you are not arriving at an issue in a state of blank idiocy: you have a standpoint. Try as you might to congratulate yourself on your even handedness and objective concern for facts in everything you do and think, you come loaded with what Bayesians call “subjective” beliefs. In the case of the disease test, your chances of seeing Christmas changed because your “subjective” choice of what to include as relevant shifted: you remembered that only a small percentage of the population have the illness.
Chivers spends a lot of time-discussing the importance of subjectivity in Bayesian statistics and, to my mind, he never quite pins the issue down. His metaphors are a little off the mark and his examples occasionally confusing. His discussion of the battle between Frequentists (who hate statistical “subjectivity”) and Bayesians left me muddled and cross — by the end I wanted to give them both a slap. I was far happier with his excellent explanation of the mysterious Monty Hall problem. For the first time in my life, I understood what was going on in this statistical sleight of hand:
Suppose you’re on a game show, hosted by Monty Hall, and you’re given the choice of three doors. Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and Monty, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you: ‘Do you want to pick door No. 2?’ Is it to your advantage to switch your choice?
The correct answer is yes: switching will raise your chance of winning from a third up to one half.
How can that possibly be? How can Monty opening one of the other doors mean the door you didn’t choose suddenly has a greater chance of joy behind it? The answer is you’re looking at it from the wrong side.
From Monty’s side — taking him, rather than you, as the source of our information — it’s a different story. He knows from the start which door is hiding that juicy Cadillac. The door he generously opens to help you will therefore always be a dud. That is valuable extra information. Imagine, instead of three closed doors, there are a million. You pick one: you have one chance in a million of being right. Monty now opens all but one of the remaining doors, each time exposing a goat. Of course he does. He’s not a fool. He’s never going to show you the Cadillac, is he? If that Cadillac is somewhere behind one of those doors (and there’s a 999,999/1,000,000 chance that it is) then he’s squirming about, doing everything he can not to open it. By the time you get to the point where he’s got only one door left to open and he finally gives you your choice — stick or switch? — he’s run out of options. The door he’s left with is almost certain to have the Cadillac behind it. Forget that this seems more about cheating than about statistics: switch!
This article was originally published in The Spectator’s UK magazine. Subscribe to the World edition here.
Leave a Reply