What makes statistics seems so daunting to the average student over, say, discre...

vharuck · on June 16, 2023

As a person with a bachelor's in math who dislikes statistics, it's the one field I know where "gotcha" problems are more common in reality than in class. And they are nasty problems. It takes a lot of education until you can confidently say what simplifications are justified for your situation. And a lot of professional statisticians like to flex on newcomers for simple mistakes.

You're right that some education would help people with everyday situations. It'd be nice if schools had intro to probability and statistics as the only required math course for all students. Even just covering fractions for probabilities, mean/median/mode, and then memorizing basic tricks or terms to look up on Wikipedia/Wolfram throughout life.

kqr · on June 16, 2023

The very idea underlying statistics (that we can learn something useful from a class of events by ignoring specifically that which makes them different from each other) is a very recent invention -- only a few hundred years old -- and it's still counter-intuitive to a lot of people.

Ask people to predict whether the Jones' have a dog and they'll start asking questions about the specifics of the Jones', like do they have children, do they live in the suburbs, have they always wanted a dog, etc. Then they try to construct a narrative based on "logical" conclusions.

When the right approach might be to ask "what's the background rate of dog ownership in the Jones' country of residence?" But that takes ignoring the specifics, and that doesn't come easily to people. It took several millennia for anyone to realise it could be useful.

smallnamespace · on June 16, 2023

> Ask people to predict whether the Jones' have a dog and they'll start asking questions about the specifics of the Jones', like do they have children, do they live in the suburbs, have they always wanted a dog, etc. Then they try to construct a narrative based on "logical" conclusions.

This is the core idea of Bayesian statistics.

You fall back to the background rate only if you have no ability to access more pertinent facts.

kqr · on June 17, 2023

When people ask about whether the Jones' have children, they do not (most of the time) have an accurate conditional probability of dog ownership given a certain number of children. They are trying to construct a story that seems plausible.

The difference between Bayesian reasoning and narrative fallacy is a coherent evaluation of joint probabilities -- the bit most people skip over.

smallnamespace · on June 17, 2023

It's doubtful people have access to population-level base rates either, in which case they would also arrive at the wrong answer, no narrative needed.

But why do you point the finger at narrative instead of a simple lack of data? In the dog example, those questions are absolutely good ones to ask, so long as the data can be found.

One reason to reject this mental opposition between 'Bayesian reasoning' and 'narrative fallacy' is because narratives are essential to coming up with good conditions on which to split the population in the first place.

'I would ask whether they have children, because children like dogs and therefore it's plausible that families with children have a higher rate of dog ownership' is a reasonable narrative that justifies collecting an extra column in your dataset.

Granted, one shouldn't accept that story uncritically without checking against the actual statistics. But narratives themselves are unavoidable even when you're doing statistics correctly!

kqr · on June 17, 2023

This is one of those things where it's impossible to prove one approach better or worse than the other. But history has shown (starting with merchant ship insurance in the 1700s going up to Tetlock's superforecaster research more recently) that starting from general base rates gives, on average, more realistic odds than starting from specifics.

You see this all the time in sports betting, too. People overcorrect from the base rate when receiving news. People love a narrative that makes sense to them, but statistically it rarely holds up against statistical reasoning with a very small set of variables. (For more examples, see Meehl's research around clinical vs actuarial judgment.)

Yes, you're correct that a hypothesis is one of those narratives, but if you're operating at that level of scientific abstraction you can ignore my comment. Most people don't, and just roll with the story and forget about the probabilities.

Foobar8568 · on June 16, 2023

Easy to get things wrong from the wrong intuition or understanding of the problem. Cross mathematic discipline etc. But I feel that most applications just need high school math.