The p-value: finally understanding it (in an unusual way)

Pedro Prado
5 min readMar 13, 2023

--

This is one of the most controversial topics of statistics. Not because it is not accepted but because people don’t get what the p-value means. Even people that teach statistics don’t get it. Even the person who invent… — just kidding. He is probably the only person in the world who thought this made sense at first. But it will, hopefully, make sense for you until the end of this article.

When I was learning statistics, years ago, I didn’t get it what the p-value represented. However I understood it while watching a mineral processing class (shout out to Mineral Engineering at University of Toronto). The following analogy brought some light to my (ADHD) mind during that class.

The (crazy) analogy

Imagine that you are a drop-out of university. You want an easy life, studying doesn’t get you anywhere these days, right?

He concurs.

You heard about gold deposits below NYC. Yeah, under all those buildings, there are gold nuggets! That’s what someone told you at an enthusiast’s geology convention. Seems legit (said no one). Might be worth trying your luck… testing that hypothesis.

You rush to pack your stuff, grab your pickaxe and off you go to the subway system of NYC, picking the walls, looking for gold nuggets.

Gold, not golden dude.

Now back to the statistics.

We have an initial hypothesis called the null hypothesis. This states that there is no gold here. Everybody accepts there is no gold here, because nobody has ever found any relevant proof to refute it.

We also have the alternative hypothesis. As it is called, it is an alternative to the null hypothesis. This hypothesis states that there is gold under NYC.

Alright. After so many nights in the subway system, digging into those walls, you actually found something. It is something shiny and yellow… you found GOLD! 🤩

… and people building gold mining plants in the middle of nowhere! Ha!

All those fools saying there is no gold below NYC… you found enough evidence that there is gold right there! What does that mean to our hypothesis test?

This means we are refuting the null hypothesis. We have found enough evidence to say that the null hypothesis (“There is no gold under NYC”) is false.

Moreover, and as important as the last statement, we know that the probability of finding what we found (gold under NYC) in a world where “There is no gold under NYC” is taken as valid is very very very low (hence the p-value is low).

This is where most people get lost. Think about what you just read.

You have a scenario. You found something that brakes that scenario badly. The chance of finding such an important sample in a scenario as cited before being true, is low. Pure logic.

This means our sample is statistically significant enough to reject the initial, null hypothesis that stated “There is no gold under NYC”.

Now I get it!

But what if we hadn’t found any gold under NYC?

That would mean we wouldn’t have had enough evidence to reject “There is no gold under NYC”. That’s what we call “failing to reject the null hypothesis”.

Importantly, it doesn’t mean the null hypothesis is true —it only means that we cannot reject it yet. Who knows if some other crazy dude had decided to mine gold under the Central Park or the Times Square, and he would have actually found it? That could possibly debunk the null hypothesis, just as the example we extensively developed above.

How does it work in the real world?

When performing hypothesis tests, the probability (p-value of the test) will tell you if the sample you got was statistically relevant or not to reject the null hypothesis when you compare it to the significance level* (called conveniently alpha).

Maybe finding a gold nugget wouldn’t be considered statistically relevant to reject the null hypothesis… afterall, maybe somebody could have just buried that gold nugget there?

Why would you do that? Trolling?

Maybe finding tons of gold nuggets would be more accurate to reject the null hypothesis. That’s when picking the right significance level comes into picture! Depending on the value you choose for you significance level, the outcome could be totally different — either rejecting or failing to reject the null hypothesis.

In our example we assumed the p-value was very low, below the significance value. In the real world, you will get the p-value returned by calculations done by your statistical package of choice (for Pythoners, the Scipy package). It will calculate the p-value based on the distribution and sample you have. For the oldschoolers, you can use those tables and graphs to calculate the area under the distribution representing the p-value, but that’s for another article.

*The significance level (alpha) is chosen by the one who performs the test and hence, can be prone to all sorts of mischief (not for this article).

Conclusion

I hope it all became clear with this explanation, if not, just ping me back on the comment section. Suggestions and corrections, also leave a comment :)

--

--

Pedro Prado

Identity crisis between a Data Scientist and Data Engineer