Demystifying Stats for non-statisticians: P-Values

statistics & probability
By Ben We.

Statistical Mediation & Moderation in Psychological Research (19)If you’ve ever taken a statistics course, you’ve experienced the strange, slightly opaque world of statistical jargon, where colloquial language has highly specific meanings that are easily abused. One of the most famous, most abused statistical terms is the “p-value.” In almost every field of science there’s an ongoing discussion over P-values, and whether the common P-Value Threshold of 0.05 is even reasonable or not. So, what is a P-value, and why is 0.05 such a contentious number?

What is a P-Value?

Before I give you the book definition of a P-value, let me recap how the statistical court of significance works. All data is initially assumed to be ordinary, terribly boring, and totally within expected boundaries unless proven elsewise. To decide that some observations are worthy of note, statisticians need a quantitative method.

That’s where the P-value comes in. Academically, the P-value is the probability of obtaining results as extreme as the observed data, assuming that the null hypothesis is correct1. While that definition is rather stuffy, opaque and not at all appealing, I hope you’ll understand it with this more concrete example. Let’s say you have a friend named Alvin who loves to play tricks on you. One day, he comes up to you with a coin and tells you to guess heads or tails.

You guess, and he flips heads.

Then, he flips heads again.

And, heads again,

And, heads again,

And, once more, to be sure, he flips the coin and it lands heads.

How many flips would it take before you suspect that he is using a trick coin? For me, it would be between four and five heads in a row - four is suspicious, and five, far too many heads in a row. This gut feeling is your intuition’s P-value threshold. Unfortunately, while intuition is acceptable with your friend Alvin, it doesn’t fly for statistics. So, let’s actually see what the probability of flipping 4 or 5 heads in a row is.

4 Heads: ½ x ½ x ½ x ½ = 0.0625

5 Heads: ½ x ½ x ½ x ½ x ½ = 0.03125

To translate those numbers into English: There’s a 6.25% chance that you could flip 4 heads in a row, and a 3.125% chance that you could flip 5 heads in a row assuming the coin is an ordinary coin. A 6.25% chance is pretty low, so I would want to examine Alvin’s coin closely. However, your threshold may vary depending on how rare you feel a 6% event is or not.

That 6% chance is the P(robability)-value, and the point when you feel with uncomfortable with the odds is your P-value threshold.

So, how did we get 0.05, and why all the fuss?

0.05, or 5% is a common threshold that most statisticians use to separate “statistically significant” from “statistically insignificant” results. Unfortunately, this threshold was not carefully calculated, but rather was picked arbitrarily by a statistician way in the past.

The problem is that so many people are taught with that standard, that they just look for that magic number, throwing out results with a P-value of 0.051, while confidently wielding results with a P-value of 0.049 like the ten commandments. However, the truth is that the “insignificant” 0.051 is almost just as rare as the “significant” 0.049. So, why the hang up on 0.05, and why don’t we use other numbers?

For one, it is a good baseline metric – 1 in 20 is a pretty rare event, but not so rare that it is an insurmountable bar. For example, during early March, there is approximately a 4-5% chance of experiencing a mixed rain/snow storm in Boston. As any local can attest, a mixed storm in March isn’t extremely rare, but it isn’t expected (or wanted for that matter). As a result, a 5% threshold is strict enough to keep the most unlikely events out, but not so strict as to disallow relatively unlikely events.

More importantly, since 5% is the accepted standard, using that value indicates that you play by the rules and that you’re not trying to pull some trickery. So, many statisticians won’t bat an eye at a P-value of 0.05.

However, there are actual quantifiable issues with the 0.05 P-value, and times when even the most rule-following statistician would gladly move the threshold. However, these are outside the scope of this post, and will be covered another time. Until then, good luck and hope the data gods reward you with some good P-values!

Our statistics and probability tutors are doctoral candidates and PhDs. Our team also includes a small number of tutors, including MD and MBA candidates, who use statistics in the context of specialized fields. We help students master the fundamentals of statistics and probability: basic probability models, combinatorics (combinations and permutations), random variables, discrete and continuous probability distributions, statistical estimation and testing, confidence intervals, and linear regression. Whether you are encountering statistics for the first time, or you are looking for graduate level assistance in a specialized field, such as biostatistics or stochastic processes, we can help you.

Contact us!

Looking for more information on statistics? Check out some other helpful blog posts below!:

Introductory Statistics: Are my data normal?

Statistical Mediation & Moderation in Psychological Research

Why Understanding Statistics Matters More Than Ever


academics study skills MCAT medical school admissions SAT college admissions expository writing strategy English MD/PhD admissions writing LSAT physics GMAT GRE chemistry biology math graduate admissions academic advice interview prep law school admissions ACT language learning test anxiety premed career advice MBA admissions personal statements homework help AP exams creative writing MD test prep study schedules computer science Common Application mathematics summer activities history secondary applications philosophy organic chemistry economics research supplements grammar 1L PSAT admissions coaching dental admissions law psychology statistics & probability legal studies ESL CARS PhD admissions SSAT covid-19 logic games reading comprehension calculus engineering USMLE mentorship Spanish parents Latin biochemistry case coaching verbal reasoning AMCAS DAT English literature STEM admissions advice excel medical school political science skills French Linguistics MBA coursework Tutoring Approaches academic integrity astrophysics chinese dental school gap year genetics letters of recommendation mechanical engineering units Anki DO Social Advocacy algebra art history artificial intelligence business careers cell biology classics data science diversity statement geometry kinematics linear algebra mental health presentations quantitative reasoning study abroad tech industry technical interviews time management work and activities 2L AAMC DMD IB exams ISEE MD/PhD programs Sentence Correction adjusting to college algorithms amino acids analysis essay athletics business skills cold emails fellowships finance first generation student functions graphing information sessions international students internships logic networking poetry proofs resume revising science social sciences software engineering trigonometry writer's block 3L Academic Interest EMT FlexMed Fourier Series Greek Health Professional Shortage Area Italian JD/MBA admissions Lagrange multipliers London MD vs PhD MMI Montessori National Health Service Corps Pythagorean Theorem Python Shakespeare Step 2 TMDSAS Taylor Series Truss Analysis Zoom acids and bases active learning architecture argumentative writing art art and design schools art portfolios bacteriology bibliographies biomedicine brain teaser burnout campus visits cantonese capacitors capital markets central limit theorem centrifugal force chem/phys chemical engineering chess chromatography class participation climate change clinical experience community service constitutional law consulting cover letters curriculum dementia demonstrated interest dimensional analysis distance learning econometrics electric engineering electricity and magnetism escape velocity evolution executive function extracurriculars freewriting genomics harmonics health policy history of medicine history of science hybrid vehicles hydrophobic effect ideal gas law immunology induction infinite institutional actions integrated reasoning intermolecular forces intern investing investment banking lab reports letter of continued interest linear maps mandarin chinese matrices mba medical physics meiosis microeconomics mitosis mnemonics