Demystifying Stats for non-statisticians: P-Values

statistics & probability
By Ben We.

Statistical Mediation & Moderation in Psychological Research (19)If you’ve ever taken a statistics course, you’ve experienced the strange, slightly opaque world of statistical jargon, where colloquial language has highly specific meanings that are easily abused. One of the most famous, most abused statistical terms is the “p-value.” In almost every field of science there’s an ongoing discussion over P-values, and whether the common P-Value Threshold of 0.05 is even reasonable or not. So, what is a P-value, and why is 0.05 such a contentious number?

What is a P-Value?

Before I give you the book definition of a P-value, let me recap how the statistical court of significance works. All data is initially assumed to be ordinary, terribly boring, and totally within expected boundaries unless proven elsewise. To decide that some observations are worthy of note, statisticians need a quantitative method.

That’s where the P-value comes in. Academically, the P-value is the probability of obtaining results as extreme as the observed data, assuming that the null hypothesis is correct1. While that definition is rather stuffy, opaque and not at all appealing, I hope you’ll understand it with this more concrete example. Let’s say you have a friend named Alvin who loves to play tricks on you. One day, he comes up to you with a coin and tells you to guess heads or tails.

You guess, and he flips heads.

Then, he flips heads again.

And, heads again,

And, heads again,

And, once more, to be sure, he flips the coin and it lands heads.

How many flips would it take before you suspect that he is using a trick coin? For me, it would be between four and five heads in a row - four is suspicious, and five, far too many heads in a row. This gut feeling is your intuition’s P-value threshold. Unfortunately, while intuition is acceptable with your friend Alvin, it doesn’t fly for statistics. So, let’s actually see what the probability of flipping 4 or 5 heads in a row is.

4 Heads: ½ x ½ x ½ x ½ = 0.0625

5 Heads: ½ x ½ x ½ x ½ x ½ = 0.03125

To translate those numbers into English: There’s a 6.25% chance that you could flip 4 heads in a row, and a 3.125% chance that you could flip 5 heads in a row assuming the coin is an ordinary coin. A 6.25% chance is pretty low, so I would want to examine Alvin’s coin closely. However, your threshold may vary depending on how rare you feel a 6% event is or not.

That 6% chance is the P(robability)-value, and the point when you feel with uncomfortable with the odds is your P-value threshold.

So, how did we get 0.05, and why all the fuss?

0.05, or 5% is a common threshold that most statisticians use to separate “statistically significant” from “statistically insignificant” results. Unfortunately, this threshold was not carefully calculated, but rather was picked arbitrarily by a statistician way in the past.

The problem is that so many people are taught with that standard, that they just look for that magic number, throwing out results with a P-value of 0.051, while confidently wielding results with a P-value of 0.049 like the ten commandments. However, the truth is that the “insignificant” 0.051 is almost just as rare as the “significant” 0.049. So, why the hang up on 0.05, and why don’t we use other numbers?

For one, it is a good baseline metric – 1 in 20 is a pretty rare event, but not so rare that it is an insurmountable bar. For example, during early March, there is approximately a 4-5% chance of experiencing a mixed rain/snow storm in Boston. As any local can attest, a mixed storm in March isn’t extremely rare, but it isn’t expected (or wanted for that matter). As a result, a 5% threshold is strict enough to keep the most unlikely events out, but not so strict as to disallow relatively unlikely events.

More importantly, since 5% is the accepted standard, using that value indicates that you play by the rules and that you’re not trying to pull some trickery. So, many statisticians won’t bat an eye at a P-value of 0.05.

However, there are actual quantifiable issues with the 0.05 P-value, and times when even the most rule-following statistician would gladly move the threshold. However, these are outside the scope of this post, and will be covered another time. Until then, good luck and hope the data gods reward you with some good P-values!

Our statistics and probability tutors are doctoral candidates and PhDs. Our team also includes a small number of tutors, including MD and MBA candidates, who use statistics in the context of specialized fields. We help students master the fundamentals of statistics and probability: basic probability models, combinatorics (combinations and permutations), random variables, discrete and continuous probability distributions, statistical estimation and testing, confidence intervals, and linear regression. Whether you are encountering statistics for the first time, or you are looking for graduate level assistance in a specialized field, such as biostatistics or stochastic processes, we can help you.

Contact us!

Looking for more information on statistics? Check out some other helpful blog posts below!:

Introductory Statistics: Are my data normal?

Statistical Mediation & Moderation in Psychological Research

Why Understanding Statistics Matters More Than Ever

Comments

topicTopics
academics study skills MCAT medical school admissions SAT expository writing college admissions English MD/PhD admissions strategy writing LSAT GMAT GRE physics chemistry math biology graduate admissions academic advice ACT interview prep law school admissions test anxiety language learning premed MBA admissions career advice personal statements homework help AP exams creative writing MD study schedules test prep computer science Common Application summer activities history mathematics philosophy organic chemistry secondary applications economics supplements research 1L PSAT admissions coaching grammar law psychology statistics & probability legal studies ESL CARS SSAT covid-19 dental admissions logic games reading comprehension engineering USMLE calculus PhD admissions Spanish mentorship parents Latin biochemistry case coaching verbal reasoning DAT English literature STEM excel medical school political science skills AMCAS French Linguistics MBA coursework Tutoring Approaches academic integrity chinese letters of recommendation Anki DO Social Advocacy admissions advice algebra art history artificial intelligence astrophysics business cell biology classics diversity statement gap year genetics geometry kinematics linear algebra mechanical engineering mental health presentations quantitative reasoning study abroad technical interviews time management work and activities 2L DMD IB exams ISEE MD/PhD programs Sentence Correction adjusting to college algorithms amino acids analysis essay athletics business skills careers cold emails data science dental school finance first generation student functions graphing information sessions international students internships logic networking poetry resume revising science social sciences software engineering tech industry trigonometry writer's block 3L AAMC Academic Interest EMT FlexMed Fourier Series Greek Health Professional Shortage Area Italian Lagrange multipliers London MD vs PhD MMI Montessori National Health Service Corps Pythagorean Theorem Python Shakespeare Step 2 TMDSAS Taylor Series Truss Analysis Zoom acids and bases active learning architecture argumentative writing art art and design schools art portfolios bacteriology bibliographies biomedicine brain teaser campus visits cantonese capacitors capital markets central limit theorem centrifugal force chemical engineering chess chromatography class participation climate change clinical experience community service constitutional law consulting cover letters curriculum dementia demonstrated interest dimensional analysis distance learning econometrics electric engineering electricity and magnetism escape velocity evolution executive function freewriting genomics harmonics health policy history of medicine history of science hybrid vehicles hydrophobic effect ideal gas law immunology induction infinite institutional actions integrated reasoning intermolecular forces intern investing investment banking lab reports linear maps mandarin chinese matrices mba medical physics meiosis microeconomics mitosis mnemonics music music theory nervous system neurology neuroscience object-oriented programming office hours operating systems

Related Content