Day 13 — Stratified Sampling: The Smart Way to Sample
Divide and conquer your sampling strategy for maximum precision.
Stratified sampling guarantees coverage of important subgroups while reducing variance by 50-95% compared to simple random sampling.
The Random Sampling Trap
Imagine you're conducting a health survey in a company of 1,000 employees:
-
900 office workers (90%)
-
100 executives (10%)
You randomly sample 100 people. Here's what can go wrong:
Unlucky Sample #1:
Office workers: 95 people
Executives: 5 people
Problem: Only 5 executives - can't say much about this group!
Unlucky Sample #2:
Office workers: 87 people
Executives: 13 people
Different from reality (90/10 split)!
Unlucky Sample #3:
Office workers: 100 people
Executives: 0 people
Complete miss on executive health!
The problem: Simple Random Sampling (SRS) is... well, random!
The solution: Stratified Sampling - sample smartly within groups!
What is Stratified Sampling?
Stratified Sampling means:
-
Divide population into non-overlapping groups (strata)
-
Sample from each stratum separately
-
Combine results with proper weighting
Visual Comparison
Simple Random Sampling (SRS):
Show code (10 lines)
Population: (office workers)
(executive)
Random sample of 10:
Picked:
Result: All office workers!
Stratified Sampling:
Show code (14 lines)
Population:
Stratum 1: (90 office workers)
Stratum 2: (10 executives)
Stratified sample of 10:
From Stratum 1: (9 people)
From Stratum 2: (1 person)
Result: Proper representation!
Why Stratify? Three Big Reasons
1. Guaranteed Coverage
Problem with SRS: Might miss rare but important groups
Example:
Show code (16 lines)
City population:
- Urban: 70%
- Suburban: 20%
- Rural: 10%
SRS of 100 might give:
Urban: 65, Suburban: 25, Rural: 10
OR
Urban: 75, Suburban: 22, Rural: 3 ← Rural underrepresented!
Stratified solution:
Show code (10 lines)
Explicitly sample from each:
Urban: 70 people (guaranteed)
Suburban: 20 people (guaranteed)
Rural: 10 people (guaranteed)
Coverage ensured!
2. Variance Reduction
The Math Intuition:
Variance comes from differences:
-
Between-stratum variance: How different are the groups?
-
Within-stratum variance: How different are people within each group?
The core idea: If strata are homogeneous (similar within), stratified sampling has lower variance than SRS!
Visual:
Show code (22 lines)
POPULATION (high variance):
Health scores: 45, 48, 50, 52, 85, 87, 88, 90, 91, 92
↑_________↑ ↑___________________↑
Office Executives
(lower) (higher)
Within-stratum variance:
Office: σ² = 6.5 (people similar)
Executive: σ² = 7.8 (people similar)
But between-stratum difference is HUGE (50 vs 90)!
SRS estimates affected by this big gap.
Stratified sampling accounts for it separately!
3. Domain Insights
SRS result:
"Average health score: 75"
Okay... but tells us nothing about groups!
Stratified result:
"Average health scores:
Office workers: 52 (95% CI: 50-54)
Executives: 88 (95% CI: 86-90)"
Rich insights about each segment!
The Math: How Much Better Is It?
Variance Formula
Simple Random Sampling variance:
Show code (12 lines)
Var(ȳ_SRS) = σ²/n × (N-n)/N
Where:
- σ² = overall population variance
- n = sample size
- N = population size
- (N-n)/N = finite population correction
Stratified Sampling variance:
Show code (12 lines)
Var(ȳ_strat) = Σ(Wₕ² × σₕ²/nₕ × (Nₕ-nₕ)/Nₕ)
Where:
- Wₕ = stratum h weight (Nₕ/N)
- σₕ² = variance within stratum h
- nₕ = sample size in stratum h
- Nₕ = population size in stratum h
The Variance Reduction:
Var(ȳ_SRS) - Var(ȳ_strat) = Σ Wₕ(μₕ - μ)²
This is the between-stratum variance!
Translation: The more different your strata are, the bigger the variance reduction!
Example Calculation
Population:
-
Stratum 1 (Office): N₁ = 900, μ₁ = 50, σ₁² = 100
-
Stratum 2 (Executive): N₂ = 100, μ₂ = 90, σ₂² = 64
-
Total: N = 1000
Sample: n = 100
Proportional allocation:
-
n₁ = 90 (90% of sample)
-
n₂ = 10 (10% of sample)
SRS Variance:
First, calculate overall variance:
Show code (12 lines)
μ = 0.9(50) + 0.1(90) = 45 + 9 = 54
σ² = 0.9(100 + (50-54)²) + 0.1(64 + (90-54)²)
= 0.9(100 + 16) + 0.1(64 + 1296)
= 0.9(116) + 0.1(1360)
= 104.4 + 136
= 240.4
Var(ȳ_SRS) = 240.4/100 × (1000-100)/1000
= 2.404 × 0.9
= 2.16
Standard error: √2.16 = 1.47
Stratified Variance:
Show code (16 lines)
W₁ = 900/1000 = 0.9
W₂ = 100/1000 = 0.1
Var(ȳ_strat) = 0.9² × (100/90) × (900-90)/900
+ 0.1² × (64/10) × (100-10)/100
= 0.81 × 1.11 × 0.9
+ 0.01 × 6.4 × 0.9
= 0.81 + 0.058
= 0.87
Standard error: √0.87 = 0.93
The Improvement:
Show code (10 lines)
Variance reduction: 2.16 - 0.87 = 1.29 (60% reduction! )
Standard error:
SRS: 1.47
Stratified: 0.93
Stratified is 58% more precise!
Translation: To get the same precision with SRS, you'd need 2.5× more samples!
Allocation Strategies: How Many Per Stratum?
Once you decide to stratify, how do you divide your sample across strata?
1. Proportional Allocation (Most Common)
Rule: Sample proportionally to stratum size
nₕ = n × (Nₕ/N)
Example:
Population: 900 office, 100 executive (1000 total)
Sample size: n = 100
Office sample: 100 × (900/1000) = 90
Executive sample: 100 × (100/1000) = 10
Pros:
-
Simple, intuitive
-
Self-weighting (no complex weights needed)
-
Represents population structure
Cons:
- Small strata get small samples (might be imprecise)
2. Equal Allocation 🟰
Rule: Same sample size for each stratum
nₕ = n / H
Where H = number of strata
Example:
Show code (10 lines)
Population: 900 office, 100 executive
Sample size: n = 100
Strata: H = 2
Office sample: 100/2 = 50
Executive sample: 100/2 = 50
Pros:
-
Good for comparing strata (equal precision)
-
Ensures small strata have enough data
Cons:
-
Oversamples small strata (need complex weights)
-
Less efficient for overall mean estimation
3. Neyman Allocation (Optimal)
Rule: Allocate proportional to stratum size AND variance
nₕ = n × (Nₕ × σₕ) / Σ(Nₖ × σₖ)
Intuition: Sample more from:
-
Large strata (more people → more important)
-
High-variance strata (more diverse → need more samples)
Example:
Show code (14 lines)
Stratum 1: N₁ = 900, σ₁ = 10
Stratum 2: N₂ = 100, σ₂ = 8
Stratum 1 weight: 900 × 10 = 9,000
Stratum 2 weight: 100 × 8 = 800
Total weight: 9,800
Office sample: 100 × (9000/9800) = 91.8 ≈ 92
Executive sample: 100 × (800/9800) = 8.2 ≈ 8
Pros:
-
Mathematically optimal (minimizes variance!)
-
Accounts for both size and heterogeneity
Cons:
-
Requires knowing σₕ in advance (often unknown!)
-
Might still undersample important small strata
4. Optimal Allocation with Cost
Rule: Account for different sampling costs per stratum
nₕ = n × (Nₕ × σₕ / √cₕ) / Σ(Nₖ × σₖ / √cₖ)
Where cₕ = cost to sample one unit from stratum h
Example:
Executives cost 5× more to survey (busy, need incentives)
c₁ = $10 (office worker)
c₂ = $50 (executive)
This would reduce executive sample further!
Use when: Budget constrained, different costs per stratum
Visual: Variance vs Allocation
Let's see how variance changes with different allocations:
Show code (24 lines)
Variance (SE²)
3.0
• SRS
2.5
2.0 • Equal
1.5
1.0 • Proportional
0.5 • Neyman
(Optimal!)
0.0
Different Allocation Strategies
Lower is better!
Takeaway: Neyman always wins (if you know the variances)!
Defining Strata: The Art and Science
Good strata are:
1. Mutually Exclusive
Each unit belongs to exactly one stratum
Bad: "Young", "Students"
(Young students counted twice!)
Good: "Student", "Non-Student"
2. Exhaustive
Every unit belongs to some stratum
Bad: "<30", "40-60", ">60"
(Missing 30-40 age range!)
Good: "<30", "30-40", "40-60", ">60"
3. Homogeneous Within 🟰
Units within stratum are similar
Bad stratum: "People" (too diverse!)
Good stratum: "Female doctors aged 40-50"
4. Heterogeneous Between
Strata are different from each other
Bad: "Age 30-40", "Age 31-41"
(Too much overlap, not distinct!)
Good: "Age 18-30", "Age 31-50", "Age 51+"
5. Meaningful
Based on domain knowledge, not arbitrary
Bad: "First 500 rows", "Last 500 rows"
(Arbitrary split!)
Good: "Urban", "Suburban", "Rural"
(Meaningful demographic divisions)
Common Stratification Variables:
Demographics:
-
Age groups
-
Gender
-
Education level
-
Income brackets
-
Geographic region
Business:
-
Customer segments (high/medium/low value)
-
Product categories
-
Time periods (Q1, Q2, Q3, Q4)
Medical:
-
Disease severity (mild/moderate/severe)
-
Treatment type
-
Risk factors present/absent
Wrapping Up
Stratified sampling is the "divide and conquer" of sampling:
Key Concepts:
Strata = non-overlapping, exhaustive groups
Proportional allocation = sample proportionally (simple, self-weighting)
Neyman allocation = optimal (proportional to Nₕ × σₕ)
Variance reduction = can be 50-95% lower than SRS!
Coverage guarantee = ensures rare groups included
Domain insights = separate estimates per stratum
The Math Win:
Variance reduction = Σ Wₕ(μₕ - μ)²
Translation: The more different your strata,
the bigger the improvement!
Allocation Decision Tree:
Show code (10 lines)
Do you know σₕ for each stratum?
Yes → Use Neyman (optimal!)
No → Do you need equal precision per stratum?
Yes → Use Equal allocation
No → Use Proportional (simplest)
Real Impact:
In our exercise, stratified sampling gave 16× more precision than SRS with the same sample size. That's like getting 129 samples for the price of 8!
Where This Shows Up in Practice
- Data Pipelines: Ensuring high-quality filtering and robust statistical metrics before feeding downstream ML models.
- Production Anomaly Detection: Tracking system logs, performance latencies, or transaction volumes under heavy skew.
- A/B Testing & Evaluation: Correctly partitioning user cohorts or comparing treatment outcomes without normal distribution assumptions.
References
-
Cochran, W. G. (1977). Sampling Techniques (3rd ed.). John Wiley & Sons.
-
Lohr, S. L. (2019). Sampling: Design and Analysis (3rd ed.). Chapman and Hall/CRC.
-
Kish, L. (1965). Survey Sampling. John Wiley & Sons.
-
Neyman, J. (1934). On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society, 97(4), 558-625.
-
Särndal, C. E., Swensson, B., & Wretman, J. (1992). Model Assisted Survey Sampling. Springer-Verlag.
-
Thompson, S. K. (2012). Sampling (3rd ed.). John Wiley & Sons.
-
Valliant, R., Dever, J. A., & Kreuter, F. (2018). Practical Tools for Designing and Weighting Survey Samples (2nd ed.). Springer.
-
Groves, R. M., Fowler, F. J., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2009). Survey Methodology (2nd ed.). John Wiley & Sons.
-
Little, R. J., & Rubin, D. B. (2019). Statistical Analysis with Missing Data (3rd ed.). John Wiley & Sons.




