Standard Deviation and Covariance - How are these related

Standard Deviation and Covariance - How are these related

Background: Mean, Median, and Mode Explained! 📊

These are three different ways to find the "middle" or "typical" value in a group of numbers. Each one tells us something different!

The Mean (Average) ➗

What it is: Add everything up, then divide by how many things you have.

Example: Test Scores 📝

Your last 5 math test scores: 85, 92, 78, 88, 82

Finding the mean:

Add them up: 85 + 92 + 78 + 88 + 82 = 425
Divide by how many: 425 ÷ 5 = 85

Your average score is 85!

Real-Life Example: Weekly Allowance 💵

Your friends' weekly allowances: $10, $15, $12, $8, $20

Sum: $65
Mean: $65 ÷ 5 = $13

The average allowance is $13 (even though nobody actually gets exactly $13!)

⚠️ When Mean Can Be Tricky!

Class Pizza Party: 5 kids ate: 2, 2, 3, 2, 11 slices

Mean: 20 ÷ 5 = 4 slices

But wait! Only one kid (who was super hungry) ate more than 4! The mean got pulled up by that one hungry kid!

The Median (The Middle One) 🎯

What it is: Line everything up from smallest to largest, then pick the one in the middle.

Example: Heights of Basketball Players 🏀

Team heights in inches: 68, 70, 65, 74, 66

Finding the median:

Sort them: 65, 66, 68, 70, 74
Find the middle: 68 inches

Half the team is shorter than 68", half is taller!

Even Number Example: Video Game High Scores 🎮

Your scores: 1200, 1500, 1800, 2000

Already sorted!
With 4 numbers, take the middle two: 1500 and 1800
Average them: (1500 + 1800) ÷ 2 = 1650

Real-Life Example: House Prices 🏠

Houses on your street sold for: $200,000, $210,000, $205,000, $195,000, $1,500,000

Mean: $462,000 (Whoa! That mansion pulled it way up!)
Median: $205,000 (More realistic for a typical house)

The median ignores extreme values!

The Mode (The Popular One) 🌟

What it is: The number that shows up the most often.

Example: Shoe Sizes 👟

Your soccer team's shoe sizes: 7, 8, 7, 9, 7, 8, 10, 7, 8

Count them:

Size 7: appears 4 times ✅ WINNER!
Size 8: appears 3 times
Size 9: appears 1 time
Size 10: appears 1 time

Mode = Size 7 (most popular size)

Multiple Modes Example: Favorite Pizza Toppings 🍕

Class votes: Pepperoni, Cheese, Pepperoni, Mushroom, Cheese, Hawaiian

Pepperoni: 2 votes
Cheese: 2 votes
Others: 1 vote each

Two modes! Pepperoni AND Cheese (this is called "bimodal")

No Mode Example: Unique Test Scores 📚

Scores: 72, 85, 91, 88, 79

Every score appears just once
No mode! (Nobody got the same score)

Comparing All Three with the Same Data! 🎯

Example: Hours of Sleep Last Week

Monday:    8 hours
Tuesday:   7 hours
Wednesday: 7 hours
Thursday:  9 hours
Friday:    6 hours
Saturday:  10 hours
Sunday:    7 hours

Finding Each One:

Mean (Average):

Sum: 8+7+7+9+6+10+7 = 54
Mean: 54 ÷ 7 = 7.7 hours

Median (Middle):

Sorted: 6, 7, 7, 7, 8, 9, 10
Middle value: 7 hours

Mode (Most Common):

7 appears three times
Mode: 7 hours

When to Use Each One? 🤔

Use MEAN When:

You want the mathematical average
Data doesn't have extreme values
Example: "What's the average temperature this week?"

Use MEDIAN When:

You have extreme values that would mess up the average
You want to know the "typical" middle value
Example: "What's a typical salary at this company?" (ignores CEO's huge salary)

Use MODE When:

You want to know what's most common
You're dealing with categories (not just numbers)
Example: "What's the most popular ice cream flavor?"

Fun Practice Examples! 🎮

Example 1: Gaming Session Minutes

Data: 30, 45, 45, 60, 45, 120, 45

Mean: 390 ÷ 7 = 55.7 minutes
Median: 30, 45, 45, 45, 45, 60, 120 → 45 minutes
Mode: 45 appears 4 times → 45 minutes

Example 2: Class Birthday Months

Data: Jan, March, Jan, July, Jan, Sept, Dec, March, Jan

Mean: Can't calculate (not numbers!)
Median: Can't really find (not numbers!)
Mode: January appears 4 times → January

Example 3: Bowling Scores 🎳

Data: 85, 92, 185, 88, 90

Mean: 540 ÷ 5 = 108 (pulled up by that amazing 185!)
Median: 85, 88, 90, 92, 185 → 90
Mode: No mode (all different)

That 185 game makes the mean misleading - median (90) better represents typical performance!

Memory Tricks! 🧠

Mean = Average

"Mean" and "Average" both have 'a' in them
It's the MEAN-ingful total divided up

Median = Middle

"Median" sounds like "medium" (middle size)
Highway median = middle of the road

Mode = Most

Both start with "Mo"
Mode = Most Often

Real-World Applications 🌍

Weather:

Mean temperature: Planning what to plant
Median rainfall: Typical weather
Mode wind direction: Where to build airport runways

School:

Mean grade: Your GPA
Median score: Where you rank in class
Mode answer: Which question everyone got wrong (teacher should review!)

Sports:

Mean points per game: Overall performance
Median time: Typical race finish
Mode jersey number: Most popular number

Quick Quiz! 🎯

Data Set: 10, 15, 15, 20, 100

What's the mean?
- Answer: 160 ÷ 5 = 32
What's the median?
- Answer: 15 (middle value)
What's the mode?
- Answer: 15 (appears twice)
Which best represents "typical"?
- Answer: Median or Mode (15) - the mean (32) got pulled up by that 100!

Remember: All three are useful, but for different reasons!

Standard Deviation Explained

Imagine your class takes a spelling test, and you want to understand how spread out everyone's scores are!

The Story of Two Classrooms 🏫

Classroom A: Everyone scores between 78-82 points

Sarah: 80
Jake: 78
Emma: 82
Luis: 79
Mia: 81

Classroom B: Scores are all over the place!

Tom: 95
Amy: 65
Ben: 100
Lisa: 55
Ryan: 85

Both classrooms have the same average (80), but they're very different!

What Standard Deviation Tells Us 📏

Standard deviation is like a "spread-out meter" that measures how far apart things are from the middle:

Small standard deviation = Everyone is bunched together (like Classroom A)
Large standard deviation = Everyone is spread far apart (like Classroom B)

Real-Life Examples You Know!

🍕 Pizza Slices: If you and your friends each get 2 slices, that's a small standard deviation (everyone gets nearly the same). But if one friend gets 5 slices and another gets just 1, that's a large standard deviation!

🎯 Dart Game:

Good player: All darts land close together near the bullseye (small standard deviation)
Beginner: Darts land all over the board (large standard deviation)

🏃 Race Times: If all runners finish within 10 seconds of each other, that's a small spread. If the fastest runner finishes 2 minutes before the slowest, that's a big spread!

The Ice Cream Shop Example 🍦

A shop tracks how many scoops each customer orders:

Boring days: Everyone orders 2 scoops (no spread at all!)
Normal days: Most order 1-3 scoops (small spread)
Crazy days: Some order 1 scoop, others order 10! (huge spread)

Standard deviation helps the shop know if customers are predictable or surprising!

Why It Matters 🎯

Teachers use it to see if:

Everyone understands the lesson (small spread in test scores)
Some kids need extra help (big spread means some are struggling while others ace it)

The Simple Rule 📐

Think of standard deviation as answering: "Do most things stay close to normal, or do they jump around a lot?"

Close together = Small number = Predictable
Far apart = Big number = Unpredictable

It's like measuring how "same" or "different" things are in a group!

Standard Deviation and Covariance Explained

Standard Deviation measures how spread out data points are from their mean. It quantifies the typical distance between individual values and the average. Calculated as the square root of variance, standard deviation has the same units as the original data, making it interpretable.

For example, if test scores have a mean of 75 with standard deviation of 10, most students scored between 65-85 (within one standard deviation). A small standard deviation means data clusters tightly around the mean; a large one indicates wide spread. In a normal distribution, roughly 68% of values fall within ±1 standard deviation, 95% within ±2, and 99.7% within ±3.

Covariance measures how two variables change together. Positive covariance means variables tend to increase or decrease together (height and weight). Negative covariance means when one increases, the other typically decreases (price and demand). Zero covariance suggests no linear relationship.

The formula is: Cov(X,Y) = E[(X - μₓ)(Y - μᵧ)]

However, covariance is hard to interpret because it depends on the variables' scales. Dividing covariance by the product of standard deviations gives correlation (-1 to +1), which is scale-independent and more interpretable.

Key difference: Standard deviation describes one variable's spread; covariance describes the relationship between two variables.

Finding Standard Deviation - Step by Step! 📊

Let's say your study group tracks their test scores (out of 100):

Test Scores:

Alex: 80 points
Maya: 76 points
Jake: 84 points
Sofia: 72 points
Ryan: 88 points

Step 1: Find the Average (Mean) 🎯

Add all scores and divide by how many students:

Sum: 80 + 76 + 84 + 72 + 88 = 400
Average: 400 ÷ 5 = 80 points

Step 2: Find How Far Each Score is from Average 📏

This is like asking "how different is each student from typical?"

Student	Score	Distance from 80
Alex	80	80 - 80 = 0
Maya	76	76 - 80 = -4
Jake	84	84 - 80 = +4
Sofia	72	72 - 80 = -8
Ryan	88	88 - 80 = +8

Step 3: Square Each Distance 🔲

Why square? It makes negative numbers positive and makes big differences stand out more!

Student	Distance	Squared
Alex	0	0² = 0
Maya	-4	(-4)² = 16
Jake	+4	4² = 16
Sofia	-8	(-8)² = 64
Ryan	+8	8² = 64

Step 4: Find the Average of Squared Distances 🧮

Sum of squares: 0 + 16 + 16 + 64 + 64 = 160
Average: 160 ÷ 5 = 32
This is called the variance!

Step 5: Take the Square Root ✅

Standard deviation = √32 ≈ 5.66 points

What Does This Mean? 🤔

The standard deviation of 5.66 tells us:

Most students score within 5.66 points of the average (80)
Typical range: between 74.34 and 85.66
The class is pretty consistent (not too spread out)

Let's Compare Two Different Classes! 📚

Class A (Everyone Similar):

Scores: 78, 80, 82, 79, 81
Average: 80
Standard deviation: 1.4 (tiny - everyone performs almost the same!)

Class B (Big Differences):

Scores: 65, 95, 70, 90, 80
Average: 80
Standard deviation: 11.4 (huge - some struggling, some acing it!)

Visual Way to Think About It 🎨

Class A (small std dev):     ●●●●●  <- all bunched around 80
                               ↑
                            Average (80)

Class B (large std dev):  ●    ●    ●    ●    ●  <- spread from 65 to 95!
                               ↑
                            Average (80)

Real Example: Your Quiz Scores 📝

Let's track your math quiz scores this month:

Quiz Scores: 85, 92, 78, 88, 82 (out of 100)

Step-by-step:

Average: (85+92+78+88+82) ÷ 5 = 85
Differences: 0, +7, -7, +3, -3
Squared: 0, 49, 49, 9, 9
Variance: (0+49+49+9+9) ÷ 5 = 23.2
Standard Deviation: √23.2 ≈ 4.82 points

What this tells you:

Your scores typically stay within 4.82 points of 85
Expected range: 80.18 to 89.82
You're pretty consistent!

Different Subject Comparisons 📖

Your Science Scores:

Tests: 90, 88, 91, 89, 92
Average: 90
Std Dev: 1.4 (super consistent! You've got science down!)

Your History Scores:

Tests: 75, 95, 70, 85, 100
Average: 85
Std Dev: 11.7 (very inconsistent - maybe depends on the topic?)

Your English Scores:

Tests: 82, 78, 85, 80, 75
Average: 80
Std Dev: 3.5 (moderate consistency)

The Grade Distribution Rule 📐

For most classes with many students:

68% score within 1 standard deviation
95% score within 2 standard deviations
99.7% score within 3 standard deviations

Example: Class average = 75, std dev = 10

68% score between 65-85
95% score between 55-95
99.7% score between 45-105 (capped at 100!)

Quick Trick to Estimate! 💡

The Range Rule: Standard deviation ≈ (Highest - Lowest) ÷ 4

From our first example:

Highest: 88, Lowest: 72
Quick estimate: (88-72) ÷ 4 = 4
Actual: 5.66 (pretty close!)

Practice Problems 🎮

Problem 1: Basketball Free Throws (out of 100 attempts):

Success rates: 70, 75, 65, 80, 60
What's the standard deviation?

Problem 2: Spelling Test Scores:

Scores: 95, 100, 90, 85, 95
How consistent is this student?

Problem 3: Video Game Accuracy (%):

Games: 82, 78, 86, 74, 90
Find the spread!

Why This Matters for School 🎯

Track Your Progress: See if you're getting more consistent
Identify Strengths: Low std dev = mastered subject
Find Problem Areas: High std dev = need more practice
Compare Subjects: Which classes are you most consistent in?
Set Goals: Aim to reduce your standard deviation!

Teacher's Perspective 👩‍🏫

When teachers see:

Small std dev (like 3-5): "Everyone understands! I can move on."
Large std dev (like 15-20): "Some kids need help. Time for review!"
Medium std dev (like 8-10): "Normal spread. Keep current pace."

Standard deviation helps you understand if you're consistently good or all over the place (too much variety)!

What is Covariance

Covariance Explained 🎯

Imagine you and your best friend are tracking two things every week: how many hours you study and what score you get on the weekly quiz.

The Big Question 🤔

Covariance answers: "When one thing goes up, does the other thing usually go up too, go down, or not care?"

Let's Look at Three Different Stories! 📚

Story 1: Study Time & Quiz Scores (Friends That Go Together) ⬆️⬆️

Week 1: Study 1 hour → Score 70
Week 2: Study 2 hours → Score 80  
Week 3: Study 3 hours → Score 90
Week 4: Study 0.5 hours → Score 65

What's happening?

More study = Higher score ✅
Less study = Lower score ✅
These are friends that move together!
This is POSITIVE covariance (+)

Story 2: Video Game Time & Homework Score (Opposites) ⬆️⬇️

Monday: Play 4 hours → Homework score 60
Tuesday: Play 1 hour → Homework score 95
Wednesday: Play 3 hours → Homework score 70
Thursday: Play 0.5 hours → Homework score 100

What's happening?

More gaming = Lower homework score 📉
Less gaming = Higher homework score 📈
These are opposites!
This is NEGATIVE covariance (-)

Story 3: Shoe Size & Math Grade (Don't Care About Each Other) ➡️❓

Student A: Shoe size 5 → Math grade 85
Student B: Shoe size 7 → Math grade 82
Student C: Shoe size 4 → Math grade 88
Student D: Shoe size 8 → Math grade 84

What's happening?

Big feet? Small feet? Math doesn't care! 🤷
These things ignore each other
This is ZERO covariance (0)

Real-Life Examples You Know! 🌟

Things That Go Together (Positive) ⬆️⬆️

Ice Cream Sales & Temperature
- Hot day = Lots of ice cream sold
- Cold day = Few ice cream sold
Practice & Getting Better
- More basketball practice = More baskets made
- Less practice = Fewer baskets made
Rain & Umbrella Sales

Rainy day = Many umbrellas sold
Sunny day = Few umbrellas sold

Things That Are Opposites (Negative) ⬆️⬇️

Speed & Time to School
- Walk faster = Get there quicker
- Walk slower = Takes longer
Absences & Grades
- Miss more school = Lower grades
- Come every day = Higher grades
Price & How Many Sold
- Expensive candy = Fewer kids buy it
- Cheap candy = More kids buy it

Things That Don't Care (Zero) ➡️❓

Hair Color & Favorite Pizza
- Blonde, brown, black hair... everyone likes different pizza!
Birthday Month & Height
- Born in January or July? Doesn't affect how tall you are!
Pet Type & Reading Speed
- Dog or cat owner? Doesn't change how fast you read!

Positive and Negative Covariance

Here are 5 real-life examples of covariance:

1. Height and Weight in Humans

Positive Covariance: Taller people tend to weigh more, while shorter people tend to weigh less

Example: In a population study, as height increases from 5'0" to 6'5", weight typically increases from around 100 lbs to 200+ lbs

Why it matters: Healthcare professionals use this relationship to calculate BMI and assess health metrics

2. Study Time and Exam Scores

Positive Covariance: Students who study more hours typically score higher on exams

Example: A student studying 2 hours might average 70%, while one studying 6 hours might average 90%

Why it matters: Educational institutions use this to recommend study guidelines and predict academic performance

3. Ice Cream Sales and Temperature

Positive Covariance: Ice cream sales increase on hotter days and decrease on colder days

Example: A store might sell 50 ice creams on a 60°F day but 300 on a 95°F day

Why it matters: Businesses use this for inventory management and staffing decisions

4. Car Age and Resale Value

Negative Covariance: As a car gets older, its resale value typically decreases

Example: A new car worth $30,000 might be worth $20,000 after 3 years and $10,000 after 7 years

Why it matters: Used car dealers, insurance companies, and consumers use this relationship for pricing and depreciation calculations

5. Stock Prices in the Same Sector

Positive Covariance: Companies in the same industry often see their stock prices move together

Example: When oil prices rise, ExxonMobil, Chevron, and BP stocks often all increase together

Why it matters: Investors use covariance to diversify portfolios - holding stocks with negative or low covariance reduces overall risk

Key Point: Covariance tells us whether two variables move together (positive), move in opposite directions (negative), or have no relationship (near zero). However, it doesn't tell us the strength of the relationship - that's what correlation does!

The Playground Example 🛝

Let's track Temperature and Kids at the Playground:

Monday:    60°F → 5 kids playing
Tuesday:   75°F → 20 kids playing
Wednesday: 80°F → 25 kids playing  
Thursday:  55°F → 3 kids playing
Friday:    85°F → 30 kids playing

Pattern? When temperature goes UP ⬆️, kids playing goes UP ⬆️ too!

This is positive covariance - they're friends!

The Dance Partner Analogy 💃🕺

Think of covariance like dance partners:

Positive Covariance = Dancing the same moves
- Both go left together
- Both go right together
- They're in sync!
Negative Covariance = Mirror dancing
- One goes left, other goes right
- One goes up, other goes down
- They're opposites!
Zero Covariance = Dancing to different songs
- Each doing their own thing
- Not paying attention to each other

Quick Check Game! 🎮

What kind of covariance do these have?

Hours of sleep & How tired you feel
- Answer: Negative! (More sleep = Less tired)
Books read & Vocabulary size
- Answer: Positive! (More books = More words)
Favorite color & Math skills
- Answer: Zero! (They don't care about each other)

Why Should You Care? 🌈

Understanding covariance helps you:

Make Predictions: "If I study more, my grades will probably go up!"
See Patterns: "Every time it's sunny, the pool is crowded"
Make Decisions: "If I want better grades, I should reduce video game time"
Understand Connections: "These two things are related!"

The Lemonade Stand Example 🍋

You run a lemonade stand and notice:

Hot days: Sell 50 cups
Warm days: Sell 30 cups
Cool days: Sell 10 cups

Temperature and lemonade sales have positive covariance! Now you know to make more lemonade on hot days!

Remember: 🧠

Covariance = "Do these things like to move together, opposite, or ignore each other?"

Best Friends = Positive (both up or both down together)
Enemies = Negative (one up, other down)
Strangers = Zero (don't care about each other)

It's like finding out which things in life are buddies, which are rivals, and which are strangers!

Finding Covariance - Step by Step! 📊

Let's track Hours Studied and Test Scores for 5 students:

Our Data:

Student	Hours Studied (X)	Test Score (Y)
Emma	2 hours	70%
Liam	3 hours	75%
Sofia	5 hours	85%
Noah	4 hours	80%
Ava	1 hour	65%

Step 1: Find the Averages 📈

Average Hours Studied:

Sum: 2 + 3 + 5 + 4 + 1 = 15 hours
Average: 15 ÷ 5 = 3 hours

Average Test Score:

Sum: 70 + 75 + 85 + 80 + 65 = 375
Average: 375 ÷ 5 = 75%

Step 2: Find How Far Each is From Average 📏

Student	Hours - 3	Score - 75
Emma	2 - 3 = -1	70 - 75 = -5
Liam	3 - 3 = 0	75 - 75 = 0
Sofia	5 - 3 = +2	85 - 75 = +10
Noah	4 - 3 = +1	80 - 75 = +5
Ava	1 - 3 = -2	65 - 75 = -10

Step 3: Multiply the Differences for Each Student 🔢

This is the key step! We multiply how far hours are from average by how far scores are from average:

Student	(Hours - 3) × (Score - 75)	Result
Emma	(-1) × (-5)	+5
Liam	(0) × (0)	0
Sofia	(+2) × (+10)	+20
Noah	(+1) × (+5)	+5
Ava	(-2) × (-10)	+20

Step 4: Find the Average of These Products 🎯

Sum: 5 + 0 + 20 + 5 + 20 = 50
Covariance: 50 ÷ 5 = +10

What Does +10 Mean? 🤔

Positive covariance (+10) tells us:

When study hours go UP ⬆️, test scores tend to go UP ⬆️
When study hours go DOWN ⬇️, test scores tend to go DOWN ⬇️
They're friends that move together!

Let's See the Pattern Visually! 📉

Score |                    Sofia•(5,85)
  85  |                 
  80  |              Noah•(4,80)
  75  |         Liam•(3,75)
  70  |     Emma•(2,70)
  65  | Ava•(1,65)
      |________________________
        1    2    3    4    5  Hours

See the upward pattern? That's positive covariance!

Different Example: Screen Time vs. Grades 📱

Let's see what negative covariance looks like:

Data:

Student	Screen Time (hrs)	Grade
Alex	1	92
Blake	2	88
Casey	3	84
Dana	4	80
Eli	5	76

Quick Calculation:

Averages: Screen = 3 hrs, Grade = 84
Differences from average:
- Alex: -2 hrs, +8 grade → (-2)×(+8) = -16
- Blake: -1 hr, +4 grade → (-1)×(+4) = -4
- Casey: 0 hrs, 0 grade → (0)×(0) = 0
- Dana: +1 hr, -4 grade → (+1)×(-4) = -4
- Eli: +2 hrs, -8 grade → (+2)×(-8) = -16
Sum: -16 + -4 + 0 + -4 + -16 = -40
Covariance: -40 ÷ 5 = -8

Negative covariance (-8) means:

More screen time = Lower grades (opposites!)

Zero Covariance Example: Shoe Size vs. Math Score 👟

Data:

Student	Shoe Size	Math Score
Student A	6	85
Student B	7	75
Student C	5	90
Student D	8	80
Student E	9	70

After calculation, you'd get covariance ≈ 0

Shoe size and math scores don't care about each other!

Understanding the Sign and Size 📏

The Sign (+ or -)

Positive (+): Things move together ⬆️⬆️
Negative (-): Things move opposite ⬆️⬇️
Zero (0): No relationship 🤷

The Size (How Big the Number Is)

Big number (like ±50): STRONG relationship
Medium number (like ±10): Moderate relationship
Small number (like ±2): Weak relationship
Zero or near zero: No relationship

Practice Problem! 🎮

Gaming Hours vs. Outdoor Activity Hours:

Day	Gaming	Outdoor
Mon	1	4
Tue	2	3
Wed	3	2
Thu	4	1
Fri	5	0

Try it yourself:

Find average gaming hours
Find average outdoor hours
Calculate differences
Multiply and sum
Divide by 5

Spoiler: You'll get negative covariance! (They're opposites)

Real-World Applications 🌍

Positive Covariance Examples:

Temperature & Ice cream sales
Study time & Grades
Exercise & Fitness level
Practice & Performance

Negative Covariance Examples:

Price & Sales quantity
Absences & Grades
TV time & Reading time
Fast food & Health

Zero Covariance Examples:

Birth month & Height
Hair color & IQ
Favorite color & Sports ability

The Formula (For Reference) 📝

Covariance = Σ[(X - X̄)(Y - Ȳ)] / n

Where:

X̄ = average of X values
Ȳ = average of Y values
n = number of data points
Σ = sum everything up

Why This Matters! 🎯

Make Predictions: "If I increase study time, my scores should improve"
Understand Relationships: "These two things are connected!"
Smart Decisions: Know what affects what
Data Analysis: Foundation for correlation and regression

Remember: Covariance shows if things are friends (positive), enemies (negative), or strangers (zero)!

The 68-95-99.7 Rule (Empirical Rule)

Python
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import norm

# Create a range of x values
x = np.linspace(-4, 4, 1000)
# Create a standard normal distribution (mean=0, std_dev=1)
y = norm.pdf(x, 0, 1)

# Create the plot
plt.figure(figsize=(12, 6))
plt.plot(x, y, color='black')

# --- Fill Regions ---

# 1 Standard Deviation (68%)
x_1_std = np.linspace(-1, 1, 100)
y_1_std = norm.pdf(x_1_std, 0, 1)
plt.fill_between(x_1_std, y_1_std, color='blue', alpha=0.3, label='68%')

# 2 Standard Deviations (95%)
x_2_std_left = np.linspace(-2, -1, 100)
y_2_std_left = norm.pdf(x_2_std_left, 0, 1)
plt.fill_between(x_2_std_left, y_2_std_left, color='green', alpha=0.3)

x_2_std_right = np.linspace(1, 2, 100)
y_2_std_right = norm.pdf(x_2_std_right, 0, 1)
plt.fill_between(x_2_std_right, y_2_std_right, color='green', alpha=0.3, label='95% (total)')

# 3 Standard Deviations (99.7%)
x_3_std_left = np.linspace(-3, -2, 100)
y_3_std_left = norm.pdf(x_3_std_left, 0, 1)
plt.fill_between(x_3_std_left, y_3_std_left, color='orange', alpha=0.3)

x_3_std_right = np.linspace(2, 3, 100)
y_3_std_right = norm.pdf(x_3_std_right, 0, 1)
plt.fill_between(x_3_std_right, y_3_std_right, color='orange', alpha=0.3, label='99.7% (total)')

# --- Add Text and Annotations ---

# 68%
plt.annotate('68%',
             xy=(0, 0.2),
             xytext=(0, 0.25),
             ha='center',
             fontsize=12,
             arrowprops=dict(facecolor='black', shrink=0.05, width=1, headwidth=8))

# 95%
plt.annotate('',
             xy=(-1.5, 0.05),
             xytext=(-2.5, 0.15),
             arrowprops=dict(arrowstyle='<->', color='black'))
plt.annotate('',
             xy=(1.5, 0.05),
             xytext=(2.5, 0.15),
             arrowprops=dict(arrowstyle='<->', color='black'))
plt.text(0, 0.13, '95%', ha='center', fontsize=12)

# 99.7%
plt.annotate('',
             xy=(-2.5, 0.01),
             xytext=(-3.5, 0.08),
             arrowprops=dict(arrowstyle='<->', color='black'))
plt.annotate('',
             xy=(2.5, 0.01),
             xytext=(3.5, 0.08),
             arrowprops=dict(arrowstyle='<->', color='black'))
plt.text(0, 0.06, '99.7%', ha='center', fontsize=12)

# --- Customize Ticks and Labels ---
plt.title('The 68-95-99.7 (Empirical) Rule', fontsize=16)
ticks = [-3, -2, -1, 0, 1, 2, 3]
labels = ['μ-3σ', 'μ-2σ', 'μ-1σ', 'μ', 'μ+1σ', 'μ+2σ', 'μ+3σ']
plt.xticks(ticks, labels, fontsize=12)
plt.yticks([]) # Remove y-axis ticks and labels

plt.ylim(bottom=0)
plt.grid(axis='x', linestyle='--', alpha=0.6)

# Save the plot
plt.savefig('empirical_rule_graph.png', dpi=300, bbox_inches='tight')

print("Graph 'empirical_rule_graph.png' has been generated.")

Code output
Graph 'empirical_rule_graph.png' has been generated.

https://lh3.googleusercontent.com/gg-dl/ABS2GSno11rPaN4-4PwT298DnWxy2xAgzKuXCbZXnXKay0jVfhd700_tO14Cs4BpXQcEJLZSbl9WnoK6wApQeWYb1K1GH26gVYBGjwIN-WipF0sRDZuS9-9P91-bCVR8qvfew8ZCJFGjVl9lRTTnHEz1xIFL2NAr3evOrle83MSrX5HMfZe0hA=s1600

Here is a graph illustrating the 68-95-99.7 (Empirical) Rule for a normal distribution.

This rule states that for a bell-shaped curve:

68% of the data falls within one standard deviation (sigma - σ) of the mean(μ).
95% of the data falls within two standard deviations.
99.7% of the data falls within three standard deviations.

The 68-95-99.7 rule is a shorthand way to remember how data is distributed in a normal distribution (bell curve). It tells you what percentage of data falls within certain distances from the mean.

The Rule States:

68% of data falls within 1 standard deviation of the mean
95% of data falls within 2 standard deviations of the mean
99.7% of data falls within 3 standard deviations of the mean

Real-World Example:

Let's say human heights follow a normal distribution with:

Mean (μ) = 170 cm
Standard deviation (σ) = 10 cm

Then:

68% of people are between 160-180 cm (170 ± 10)
95% of people are between 150-190 cm (170 ± 20)
99.7% of people are between 140-200 cm (170 ± 30)

Practical Applications:

1. Quality Control

Products outside 3σ are considered defects (only 0.3% expected)
Six Sigma methodology aims for even tighter control

2. Test Scores

If test scores have mean = 75, standard deviation = 10:

68% of students score between 65-85
95% score between 55-95
99.7% score between 45-105

3. Stock Market

Daily returns often assumed normal:

Most days (68%) have small movements (±1σ)
Big moves beyond 2σ happen only 5% of the time
Extreme moves beyond 3σ are very rare (0.3%)

Why It's Important:

Quick estimates without complex calculations
Outlier detection - values beyond 3σ are suspicious
Setting thresholds and expectations
Understanding variability in any process

Key Points to Remember:

Only applies to normal (Gaussian) distributions
It's about standard deviations from the mean
Also called the "Empirical Rule" or "Three-Sigma Rule"
The remaining 0.3% (outside 3σ) are considered outliers

Quick Memory Trick:

Think of it as doubling coverage:

1σ = about 2/3 of data (68%)
2σ = about 19/20 of data (95%)
3σ = almost everything (99.7%)

This rule is fundamental in statistics, quality control, and understanding how "normal" any measurement is!

How Standard Deviation and Covariance Are Related

Yes, standard deviation and covariance are closely related through several important connections:

1. Variance Connection

Variance is the square of standard deviation: Var(X) = σ² [sigma squared]
Variance is actually covariance of a variable with itself: Var(X) = Cov(X,X)
Therefore: Standard deviation = √Cov(X,X)

2. Correlation Formula The correlation coefficient directly connects them:

Correlation = Cov(X,Y) / (σₓ × σᵧ)
This normalizes covariance by both variables' standard deviations
Makes the relationship scale-independent (-1 to +1)

3. Mathematical Properties

Covariance can never exceed the product of standard deviations: |Cov(X,Y)| ≤ σₓ × σᵧ
This is why correlation is bounded between -1 and 1
When |correlation| = 1, the relationship is perfectly linear

4. Practical Example If two variables have:

Standard deviations: σₓ = 10, σᵧ = 5
Covariance: Cov(X,Y) = 40
Then correlation = 40/(10×5) = 0.8 (strong positive relationship)

5. Portfolio Theory Application In finance, portfolio variance combines both concepts:

Portfolio variance = Σ wᵢ²σᵢ² + 2Σ wᵢwⱼCov(i,j)
Uses individual standard deviations and pairwise covariances

In essence, standard deviation is a special case of covariance (with itself), and together they fully describe linear relationships between variables.

Understanding Data Spread Topics

Variance and standard deviation
What happens with multiple variables?

Measuring Relationships Between Variables

Covariance and correlation

Putting It Together: Portfolio Theory

Practical application using both concepts
We need both

Key Connecting Points to Emphasize

Mathematical Unity
- Variance is just covariance with itself
- Standard deviation appears in the correlation formula
Conceptual Flow
- Once you understand spread, relationships are the natural next step
- You need both to fully describe data
Practical Necessity

You can't understand portfolio risk with just standard deviation
You can't interpret covariance without knowing the standard deviations

Why Combine Them

Natural progression: Standard deviation (one variable) → Covariance (two variables)
Mathematical relationship: Variance = Cov(X,X), making them fundamentally connected
Reader efficiency: Learning both together reinforces understanding
Practical applications: Most real-world problems involve both concepts

From Spread to Relationships: Understanding Standard Deviation and Covariance

1. Introduction: The Journey from One to Many Variables

Start with a relatable problem: "How risky is this investment?"
Show need for measuring spread (std dev) and relationships (covariance)

2. Part I: Standard Deviation - Measuring Spread

Intuitive explanation with visual examples
Calculation walkthrough
Why square root of variance?
68-95-99.7 rule
Real example: Student test scores

3. Bridge Section: "What If We Have Two Variables?"

Transition question: "Does study time relate to test scores?"
Show limitations of just using individual standard deviations
Introduce need for measuring relationships

4. Part II: Covariance - Measuring Relationships

Build from variance formula: Var(X) = E[(X-μ)²]
Show variance IS covariance: Var(X) = Cov(X,X)
Extend to two variables: Cov(X,Y) = E[(X-μₓ)(Y-μᵧ)]
Positive vs negative covariance examples

5. The Connection: Correlation

Problem: Covariance depends on units
Solution: Normalize by standard deviations
ρ = Cov(X,Y)/(σₓ × σᵧ)
Now everything connects!

6. Practical Applications (Using Both Together)

Portfolio Risk Example:
- Individual stock volatilities (standard deviations)
- Stock relationships (covariances)
- Portfolio risk formula combines both:
  σ²portfolio = w₁²σ₁² + w₂²σ₂² + 2w₁w₂Cov(1,2)

7. Interactive Visualizations

Scatter plots showing different covariance patterns
Slider to adjust correlation and see effect on both metrics
Portfolio risk calculator

8. Common Pitfalls

Why covariance ≠ causation
When standard deviation misleads (bimodal distributions)
Scale sensitivity of covariance

Code Example

import numpy as np

import matplotlib.pyplot as plt

# Generate correlated data
mean = [0, 0]
cov_matrix = [[1, 0.8], 
              [0.8, 1]]  # Covariance matrix
x, y = np.random.multivariate_normal(mean, cov_matrix, 1000).T

# Calculate metrics
std_x = np.std(x)
std_y = np.std(y)
covariance = np.cov(x, y)[0,1]
correlation = covariance / (std_x * std_y)

print(f"Std Dev of X: {std_x:.3f}")
print(f"Std Dev of Y: {std_y:.3f}")
print(f"Covariance: {covariance:.3f}")
print(f"Correlation: {correlation:.3f}")

# Visualize
plt.scatter(x, y, alpha=0.5)
plt.title(f'Correlation = {correlation:.2f}')
plt.xlabel('X (spread = std dev)')
plt.ylabel('Y (relationship = covariance)')

Essential Topics to Complete Your Statistics/Probability Blog

Since you're covering Bayes' theorem, distributions, standard deviation, and covariance, here are complementary topics to create a comprehensive blog:

Foundation Topics (Prerequisites)

Probability Basics
- Sample space, events, probability axioms
- Conditional probability and independence
- Law of total probability
Descriptive Statistics
- Mean, median, mode
- Variance and its properties
- Percentiles and quartiles

Core Statistical Concepts

Correlation vs Causation
- Pearson, Spearman correlations
- Simpson's paradox
- Common misconceptions
Central Limit Theorem
- Why it matters
- Real-world applications
- Connection to normal distribution
Hypothesis Testing
- Null/alternative hypotheses
- Type I and II errors
- p-values and significance levels
- Power analysis
Confidence Intervals
- Interpretation and construction
- Relationship to hypothesis testing
- Common misunderstandings

Intermediate Topics

Maximum Likelihood Estimation (MLE)
- Connection to Bayesian methods
- When to use MLE vs MAP
- Practical examples
Monte Carlo Methods
- Simulation basics
- Monte Carlo integration
- Bootstrap methods
Markov Chains
- Transition matrices
- Stationary distributions
- MCMC for Bayesian inference

Applied Topics

A/B Testing
- Frequentist vs Bayesian approaches
- Sample size calculation
- Multiple testing correction
Regression Analysis
- Linear regression basics
- Assumptions and diagnostics
- Logistic regression for classification
Time Series Basics
- Autocorrelation
- Stationarity
- Moving averages

Modern ML Connections

Information Theory Basics
- Entropy
- KL divergence
- Cross-entropy loss
Bayesian Networks and Graphical Models
- Directed vs undirected
- Inference algorithms
- Real applications
Expectation-Maximization (EM)
- Hidden variables
- Connection to MLE
- Gaussian mixture models

Practical Guides

Common Statistical Fallacies
- Base rate fallacy
- Prosecutor's fallacy
- Survivorship bias
- Data dredging
When to Use Which Test
- Decision tree for statistical tests
- Parametric vs non-parametric
- Sample size considerations
Visualization Best Practices
- Choosing the right plot
- Avoiding misleading visualizations
- Uncertainty visualization

Blog Structure Suggestion

Series 1: Foundations (Start here)

Probability basics → Descriptive stats → Distributions → Standard deviation/Covariance

Series 2: Inference

Bayes' theorem → Hypothesis testing → Confidence intervals → MLE

Series 3: Applications

Correlation/Causation → CLT → Regression → A/B testing

Series 4: Advanced

Markov chains → Monte Carlo → Bayesian networks → EM algorithm

Interactive Elements to Add

Code examples in Python/R for each concept
Interactive visualizations using JavaScript libraries
Real-world datasets for practice
Problem sets with solutions
Common mistakes sections
"When to use this" decision guides

Connecting Threads

Make sure to show how topics relate:

How CLT justifies using normal distribution in hypothesis testing
How MLE connects to Bayesian MAP estimation
How covariance appears in regression and portfolio theory
How Bayes' theorem underlies machine learning algorithms

Artificial Intelligence Theory and Application