Lesson 8: Statistics — Data & Probability

Cambridge A Level Mathematics 9709 — Statistics 1 (S1) | Paper 4 (AS) or Paper 5

Lesson 8 of 12
67% complete
📋 Note: Statistics is a separate component — Paper 4 (AS Level) or Paper 5 (A Level). It does not build on Pure Mathematics directly. This lesson covers data representation, measures of location and spread, probability (including conditional probability and independence), and discrete random variables. Lesson 9 covers the Normal distribution and hypothesis testing.

1. Representation of Data S1

Types of Data: Qualitative (categorical — names, colours, labels) and Quantitative (numerical). Quantitative data is either discrete (countable, e.g. number of students) or continuous (measurable, any value in a range, e.g. height, time).

Stem-and-Leaf Diagrams

A stem-and-leaf diagram preserves the original data values while showing the distribution. A back-to-back stem-and-leaf compares two datasets.

Box-and-Whisker Plots

Box plot uses five-number summary:
Minimum value | Lower Quartile (Q₁) | Median (Q₂) | Upper Quartile (Q₃) | Maximum value
Interquartile Range (IQR) = Q₃ − Q₁ — measures spread of middle 50%.
Outliers: Values below Q₁ − 1.5×IQR or above Q₃ + 1.5×IQR.

Histograms for Grouped Data

Histogram: For continuous grouped data, the y-axis is frequency density, NOT frequency. This ensures the area of each bar equals the frequency.
Frequency density = Frequency ÷ Class width
Frequency = Frequency density × Class width

📐 Worked Example 1 — Histogram and Frequency Density

Data on journey times (minutes): 0–10: 12, 10–20: 18, 20–40: 22, 40–70: 15. Draw the histogram frequency densities and estimate the number of journeys under 30 minutes.

1
Calculate frequency densities (FD = freq ÷ class width):
0–10: 12/10=1.2   10–20: 18/10=1.8   20–40: 22/20=1.1   40–70: 15/30=0.5
2
Journeys under 30 min = all of 0–10, all of 10–20, and half of 20–40:
= 12 + 18 + 22/2 = 12 + 18 + 11 = 41 journeys

Cumulative Frequency Curves

Plot cumulative frequency against the upper class boundary. The curve (ogive) allows estimation of the median (at n/2), quartiles (at n/4 and 3n/4), and percentiles.

2. Measures of Location and Spread S1

Measures of Location

Mean: x̄ = Σx/n    or    x̄ = Σfx/Σf (frequency distribution)
Median: middle value when data is sorted; average of two middle values if n even
Mode: most frequent value(s)

Measures of Spread — Variance and Standard Deviation

Variance: σ² = Σ(x−x̄)²/n = Σx²/n − x̄²    (population)
Variance: s² = Σ(x−x̄)²/(n−1) = [Σx²−(Σx)²/n]/(n−1)    (sample)
Standard deviation: σ = √(variance)
Coding: if y=(x−a)/b then ȳ=(x̄−a)/b, σ_y = σ_x/|b|

📐 Worked Example 2 — Mean and Variance

For the data: 3, 7, 7, 8, 10, 12, 14. Find the mean, variance, and standard deviation.

1
n=7, Σx=3+7+7+8+10+12+14=61
x̄ = 61/7 ≈ 8.714
2
Σx²=9+49+49+64+100+144+196=611
σ² = 611/7 − (61/7)² = 87.286 − 75.918 = 11.368
3
σ = √11.368 = 3.37 (3 s.f.)

📐 Worked Example 3 — Coding

Data has Σx=360, Σx²=15200, n=20. Using y=x−20, find the mean and standard deviation of x.

1
Σy=Σ(x−20)=360−400=−40. ȳ=−40/20=−2
x̄=ȳ+20=18
2
Σy²=Σ(x−20)²=Σx²−40Σx+400n=15200−14400+8000=8800
σ²_y=Σy²/n−ȳ²=8800/20−4=440−4=436
Since y=x−20, σ_x=σ_y=√436≈20.9

Choosing the Right Measure

SituationBest Measure of LocationBest Measure of Spread
Symmetric distributionMeanStandard deviation
Skewed distributionMedianIQR
Outliers presentMedianIQR
Categorical dataMode
Further calculations neededMeanStandard deviation

3. Probability S1

Fundamental Probability Rules

P(A) = (number of favourable outcomes)/(total outcomes)    for equally likely outcomes
0 ≤ P(A) ≤ 1    P(A) + P(A') = 1    P(∅) = 0    P(S) = 1
Addition rule: P(A∪B) = P(A) + P(B) − P(A∩B)
Mutually exclusive: P(A∩B)=0 → P(A∪B)=P(A)+P(B)
Conditional probability: P(A|B) = P(A∩B)/P(B)
Independent events: P(A∩B) = P(A)×P(B)    i.e. P(A|B)=P(A)
Multiplication rule: P(A∩B) = P(A|B)×P(B) = P(B|A)×P(A)

📐 Worked Example 4 — Conditional Probability

In a class: P(pass Maths) = 0.7, P(pass Physics) = 0.6, P(pass both) = 0.45. Find: (a) P(pass at least one) (b) P(pass Maths | pass Physics) (c) Are the events independent?

1
(a) P(M∪P) = 0.7+0.6−0.45 = 0.85
2
(b) P(M|P) = P(M∩P)/P(P) = 0.45/0.6 = 0.75
3
(c) If independent: P(M∩P)=P(M)×P(P)=0.7×0.6=0.42≠0.45.
P(M|P)=0.75≠0.7=P(M). Therefore not independent.

Tree Diagrams and Venn Diagrams

📐 Worked Example 5 — Tree Diagram (Two-Stage)

A bag has 4 red and 6 blue balls. Two balls are drawn without replacement. Find the probability that both are the same colour.

1
P(RR) = (4/10)×(3/9) = 12/90 = 2/15
P(BB) = (6/10)×(5/9) = 30/90 = 1/3
2
P(same colour) = 2/15 + 1/3 = 2/15 + 5/15 = 7/15

Permutations and Combinations

Counting Methods

Permutations (order matters): ⁿPᵣ = n!/(n−r)!
Combinations (order doesn't matter): ⁿCᵣ = n!/[r!(n−r)!]
Arrangements with repeats: n!/p!q!r!... (where p,q,r... are frequencies of repeated items)
Circular arrangements: (n−1)!

📐 Worked Example 6 — Combinations in Probability

A committee of 4 is chosen from 6 men and 5 women. Find the probability that the committee has at least 2 women.

1
Total ways = ¹¹C₄ = 330
2
P(at least 2 women) = P(2W+2M) + P(3W+1M) + P(4W+0M)
= (⁵C₂×⁶C₂ + ⁵C₃×⁶C₁ + ⁵C₄×⁶C₀)/330
= (10×15 + 10×6 + 5×1)/330
= (150+60+5)/330 = 215/330 = 43/66

4. Discrete Random Variables S1

Discrete Random Variable (DRV): A variable X that takes specific discrete values with defined probabilities. The probability distribution lists all possible values and their probabilities. Key requirement: ΣP(X=x) = 1.

Expectation and Variance of a DRV

E(X) = Σ x P(X=x)    (expected value / mean)
E(X²) = Σ x² P(X=x)
Var(X) = E(X²) − [E(X)]²
E(aX+b) = aE(X) + b    Var(aX+b) = a²Var(X)
E(X+Y) = E(X) + E(Y)    (always, even if not independent)
Var(X+Y) = Var(X) + Var(Y)    (only when X and Y are independent)

📐 Worked Example 7 — Probability Distribution Table

A DRV X has P(X=x) = k(x+1) for x = 0, 1, 2, 3. Find k, E(X), and Var(X).

1
ΣP=1: k(1)+k(2)+k(3)+k(4)=10k=1 → k=1/10
2
E(X)=0×(1/10)+1×(2/10)+2×(3/10)+3×(4/10)=0+2/10+6/10+12/10=20/10=2
3
E(X²)=0×(1/10)+1×(2/10)+4×(3/10)+9×(4/10)=0+2/10+12/10+36/10=50/10=5
Var(X)=E(X²)−[E(X)]²=5−4=1

Geometric Distribution

Geometric Distribution: Models the number of trials until the first success, where each trial has probability p of success (independent trials).
X ~ Geo(p): P(X=r) = (1−p)^(r−1) p for r = 1, 2, 3, ...
E(X) = 1/p    Var(X) = (1−p)/p²

📐 Worked Example 8 — Geometric Distribution

A biased coin has P(head) = 0.3. The coin is tossed until a head appears. Find: (a) P(X=4) (b) P(X≤3) (c) E(X) and Var(X)

1
(a) P(X=4) = (0.7)³×(0.3) = 0.343×0.3 = 0.1029
2
(b) P(X≤3) = P(X=1)+P(X=2)+P(X=3)
= 0.3 + 0.7×0.3 + 0.7²×0.3 = 0.3+0.21+0.147 = 0.657
Alternative: P(X≤3) = 1−P(X>3) = 1−(0.7)³ = 1−0.343 = 0.657 ✓
3
(c) E(X) = 1/0.3 = 10/3 ≈ 3.33
Var(X) = 0.7/0.09 = 70/9 ≈ 7.78

5. Binomial Distribution S1

Binomial Distribution: X ~ B(n, p) where n = number of trials, p = probability of success in each trial. Conditions: fixed number of trials, independent trials, constant probability, only two outcomes (success/failure).

Binomial Distribution Formulae

P(X=r) = ⁿCᵣ pʳ (1−p)ⁿ⁻ʳ    for r = 0, 1, 2, ..., n
E(X) = np    Var(X) = np(1−p) = npq    where q = 1−p
P(X≤r) = Σ ⁿCₖ pᵏ qⁿ⁻ᵏ for k=0 to r    (use tables or calculator)

📐 Worked Example 9 — Binomial Probabilities

X~B(10, 0.35). Find: (a) P(X=4) (b) P(X≥3) (c) E(X) and Var(X) (d) P(X=k) is maximum at k=?

1
(a) P(X=4) = ¹⁰C₄×(0.35)⁴×(0.65)⁶
= 210×0.01501×0.07542 ≈ 0.2377
2
(b) P(X≥3) = 1−P(X≤2) = 1−[P(0)+P(1)+P(2)]
P(0)=(0.65)¹⁰≈0.01346; P(1)=10×0.35×(0.65)⁹≈0.07249; P(2)=45×(0.35)²×(0.65)⁸≈0.17567
P(X≤2)≈0.26162. P(X≥3)≈0.7384
3
(c) E(X)=10×0.35=3.5. Var(X)=10×0.35×0.65=2.275
4
(d) Modal value: find k where P(X=k) ≥ P(X=k±1). Generally k = floor((n+1)p).
(n+1)p=11×0.35=3.85 → floor=3. Check: P(3)≈0.2522>P(4)≈0.2377 ✓
Modal value = 3

Normal Approximation to Binomial

When n is large and p is not too close to 0 or 1 (rule of thumb: np>5 and nq>5), the binomial B(n,p) can be approximated by Normal(np, npq). Apply a continuity correction: P(X≤k) becomes P(Y≤k+0.5) where Y~N(np, npq).

6. Poisson Distribution S1

Poisson Distribution: X ~ Po(λ) models the number of events occurring in a fixed interval of time or space, where events occur randomly, independently, at a constant average rate λ.

Poisson Distribution

P(X=r) = e^(−λ) λʳ / r!    for r = 0, 1, 2, ...
E(X) = Var(X) = λ    (mean equals variance — key property)
If X~Po(λ₁) and Y~Po(λ₂) are independent: X+Y~Po(λ₁+λ₂)
Poisson approximation to Binomial: when n large, p small, np=λ moderate

📐 Worked Example 10 — Poisson Distribution

Calls arrive at a helpdesk at a rate of 4 per hour. Find: (a) P(exactly 3 calls in an hour) (b) P(at most 2 calls in 30 minutes) (c) P(at least 1 call in 15 minutes)

1
(a) λ=4. P(X=3) = e⁻⁴×4³/3! = e⁻⁴×64/6 ≈ 0.01832×10.667 ≈ 0.1954
2
(b) 30 min → λ=2. P(X≤2) = P(0)+P(1)+P(2)
= e⁻²(1+2+2) = 5e⁻² ≈ 5×0.1353 = 0.6767
3
(c) 15 min → λ=1. P(X≥1) = 1−P(X=0) = 1−e⁻¹ ≈ 1−0.3679 = 0.6321
Choosing the Right Distribution:
Binomial B(n,p): Fixed n trials, constant p, independent, count successes.
Geometric Geo(p): Count trials until first success, independent, constant p.
Poisson Po(λ): Count events in interval, random/independent, constant rate λ, E=Var=λ.
Normal N(μ,σ²): Continuous data, symmetric bell curve (Lesson 9).

📝 Exam Practice Questions

Q1 [4 marks] — Data: 12, 15, 18, 18, 20, 22, 25, 28, 30, 35. Find the median, quartiles, IQR, and identify any outliers using the 1.5×IQR rule.

n=10. Median = (20+22)/2 = 21
Q₁ = median of lower half {12,15,18,18,20} = 18
Q₃ = median of upper half {22,25,28,30,35} = 28
IQR = 28−18 = 10
Lower fence: 18−15=3. Upper fence: 28+15=43. No values outside [3,43].
No outliers.

Q2 [3 marks] — For 50 values: Σx=480, Σx²=5200. Find the mean, variance, and standard deviation.

x̄ = 480/50 = 9.6
σ² = 5200/50 − 9.6² = 104 − 92.16 = 11.84
σ = √11.84 ≈ 3.44

Q3 [4 marks] — Events A and B satisfy P(A)=0.4, P(B)=0.5, P(A∪B)=0.7. (a) Find P(A∩B). (b) Find P(A|B). (c) Are A and B independent? (d) Are A and B mutually exclusive?

(a) P(A∩B)=P(A)+P(B)−P(A∪B)=0.4+0.5−0.7=0.2
(b) P(A|B)=P(A∩B)/P(B)=0.2/0.5=0.4
(c) P(A|B)=0.4=P(A) ✓ → A and B are independent
(Also: P(A∩B)=0.2=P(A)×P(B)=0.4×0.5=0.2 ✓)
(d) P(A∩B)=0.2≠0 → not mutually exclusive
Exam Tip: Independence means P(A∩B)=P(A)P(B). Mutually exclusive means P(A∩B)=0. Two events with positive probability CANNOT be both independent and mutually exclusive simultaneously.

Q4 [4 marks] — A DRV X has the distribution: P(X=1)=0.1, P(X=2)=0.3, P(X=3)=k, P(X=4)=0.2, P(X=5)=0.1. Find k, E(X), Var(X), and E(3X−2).

k=1−(0.1+0.3+0.2+0.1)=0.3
E(X)=1(0.1)+2(0.3)+3(0.3)+4(0.2)+5(0.1)=0.1+0.6+0.9+0.8+0.5=2.9
E(X²)=1(0.1)+4(0.3)+9(0.3)+16(0.2)+25(0.1)=0.1+1.2+2.7+3.2+2.5=9.7
Var(X)=9.7−2.9²=9.7−8.41=1.29
E(3X−2)=3E(X)−2=8.7−2=6.7

Q5 [3 marks] — X~B(15, 0.4). Find P(X=6), P(4≤X≤7), and the most likely value of X.

P(X=6)=¹⁵C₆×(0.4)⁶×(0.6)⁹=5005×0.004096×0.010078≈0.2066
P(4≤X≤7)=P(4)+P(5)+P(6)+P(7)
≈0.1268+0.1859+0.2066+0.1771≈0.6964
Modal value: (n+1)p=16×0.4=6.4 → floor=6. Mode=6

Q6 [4 marks] — Flaws in a fabric occur at a rate of 2 per metre. (a) State the distribution of X = number of flaws in 3 metres. (b) Find P(X=5). (c) Find P(X>8). (d) In 0.5 m, find P(at least one flaw).

(a) X~Po(6) (λ=2×3=6)
(b) P(X=5)=e⁻⁶×6⁵/5!=e⁻⁶×7776/120≈0.002479×64.8≈0.1606
(c) P(X>8)=1−P(X≤8). Using cumulative Poisson tables or calculation:
P(X≤8)≈0.8472. P(X>8)≈0.1528
(d) 0.5m: λ=1. P(X≥1)=1−e⁻¹≈0.6321
← Lesson 7: Further Calculus (P3) Lesson 9: Normal Distribution & Hypothesis Testing →