Continuous Random Variables | 9231 Further Statistics P4

Probability Density Functions

A continuous random variable X takes values in an interval (or union of intervals). Unlike discrete variables, P(X = x) = 0 for any specific value. Instead, probability is defined over intervals through integration of the probability density function f(x).

📐 PDF — Two Fundamental Requirements

Non-negativity

f(x) ≥ 0 for all x. A PDF can never be negative.

Total area = 1

∫₋∞^∞ f(x) dx = 1. The total probability over all outcomes is 1.

Probability

P(a ≤ X ≤ b) = ∫ₐᵇ f(x) dx. Area under the PDF between a and b.

Point probability

P(X = a) = 0 for any specific value a. So P(a ≤ X ≤ b) = P(a < X < b) = P(a < X ≤ b) etc.

Cumulative Distribution Function (CDF)

Definition F(x) = P(X ≤ x) = ∫₋∞ˣ f(t) dt F(x) is the area under the PDF to the left of x. It is always non-decreasing, with F(−∞) = 0 and F(+∞) = 1.

PDF from CDF f(x) = F'(x) = dF/dx Differentiate the CDF to recover the PDF.

Interval probability from CDF P(a ≤ X ≤ b) = F(b) − F(a) Use CDF differences to avoid integration where possible.

Properties of the CDF

Property	Statement
Range	0 ≤ F(x) ≤ 1 for all x
Monotone	F is non-decreasing: x₁ < x₂ ⟹ F(x₁) ≤ F(x₂)
Limits	F(x) → 0 as x → −∞ and F(x) → 1 as x → +∞
Continuity	F is continuous for a CRV (unlike discrete RVs)
At boundaries	F(a) = 0 and F(b) = 1 at the endpoints of the support

💡 Cambridge Exam Pattern — PDF/CDF Questions

A common question gives f(x) = kx² (or similar) over [a,b] and asks you to find k using the total area = 1 condition.
Then asks you to find the CDF F(x) by integrating — always include the lower limit and write F(x) piecewise (0 for x < a, formula for a ≤ x ≤ b, 1 for x > b).
Then asks for a specific probability — use the CDF rather than integrating again.
Always state the support of the distribution (the range where f(x) > 0).

E

Finding k, CDF, and Probability

A CRV X has PDF f(x) = kx(2 − x) for 0 ≤ x ≤ 2, and f(x) = 0 otherwise.
(i) Find k. (ii) Find F(x). (iii) Find P(0.5 ≤ X ≤ 1.5).

Part (i)

∫₀² kx(2−x) dx = 1
k∫₀²(2x − x²) dx = k[x² − x³/3]₀² = k(4 − 8/3) = k · 4/3 = 1

k = 3/4

Part (ii)

For 0 ≤ x ≤ 2:
F(x) = ∫₀ˣ (3/4)t(2−t) dt = (3/4)[t² − t³/3]₀ˣ = (3/4)(x² − x³/3)

F(x) = 0 (x < 0); (3/4)(x² − x³/3) (0 ≤ x ≤ 2); 1 (x > 2)

Part (iii)

P(0.5 ≤ X ≤ 1.5) = F(1.5) − F(0.5)
F(1.5) = (3/4)(2.25 − 1.125) = (3/4)(1.125) = 0.84375
F(0.5) = (3/4)(0.25 − 0.125/3) = (3/4)(0.2083) = 0.15625

P = 0.84375 − 0.15625 = 0.6875 = 11/16

Median, Quartiles, and Percentiles

Median m F(m) = 0.5 ⟺ ∫₋∞ᵐ f(x) dx = 0.5 Solve F(m) = 0.5 for m. For the lower quartile Q₁: F(Q₁) = 0.25. For the upper quartile Q₃: F(Q₃) = 0.75.

Expectation & Variance

📐 Core Formulae — Expectation and Variance

Mean E(X)

E(X) = μ = ∫ x f(x) dx (over the support of X)

E(X²)

E(X²) = ∫ x² f(x) dx

Variance Var(X)

Var(X) = σ² = E(X²) − [E(X)]²
Also written Var(X) = E[(X−μ)²] = ∫(x−μ)² f(x) dx — but the computational form E(X²)−μ² is always faster.

E[g(X)]

E[g(X)] = ∫ g(x) f(x) dx — replace x with g(x) in the integrand.

Linear Transformations

Expectation — linear E(aX + b) = aE(X) + b Constants shift/scale the mean proportionally.

Variance — linear Var(aX + b) = a²Var(X) Adding b does not change variance. Scaling by a multiplies variance by a².

Sums of Independent Random Variables

E(X + Y) E(X + Y) = E(X) + E(Y) Always true — no independence required.

Var(X + Y) — independent X, Y Var(X + Y) = Var(X) + Var(Y) Only true when X and Y are independent.

⛔ Common Error — Var(X − Y)

When X and Y are independent: Var(X − Y) = Var(X) + Var(Y) — variances always add, never subtract. This catches many students. For Var(2X): that is Var(aX) with a=2, giving 4Var(X). This is NOT the same as Var(X+X) from two independent copies.

E

Mean and Variance from PDF

X has PDF f(x) = (3/4)x(2−x) for 0 ≤ x ≤ 2 (from previous example). Find E(X), E(X²), and Var(X).

E(X)

E(X) = ∫₀² x · (3/4)x(2−x) dx = (3/4)∫₀²(2x²−x³) dx
= (3/4)[2x³/3 − x⁴/4]₀² = (3/4)(16/3 − 4) = (3/4)(4/3)

E(X) = 1

E(X²)

E(X²) = (3/4)∫₀²(2x³−x⁴) dx = (3/4)[x⁴/2 − x⁵/5]₀²
= (3/4)(8 − 32/5) = (3/4)(8/5)

E(X²) = 6/5

Var(X)

Var(X) = E(X²) − [E(X)]² = 6/5 − 1 = 1/5

Var(X) = 0.2 sd(X) = 1/√5 ≈ 0.447

Mode and Median vs Mean

Measure	Definition	How to Find
Mean μ	∫ x f(x) dx	Integrate x·f(x) over support
Median m	F(m) = 0.5	Solve CDF = ½ for m
Mode	x where f(x) is maximum	Differentiate f(x), set f'(x)=0
Quartile Q₁	F(Q₁) = 0.25	Solve CDF = ¼
Quartile Q₃	F(Q₃) = 0.75	Solve CDF = ¾

Key Continuous Distributions

Cambridge 9231 P4 requires detailed knowledge of four continuous distributions. You must be able to state the PDF, CDF, mean, and variance for each — and recognise which applies from context.

Uniform Distribution

X ~ U(a, b)

f(x) = 1/(b−a) for a ≤ x ≤ b
F(x) = (x−a)/(b−a)
f(x) = 0 otherwise

Mean: (a+b)/2 Var: (b−a)²/12

Exponential Distribution

X ~ Exp(λ)

f(x) = λe^(−λx) for x ≥ 0
F(x) = 1 − e^(−λx)
f(x) = 0 for x < 0

Mean: 1/λ Var: 1/λ²

Normal Distribution

X ~ N(μ, σ²)

f(x) = (1/σ√(2π)) e^(−(x−μ)²/2σ²)
No closed-form CDF
Standardise: Z = (X−μ)/σ ~ N(0,1)

Mean: μ Var: σ²

Beta Distribution

X ~ Beta(α, β)

f(x) = x^(α−1)(1−x)^(β−1)/B(α,β)
Support: 0 ≤ x ≤ 1
B(α,β) = Γ(α)Γ(β)/Γ(α+β)

Mean: α/(α+β) Var: αβ/[(α+β)²(α+β+1)]

The Exponential Distribution in Depth

The exponential distribution models the waiting time between events in a Poisson process with rate λ. It has the key memoryless property:

Memoryless property P(X > s + t | X > s) = P(X > t) Given that you have already waited s units of time, the distribution of remaining wait time is still Exp(λ). Only the exponential (and geometric, in discrete case) has this property.

📌 Exponential–Poisson Link

If events occur as a Poisson process at rate λ per unit time, then the waiting time between consecutive events follows Exp(λ). This link is often tested in P4 — you may need to switch between Poisson (discrete, counting events) and Exponential (continuous, modelling time).

The Normal Distribution — Standardisation

Standardisation — Z-score If X ~ N(μ, σ²), then Z = (X − μ)/σ ~ N(0, 1) Use the standard normal tables (Φ) to find probabilities. Cambridge provides Table 1 (Φ values) and Table 2 (critical values) in the MF19 formula booklet.

Symmetry of N(0,1) P(Z < −z) = 1 − P(Z < z) = 1 − Φ(z)

Interval probability P(a < X < b) = Φ((b−μ)/σ) − Φ((a−μ)/σ)

Linear Combinations of Normal Variables

If X ~ N(μ₁, σ₁²) and Y ~ N(μ₂, σ₂²) are independent:

aX + bY aX + bY ~ N(aμ₁ + bμ₂, a²σ₁² + b²σ₂²) The sum/difference of independent normal variables is also normal. This is a P4 staple — used in inference and hypothesis testing.

Functions of a Random Variable

If Y = g(X) where X is a CRV, we need the distribution of Y. There are two standard methods in Cambridge 9231.

Method 1 — CDF Method (Always Works)

Steps — CDF Method for Y = g(X)

Step 1

Find the CDF of Y: F_Y(y) = P(Y ≤ y) = P(g(X) ≤ y).
Rearrange the inequality to express in terms of X.

Step 2

Express as an integral of f_X(x) and evaluate using the known CDF of X.

Step 3

Differentiate F_Y(y) to get the PDF f_Y(y) = F_Y'(y). State the support of Y.

E

CDF Method — Y = X²

X ~ U(0, 2). Find the PDF of Y = X².

CDF of X

X ~ U(0,2): f_X(x) = 1/2 for 0 ≤ x ≤ 2, CDF F_X(x) = x/2.

Support of Y

X ∈ [0,2] ⟹ Y = X² ∈ [0,4].

CDF of Y

F_Y(y) = P(Y ≤ y) = P(X² ≤ y) = P(X ≤ √y) = F_X(√y) = √y/2
(valid for 0 ≤ y ≤ 4)

PDF of Y

f_Y(y) = d/dy [√y/2] = 1/(4√y)

f_Y(y) = 1/(4√y) for 0 < y ≤ 4, 0 otherwise

Verify

∫₀⁴ 1/(4√y) dy = [√y/2]₀⁴ = 2/2 = 1 ✓

Method 2 — Change of Variable Formula (Monotone g)

For monotone increasing g with inverse x = h(y) f_Y(y) = f_X(h(y)) · |h'(y)| The |Jacobian| factor |h'(y)| = |dx/dy| corrects for the change of scale. If g is decreasing, the same formula applies due to the absolute value.

E

Change of Variable — Y = e^X

X ~ U(0,1). Find the PDF of Y = e^X.

Inverse

y = eˣ ⟹ x = h(y) = ln y. Support: y ∈ [e⁰, e¹] = [1, e].

Jacobian

|h'(y)| = |d/dy (ln y)| = 1/y

PDF of Y

f_Y(y) = f_X(ln y) · (1/y) = 1 · (1/y) = 1/y

f_Y(y) = 1/y for 1 ≤ y ≤ e, 0 otherwise

Worked Examples

1

Piecewise PDF — Full Treatment

X has PDF f(x) = ax for 0 ≤ x ≤ 1, f(x) = a(2−x) for 1 < x ≤ 2, and 0 otherwise.
(i) Find a. (ii) Find F(x). (iii) Find the median. (iv) Find E(X) and Var(X).

Part (i) — k

∫₀¹ ax dx + ∫₁² a(2−x) dx = 1
a[x²/2]₀¹ + a[2x − x²/2]₁² = a(1/2) + a(4−2 − 2+1/2) = a/2 + a/2 = a = 1

a = 1 (triangular distribution on [0,2])

Part (ii) — F(x)

For 0 ≤ x ≤ 1: F(x) = ∫₀ˣ t dt = x²/2
For 1 < x ≤ 2: F(x) = F(1) + ∫₁ˣ (2−t) dt = 1/2 + [2t−t²/2]₁ˣ = 1/2 + (2x−x²/2) − (2−1/2)
= 1/2 + 2x − x²/2 − 3/2 = 2x − x²/2 − 1

F(x) = x²/2 (0≤x≤1); 2x − x²/2 − 1 (1<x≤2); 0 or 1 elsewhere

Part (iii) — Median

By symmetry of the triangular distribution, median = 1. Verify: F(1) = 1/2 ✓

Part (iv) — E(X)

By symmetry about x=1: E(X) = 1.
E(X²) = ∫₀¹ x²·x dx + ∫₁² x²(2−x) dx = [x⁴/4]₀¹ + [2x³/3−x⁴/4]₁²
= 1/4 + (16/3−4) − (2/3−1/4) = 1/4 + 4/3 − 5/12 = 7/6
Var(X) = 7/6 − 1 = 1/6 ≈ 0.167

E(X) = 1, Var(X) = 1/6

2

Exponential Distribution — Poisson Link

Emails arrive at a rate of 5 per hour following a Poisson process. Let T be the waiting time (in hours) until the next email.
(i) State the distribution of T. (ii) Find P(T > 0.3). (iii) Given T > 0.1, find P(T > 0.4).

Part (i)

T ~ Exp(5) [rate λ=5 per hour, mean=1/5 hour=12 minutes]

Part (ii)

P(T > 0.3) = 1 − F(0.3) = e^(−5×0.3) = e^(−1.5)

= 0.2231

Part (iii) — Memoryless

P(T > 0.4 | T > 0.1) = P(T > 0.4 − 0.1) = P(T > 0.3) = e^(−1.5)

= 0.2231 [same answer — memoryless property]

Practice Questions

Question 1 — PDF, CDF, Median

[7 marks]

A CRV X has PDF f(x) = kx²(3−x) for 0 ≤ x ≤ 3, 0 otherwise.
(i) Show k = 4/27. (ii) Find F(x) for 0 ≤ x ≤ 3. (iii) Find the median, correct to 3 d.p.

For (i): ∫₀³ kx²(3−x) dx = k[x³ − x⁴/4]₀³ = k(27−81/4) = 27k/4 = 1. For (iii): solve F(m) = 0.5 numerically.

✓ Solution

(i) ∫₀³ kx²(3−x) dx = k∫₀³(3x²−x³)dx = k[x³−x⁴/4]₀³ = k(27−81/4) = 27k/4 = 1 ⟹ k = 4/27 ✓

(ii) F(x) = (4/27)∫₀ˣ t²(3−t) dt = (4/27)[t³−t⁴/4]₀ˣ = (4/27)(x³ − x⁴/4)

F(x) = (4x³/27)(1 − x/12) = 4x³/27 − x⁴/27 for 0 ≤ x ≤ 3

(iii) Solve F(m) = 0.5: 4m³/27 − m⁴/108 = 0.5
Try m=2: F(2) = 32/27 − 16/108 = 32/27 − 4/27 = 28/27 — too big; wait 28/27 > 1 — recheck.
F(2) = (4×8/27)(1−2/12) = (32/27)(10/12) = 320/324 = 0.988 — too big.
Try m=1.6: F(1.6) = 4(4.096)/27 − (6.5536)/108 = 0.608 − 0.061 = 0.547
Try m=1.5: F(1.5) = 4(3.375)/27 − 5.0625/108 = 0.500 − 0.047 = 0.453
Median ≈ between 1.5 and 1.6; bisect to get

m ≈ 1.554

Question 2 — Expectation and Variance

[5 marks]

X ~ U(2, 8). Without integration, state E(X) and Var(X). Find P(|X − E(X)| < 1).

For U(a,b): E(X) = (a+b)/2, Var(X) = (b−a)²/12. For the probability, note |X−5| < 1 means 4 < X < 6, which is an interval of length 2 out of the total range 6.

✓ Solution

E(X) = (2+8)/2 = 5
Var(X) = (8−2)²/12 = 36/12 = 3

P(|X−5| < 1) = P(4 < X < 6) = (6−4)/(8−2) = 2/6 =

1/3

Question 3 — Exponential Distribution

[5 marks]

The lifetime T (years) of a component follows Exp(0.2).
(i) Find the mean and standard deviation of T.
(ii) Find P(T > 8).
(iii) Find the value of t such that P(T < t) = 0.9.

For Exp(λ): mean = 1/λ, var = 1/λ². For (iii): F(t) = 1−e^(−λt) = 0.9 ⟹ e^(−0.2t) = 0.1 ⟹ t = −ln(0.1)/0.2.

✓ Solution

(i) Mean = 1/0.2 = 5 years. Var = 1/0.04 = 25. SD = 5 years.

(ii) P(T > 8) = e^(−0.2×8) = e^(−1.6) ≈

0.2019

(iii) 1−e^(−0.2t) = 0.9 ⟹ e^(−0.2t) = 0.1 ⟹ t = ln(10)/0.2 =

11.51 years

Question 4 — Function of a CRV

[6 marks]

X has PDF f(x) = 2x for 0 ≤ x ≤ 1. Find the PDF of Y = 1 − X².

Use the CDF method. For 0 ≤ x ≤ 1, Y = 1−X² ∈ [0,1]. F_Y(y) = P(Y ≤ y) = P(1−X² ≤ y) = P(X² ≥ 1−y) = P(X ≥ √(1−y)) = 1−F_X(√(1−y)).

✓ Solution

X ∈ [0,1] ⟹ Y = 1−X² ∈ [0,1] (Y decreases as X increases).
F_X(x) = x² (integrate f(x)=2x).

F_Y(y) = P(1−X² ≤ y) = P(X² ≥ 1−y) = P(X ≥ √(1−y)) = 1 − F_X(√(1−y)) = 1 − (1−y) = y

f_Y(y) = d/dy [y] = 1

Y ~ U(0,1)

(Interesting result — the probability integral transform: if X has CDF F, then F(X) ~ U(0,1).)

Interactive Distribution Tool

Select a distribution, adjust parameters, and see the PDF and CDF curves with live probability calculations.

PDF & CDF Explorer

P(X ≤ a) — find probability

P(a ≤ X ≤ b)

—Mean

—Variance

—P(X ≤ a)

—P(a ≤ X ≤ b)

Formula Sheet — Continuous Random Variables

PDF Requirements

f(x) ≥ 0Non-negativity

∫ f(x) dx = 1Total area = 1

P(a≤X≤b) = ∫ₐᵇ f(x) dx—

P(X=a) = 0Always for CRV

CDF

F(x) = P(X≤x)= ∫₋∞ˣ f(t) dt

f(x) = F'(x)Differentiate CDF

P(a≤X≤b)= F(b) − F(a)

0 ≤ F(x) ≤ 1Monotone non-dec.

Expectation & Variance

E(X) = ∫ x f(x) dx—

Var(X) = E(X²) − [E(X)]²Computational

E(aX+b)= aE(X)+b

Var(aX+b)= a²Var(X)

Var(X−Y) indep.= Var(X)+Var(Y)

Uniform U(a,b)

f(x) = 1/(b−a)a ≤ x ≤ b

F(x) = (x−a)/(b−a)—

Mean(a+b)/2

Variance(b−a)²/12

Exponential Exp(λ)

f(x) = λe^(−λx)x ≥ 0

F(x) = 1−e^(−λx)—

Mean = 1/λVar = 1/λ²

MemorylessP(X>s+t|X>s)=P(X>t)

Functions of X — CDF Method

F_Y(y) = P(g(X)≤y)Rearrange

f_Y(y) = F_Y'(y)Differentiate

Jacobian methodf_Y(y)=f_X(h(y))|h'(y)|

Check support of YAlways state it

📋 Cambridge Exam Strategy — CRV

Finding k: Integrate f(x) over the stated support, set equal to 1, solve. Never integrate over the whole real line if the support is finite.
Writing F(x): Always give F(x) piecewise — F(x) = 0 below support, formula in between, F(x) = 1 above support. Cambridge deducts marks for missing the boundary cases.
Mode: Differentiate f(x) and set f'(x) = 0. Check it is a maximum (second derivative or sign check). The mode is where the PDF is highest, not where F = 0.5.
Var(X−Y): This equals Var(X) + Var(Y) for independent X,Y. Never Var(X) − Var(Y).
Functions of X: Always use the CDF method for safety — it always works. The Jacobian method is faster only for monotone transformations.
Exponential memoryless: P(T > s+t | T > s) = P(T > t). State this explicitly — Cambridge awards a mark for invoking the memoryless property by name.