CI - Paired Samples

2-sample problems - inference on parameters involving two populations

Population 1: 𝑋 ∼ f_𝑋(𝑥), 𝐄(𝑋) = 𝜇_𝑋
Population 2: 𝑌 ∼ f_𝑌(𝑦), 𝐄(𝑌) = 𝜇_𝑌

CI with Paired Samples - both 𝑋 and 𝑌 samples come from SAME subject

sample size of 𝑋 and 𝑌 are the SAME

Subject #	(𝑋, 𝑌)	𝐷 = 𝑋 - 𝑌
1	(𝑋₁, 𝑌₁)	𝐷₁ = 𝑋₁ - 𝑌₁
2	(𝑋₂, 𝑌₂)	𝐷₂ = 𝑋₂ - 𝑌₂
…	…	…
n	(𝑋_n, 𝑌_n)	𝐷_n = 𝑋_n - 𝑌_n

CI - General Formula

Click here to expand...

CI Definition

An interval [𝐴, 𝐵] is a (1 − 𝛼)100% confidence interval for the parameter 𝜃 if it contains the parameter with probability (1 − 𝛼):

𝐏{𝐴 ≤ 𝜃 ≤ 𝐵} = 1 − 𝛼

where:

𝛼 - significance level

(1 − 𝛼) - confidence level or coverage probability

CI Formula Intuition

Click here to expand...

Given a sample of data and a desired confidence level (1 − 𝛼), how can we construct a confidence interval [𝐴, 𝐵] that will satisfy the coverage condition

𝐏{𝐴 ≤ 𝜃 ≤ 𝐵} = 1 − 𝛼

first we need to estimate 𝜃

choose an unbiased estimator with normal distribution (e.g. MLE)

use the estimator to take the sample data and estimate 𝜃 point estimate 𝜃ˆ

next we standardize 𝜃ˆ to get a standard normal variable 𝑧:

𝑧 = [𝜃ˆ - 𝐄(𝜃ˆ)] / 𝜎(𝜃ˆ)

since 𝜃ˆ was estimated with an unbiased estimator: 𝐄(𝜃ˆ) = 𝜃

𝑧 = (𝜃ˆ - 𝜃) / 𝜎(𝜃ˆ)

this variable 𝑧 falls between the standard normal quantiles 𝑞_𝛼/2and 𝑞_1−𝛼/2, denoted by

-𝑧_𝛼/2= 𝑞_𝛼/2

𝑧_𝛼/₂= 𝑞_1−𝛼/2

with probability (1 - 𝛼), then:

𝐏{-𝑧_𝛼/2 ≤ (𝜃ˆ - 𝜃) / 𝜎(𝜃ˆ) ≤ 𝑧_𝛼/₂} = 1 - 𝛼

now rearrange for 𝜃:

𝐏{𝜃ˆ - 𝑧_𝛼/2·𝜎(𝜃ˆ) ≤ 𝜃 ≤ 𝜃ˆ + 𝑧_𝛼/₂·𝜎(𝜃ˆ)} = 1 - 𝛼

we have obtained two numbers:

𝐴 = 𝜃ˆ - 𝑧_𝛼/₂·𝜎(𝜃ˆ)

𝐵 = 𝜃ˆ + 𝑧_𝛼/₂·𝜎(𝜃ˆ)

such that

𝐏{𝐴 ≤ 𝜃 ≤ 𝐵} = 1 − 𝛼

CI Formulas

Large Sample Size (𝑛)

Normal Population

𝑆𝐸(𝜃ˆ) Known

Confidence Interval

FALSE

FALSE

EITHER

Bootstrap Method

FALSE

TRUE

FALSE

𝜃ˆ ± 𝑡_𝛼/2·𝑆𝐸ˆ(𝜃ˆ)

FALSE

TRUE

TRUE

𝜃ˆ ± 𝑧_𝛼/2·𝑆𝐸(𝜃ˆ)

TRUE

EITHER

FALSE

𝜃ˆ ± 𝑧_𝛼/2·𝑆𝐸ˆ(𝜃ˆ)

TRUE

EITHER

TRUE

𝜃ˆ ± 𝑧_𝛼/2·𝑆𝐸(𝜃ˆ)

where:

𝜃ˆ - point estimate/statistic or center of the interval

𝑧 - z-score a type of confidence multiplier

𝑡 - t-score a type of confidence multiplier

𝑆𝐸(𝜃ˆ) or 𝜎(𝜃ˆ) or 𝑆𝑡𝑑(𝜃ˆ) - standard error of the point estimator/statistic

𝑆𝐸ˆ(𝜃ˆ) or 𝑠(𝜃ˆ) or 𝑆𝑡𝑑ˆ(𝜃ˆ) - estimated standard error of the point estimator/statistic

CI Annotated

CI Diagram

Link to original

CI - Formula

100(1 - 𝛼)% CI for 𝜇_𝐷assuming {𝐷₁, 𝐷₂, …, 𝐷_n} are normal(𝜇_𝐷, 𝜎_𝐷²) distributed

from the sample difference {𝐷₁, 𝐷₂, …, 𝐷_n} we can compute 𝐷̅ and 𝑠_𝐷²

CI when 𝜎 is KNOWN

𝐷̅ ± 𝑧*·(𝜎_𝐷/√𝑛)

CI when 𝜎 is UNKNOWN

𝐷̅ ± 𝑡*·(𝑠_𝐷/√𝑛)

where:

𝐷̅ is the sample mean of differences (𝑋̅ - 𝑌̅)
𝑧* is the z distribution
𝑡* is the t distribution
𝜎_𝐷is the population standard deviation
𝑠 is the sample standard deviation
𝑛 is size of sample
(𝜎_𝐷/√𝑛) is the standard error of 𝐷̅
(𝑠_𝐷/√𝑛) is the estimated standard error of 𝐷̅

approximate 100(1 - 𝛼)% CI for 𝜇_𝐷if 𝑛 is large

𝐷̅ ± 𝑧*·(𝜎_𝐷/√𝑛) ~ 𝐷̅ ± 𝑡*·(𝑠_𝐷/√𝑛) ~ 𝐷̅ ± 𝑧*·(𝑠_𝐷/√𝑛)

CIs for Sample Mean

Large Sample Size (𝑛)	Normal Population	𝜎 / 𝑆𝐸(𝑋̅) Known	Confidence Interval
FALSE	FALSE	EITHER	Bootstrap Method
FALSE	TRUE	FALSE	𝐷̅ ± 𝑡_{𝛼/2,𝑛-1}·(𝑠_𝐷/√𝑛)
FALSE	TRUE	TRUE	𝐷̅ ± 𝑧_𝛼/2·(𝜎_𝐷/√𝑛)
TRUE	EITHER	FALSE	𝐷̅ ± 𝑧_𝛼/2·(𝑠_𝐷/√𝑛)
TRUE	EITHER	TRUE	𝐷̅ ± 𝑧_𝛼/2·(𝜎_𝐷/√𝑛)

／var／log marcus chiu

Explorer

CI - Paired Samples

CI - General Formula

CI Definition

CI Formula Intuition

CI Formulas

CI Annotated

CI Diagram

CI - Formula

approximate 100(1 - 𝛼)% CI for 𝜇_𝐷if 𝑛 is large

CIs for Sample Mean

／var／logmarcus chiu

Explorer

CI - Paired Samples

CI - General Formula

CI Definition

CI Formula Intuition

CI Formulas

CI Annotated

CI Diagram

CI - Formula

approximate 100(1 - 𝛼)% CI for 𝜇𝐷if 𝑛 is large

CIs for Sample Mean

／var／log marcus chiu

approximate 100(1 - 𝛼)% CI for 𝜇_𝐷if 𝑛 is large