CI - 2 Independent Samples

2-sample problems - inference on parameters involving two populations

Population 1: 𝑋 ∼ 𝑓_𝑋(𝑥)
Population 2: 𝑌 ∼ 𝑓_𝑌(𝑦)

CI with 2 Independent Samples - both 𝑋 and 𝑌 samples come from 2 different subjects (i.e. independent observations)

sample size of 𝑋 and 𝑌 may be different

𝑋	𝑌
𝑋₁	𝑌₁
𝑋₂	𝑌₂
…	…
𝑋_𝑛	𝑌_𝑚

CI - General Formula

Click here to expand...

CI Definition

An interval [𝐴, 𝐵] is a (1 − 𝛼)100% confidence interval for the parameter 𝜃 if it contains the parameter with probability (1 − 𝛼):

𝐏{𝐴 ≤ 𝜃 ≤ 𝐵} = 1 − 𝛼

where:

𝛼 - significance level

(1 − 𝛼) - confidence level or coverage probability

CI Formula Intuition

Click here to expand...

Given a sample of data and a desired confidence level (1 − 𝛼), how can we construct a confidence interval [𝐴, 𝐵] that will satisfy the coverage condition

𝐏{𝐴 ≤ 𝜃 ≤ 𝐵} = 1 − 𝛼

first we need to estimate 𝜃

choose an unbiased estimator with normal distribution (e.g. MLE)

use the estimator to take the sample data and estimate 𝜃 point estimate 𝜃ˆ

next we standardize 𝜃ˆ to get a standard normal variable 𝑧:

𝑧 = [𝜃ˆ - 𝐄(𝜃ˆ)] / 𝜎(𝜃ˆ)

since 𝜃ˆ was estimated with an unbiased estimator: 𝐄(𝜃ˆ) = 𝜃

𝑧 = (𝜃ˆ - 𝜃) / 𝜎(𝜃ˆ)

this variable 𝑧 falls between the standard normal quantiles 𝑞_𝛼/2and 𝑞_1−𝛼/2, denoted by

-𝑧_𝛼/2= 𝑞_𝛼/2

𝑧_𝛼/₂= 𝑞_1−𝛼/2

with probability (1 - 𝛼), then:

𝐏{-𝑧_𝛼/2 ≤ (𝜃ˆ - 𝜃) / 𝜎(𝜃ˆ) ≤ 𝑧_𝛼/₂} = 1 - 𝛼

now rearrange for 𝜃:

𝐏{𝜃ˆ - 𝑧_𝛼/2·𝜎(𝜃ˆ) ≤ 𝜃 ≤ 𝜃ˆ + 𝑧_𝛼/₂·𝜎(𝜃ˆ)} = 1 - 𝛼

we have obtained two numbers:

𝐴 = 𝜃ˆ - 𝑧_𝛼/₂·𝜎(𝜃ˆ)

𝐵 = 𝜃ˆ + 𝑧_𝛼/₂·𝜎(𝜃ˆ)

such that

𝐏{𝐴 ≤ 𝜃 ≤ 𝐵} = 1 − 𝛼

CI Formulas

Large Sample Size (𝑛)

Normal Population

𝑆𝐸(𝜃ˆ) Known

Confidence Interval

FALSE

FALSE

EITHER

Bootstrap Method

FALSE

TRUE

FALSE

𝜃ˆ ± 𝑡_𝛼/2·𝑆𝐸ˆ(𝜃ˆ)

FALSE

TRUE

TRUE

𝜃ˆ ± 𝑧_𝛼/2·𝑆𝐸(𝜃ˆ)

TRUE

EITHER

FALSE

𝜃ˆ ± 𝑧_𝛼/2·𝑆𝐸ˆ(𝜃ˆ)

TRUE

EITHER

TRUE

𝜃ˆ ± 𝑧_𝛼/2·𝑆𝐸(𝜃ˆ)

where:

𝜃ˆ - point estimate/statistic or center of the interval

𝑧 - z-score a type of confidence multiplier

𝑡 - t-score a type of confidence multiplier

𝑆𝐸(𝜃ˆ) or 𝜎(𝜃ˆ) or 𝑆𝑡𝑑(𝜃ˆ) - standard error of the point estimator/statistic

𝑆𝐸ˆ(𝜃ˆ) or 𝑠(𝜃ˆ) or 𝑆𝑡𝑑ˆ(𝜃ˆ) - estimated standard error of the point estimator/statistic

CI Annotated

CI Diagram

Link to original

／var／log marcus chiu

Explorer

CI - 2 Independent Samples

CI - General Formula

CI Definition

CI Formula Intuition

CI Formulas

CI Annotated

CI Diagram

Subpages

Large Sample Size (𝑛)	Normal Population	𝑆𝐸(𝜃ˆ) Known	Confidence Interval
FALSE	FALSE	EITHER	Bootstrap Method
FALSE	TRUE	FALSE	𝜃ˆ ± 𝑡_𝛼/2·𝑆𝐸ˆ(𝜃ˆ)
FALSE	TRUE	TRUE	𝜃ˆ ± 𝑧_𝛼/2·𝑆𝐸(𝜃ˆ)
TRUE	EITHER	FALSE	𝜃ˆ ± 𝑧_𝛼/2·𝑆𝐸ˆ(𝜃ˆ)
TRUE	EITHER	TRUE	𝜃ˆ ± 𝑧_𝛼/2·𝑆𝐸(𝜃ˆ)

／var／logmarcus chiu

Explorer

CI - 2 Independent Samples

CI - General Formula

CI Definition

CI Formula Intuition

CI Formulas

CI Annotated

CI Diagram

Subpages

／var／log marcus chiu