Simple Sampling - Generalized

assume:

function ℎ(𝑋)
𝑋 has a probability distribution 𝐏
it is easy to generate a sample 𝑥_𝑖 from probability distribution 𝐏
computation of ℎ(𝑥_𝑖) is easy

we want to compute the expected value of ℎ(𝑋) (i.e. 𝐄_𝐏[ℎ(𝑋)])

𝐄_𝐏[ℎ(𝑋)] = ∫ℎ(𝑥)𝐏(𝑥)𝑑𝑥 # continuous case
𝐄_𝐏[ℎ(𝑋)] = 𝛴_𝑥∊𝑋 ℎ(𝑥)𝐏(𝑥) # discrete case

is estimated with:

𝐄_𝐏[ℎ(𝑋)] ≈ (1/𝑛) 𝛴_{1≤𝑖≤𝑛}ℎ(𝑥_𝑖)

where:

ℎ(𝑥) - is some function
𝐏 - is the probability distribution
𝐄_𝐏[..] - expected value based on 𝐏
𝑛 - is the number of samples generated
{𝑥₁, …, 𝑥_𝑖, …, 𝑥_𝑛} - are samples i.i.d. generated from 𝐏

Simple Sampling - Examples

Using Simple Sampling to Estimate the Average of a Distribution

see: sample mean

here we want to estimate the expected value of 𝑋 having probability distribution 𝐏. Therefore:

ℎ(𝑥) = 𝑥

is estimated with:

𝐄_𝐏[𝑋] ≈ (1/𝑛) 𝛴_{1≤𝑖≤𝑛} [𝑥_𝑖]

where:

𝑛 - is the number of samples generated

{𝑥₁, …, 𝑥_𝑖, …, 𝑥_𝑛} - are i.i.d. samples generated from 𝐏

Using Simple Sampling to Estimate the Proportion of a Distribution

see: sample proportions

We would like to estimate the probability of a random variable 𝑋.

More specifically, given:

variable 𝑋 has a domain of possible outcomes (i.e. sample space)

event 𝐴 is a subset of sample space

we would like to estimate the probability of 𝑝 that variable 𝑋 would result in an outcome that exist in event 𝐴. In other words, 𝑝 = 𝐏 {𝑋 ∈ 𝐴}

This probability 𝑝 is estimated by generating a long run of experiments on 𝑋, where each run returns an outcome 𝑋_𝑖. Then we compute the proportion of times when our event 𝐴 occurred.

𝑝 = 𝐄_𝐏[𝑋 ∈ 𝐴]

𝑝̂ ≈ (1/𝑛) 𝛴_{1≤𝑖≤𝑛} 𝐼(𝑥_𝑖∊𝐴)

where:

𝑛 - is the number of samples generated

{𝑥₁, …, 𝑥_𝑖, …, 𝑥_𝑛} - are i.i.d. samples generated from 𝐏

𝐴 is an event (i.e. a set of outcomes) in which we are estimating its probability

𝐼(𝑥_𝑖∊𝐴) - indicator function (equals 1 when 𝑥_𝑖∊𝐴, otherwise 0)

𝑝̂ - the estimator

now we have an estimated probability 𝑝̂

How Accurate is This Method? Does 𝑝̂ = 𝑝?

To answer this question, compute 𝐄[𝑝̂] and 𝐒𝐭𝐝(𝑝̂)

Since the number of outcomes 𝑋₁, …, 𝑋_𝑛 that fall within event 𝐴 has Binomial(𝑛,𝑝) distribution with:

expectation = (𝑛𝑝)

variance = 𝑛𝑝(1−𝑝)

standard deviation = √[𝑛𝑝(1−𝑝)]

therefore, we obtain:

𝐄[𝑝̂] = (1/𝑛) (number of 𝑋₁, …, 𝑋_𝑛∈ 𝐴)

𝐄[𝑝̂] = (1/𝑛) [𝐄(𝑋₁) + … + 𝐄(𝑋_𝑛)]

𝐄[𝑝̂] = (1/𝑛) [𝑝 + … + 𝑝]

𝐄[𝑝̂] = (1/𝑛) 𝑛𝑝

𝐄[𝑝̂] = 𝑝

𝐒𝐭𝐝(𝑝̂) = (1/𝑛) (number of 𝑋₁, …, 𝑋_𝑛∈ 𝐴)

𝐒𝐭𝐝(𝑝̂) = (1/𝑛) √[𝑛𝑝(1−𝑝)]

𝐒𝐭𝐝(𝑝̂) = √[𝑝(1−𝑝) / 𝑛]

thus, we can conclude the following:

𝐄[𝑝̂] = 𝑝, shows that our Monte Carlo Estimator of 𝑝 is unbiased, so that over a long run, it will on the average return the desired quantity 𝑝

𝐒𝐭𝐝(𝑝̂) = √[𝑝(1−𝑝) / 𝑛], indicates that the standard deviation of our estimator 𝑝̂ decreases with 𝑛 at the rate of 1/√𝑛 . Larger Monte Carlo experiments produce more accurate results. A 100-fold increase in the number of generated variables reduces the standard deviation (therefore, enhancing accuracy) by a factor of 10

／var／log marcus chiu

Explorer

Simple Sampling - Simple Monte Carlo

Simple Sampling - Generalized

Simple Sampling - Examples

How Accurate is This Method? Does 𝑝̂ = 𝑝?

／var／logmarcus chiu

Explorer

Simple Sampling - Simple Monte Carlo

Simple Sampling - Generalized

Simple Sampling - Examples

How Accurate is This Method? Does 𝑝̂ = 𝑝?

／var／log marcus chiu