Empirical/Sample Distribution
- its value at a given point is equal to the proportion of observations from the sample that are less than or equal to that point
- is NOT the same as Sampling Distribution - Finite-Sample Distribution
Mathematical Definition
let:
- 𝐷 = [𝑥1, …, 𝑥𝑛] be a set of 𝑛 samples taken from some true population distribution 𝐏
the empirical distribution 𝐏ˆ𝐷is defined as:
- 𝐏ˆ𝐷(𝐴) = (1/|𝐷|) 𝛴𝑥𝑖∊𝐷𝐼(𝑥𝑖∊𝐴)
where:
- 𝐴 - an event
- 𝐼(𝑥𝑖∊𝐴) - is an indicator function that is equal to 1 if 𝑥𝑖is found in 𝐴 and 0 otherwise
note:
- 𝐏ˆ𝐷is an estimate of true distribution 𝐏
- 𝐏ˆ𝐷(𝐴) is the probability of the event 𝐴 is simply the fraction of training examples that satisfy 𝐴
- as the number of training examples grows, the empirical distribution approaches the true population distribution 𝐏
- 𝑙𝑖𝑚𝑛→ ∞ 𝐏ˆ𝐷(𝐴) = 𝐏(𝐴)