Distance Variance/Variation
- a type of Variation of Distances
- used to compute the dispersion of a variable 𝑋 from its mean (for dispersion between 2 variables see Distance Covariance)
Sample Distance Variance
Calculate the doubly-centered distance matrix (𝐴) of a statistical sample {𝑋1, …, 𝑋𝑛}:
Click here to expand...
Let (𝑋𝑖) for 𝑖 = 1, 2, …, 𝑛 be a statistical sample from a pair of real-valued or vector-valued random variables 𝑋
First, compute the 𝑛x𝑛 distance matrix (𝑎)
- 𝑎𝑖𝑗 = ||𝑋𝑖 - 𝑋𝑗||
where:
- ||⋅|| denotes Euclidean distance/Euclidean norm
Next, compute the doubly-centered distance matrix (𝐴)
- 𝐴𝑖𝑗 = 𝑎𝑖𝑗 - 𝑎̅𝑖· - 𝑎̅·𝑗 + 𝑎̅
where:
- 𝑎̅𝑖· is the 𝑖th row mean of 𝑎
- 𝑎̅·𝑗 is the 𝑗th column mean of 𝑎
- 𝑎̅ is the grand mean of 𝑎
All rows and all columns of 𝐴 sum to zero
The sample distance variance (a scalar) is simply the arithmetic average of the products 𝐴𝑖𝑗𝐴𝑖𝑗:
- 𝑑𝑉𝑎𝑟(𝑋) = (1/𝑛2) 𝛴1≤𝑖≤𝑛𝛴1≤𝑖≤𝑛[𝐴𝑖𝑗𝐴𝑖𝑗]