Distance Covariance/Covariation
- a type of covariation between 2 variables 𝑋 and 𝑌 (if between itself see Distance Variance)
- is used to calculate distance correlation
Sample Distance Covariance Formula
Given a statistical sample {(𝑋1, 𝑌1), …, (𝑋𝑛, 𝑌𝑛)} compute their respective doubly-centered distance matrices 𝐴 and 𝐵:
Click here to expand...
Let (𝑋𝑖, 𝑌𝑖), 𝑖 = 1, 2, …, 𝑛 be a statistical sample from a pair of real-valued or vector-valued random variables (𝑋, 𝑌)
First, compute the 𝑛x𝑛 distance matrices 𝑎 and 𝑏
- 𝑎𝑖𝑗 = ||𝑋𝑖 - 𝑋𝑗||
- 𝑏𝑖𝑗 = ||𝑌𝑖 - 𝑌𝑗||
where:
- ||⋅|| denotes norm
Next, compute the doubly-centered distance matrices 𝐴 and 𝐵:
- 𝐴𝑖𝑗 = 𝑎𝑖𝑗 - 𝑎̅𝑖· - 𝑎̅·𝑗 + 𝑎̅
- 𝐵𝑖𝑗 = 𝑏𝑖𝑗 - 𝑏̅𝑖· - 𝑏̅·𝑗 + 𝑏̅
where:
- 𝑎̅𝑖· is the 𝑖th row mean of 𝑎
- 𝑎̅·𝑗 is the 𝑗th column mean of 𝑎
- 𝑎̅ is the grand mean of 𝑎
- the notation is similar for the b values
The sample distance covariance (a scalar) is simply the arithmetic average of the products 𝐴𝑖𝑗𝐵𝑖𝑗:
- 𝑑𝐶𝑜𝑣2(𝑋,𝑌) = (1/𝑛2) 𝛴1≤𝑖≤𝑛𝛴1≤𝑗≤𝑛[𝐴𝑖𝑗𝐵𝑖𝑗]