Cook’s Distance
  • is used to determine the influence of a data point and highlight particular points worth investigating for validity
  • it can also be used to highlight regions of the 𝑋 space that need further investigating with more data

Formula

The distance 𝐷𝑖 for data point 𝑖 is defined as:

where:

  • 𝑦̂𝑗(𝑖) - is the fitted value for data point 𝑗 with the 𝑖th observation removed