Adjusted R-Square - Adjusted Coefficient of Determination

  • as we add new predictor 𝑋𝑖 to our model, it either explains additional portions of 𝑆𝑆𝑇𝑂𝑇 or it doesn’t. Therefore, regular 𝑅2can only stay the same or go up when we add 𝑋𝑖. Thus we expect 𝑅2to increase going from univariate regression to multivariate regression
  • Adjusted 𝑅2 is a modification of 𝑅2that penalizes the addition of a useless predictor variable 𝑋𝑖
  • Adjusted 𝑅2ranges between [0,1]

original 𝑅2 formula:

  • 𝑅2 = 𝑆𝑆𝑅𝐸𝐺/ 𝑆𝑆𝑇𝑂𝑇
  • 𝑅2 = (𝑆𝑆𝑇𝑂𝑇- 𝑆𝑆𝐸𝑅𝑅) / 𝑆𝑆𝑇𝑂𝑇
  • 𝑅2 = 1- (𝑆𝑆𝐸𝑅𝑅 / 𝑆𝑆𝑇𝑂𝑇)

adjusted 𝑅2 formula:

  • 𝑅2𝑎𝑑𝑗 = 1- [(𝑆𝑆𝐸𝑅𝑅/𝑑𝑓𝐸𝑅𝑅) / (𝑆𝑆𝑇𝑂𝑇/𝑑𝑓𝑇𝑂𝑇)]
  • 𝑅2𝑎𝑑𝑗 = 1- [(𝑆𝑆𝐸𝑅𝑅/(𝑛 - 𝑘 - 1)) / (𝑆𝑆𝑇𝑂𝑇/(𝑛 - 1))]
  • 𝑅2𝑎𝑑𝑗 = 1- [(𝑆𝑆𝐸𝑅𝑅 / 𝑆𝑆𝑇𝑂𝑇) · ((𝑛 - 1) / (𝑛 - 𝑘 - 1))]
  • 𝑅2𝑎𝑑𝑗 = 1- [(𝑆𝑆𝐸𝑅𝑅 / 𝑆𝑆𝑇𝑂𝑇) · (𝑑𝑓𝑇𝑂𝑇 / 𝑑𝑓𝐸𝑅𝑅)]

where:

imagine adding a non-significant predictor variable 𝑋𝑖. The number of estimated slopes 𝑘 increments by 1. However, if this variable is not able to explain any variation of the response 𝑌 (𝑖.𝑒. 𝑆𝑆𝑇𝑂𝑇) then the sum of squares: 𝑆𝑆𝐸𝑅𝑅 and 𝑆𝑆𝑅𝐸𝐺 will remain the same. Then (𝑆𝑆𝐸𝑅𝑅/𝑑𝑓𝐸𝑅𝑅) will increase and 𝑅2𝑎𝑑𝑗will decrease