read: Multiple Linear Regression Models

  • nested model - model 𝐴 is a nested model of 𝐵 if the predictors of 𝐴 is a subset of predictors of 𝐵

Comparing 2 Models

the 2 models to be compared:

  • 𝑀𝐿 - larger model - predictors {𝑥1, …, 𝑥𝑘}
  • 𝑀𝑆- smaller model (nested) - predictors {𝑥1, …, 𝑥𝑗} i.e. does not have {𝑥𝑗+1, …, 𝑥𝑘}

Extra Sum of Squares

extra sum of squares 𝑆𝑆𝐸𝑋 is the difference of the models’ sum of squares regressions 𝑆𝑆𝑅𝐸𝐺 or sum of squares error 𝑆𝑆𝐸𝑅𝑅:

  • 𝑆𝑆𝐸𝑋= 𝑆𝑆𝑅𝐸𝐺(𝑀𝐿) - 𝑆𝑆𝑅𝐸𝐺(𝑀𝑆)
  • 𝑆𝑆𝐸𝑋= 𝑆𝑆𝐸𝑅𝑅(𝑀𝑆) - 𝑆𝑆𝐸𝑅𝑅(𝑀𝐿)

degrees of freedom of 𝑆𝑆𝐸𝑋:

  • 𝑑𝑓𝐸𝑋 = 𝑑𝑓𝑅𝐸𝐺(𝑀𝐿) - 𝑑𝑓𝑅𝐸𝐺(𝑀𝑆)
  • 𝑑𝑓𝐸𝑋 = 𝑛𝑢𝑚-𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟-𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠(𝑀𝐿) - 𝑛𝑢𝑚-𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑜𝑟-𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒𝑠(𝑀𝑆)
  • 𝑑𝑓𝐸𝑋 = 𝑘 - 𝑗

Partial F-Test Statistic

significance of the additional explained variation (measured by 𝑆𝑆𝐸𝑋) is tested by a partial f-test statistic:

  • 𝐹 = 𝑀𝑆𝐸𝑋 / 𝑀𝑆𝐸𝑅𝑅(larger)
  • 𝐹 = [𝑆𝑆𝐸𝑋/(𝑘-𝑗)] / [𝑆𝑆𝐸𝑅𝑅(larger)/(𝑛-𝑘-1)]

the set of predictor variables in 𝑀1\𝑀2= {𝑋𝑘+1, …, 𝑋𝑚} affect the response 𝑌 if at least one of the slopes {𝜷𝑘+1, …, 𝜷𝑚} is not zero in 𝑀1. The partial F-test is a test of:

  • null hypothesis 𝐻0:
    • 𝜃𝑗+1 = … = 𝜃𝑘 = 0
    • the full model does not capture more variation than the reduced model
  • alternative hypothesis 𝐻𝐴:
    • at least one of {𝜃𝑗+1, …, 𝜃𝑘} is ≠ 0
    • the full model captures more variation than the reduced model

null hypothesis 𝐻0’s null distribution:

  • 𝐹(𝑘-𝑗),(𝑛-𝑘-1)

The partial F-test is used for sequential selection of predictors in multivariate regression

  • rejection region:
    • qf(0.95, df1=k-j, df2=n-k-1)
  • p-value:
    • pf(𝐹𝑜𝑏𝑠, df1=k-j, df2=n-k-1)

Example R Code