Standard Regression Assumptions
  • each observed response 𝑦(𝑖) are independent Normal random variables with:
    • expectation = 𝐄[𝑌(𝑖)|𝑋1(𝑖), …, 𝑋𝑘(𝑖)] = 𝜃0+ 𝜃1𝑋1(𝑖)+ … + 𝜃𝑘𝑋𝑘(𝑖)
    • variance = 𝑉𝑎𝑟(𝑌(𝑖)|𝑋1(𝑖), …, 𝑋𝑘(𝑖)) = 𝜎2 # variance is same constant for all 𝑋1(𝑖), …, 𝑋𝑘(𝑖)
  • predictors {𝑋1(𝑖), …, 𝑋𝑘(𝑖)} are considered non-random because they are observed
  • as a consequence:

Desired Properties

Required Assumptions

  • 𝜃ˆ𝑂𝐿𝑆 is an unbiased estimate of 𝜃
  • 𝐄[𝜖(𝑖)] = 0, ∀𝑖
  • 𝜃ˆ𝑂𝐿𝑆 is an unbiased estimate of 𝜃
  • 𝜃ˆ𝑂𝐿𝑆 is a BLUE estimator
  • 𝐄[𝜖(𝑖)] = 0, ∀𝑖
  • 𝑉𝑎𝑟(𝜖(𝑖)) = constant < ∞, ∀𝑖
  • 𝐶𝑜𝑣(𝜖(𝑖),𝜖(𝑗)) = 0, ∀𝑖≠𝑗
  • 𝜃ˆ𝑂𝐿𝑆 is an unbiased estimate of 𝜃
  • 𝜃ˆ𝑂𝐿𝑆 is a BLUE estimator
  • 𝜃ˆ𝑂𝐿𝑆 is mathematically equivalent to MLE
Assumptions in ANOVA
  • normality of sampling distribution of means - the distribution of sample means is normally distributed
  • errors 𝑒(𝑖)& 𝑒(𝑗) are independent of each other (where 𝑒(𝑖) = 𝑦̂(𝑖) - 𝑦(𝑖))
  • absence of outliers - outliers have been removed from the dataset
  • homogeneity of variance - population variances at different levels of each independent variable {𝑋1, …, 𝑋𝑘} are equal