Standard Regression Assumptions
- each observed response 𝑦(𝑖) are independent Normal random variables with:
- expectation = 𝐄[𝑌(𝑖)|𝑋1(𝑖), …, 𝑋𝑘(𝑖)] = 𝜃0+ 𝜃1𝑋1(𝑖)+ … + 𝜃𝑘𝑋𝑘(𝑖)
- variance = 𝑉𝑎𝑟(𝑌(𝑖)|𝑋1(𝑖), …, 𝑋𝑘(𝑖)) = 𝜎2 # variance is same constant for all 𝑋1(𝑖), …, 𝑋𝑘(𝑖)
- predictors {𝑋1(𝑖), …, 𝑋𝑘(𝑖)} are considered non-random because they are observed
- as a consequence:
- 𝜃0, …, 𝜃𝑘have Normal Distribution
|
Desired Properties |
Required Assumptions |
|---|---|
|
|
|
|
|
|
Assumptions in ANOVA
- normality of sampling distribution of means - the distribution of sample means is normally distributed
- errors 𝑒(𝑖)& 𝑒(𝑗) are independent of each other (where 𝑒(𝑖) = 𝑦̂(𝑖) - 𝑦(𝑖))
- absence of outliers - outliers have been removed from the dataset
- homogeneity of variance - population variances at different levels of each independent variable {𝑋1, …, 𝑋𝑘} are equal