omitted variable bias occurs when a regression model leaves out relevant independent variables, which are known as confounding variables. This forces the model to attribute the effects of omitted variables to variables that are in the model, which biases the coefficient estimates
Conditions that Cause Omitted Variable Bias
- the omitted variable 𝑍 must correlate with the dependent variable 𝑌
- the omitted variable 𝑍 must correlate with at least one independent variable 𝑋 in the regression model
- that one independent variable 𝑋 must correlate with the dependent variable 𝑌
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias.png)
Effects of Omitted Variable Bias
The effect of 𝑋 can be either:
- overestimated
- underestimated
- masked
- sign reversed
when true effect of |𝐸𝑓𝑓𝑒𝑐𝑡(𝑍)| > 0 and |𝐸𝑓𝑓𝑒𝑐𝑡(𝑋)| > 0 then the true effect of 𝑋 is overestimated
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-overestimated-2.png)
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-overestimated.png)
when true effect of |𝐸𝑓𝑓𝑒𝑐𝑡(𝑍)| > 0 and |𝐸𝑓𝑓𝑒𝑐𝑡(𝑋)| > 0 then the true effect of 𝑋 is overestimated
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-overestimated-4.png)
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-overestimated-3.png)
when true effect of |𝐸𝑓𝑓𝑒𝑐𝑡(𝑍)| < |𝐸𝑓𝑓𝑒𝑐𝑡(𝑋)| then the true effect of 𝑋 is underestimated
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-masked.png)
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-masked-2.png)
when true effect of |𝐸𝑓𝑓𝑒𝑐𝑡(𝑍)| < |𝐸𝑓𝑓𝑒𝑐𝑡(𝑋)| then the true effect of 𝑋 is underestimated
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-underestimated-2.png)
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-underestimated.png)
when true effect of |𝐸𝑓𝑓𝑒𝑐𝑡(𝑍)| = |𝐸𝑓𝑓𝑒𝑐𝑡(𝑋)| then the true effect of 𝑋 is masked
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-masked.png)
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-masked-2.png)
when true effect of |𝐸𝑓𝑓𝑒𝑐𝑡(𝑍)| = |𝐸𝑓𝑓𝑒𝑐𝑡(𝑋)| then the true effect of 𝑋 is masked
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-underestimated-2.png)
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-underestimated.png)
when true effect of |𝐸𝑓𝑓𝑒𝑐𝑡(𝑍)| > |𝐸𝑓𝑓𝑒𝑐𝑡(𝑋)| then the true effect of 𝑋 is sign changed
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-masked.png)
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-masked-2.png)
when true effect of |𝐸𝑓𝑓𝑒𝑐𝑡(𝑍)| > |𝐸𝑓𝑓𝑒𝑐𝑡(𝑋)| then the true effect of 𝑋 is sign changed
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-underestimated-2.png)
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---models/regression-models/analysis-(regressor/predictor/independent/input/feature-function---response/dependent/output/outcome)-variable/parametric-regression-(pr)-models/continuous-regression-models/linear-regression-(lr)-models/lr---problems/(confounding/lurking-variables---confounders)---(omitted-variable-bias---spurious-effects/relationships)/confounding-variable-and-omitted-variable-bias-effect-underestimated.png)
How to Detect Omitted Variable Bias
We know that for omitted variable bias to exist, an independent variable must correlate with the residuals. Consequently, we can plot the residuals by the variables in our model. If we see a relationship in the plot, rather than random scatter, it both tells us that there is a problem and points us towards the solution