Bias/Underfit vs Variance/Overfit - Linear Regression
Let’s say we are using Ordinary Least Squares (OLS) Regression as our learning algorithm
The data is given below
|
𝑥1 = size |
𝑥2 = # of rooms |
… |
𝑦 = price |
|---|---|---|---|
|
2104 |
3 |
… |
400 |
|
1600 |
2 |
… |
330 |
|
2400 |
4 |
… |
369 |
|
… |
… |
… |
… |
|
1416 |
2 |
… |
232 |
We can always find a regression function 𝑓(𝑥) that passes through all the observed points without any error (i.e. 𝛴1≤𝑖≤𝑛[𝑒𝑖]2 = 0). Though this may overfit
We want to find a line equation 𝑓(𝑥) that neither:
- underfits (i.e. high bias)
- overfits (i.e. high variance)
Method 1 - Comparing Line Equations of Different Polynomial Degrees
Click here to expand...
we choose a set of candidate line equations (of increasing polynomial degree):
- 𝑦(1) = 𝜃0 + 𝜃1(𝑥)
- 𝑦(2) = 𝜃0 + 𝜃1(𝑥) + 𝜃2(𝑥)2
- …
- 𝑦(𝑑) = 𝜃0 + 𝜃1(𝑥) + 𝜃2(𝑥)2 + … + 𝜃𝑑(𝑥)𝑑
we split the data into 2 sets:
- training set
- test set / cross-validation set
we train each candidate line equation with the training set. For each candidate we have a model and we can test its performance by calculating its error over the 2 data sets. Then plot it!
bias-variance-polynomial-degree.drawiowe choose the model in the sweet spot region
Method 2 - Comparing Different Regularization Values
Click here to expand...
- single model: 𝘩𝜃(𝑥) = 𝜃0 + 𝜃1(𝑥) + 𝜃2(𝑥)2 + … + 𝜃𝑑(𝑥)𝑑
- cost/error function: 𝐽(𝜃) = (1/2𝑛)𝛴1≤𝑖≤𝑛(𝘩𝜃(𝑥(𝑖)) - 𝑦(𝑖))2 + (𝜆/2𝑑)𝛴1≤𝑗≤𝑑(𝜃𝑗)2
try different regularization values 𝜆:
- 𝜆1 = 0
- 𝜆2 = 0.01
- 𝜆3 = 0.02
- 𝜆4 = 0.04
- 𝜆5 = 0.08
- …
- 𝜆12 = 10.24
we split the data into 2 sets:
- training set
- test set / cross-validation set
we train each candidate line equation with the training set. For each candidate we have a model and we can test its performance by calculating its error over the 2 data sets. Then plot it!
bias-variance-as-a-function-of-regularization-parameter.drawio
we choose the model in the sweet spot region
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---diagnosing-model-bias/underfit-vs-variance/overfit---linear-regression/bias-variance-polynomial-degree.png)
---cognitive-computing---machine-intelligence/ai---subfields/machine-learning-(ml)---pattern-recognition/ml---diagnosing-model-bias/underfit-vs-variance/overfit---linear-regression/bias-variance-as-a-function-of-regularization-parameter.png)