Bias/Underfit vs Variance/Overfit - Linear Regression

Let’s say we are using Ordinary Least Squares (OLS) Regression as our learning algorithm

The data is given below

𝑥1 = size

𝑥2 = # of rooms

𝑦 = price

2104

3

400

1600

2

330

2400

4

369

1416

2

232

We can always find a regression function 𝑓(𝑥) that passes through all the observed points without any error (i.e. 𝛴1≤𝑖≤𝑛[𝑒𝑖]2 = 0). Though this may overfit

We want to find a line equation 𝑓(𝑥) that neither:

  • underfits (i.e. high bias)
  • overfits (i.e. high variance)

Method 1 - Comparing Line Equations of Different Polynomial Degrees

Method 2 - Comparing Different Regularization Values