Methods estimating unknown coefficients {𝜃0, …, 𝜃𝑘} of 𝐄[𝑌|𝑋1=𝑥1, …, 𝑋𝑘=𝑥𝑘] = ℎ(𝑥1, …, 𝑥𝑘) = 𝑦̂ = 𝜃0+ 𝜃1𝑓1(𝑥1, …, 𝑥𝑘) + … + 𝜃𝑘𝑓𝑘(𝑥1, …, 𝑥𝑘)

Method

Description

Method of Least Squares
(Gradient Descent)

  • idea: minimizing square error via GRADIENT DESCENT
  • need to choose learning rate 𝛼
  • need many iterations
  • works well when the number of training examples 𝑋 is large

Method of Least Squares
(Projection Matrix - Normal Equation)

  • idea: minimizing square error via NORMAL EQUATIONS
  • no need to choose learning rate 𝛼
  • do not need to iterate
  • need to compute (𝑋𝑇𝑋)-1𝑋𝑇 or 𝑉𝐷-1𝑈𝑇
  • slow if the number of training examples 𝑋 is large because computing the inverse of a matrix is 𝑂(𝑛3)

Maximum Likelihood Estimation

MAP (Bayesian Linear Regression)

Newton-Raphson (N-R) Technique

  • idea: TODO