Solving Linear Regression With Maximum Likelihood Estimation (MLE)

instead of producing a single scalar prediction 𝑦̂, think of a model producing a conditional distribution 𝐏(𝑦|𝒙)

We can imagine that with an infinitely large training set, we might see several training examples with the same input value 𝒙 but different values of 𝑦. The goal of the learning algorithm is now to fit the distribution 𝐏(𝑦|𝒙) to all of those different 𝑦 values that are all compatible with 𝒙

we define:

𝐏(𝑦|𝒙) = 𝒩(𝑦;𝑓(𝒙,𝜽),𝜎²)

where:

𝒩 - is the normal distribution
𝑓(𝒙,𝜽) is the estimation of the mean of the gaussian as 𝑦̂
𝜎² - is some constant variance

In other words, OLS is mathematically equivalent to MLE, if the errors are assumed to be normally distributed and Independent and Identically Distributed (IID)

Given 𝑛 training examples {(𝑦⁽¹⁾,𝒙⁽¹⁾), …, (𝑦^(𝑛),𝒙^(𝑛))}, maximize the probability w.r.t. model parameters 𝜽:

𝜽ˆ_𝑀𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥_𝜽 𝛴_{1≤𝑖≤𝑛}[ 𝑙𝑜𝑔 [ 𝐏(𝑦^(𝑖)|𝒙^(𝑖);𝜽) ] ] # see derivation at Maximum Likelihood Estimation (MLE)
𝜽ˆ_𝑀𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥_𝜽 𝛴_{1≤𝑖≤𝑛}[ 𝑙𝑜𝑔 [ 𝒩(𝑦^(𝑖);𝑓(𝒙^(𝑖),𝜽),𝜎²) ] ] # by definition above
𝜽ˆ_𝑀𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥_𝜽 𝛴_{1≤𝑖≤𝑛}[ 𝑙𝑜𝑔 [ 𝒩(𝑦^(𝑖);𝑦̂^(𝑖),𝜎²) ] ] # equivalent syntax change: 𝑓(𝒙^(𝑖),𝜽) = 𝑦̂^(𝑖)
𝜽ˆ_𝑀𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥_𝜽 𝛴_{1≤𝑖≤𝑛}[ 𝑙𝑜𝑔 [ (1/[𝜎*𝑠𝑞𝑟𝑡(2𝜋)]) 𝑒^{-(𝑦^(𝑖)-𝑦̂^(𝑖))²/(2𝜎²)} ] ] # by normal distribution formula
𝜽ˆ_𝑀𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥_𝜽 𝛴_{1≤𝑖≤𝑛}[ 𝑙𝑜𝑔(1) - 𝑙𝑜𝑔(𝜎) - 𝑙𝑜𝑔((2𝜋)^(1/2)) + 𝑙𝑜𝑔(𝑒^-(^𝑦^{^(𝑖)}^{-𝑦̂^(𝑖)}^{)²/(2𝜎²)}) ]
𝜽ˆ_𝑀𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥_𝜽 𝛴_{1≤𝑖≤𝑛}[ 0 ] - 𝛴_{1≤𝑖≤𝑛}[ 𝑙𝑜𝑔(𝜎) ] - 𝛴_{1≤𝑖≤𝑛}[(1/2)·𝑙𝑜𝑔(2𝜋)] + 𝛴_{1≤𝑖≤𝑛}[ -(𝑦^(𝑖)-𝑦̂^(𝑖))²/(2𝜎²) 𝑙𝑜𝑔(𝑒) ]
𝜽ˆ_𝑀𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥_𝜽 -𝑛·𝑙𝑜𝑔(𝜎) - (𝑛/2)·𝑙𝑜𝑔(2𝜋) - 𝛴_{1≤𝑖≤𝑛}[ (𝑦^(𝑖)-𝑦̂^(𝑖))²/ (2𝜎²) ]
𝜽ˆ_𝑀𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥_𝜽 -𝛴_{1≤𝑖≤𝑛}[ (𝑦^(𝑖)-𝑦̂^(𝑖))² / (2𝜎²) ]
𝜽ˆ_𝑀𝐿𝐸 = 𝑎𝑟𝑔𝑚𝑎𝑥_𝜽 -𝛴_{1≤𝑖≤𝑛}[ (𝑦^(𝑖)-𝑦̂^(𝑖))² ] # 2𝜎² is a constant

Again the 𝐿𝑆𝐸 estimator is defined as:

𝜽ˆ_𝐿𝑆𝐸 = 𝑎𝑟𝑔𝑚𝑖𝑛_𝜽 (1/𝑛) 𝛴_{1≤𝑖≤𝑛}[ (𝑦^(𝑖)-𝑦̂^(𝑖))² ]
𝜽ˆ_𝐿𝑆𝐸 = 𝑎𝑟𝑔𝑚𝑖𝑛_𝜽 𝛴_{1≤𝑖≤𝑛}[ (𝑦^(𝑖)-𝑦̂^(𝑖))² ]

therefore:

𝜽ˆ_𝑀𝐿𝐸 = 𝜽ˆ_𝐿𝑆𝐸

／var／log marcus chiu

Explorer

LR - Methods Estimating Unknown Coefficients - MLE

Solving Linear Regression With Maximum Likelihood Estimation (MLE)

／var／logmarcus chiu

Explorer

LR - Methods Estimating Unknown Coefficients - MLE

Solving Linear Regression With Maximum Likelihood Estimation (MLE)

／var／log marcus chiu