Linear SVM (SVM Without Kernel)

Linear SVM - Representation

same as binomial logistic regression

  • 𝒙 - input attribute values vector (i.e. 𝒙 = [𝑥0, …, 𝑥𝑘]) # 𝑥0is the bias
  • 𝜽 - weight/parameter vector (i.e. 𝜽 = [𝜃0, …, 𝜃𝑘])
  • 𝑦 - binary output value

Linear SVM - Cost Function With Regularization

here is the binomial logistic regression’s regularized cost function:

  • 𝐽(𝜽) = -(1/𝑛)·𝛴1≤𝑖≤𝑛[𝑦(𝑖)·𝑙𝑜𝑔(ℎ𝜽(𝒙(𝑖))) + (1-𝑦(𝑖))·𝑙𝑜𝑔(1-ℎ𝜽(𝒙(𝑖)))] + (𝜆/2)·[𝛴1≤𝑗≤𝑘(𝜃𝑗)2]

let’s represent it differently:

  • 𝐽(𝜽) = -(1/𝑛)·𝛴1≤𝑖≤𝑛[𝑦(𝑖)·𝑙𝑜𝑔(ℎ𝜽(𝒙(𝑖))) + (1-𝑦(𝑖))·𝑙𝑜𝑔(1-ℎ𝜽(𝒙(𝑖)))] + (𝜆/2)·[𝛴1≤𝑗≤𝑘(𝜃𝑗)2]
  • 𝐽(𝜽) = (1/𝑛)·𝛴1≤𝑖≤𝑛[𝑦(𝑖)·-𝑙𝑜𝑔(ℎ𝜽(𝒙(𝑖))) + (1-𝑦(𝑖)-𝑙𝑜𝑔(1-ℎ𝜽(𝒙(𝑖)))] + (𝜆/2)·[𝛴1≤𝑗≤𝑘(𝜃𝑗)2]
  • 𝐽(𝜽) = (1/𝑛)·𝛴1≤𝑖≤𝑛[𝑦(𝑖)·𝑐𝑜𝑠𝑡1(𝜽𝑇𝒙(𝑖)) + (1-𝑦(𝑖)𝑐𝑜𝑠𝑡0(𝜽𝑇𝒙(𝑖))] + (𝜆/2)·[𝛴1≤𝑗≤𝑘(𝜃𝑗)2]
  • 𝐽(𝜽) = 𝐶·𝛴1≤𝑖≤𝑛[𝑦(𝑖)·𝑐𝑜𝑠𝑡1(𝜽𝑇𝒙(𝑖)) + (1-𝑦(𝑖)𝑐𝑜𝑠𝑡0(𝜽𝑇𝒙(𝑖))] + (1/2)·[𝛴1≤𝑗≤𝑘(𝜃𝑗)2] # 𝐶 = (1/𝜆) and remove constant (1/𝑛)
  • 𝐽(𝜽) = 𝐶·𝛴1≤𝑖≤𝑛[𝑦(𝑖)·𝑐𝑜𝑠𝑡1(𝜽𝑇𝒙(𝑖)) + (1-𝑦(𝑖)𝑐𝑜𝑠𝑡0(𝜽𝑇𝒙(𝑖))] + (1/2)·(𝜽𝑇𝜽) ]

where:

Linear SVM - Learning 𝜽s

goal: optimize values of 𝜽 wrt cost function 𝐽(𝜽)

Linear SVM - Hypothesis

given 𝒙 and the optimized values of 𝜽, the assigned output value is defined as (i.e. hypothesis):

  • 𝜽(𝒙) = 1, if 𝜽𝑇𝒙 ≥ 0
  • 𝜽(𝒙) = 0, otherwise