Bayesian Inference uses the posterior distribution to infer the values of parameters. Like Inferential Statistics Forms, Bayesian Inference has 3 forms:
- Bayesian Point Estimation - e.g. MAP
- Bayesian Interval Estimation - e.g. Credible Intervals
- Bayesian Hypothesis Testing -
related: Bayes’ Rule
Bayesian Priors
priors can be represented in different forms such as:
- explicitly expressed in probability distributions over parameters of the model
- the direct influence of the function itself and only indirectly acting on the parameters via their effect on the function
- implicitly expressed by choosing algorithms that are biased toward choosing some class of functions over another (smoothness prior or local constancy prior)
Bayesian Estimation/Inference Steps
- before observing the dataset, we represent knowledge of 𝜃 with a prior probability distribution 𝐏(𝜃)
- usually, the selected prior distribution has “high entropy” to reflect a high degree of uncertainty in the value of 𝜃 (e.g. uniform distribution or Gaussian distribution with high entropy)
- next, the observation of data causes the posterior to lose entropy and concentrate around a few highly likely values of the parameters
Bayesian methods typically generalize much better when limited training data is available, but typically suffer from high computational cost when the number of training examples is large