- posterior distribution 𝐏(𝜃|𝑋) refers to the distribution of parameter(s) 𝜃 given observed data 𝑋
- predictive posterior distribution 𝐏(𝑋ˆ|𝑋) refers to the distribution of new data 𝑋ˆ given observed data 𝑋
For example in Bayesian linear regression, you learn a posterior distribution over the 𝑤 parameter of the model 𝑦=𝑤𝑋 given some observed data 𝑋. Then when a new unseen data point 𝑥* comes in, you want to find the distribution over possible predictions 𝑦* given the posterior distribution for 𝑤 that you just learned. This distribution over possible 𝑦*‘s given the posterior for 𝑤 is the prediction distribution
- predictive posterior distribution
- is used to predict the label of new data values
- distribution for future predicted data based on the data you have already seen