Make Adjustments to Points in Reduced Space s.t. its Probability Distribution equals the Target Probability Distribution
KL divergence measures the “distance” between two probability distributions
KL(P∣∣Q)=∑i,jPijlog(QijPij)
Use gradient descent to minimize the sum of the KL divergence over all the points
Take the partial derivative of the cost function wrt every point. This partial derivative tells us how to move the points within the reduced dimensional space.