Comparing:
Logistic regression indeed does not assume any specific shapes of densities in the space of predictor variables, but LDA does. Here are some differences between the two analyses, briefly.
Binary Logistic regression (BLR) vs Linear Discriminant analysis (with 2 groups: also known as Fisher’s LDA):
- BLR: Based on Maximum Likelihood Estimation. LDA: Based on Least Squares Estimation; equivalent to linear regression with binary predictand (coefficients are proportional and R-square = 1-Wilk’s lambda)
- BLR: Estimates probability (of group membership) immediately (the predictand is itself taken as probability, observed one) and conditionally. LDA: estimates probability mediately (the predictand is viewed as a binned continuous variable, the discriminant) via a classificatory device (such as naive Bayes) which uses both conditional and marginal information.
- BLR: Not so exigent to the level of the scale and the form of the distribution in predictors. LDA: Predictors desirably interval level with multivariate normal distribution.
- BLR: No requirements about the within-group covariance matrices of the predictors. LDA: The within-group covariance matrices should be identical in population.
- BLR: The groups may have quite different 𝑛. LDA: The groups should have similar 𝑛.
- BLR: Not so sensitive to outliers. LDA: Quite sensitive to outliers.
- BLR: Younger method. LDA: Older method.
- BLR: Usually preferred, because less exigent / more robust. LDA: With all its requirements met, often classifies better than BLR (asymptotic relative efficiency 3/2 time higher then)
- BLR: categorical variables can be used as independent variables while making predictions. LDA: works when all the independent/predictor variables are continuous (not categorical) and follow a Normal distribution
- When the classes of the response variable Y (i.e. default = “Yes”, default = “No”) are well-separated, the parameter estimates for the logistic regression model are surprisingly unstable. LDA & QDA do not suffer from this problem.
- If n is small and the distribution of the predictors X is approximately normal in each of the classes, the LDA & QDA models are again more stable than the logistic regression model.
- LDA & QDA are often preferred over logistic regression when we have more than two non-ordinal response classes (i.e.: stroke, drug overdose, and epileptic seizure)
- Both LDA and QDA assume that the predictor variables X are drawn from a multivariate Gaussian (aka normal) distribution.
- LDA assumes equality of covariances among the predictor variables X across all levels of Y. This assumption is relaxed with the QDA model.
- LDA and QDA require the number of predictor variables (p) to be less than the sample size (n). Furthermore, it is important to keep in mind that performance will severely decline as p approaches n. A simple rule of thumb is to use LDA & QDA on data sets where n≥5×pn≥5×p