Parametric Regression (PR) Models
- is a category of regression analysis in which the predictor 𝑦̂ takes on a parametric function/form with respect to the independent variables 𝑥
PR Analysis
see: Parametric Regression (PR) Analysis
PR Models - Types/Classes
- (Level/Linear/Log-Level/Linear/Log) Regression Model
- General Linear Models vs Generalized Linear Models
PR Models - Comparisons
PR Models - Dependent/Response Variable Type
Continuous Regression Models - takes an input vector 𝑥∊ℝ𝑛 as input and predicts the value of a scalar 𝑦∊ℝ as output
Linear Regression Models Linear Regression Models
Link to originalLinear Regression Models - takes an input vector 𝑥∊ℝ𝑛 as input and predicts the value of a scalar 𝑦∊ℝ as output (whose function/estimator is linear wrt the regression coefficients {𝜃0, …, 𝜃𝑝})
Linear Model Type
Description
- has several weaknesses, including sensitivity to both outliers and multicollinearity and it is prone to overfitting
- these automated methods can help identify candidate regressors early in the model specification process
- applicable in all cases where Ordinary Least Squares (OLS) Regression can be used
- applies re-weighting to reduce outlier influence
- address multicollinearity
- allows you to analyze data even when severe multicollinearity is present and helps prevent overfitting. This type of model reduces the large, problematic variance that multicollinearity causes by introducing a slight bias in the estimates. The procedure trades away much of the variance in exchange for a little bias, which produces more useful coefficient estimates when multicollinearity is present
Lasso Regression
(Least Absolute Shrinkage and Selection Operator)
- performs variable selection that aims to increase prediction accuracy by identifying a simpler model. It is similar to Ridge regression but with variable selection
- is a combination of regularizers Ridge regression and LASSO regression
Partial Least Squares (PLS) Regression
- is useful when you have very few observations compared to the number of independent variables or when your independent variables are highly correlated. PLS decreases the independent variables down to a smaller number of uncorrelated components, similar to Principal Component Analysis. Then, the procedure performs linear regression on these components rather than the original data. PLS emphasizes developing predictive models and is not used for screening variables. Unlike OLS, you can include multiple continuous dependent variables. PLS uses the correlation structure to identify smaller effects and model multivariate patterns in the dependent variables
- models variables within (0, 1) range
- models compositional data
- smoothing time series
- for approximation of data that can only increase (typically cumulative data)
Non-Linear Regression Models Non-Linear Regression Models
Link to originalNon-Linear Regression Models - takes an input vector 𝑥∊ℝ𝑛 as input and predicts the value of a scalar 𝑦∊ℝ as output (whose function/estimator is linear wrt the coefficients {𝜃0, …, 𝜃𝑝})
Non-Linear Regression Type
Function Form Example
power
𝑦̂ = 𝜃1𝑥𝜃2
weibull growth
𝑦̂ = 𝜃1 + (𝜃2- 𝜃1)·𝑒𝑥𝑝(-𝜃3𝑥𝜃4)
𝑦̂ = 𝜃0 + 𝜃1𝑐𝑜𝑠(𝑥 + 𝜃4) + 𝜃2𝑐𝑜𝑠(2𝑥 + 𝜃4)
Link to originalCategorical Regression Models - takes an input vector 𝑥∊ℝ𝑛 as input and predicts the value of a nominal/ordinal 𝑦∊ℝ as output
Type
Description
- takes an input vector 𝑥∊ℝ𝑛 as input and predicts the value of a nominal 𝑦∊ℝ as output
- models binary variables
- uses the cumulative distribution function of the logistic (sigmoid) distribution
- models binary variables
- the cumulative distribution function of the standard normal distribution
- models categorical variables with more than 2 levels
- models ordinal or rank variables
Link to original
- takes an input vector 𝑥∊ℝ𝑛 as input and predicts the value of a count 𝑦∊ℝ as output
- a type of parametric regression model whose dependent variable is a count of items, events, results, or activities
- counts are nonnegative integers (0, 1, 2, etc.)
- count data with:
- higher means tend to be normally distributed and you can often use Linear Regression/OLS
- smaller means can be skewed, and Linear Regression might have a hard time fitting these data. In these cases, we use count regression models