|
|
Parametric
|
Non-Parametric
|
|---|
|
simple
|
learns a function described by a finite amount of parameters
|
learns a function with no limit of parameters
|
|---|
|
description
|
Assumptions can greatly simplify the learning process, but can also limit what can be learned. Algorithms that simplify the function to a known form are called parametric machine learning algorithms
these algorithms involve two steps:
- select a form/model for the function
- learn the coefficients for the function from the training data
|
Algorithms that do not make strong assumptions about the form of the mapping function are called non-parametric machine learning algorithms. By not making assumptions, they are free to learn any functional form from the training data
Non-parametric models differ from parametric models in that the model structure is not specified a priori but is instead determined from data. The term non-parametric is not meant to imply that such models completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance
|
|---|
|
benefits
|
- Simpler: These methods are easier to understand and interpret results.
- Speed: Parametric models are very fast to learn from data.
- Less Data: They do not require as much training data and can work well even if the fit to the data is not perfect
|
- Flexibility: Capable of fitting a large number of functional forms.
- Power: No assumptions (or weak assumptions) about the underlying function.
- Performance: Can result in higher performance models for prediction
|
|---|
|
limitations
|
- Constrained: By choosing a functional form these methods are highly constrained to the specified form.
- Limited Complexity: The methods are more suited to simpler problems.
- Poor Fit: In practice the methods are unlikely to match the underlying mapping function
|
- More data: Require a lot more training data to estimate the mapping function.
- Slower: A lot slower to train as they often have far more parameters to train.
- Overfitting: More of a risk to overfit the training data and it is harder to explain why specific predictions are made
|
|---|
|
Bias-Variance Trade-Off
|
generally have:
- higher bias
- lower variance
|
generally have:
- lower bias
- higher variance
|
|---|
|
Probability
|
Parametric Probability Distribution Models
|
Non-Parametric Probability Distribution Models
|
|---|
|
Example Algorithms
|
|
|
|---|