Model Type/Class/Category

Description

Example Model

Regression Algorithms

is concerned with modeling the relationship between variables. This process is iteratively refined using a measure of the error in the predictions made by the model

Instance-Based Algorithms

Memory-Based

Winner-Take-All

uses training data as part of the model

such methods have 2 steps:

  • build up a representation/database of training data
  • defining similarity measures to compare new data to the database in order to find the best match and make a prediction

Regularization Algorithms

is an extension of another method (typically regression methods) that penalizes models based on their complexity, favoring simpler models that are also better at generalizing

  • Ridge Regression
  • Least Absolute Shrinkage and Selection Operator (LASSO)
  • Elastic Net
  • Least-Angle Regression (LARS)

Decision Tree Algorithms

constructs a model of decisions made based on actual values of attributes in the data.

Decisions fork in tree structures until a prediction decision is made for a given record. Decision trees are trained on data for classification and regression problems.

Bayesian Algorithms

applies Bayes’ Theorem for problems such as classification and regression

Clustering Algorithms

are concerned with using the inherent structures in the data to best organize the data into groups of maximum commonality

Association Rule Learning Algorithms

extract rules that best explain observed relationships/associations between variables in large multidimensional datasets

  • Apriori algorithm
  • Eclat algorithm

Artificial Neural Network Algorithms

a class of pattern matching that are commonly used for regression and classification problems but are really an enormous subfield comprised of hundreds of algorithms and variations for all manner of problem types

Deep Learning Algorithms

Deep Learning methods are a modern update to Artificial Neural Networks concerned with building much larger and more complex neural networks

Dimensionality Reduction Algorithms

like clustering methods, dimensionality reduction seeks and exploits the inherent structure in the data, but in this case in an unsupervised manner or in order to summarize or describe data using less information.

This can be useful to visualize dimensional data or to simplify data which can then be used in a supervised learning method

Ensemble Algorithms

are composed of multiple models that are independently trained and whose predictions are combined in some way to make the overall prediction

  • Boosting
  • Bootstrapped Aggregation (Bagging)
  • Random Forest
  • AdaBoost
  • Weighted Average (Blending)
  • Stacked Generalization (Stacking)
  • Gradient Boosting Machines (GBM)
  • Gradient Boosted Regression Trees (GBRT)