Types of Machine Learning Algorithms

Introduction:

Ken Hoffman
Analytics Vidhya

--

Machine learning is an application of AI that provides systems the ability to automatically learn from data and experience without explicit programming.

There is some variation as to what purposes machine learning can be used for:

  • Supervised Learning
  • Unsupervised Learning

Supervised Learning:

Supervised Learning is the task of learning the mapping function from the input variable (x) to the output variable (y). It is called supervised learning because the process of an algorithm learning from the training dataset can be thought of as a teacher supervising the learning process.

Supervised Learning can be used for regression and classification problems:

Regression: when the output variable is a real and/or continuous value such as “salary” or “height”.

Classification: when the output variable is a category such as “hot” vs. “cold” or “green” vs. “red” vs. “blue”.

Some popular examples of supervised machine learning algorithms are:

  • Linear regression → linear approach to modeling that attempts to model the relationship between a scalar response and one or more explanatory variables
Linear Regression
  • Logistic regression → a statistical model that uses a logistic function to model a binary dependent variable
Logistic Regression
  • K-nearest neighbors → a type of lazy learning that classifies data points based on the points that are most similar to it
k-nearest neighbors
  • Decision tree → a model that is structured like a flow-chart, each question helping to separate data further
  • Random Forest → a model that consists of many decision trees, each providing its own classification. The Random Forest collects the classifications and chooses the most voted prediction as the result
  • Gradient Boosting algorithms (i.e. XGBoost) → a technique for regression and classification problems which produces a prediction model in the form of an ensemble of weak prediction models (typically decision trees). It builds the model in a stage-wise fashion and generalizes them by allowing optimization of an arbitrary differentiable loss function.

Unsupervised Learning:

Unsupervised Learning is used when you only have input data, but no corresponding output variables. It is called unsupervised learning because, unlike with supervised learning, there is no correct answer. Algorithms are left on their own to determine and understand patterns in the data.

Unsupervised Learning can be used for clustering and association problems:

Clustering: discovering the inherent groupings in a dataset

Association: discovering rules that describe portions of the data (ex. people that buy product X also tend to buy product Y)

Some popular examples of unsupervised machine learning algorithms are:

  • K-means → a method of vector quantization that aims to separate n observations into k clusters. Each observation belongs to the cluster with the nearest mean.
  • Hierarchical Clustering → an algorithm that seeks to build a hierarchy of clusters. The endpoint is a set of clusters, where each cluster is distinct from each other cluster, and the objects within each cluster are broadly similar to each other.

References:

--

--