The PCA algorithm is one of the most important in terms of dimensionality reduction but really understand the process requires some kind of math preparation, so I thought it best to address this issue in a series of articles and this is part 1.

What is expected to you to know for this topic:

- Foundation on multivariable calculus

The first tool that we will focus on is the Lagrange multipliers.

Lagrange multipliers try to solve the following issue: How can we find local maximum or minimum based on some constraints? To make it more concrete, let’s take some example:

Suppose…

One of the most simple algorithms out there is linear regression, and it comes in handy when trying to model a linear relationship between features and targets.

When implementing some learning algorithm is quite reasonable to think about a cost (or risk) function to transform a problem of finding the right coefficients to fit the regression hyperplane into an optimization problem.

A cost function that captures the deviation of the expected values can be defined as follows:

For this article, I’m assuming that you know the basics of the Perceptron algorithm.

Linear models often serve the purpose of modeling data well, but sometimes it is not enough. To be more concrete, I will introduce the topic with an example:

Suppose that we are dealing with a simple feature vector that lives on the real line. in other words:

For the sake of simplicity, let’s consider 3 data points described as follows:

Electrical Engineer, BI analyst and Data Scientist