Regularization : A Important concept in Machine Learning!

Statistics are the triumph of the quantitative method, and the quantitative method is the victory of sterility and death by Hilaire belloc, British-French Writer and Historian

Source: deeplearning-academy

Why do we need to understand Regularization?

Have you ever feel your training model metrics are good but not performing well on unseen data. You are also unsure what to do in the training model as everything looks perfect. That's right it might be a case of Overfitting.

There are some few steps to overcome the problem of overfitting but we are only focusing on regularization concept in this post.

Let's understand through a real life example:

Suppose there is a kid who is good in games like car racing but he has not driven any car in real world. He only knows theoretically and he is over confident that he can easily drive any car on the road without any issue. One of his friend asked him to drive his car to the mall to play bowl. Now, kid accepts this as a challenge and as he is overconfident on his driving skills, he took his father car. Now for sometime on the plain road, there is no issue in his driving but as traffic comes, he started to shake and beside putting foot on brake peddle, he presses speed peddle. He rammed on other car.

Now, lets's understand what got wrong here!

He knows everything, a great car player but only in games so when it comes to test his skill in real world, he fails. This is the problem of overfitting.

So, when performance fails on unseen data in simple words that is called overfitting.

Source: speedlux.com

So, how will you solve this problem? To make him drive safely on real roads, first without taking him to traffic, you will teach him some basics of car then you will take him on plain roads after that slowly slowly you will add some obstacles in front of him, so that he can learn how to turn, how to break, when to change gears, when to speed, how to look in mirrors.

So, I think this is the simplest meaning of regularization. Hope you now have basic understanding of this concept.

To Convert complex things into simpler ones, we use regularization. In Context of machine learning, it shrinks the coefficients of model towards zero. It discourages learning a more complex models, to prevent the problem of overfitting.

Basic idea is to add a complexity term that would give a bigger loss for complex models. In simple words, as high we goes up, more we can fall. This is what we doing here!

There are two techniques in Regularization:

Lasso regression(L1 Regularization)

SSE is the sum of squared error in the regression fit line. Objective of SSE is to minimize the distance between the fit line and data points.It modifies the SSE by adding the penalty term equivalent to the sum of absolute value of coefficients. It will penalize the higher coefficient.

SSE + l| b |

Ridge regression(L2 Regularization)

It modifies the penalty term equivalent to square of magnitude of coefficients.

SSE + l| b |²

But in L2, coefficients always will be non zero. So the ridge regression only shrinks the coefficients close to zero. Hence, final model will include all predictors.On the other hand, L1 provides sparse solution that means it can shrink down some coefficients exactly to zero.

Feel free to comment your opinions!

Datacian

Regularization : A Important concept in Machine Learning!

Why do we need to understand Regularization?

Lasso regression(L1 Regularization)

Ridge regression(L2 Regularization)

Post a Comment

0 Comments

More from Datacian

What is Stratified Random Sampling and it's implementaton in R

Learn to Implement One-Way and Two-Way Contingency tables!

Implement Three-Way Contingency table!

Categories

Contact Us

Recent Posts

Use Labels to Navigate

Translate

Wikipedia

Datacian

Regularization : A Important concept in Machine Learning!

Why do we need to understand Regularization?

Lasso regression(L1 Regularization)

Ridge regression(L2 Regularization)

You may like these posts

Post a Comment

0 Comments

More from Datacian

What is Stratified Random Sampling and it's implementaton in R

Learn to Implement One-Way and Two-Way Contingency tables!

Implement Three-Way Contingency table!

Categories

Contact Us

Recent Posts

Use Labels to Navigate

Translate

Wikipedia