Statistics are the triumph of the quantitative method, and the quantitative method is the victory of sterility and death by Hilaire belloc, British-French Writer and Historian
Why do we need to understand Regularization?
Have you ever feel your training model metrics are good but not performing well on unseen data. You are also unsure what to do in the training model as everything looks perfect. That's right it might be a case of Overfitting.
There are some few steps to overcome the problem of overfitting but we are only focusing on regularization concept in this post.
Let's understand through a real life example:
Suppose there is a kid who is good in games like car racing but he has not driven any car in real world. He only knows theoretically and he is over confident that he can easily drive any car on the road without any issue. One of his friend asked him to drive his car to the mall to play bowl. Now, kid accepts this as a challenge and as he is overconfident on his driving skills, he took his father car. Now for sometime on the plain road, there is no issue in his driving but as traffic comes, he started to shake and beside putting foot on brake peddle, he presses speed peddle. He rammed on other car.
Now, lets's understand what got wrong here!
He knows everything, a great car player but only in games so when it comes to test his skill in real world, he fails. This is the problem of overfitting.
So, when performance fails on unseen data in simple words that is called overfitting.
Source: speedlux.com |
So, how will you solve this problem? To make him drive safely on real roads, first without taking him to traffic, you will teach him some basics of car then you will take him on plain roads after that slowly slowly you will add some obstacles in front of him, so that he can learn how to turn, how to break, when to change gears, when to speed, how to look in mirrors.
So, I think this is the simplest meaning of regularization. Hope you now have basic understanding of this concept.
To Convert complex things into simpler ones, we use regularization. In Context of machine learning, it shrinks the coefficients of model towards zero. It discourages learning a more complex models, to prevent the problem of overfitting.
Basic idea is to add a complexity term that would give a bigger loss for complex models. In simple words, as high we goes up, more we can fall. This is what we doing here!
There are two techniques in Regularization:
Lasso regression(L1 Regularization)
SSE is the sum of squared error in the regression fit line. Objective of SSE is to minimize the distance between the fit line and data points.It modifies the SSE by adding the penalty term equivalent to the sum of absolute value of coefficients. It will penalize the higher coefficient.
SSE + l| b |
Ridge regression(L2 Regularization)
It modifies the penalty term equivalent to square of magnitude of coefficients.
SSE + l| b |2
But in L2, coefficients always will be non zero. So the ridge regression only shrinks the coefficients close to zero. Hence, final model will include all predictors.On the other hand, L1 provides sparse solution that means it can shrink down some coefficients exactly to zero.
Feel free to comment your opinions!
0 Comments