Gradient Descent vs Gradient descent intution
Answers
Answered by
0
Just adding to an existing post here, an intuitive way to think of Gradient Descent is to imagine the path of a river originating from top of a mountain.
The goal of gradient descent is exactly what the river strives to achieve - namely, reach the bottom most point (at the foothill) climbing down from the mountain.
Now, if the terrain of the mountain is shaped in such a way that the river doesn't have to stop anywhere completely before arriving at its final destination (which is the lowest point at the foothill, then this is the ideal case we desire. In Machine Learning, this amounts to saying, we have found the global mimimum (or optimum) of the solution starting from the initial point (top of the hill).
However, it could be that the nature of terrain forces several pits in the path of the river, which could force the river to get trapped and stagnate. In Machine Learning terms, such pits are termed as local minima solutions, which is not desirable. There are a bunch of ways to get out of this (which I am not discussing).
Gradient Descent therefore is prone to be stuck in local minimum, depending on the nature of the terrain (or function in ML terms). But, when you have a special kind of mountain terrain (which is shaped like a bowl, in ML terms this is called a Convex Function), the algorithm is always guaranteed to find the optimum. You can visualize this picturing a river again. These kind of special terrains (a.k.a convex functions) are always a blessing for optimization in ML.
Also, depending on where at the top of the mountain you initial start from (ie. initial values of the function), you might end up following a different path. Similarly, depending on the speed at the river climbs down (ie. the learning rate or step size for the gradient descent algorithm), you might arrive at the final destination in a different manner. Both of these criteria can affect whether you fall into a pit (local minima) or are able to avoid it.
For the math behind the Gradient Descent algorithm, there are several tutorials which explain it more rigorously and so I won't mention all that here.
Hope this helps to convey the high level idea!
20.7k Views · View Upvoters
Your response is private.
Is this answer still relevant and up to date?
Promoted by Aegis School of Data Science
Post graduate program in data science and business analytics.
Program certification by IBM. Launch your career in data science, big data, analytics & machine learning.
Learn More
MORE ANSWERS BELOW. RELATED QUESTIONS
What is an intuitive explanation of stochastic gradient descent?
20,368 Views
What is an intuitive explanation for why the gradient points in the direction of steepest ascent?
4,290 Views
What is an intuitive explanation for gradient descent with an impulse term?
618 Views
What's the difference between gradient descent and stochastic gradient descent?
148,313 Views
Is there any tool for gradient descent?
317 Views
Why don't we take the derivative of gradient descent?
614 Views
Is gradient descent "greedy"?
1,042 Views
What should everybody know about (stochastic) gradient descent?
4,189 Views
OTHER ANSWERS

Kiran Kannar, Master's Computer Science, University of California, San Diego (2018)
Answered Dec 10, 2016 · Author has 586 answers and 1.2m answer views
Have you played Hill Climb Racing?

It’s okay even if you have never played this game. That one depends a lot on physics, but for the sake of this answer, imagine the car’s objective is to drive down a hill to the lowest point, but not climb up a second hill adjacent to it (sort of like a V formed by two hills). Leaving aside the physics for the moment, you can move down the hill with constant speed, but with multiple stops to refuel, and you are not allowed to stop until you run out of fuel.
Also, there is a good magnetic force called gMagneto that can exert its control and will help you take appropriately distanced stops to reach the lowest point. The distance between stops is not equal, and depends on the gMagneto’s powerful force, the steep of the hill at the current point of the car (and therefore the car’s current point). The
The goal of gradient descent is exactly what the river strives to achieve - namely, reach the bottom most point (at the foothill) climbing down from the mountain.
Now, if the terrain of the mountain is shaped in such a way that the river doesn't have to stop anywhere completely before arriving at its final destination (which is the lowest point at the foothill, then this is the ideal case we desire. In Machine Learning, this amounts to saying, we have found the global mimimum (or optimum) of the solution starting from the initial point (top of the hill).
However, it could be that the nature of terrain forces several pits in the path of the river, which could force the river to get trapped and stagnate. In Machine Learning terms, such pits are termed as local minima solutions, which is not desirable. There are a bunch of ways to get out of this (which I am not discussing).
Gradient Descent therefore is prone to be stuck in local minimum, depending on the nature of the terrain (or function in ML terms). But, when you have a special kind of mountain terrain (which is shaped like a bowl, in ML terms this is called a Convex Function), the algorithm is always guaranteed to find the optimum. You can visualize this picturing a river again. These kind of special terrains (a.k.a convex functions) are always a blessing for optimization in ML.
Also, depending on where at the top of the mountain you initial start from (ie. initial values of the function), you might end up following a different path. Similarly, depending on the speed at the river climbs down (ie. the learning rate or step size for the gradient descent algorithm), you might arrive at the final destination in a different manner. Both of these criteria can affect whether you fall into a pit (local minima) or are able to avoid it.
For the math behind the Gradient Descent algorithm, there are several tutorials which explain it more rigorously and so I won't mention all that here.
Hope this helps to convey the high level idea!
20.7k Views · View Upvoters
Your response is private.
Is this answer still relevant and up to date?
Promoted by Aegis School of Data Science
Post graduate program in data science and business analytics.
Program certification by IBM. Launch your career in data science, big data, analytics & machine learning.
Learn More
MORE ANSWERS BELOW. RELATED QUESTIONS
What is an intuitive explanation of stochastic gradient descent?
20,368 Views
What is an intuitive explanation for why the gradient points in the direction of steepest ascent?
4,290 Views
What is an intuitive explanation for gradient descent with an impulse term?
618 Views
What's the difference between gradient descent and stochastic gradient descent?
148,313 Views
Is there any tool for gradient descent?
317 Views
Why don't we take the derivative of gradient descent?
614 Views
Is gradient descent "greedy"?
1,042 Views
What should everybody know about (stochastic) gradient descent?
4,189 Views
OTHER ANSWERS

Kiran Kannar, Master's Computer Science, University of California, San Diego (2018)
Answered Dec 10, 2016 · Author has 586 answers and 1.2m answer views
Have you played Hill Climb Racing?

It’s okay even if you have never played this game. That one depends a lot on physics, but for the sake of this answer, imagine the car’s objective is to drive down a hill to the lowest point, but not climb up a second hill adjacent to it (sort of like a V formed by two hills). Leaving aside the physics for the moment, you can move down the hill with constant speed, but with multiple stops to refuel, and you are not allowed to stop until you run out of fuel.
Also, there is a good magnetic force called gMagneto that can exert its control and will help you take appropriately distanced stops to reach the lowest point. The distance between stops is not equal, and depends on the gMagneto’s powerful force, the steep of the hill at the current point of the car (and therefore the car’s current point). The
Similar questions