Physics, asked by alekya1389, 1 year ago

Gradient Descent vs Gradient descent intution

Answers

Answered by divyansh108

Just adding to an existing post here, an intuitive way to think of Gradient Descent is to imagine the path of a river originating from top of a mountain.
The goal of gradient descent is exactly what the river strives to achieve - namely, reach the bottom most point (at the foothill) climbing down from the mountain.

Now, if the terrain of the mountain is shaped in such a way that the river doesn't have to stop anywhere completely before arriving at its final destination (which is the lowest point at the foothill, then this is the ideal case we desire. In Machine Learning, this amounts to saying, we have found the global mimimum (or optimum) of the solution starting from the initial point (top of the hill).

However, it could be that the nature of terrain forces several pits in the path of the river, which could force the river to get trapped and stagnate. In Machine Learning terms, such pits are termed as local minima solutions, which is not desirable. There are a bunch of ways to get out of this (which I am not discussing).

Gradient Descent therefore is prone to be stuck in local minimum, depending on the nature of the terrain (or function in ML terms). But, when you have a special kind of mountain terrain (which is shaped like a bowl, in ML terms this is called a Convex Function), the algorithm is always guaranteed to find the optimum. You can visualize this picturing a river again. These kind of special terrains (a.k.a convex functions) are always a blessing for optimization in ML.

Also, depending on where at the top of the mountain you initial start from (ie. initial values of the function), you might end up following a different path. Similarly, depending on the speed at the river climbs down (ie. the learning rate or step size for the gradient descent algorithm), you might arrive at the final destination in a different manner. Both of these criteria can affect whether you fall into a pit (local minima) or are able to avoid it.

For the math behind the Gradient Descent algorithm, there are several tutorials which explain it more rigorously and so I won't mention all that here.
Hope this helps to convey the high level idea!

20.7k Views · View Upvoters

Your response is private.

Is this answer still relevant and up to date?

Promoted by Aegis School of Data Science

Post graduate program in data science and business analytics.

Program certification by IBM. Launch your career in data science, big data, analytics & machine learning.

Learn More

MORE ANSWERS BELOW. RELATED QUESTIONS

What is an intuitive explanation of stochastic gradient descent?

20,368 Views

What is an intuitive explanation for why the gradient points in the direction of steepest ascent?

4,290 Views

What is an intuitive explanation for gradient descent with an impulse term?

618 Views

What's the difference between gradient descent and stochastic gradient descent?

148,313 Views

Is there any tool for gradient descent?

317 Views

Why don't we take the derivative of gradient descent?

614 Views

Is gradient descent "greedy"?

1,042 Views

What should everybody know about (stochastic) gradient descent?

4,189 Views

OTHER ANSWERS

Kiran Kannar, Master's Computer Science, University of California, San Diego (2018)

Answered Dec 10, 2016 · Author has 586 answers and 1.2m answer views

Have you played Hill Climb Racing?

It’s okay even if you have never played this game. That one depends a lot on physics, but for the sake of this answer, imagine the car’s objective is to drive down a hill to the lowest point, but not climb up a second hill adjacent to it (sort of like a V formed by two hills). Leaving aside the physics for the moment, you can move down the hill with constant speed, but with multiple stops to refuel, and you are not allowed to stop until you run out of fuel.

Also, there is a good magnetic force called gMagneto that can exert its control and will help you take appropriately distanced stops to reach the lowest point. The distance between stops is not equal, and depends on the gMagneto’s powerful force, the steep of the hill at the current point of the car (and therefore the car’s current point). The

Previous Question

Next Question