Science, asked by Amanshaily2980, 1 year ago

You run gradient descent for 15 iterations with \alpha = 0.3α=0.3 and compute J(\theta)J(θ) after each iteration. You find that the value of J(\theta)J(θ) increases over time. Based on this, which of the following conclusions seems most plausible?

Answers

Answered by Anonymous

Answer:

α = 0.3

Explanation:

In the said statement there can be three possible conclusions -

a. Rather than use the current value of α, it will be more promising to try a larger value of α (say α=1.0).

b. Rather than use the current value of α, it will be more promising to try a smaller value of α (say α=0.1).

c. α=0.3 is an effective choice of learning rate

Hence,

Gradient descent = 15 iterations

α = 0.3α = 0.3

J(θ) = Increases/decreases

Thus,

α = 0.3 is an effective choice of learning rate because, we want the gradient descent to quickly converge to the minimum possible rate, so that the current setting of α seems to be a good step.

Answered by indiabrainly

Answer:

Explanation:

You run gradient descent for 15 iterations with α=0.3 and compute J(θ) after each iteration. You find that the value of J(θ) decreases quickly then levels off. Based on this, the conclusion that seems most plausible among the given options is following - α=0.3 is an effective choice of learning rate.

As we want the gradient descent to converge quickly to minimum, the current setting of the α can be good.

Previous Question

Next Question