Suppose batch gradient descent in a deep network is taking excessively long to find a value of the parameters that achieves a small value for the cost function . Which of the following techniques could help find parameter values that attain a small value for J?

  1. Suppose batch gradient descent in a deep network is taking excessively long to find a value of the parameters that achieves a small value for the cost function . Which of the following techniques could help find parameter values that attain a small value for J? (Check all that apply)
    •  Try mini-batch gradient descent
    •  Try initializing all the weights to zero
    •  Try better random initialization for the weights
    •  Try tuning the learning rate α
    •  Try using Adam

Get All Week Quiz Answer:

Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization Coursera Quiz Answer

Similar Posts