Which of these statements about deep learning programming frameworks are true?

Which of these statements about deep learning programming frameworks are true?(Check all that apply)  A programming framework allows you to code up deep learning algorithms with typically fewer lines of code than a lower-level language such as Python.  Deep learning programming frameworks require cloud-based machines to run.  Even if a project is currently open source,…

Finding good hyperparameter values is very time-consuming. So typically you should do it once at the start of the project, and try to find very good hyperparameters so that you don’t ever have to revisit tuning them again.

Finding good hyperparameter values is very time-consuming. So typically you should do it once at the start of the project, and try to find very good hyperparameters so that you don’t ever have to revisit tuning them again. True or false?  True  False Get All Week Quiz Answer: Improving Deep Neural Networks: Hyperparameter Tuning, Regularization…

If you think β (hyperparameter for momentum) is between on 0.9 and 0.99, which of the following is the recommended way to sample a value for beta?

If you think β (hyperparameter for momentum) is between on 0.9 and 0.99, which of the following is the recommended way to sample a value for beta? r = np.random.rand() beta = r*0.09 + 0.9   r = np.random.rand() beta = 1-10**(- r + 1)   r = np.random.rand() beta = 1-10**(- r – 1)…

During hyperparameter search, whether you try to babysit one model (“Panda” strategy) or train a lot of models in parallel (“Caviar”) is largely determined by:

During hyperparameter search, whether you try to babysit one model (“Panda” strategy) or train a lot of models in parallel (“Caviar”) is largely determined by:  Whether you use batch or mini-batch optimization  The presence of local minima (and saddle points) in your neural network  The amount of computational power you can access  The number of…

Every hyperparameter, if set poorly, can have a huge negative impact on training, and so all hyperparameters are about equally important to tune well.

Every hyperparameter, if set poorly, can have a huge negative impact on training, and so all hyperparameters are about equally important to tune well. True or False?  True  False Yes We’ve seen in lecture that some hyperparameters, such as the learning rate, are more critical than others. Get All Week Quiz Answer: Improving Deep Neural…

After training a neural network with Batch Norm, at test time, to evaluate the neural network on a new example you should:

After training a neural network with Batch Norm, at test time, to evaluate the neural network on a new example you should:  Skip the step where you normalize using μ and  since a single test example cannot be normalized.  If you implemented Batch Norm on mini-batches of (say) 256 examples, then to evaluate on one test…

β and γ are hyperparameters of the algorithm, which we tune via random sampling.Which of the following statements about γ and β in Batch Norm are true?

β and γ are hyperparameters of the algorithm, which we tune via random sampling.Which of the following statements about γ and β in Batch Norm are true?  They set the mean and variance of the linear variable   There is one global value of  and one global value of  for each layer, and applies to all the hidden…

In batch normalization as presented in the videos, if you apply it on the lth layer of your neural network, what are you normalizing?

In batch normalization as presented in the videos, if you apply it on the lth layer of your neural network, what are you normalizing? In batch normalization as presented in the videos, if you apply it on the lth layer of your neural network, what are you normalizing? In batch normalization as presented in the videos, if…

If searching among a large number of hyperparameters, you should try values in a grid rather than random values, so that you can carry out the search more systematically and not rely on chance. True or False?

If searching among a large number of hyperparameters, you should try values in a grid rather than random values, so that you can carry out the search more systematically and not rely on chance. True or False?  True  False Get All Week Quiz Answer: Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization Coursera Quiz…

Which of the following statements about Adam is False?

Which of the following statements about Adam is False?  Adam combines the advantages of RMSProp and momentum  We usually use “default” values for the hyperparameters β1, β2 and ε in Adam (β1 = 0.9, β2 = 0.999,  )  The learning rate hyperparameter α in Adam usually needs to be tuned.  Adam should be used with batch gradient…

Suppose batch gradient descent in a deep network is taking excessively long to find a value of the parameters that achieves a small value for the cost function . Which of the following techniques could help find parameter values that attain a small value for J?

Suppose batch gradient descent in a deep network is taking excessively long to find a value of the parameters that achieves a small value for the cost function . Which of the following techniques could help find parameter values that attain a small value for J? (Check all that apply)  Try mini-batch gradient descent  Try initializing all…

These plots were generated with gradient descent; with gradient descent with momentum (β = 0.5) and gradient descent with momentum (β = 0.9). Which curve corresponds to which algorithm?

Consider this figure:These plots were generated with gradient descent; with gradient descent with momentum (β = 0.5) and gradient descent with momentum (β = 0.9). Which curve corresponds to which algorithm?  (1) is gradient descent with momentum (small β), (2) is gradient descent with momentum (small β), (3) is gradient descent  (1) is gradient descent….