Gradient Descent

Gradient Descent

Gradient descent is one of the most popular algorithms to perform optimization and by far the most common way to optimize neural networks.

Gradient Descent is a technique to minimize loss by computing the gradients of loss with respect to the model’s parameters, conditioned on training data. Informally, gradient descent iteratively adjusts parameters, gradually finding the best combination of weights and bias to minimize loss.

Different variants of gradient descent

  1. Batch gradient descent
  2. Stochastic Gradient Descent
  3. Mini-batch gradient descent

Gradient descent optimization algorithms

  1. Momentum
  2. Nesterov accelerated gradient
  3. Adadelta
  4. Adagrad
  5. RMSprop
  6. Adam
  7. AdaMax
  8. Nadam
  9. AMSGrad