No More Pesky Learning Rates

May 8th, 2015
Studypool Tutor
Price: $10 USD

Tutor description

The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time. We propose a method to automatically adjust multiple learning rates so as to minimize the expected error at any one time. The method relies on local gradient variations across samples.

Word Count: 6546
Showing Page: 1/35
Learning RateNo More Pesky Learning RatesTom Schaul Sixin Zhang Yann LeCun Courant Institute of Mathematical Sciences New York University 715 Broadway, New York, NY 10003, USAschaul@cims.nyu.edu zsx@cims.nyu.edu yann@cims.nyu.eduarXiv:1206.1106v2 [stat.ML] 18 Feb 2013AbstractThe performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time. We propose a method to automatically adjust multiple learning rates so as to minimize the expected error at any one time. The method relies on local gradient variations across samples. In our approach, learning rates can increase as well as decrease, making it suitable for non-stationary problems. Using a number of convex and non-convex learning tasks, we show that the resulting algorithm matches the performance of SGD or other adaptive approaches with their best settings obtained through systematic search, and eectively removes the need for learning rate tuning.learning rates for dierent parameters), so as to minimize some estimate of the expectation of the loss at any one time. Starting from an idealized scenario where every samples contribution to the loss is quadratic and separable, we derive a formula for the optimal learning rates for SGD, based on estimates of the variance of the gradient. The formula has two components: one that captures variability across samples, and one that captures the local curvature, both of which can be estimated

Review from student

Studypool Student
" Outstanding Job!!!! "
Ask your homework questions. Receive quality answers!

Type your question here (or upload an image)

1824 tutors are online

Brown University





1271 Tutors

California Institute of Technology




2131 Tutors

Carnegie Mellon University




982 Tutors

Columbia University





1256 Tutors

Dartmouth University





2113 Tutors

Emory University





2279 Tutors

Harvard University





599 Tutors

Massachusetts Institute of Technology



2319 Tutors

New York University





1645 Tutors

Notre Dam University





1911 Tutors

Oklahoma University





2122 Tutors

Pennsylvania State University





932 Tutors

Princeton University





1211 Tutors

Stanford University





983 Tutors

University of California





1282 Tutors

Oxford University





123 Tutors

Yale University





2325 Tutors