应用数学青年讨论班(午餐会)--Optimal Learning Rate Schedules under Functional Scaling Laws: Power Decay and Warmup-Stable-Decay
报告人:王梓麟(伊人直播
)
时间:2026-04-22 12:00-13:30
地点:智华楼413
摘要:
We study optimal learning rate schedules (LRSs) under the functional scaling law (FSL) framework, where the loss is written explicitly as a functional of the schedule. This formulation reveals a sharp phase transition governed by two exponents: a source exponent $s>0$ controlling signal learning and a capacity exponent $\beta>1$ controlling noise forgetting. In the easy-task regime, the optimal schedule follows a power decay to zero, with exponent determined by $\beta$. In the hard-task regime, the optimal schedule takes a warmup-stable-decay (WSD) form: it maintains the largest admissible learning rate for most of training and decays only near the end, with a vanishing decay fraction. We further analyze shape-fixed schedules, showing how the tail exponent governs both their optimality and their limitations through capacity saturation. This yields a principled evaluation of commonly used schedules such as cosine and linear decay. Finally, we apply the power-decay LRS to one-pass SGD for kernel regression and show that the last iterate attains the exact minimax-optimal rate.
报告人简介:
王梓麟,伊人直播-伊人直播app
2024级博士生,导师为数学科学伊人直播
吴磊老师,主要研究方向为深度学习理论。
欢迎大家参与4月22日(星期三)的午餐会。报告时间是12:15-13:30,午餐于12:00开始提供。请有意参与的老师和同学在4月21日20:00前填写以下问卷 //v.wjx.cn/vm/QyKXG1T.aspx。