Discussion about this post

User's avatar
Hao Hoang's avatar

📚 Related Papers:

- Adam: A Method for Stochastic Optimization. Available at: https://arxiv.org/abs/1412.6980

- Adaptive Subgradient Methods for Online Learning and Stochastic Optimization (JMLR). Available at: https://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf

- An overview of gradient descent optimization algorithms. Available at: https://arxiv.org/abs/1609.04747

- MuonRec: Shifting the Optimizer Paradigm Beyond Adam in Scalable Generative Recommendation. Available at: https://arxiv.org/abs/2603.00416

No posts

Ready for more?