Calculus & Optimization
Core Concepts
- Derivative
- Partial derivative
- Gradient
- Chain rule
- Taylor expansion
- Lagrange multipliers
- Convex optimization
Applications in Large Models
Backpropagation
- A perfect embodiment of gradient computation and the chain rule.
Model Training
- The core of minimizing the loss function (an optimization problem); all optimizers (SGD, Adam, RMSProp) are variants of gradient descent.
Activation Functions
- Their derivative properties are critical for gradient propagation.
Model Convergence Analysis
- Involves convergence theory from calculus.
贡献者
这篇文章有帮助吗?
最近更新
Involution Hell© 2026 byCommunityunderCC BY-NC-SA 4.0