Backpropagation
Loss function
Optimization
- Optimization Algorithms (SGD with momentum, RMSProp, Adam)
- Optimizing loss with weight initialization
- BatchNormalization
- RMSNorm
- Diagnostic tool to look out for while training NN
- Skip Connections
Training
Misc
- Matrix Visualization
- SwiGLU activation- not mine, but offers best explanation