Training Neural Networks

Karpathy’s advice while training NN

Deep Learning Concepts

Contains simple explanation for DL concepts

How to scale your LLM (Must read)

https://jax-ml.github.io/scaling-book/

The Ultra-Scale Playbook: Training LLMs on GPU Clusters

https://huggingface.co/spaces/nanotron/ultrascale-playbook

Good coding style

https://medium.com/@NoamShazeer/shape-suffixes-good-coding-style-f836e72e24fd

How to sample from LLM (top-k, top-p)

https://huggingface.co/blog/how-to-generate