Rmsnorm

15 Jan 2025 - cohlem

Recap of LayerNorm

let’s first recap by understanding why LayerNorm was used:

In this paper, the authors raise concern about LayerNorm.

rms1

they

they also provide that RMSNorm is invariant (does not change) to inputs or weights matrices.

rms2

which indicates that the change in scale of input of weights doesn’t affect the RMSNorm.