Max Norm Constraints／Regularization

is a regularization method that enforces an absolute upper bound on the magnitude of the weight vector for every neuron and uses projected gradient descent to enforce the constraint
In practice, this corresponds to performing the parameter update as normal and then enforcing the constraint by clamping the weight vector 𝑤⃗ of every neuron to satisfy ‖𝑤⃗‖₂<𝑐.
- typical values of 𝑐 are on orders of 3 or 4
- ‖·‖₂ is the L2 norm
Some people report improvements when using this form of regularization
One of its appealing properties is that the model cannot “explode” even when the learning rates are set too high because the updates are always bounded

／var／log marcus chiu