-
Training Quantization with Outlier Suppression in Training Time
Handling activation outlier in Transformer model is crucial to minimizing quantization error. In this blogpost, we explore simpler W8A8 training quantization without any explicit activation outlier suppression schemes.