no code implementations • 21 Feb 2024 • Xiao-Yang Liu, Jie Zhang, Guoxuan Wang, Weiqing Tong, Anwar Walid
However, the resulting model still consumes a large amount of GPU memory.
Model Compression Quantization