1 code implementation • 26 Apr 2022 • Linnan Wang, Chenhan Yu, Satish Salian, Slawomir Kierat, Szymon Migacz, Alex Fit Florea
This paper intends to expedite the model customization with a model hub that contains the optimized models tiered by their inference latency using Neural Architecture Search (NAS).
Ranked #2 on Neural Architecture Search on ImageNet
no code implementations • CVPR 2022 • Linnan Wang, Chenhan Yu, Satish Salian, Slawomir Kierat, Szymon Migacz, Alex Fit Florea
To achieve this goal, we build a distributed NAS system to search on a novel search space that consists of prominent factors to impact latency and accuracy.
2 code implementations • 9 Sep 2020 • Swetha Mandava, Szymon Migacz, Alex Fit Florea
Transformer-based models consist of interleaved feed-forward blocks - that capture content meaning, and relatively more expensive self-attention blocks - that capture context meaning.
Ranked #1 on Language Modelling on enwiki8