FastMoE

Introduced by He et al. in FastMoE: A Fast Mixture-of-Expert Training System

FastMoE is a distributed MoE training system based on PyTorch with common accelerators. The system provides a hierarchical interface for both flexible model design and adaption to different applications, such as Transformer-XL and Megatron-LM.

Source: FastMoE: A Fast Mixture-of-Expert Training System

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Language Modelling	1	100.00%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Distributed Methods

Hybrid Parallel Methods

2D Parallel Distributed Methods