Synthesizer

Introduced by Tay et al. in Synthesizer: Rethinking Self-Attention in Transformer Models

The Synthesizer is a model that learns synthetic attention weights without token-token interactions. Unlike Transformers, the model eschews dot product self-attention but also content-based self-attention altogether. Synthesizer learns to synthesize the self-alignment matrix instead of manually computing pairwise dot products. It is transformation-based, only relies on simple feed-forward layers, and completely dispenses with dot products and explicit token-token interactions.

This new module employed by the Synthesizer is called "Synthetic Attention": a new way of learning to attend without explicitly attending (i.e., without dot product attention or content-based attention). Instead, Synthesizer generate the alignment matrix independent of token-token dependencies.

Source: Synthesizer: Rethinking Self-Attention in Transformer Models

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Pose Estimation	3	7.14%
Zero-Shot Object Detection	2	4.76%
Voice Cloning	2	4.76%
Object Detection	2	4.76%
Face Reenactment	2	4.76%
Texture Synthesis	2	4.76%
Language Modelling	2	4.76%
Machine Translation	2	4.76%
Text Generation	2	4.76%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
Dense Synthesized Attention	Attention Mechanisms	(optional)
Factorized Dense Synthesized Attention	Attention Mechanisms	(optional)
Factorized Random Synthesized Attention	Attention Mechanisms	(optional)
Multi-Head Attention	Attention Modules
Random Synthesized Attention	Attention Mechanisms	(optional)

Categories

Add Remove

Language Models