ID	alexnet_in1k_jigsaw_goyal
Classes	1000

ID	alexnet_in22k_jigsaw_goyal
Classes	22000

ID	alexnet_yfcc100m_jigsaw_goyal

ID	rn50_in1k_perm100_jigsaw
Layers	50
Classes	1000
Permutations	100
Width Multiplier	1

ID	rn50_in1k_perm10k_jigsaw
LR	0.1
Layers	50
Classes	1000
Momentum	0.9
Permutations	2000
Width Multiplier	1

ID	rn50_in1k_jigsaw_goyal
LR	0.1
Layers	50
Classes	1000
Momentum	0.9
Permutations	2000
Width Multiplier	1

ID	rn50_in22k_jigsaw_goyal
LR	0.1
Layers	50
Classes	22000
Momentum	0.9
Permutations	2000
Width Multiplier	1

ID	rn50_yfcc100m_jigsaw_goyal
LR	0.1
Layers	50
Momentum	0.9
Permutations	2000
Width Multiplier	1

ID	rn50_in1k_perm2k_jigsaw
LR	0.1
Layers	50
Classes	1000
Momentum	0.9
Permutations	2000
Weight Decay	0.0001
Width Multiplier	1

ID	rn50_in22k_perm2k_jigsaw
LR	0.1
Layers	50
Classes	22000
Momentum	0.9
Permutations	2000
Width Multiplier	1

Jigsaw

facebookresearch / vissl

Last updated on Feb 27, 2021

Parameters 61 Million

FLOPs 715 Million

File Size 8.92 MB

Training Data ImageNet

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	Convolution, Dropout, Dense Connections, ReLU, Max Pooling, Softmax
ID	alexnet_in1k_jigsaw_goyal
Classes	1000
SHOW MORE
SHOW LESS

Parameters 61 Million

FLOPs 715 Million

File Size 8.92 MB

Training Data ImageNet

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	Convolution, Dropout, Dense Connections, ReLU, Max Pooling, Softmax
ID	alexnet_in22k_jigsaw_goyal
Classes	22000
SHOW MORE
SHOW LESS

Parameters 61 Million

FLOPs 715 Million

File Size 8.92 MB

Training Data ImageNet, YFCC100M

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	Convolution, Dropout, Dense Connections, ReLU, Max Pooling, Softmax
ID	alexnet_yfcc100m_jigsaw_goyal

Jigsaw ResNet-50 - 100 permutations achieves 83.3% Top 1 Accuracy on ImageNet

Parameters 26 Million

FLOPs 4 Billion

File Size 97.78 MB

Training Data ImageNet

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	rn50_in1k_perm100_jigsaw
Layers	50
Classes	1000
Permutations	100
Width Multiplier	1
SHOW MORE
SHOW LESS

Jigsaw ResNet-50 - 10K permutations achieves 81.9% Top 1 Accuracy on ImageNet

Parameters 26 Million

FLOPs 4 Billion

File Size 882.14 MB

Training Data ImageNet

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	rn50_in1k_perm10k_jigsaw
LR	0.1
Layers	50
Classes	1000
Momentum	0.9
Permutations	2000
Width Multiplier	1
SHOW MORE
SHOW LESS

Parameters 26 Million

FLOPs 4 Billion

File Size 273.79 MB

Training Data ImageNet

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	rn50_in1k_jigsaw_goyal
LR	0.1
Layers	50
Classes	1000
Momentum	0.9
Permutations	2000
Width Multiplier	1
SHOW MORE
SHOW LESS

Parameters 26 Million

FLOPs 4 Billion

File Size 273.79 MB

Training Data ImageNet

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	rn50_in22k_jigsaw_goyal
LR	0.1
Layers	50
Classes	22000
Momentum	0.9
Permutations	2000
Width Multiplier	1
SHOW MORE
SHOW LESS

Parameters 26 Million

FLOPs 4 Billion

File Size 449.59 MB

Training Data ImageNet, YFCC100M

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	rn50_yfcc100m_jigsaw_goyal
LR	0.1
Layers	50
Momentum	0.9
Permutations	2000
Width Multiplier	1
SHOW MORE
SHOW LESS

Jigsaw ResNet-50 (ImageNet-1K, 2K permutations) achieves 82% Top 1 Accuracy on ImageNet

Parameters 26 Million

FLOPs 4 Billion

File Size 332.77 MB

Training Data ImageNet

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	rn50_in1k_perm2k_jigsaw
LR	0.1
Layers	50
Classes	1000
Momentum	0.9
Permutations	2000
Weight Decay	0.0001
Width Multiplier	1
SHOW MORE
SHOW LESS

Jigsaw ResNet-50 (ImageNet-22K, 2K permutations) achieves 82.9% Top 1 Accuracy on ImageNet

Parameters 26 Million

FLOPs 4 Billion

File Size 333.20 MB

Training Data ImageNet

Training Resources 8 NVIDIA V100 GPUs

Training Time

Training Techniques	Jigsaw, Weight Decay, SGD with Momentum
Architecture	1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax
ID	rn50_in22k_perm2k_jigsaw
LR	0.1
Layers	50
Classes	22000
Momentum	0.9
Permutations	2000
Width Multiplier	1
SHOW MORE
SHOW LESS

README.md

Summary

Jigsaw is a self-supervision approach that relies on jigsaw-like puzzles as the pretext task in order to learn image representations. This particular set of models includes improved models for Jigsaw that employ:

Scaling pre-training data: scaling to 100× more data (YFCC-100M).
Scaling model capacity: scaling up to a higher capacity model, ResNet-50, that shows larger improvements as the data size increases.
Scaling problem complexity: scaling the ‘hardness’; observing higher capacity models show a larger improvement on ‘harder’ tasks.

How do I train this model?

Get started with VISSL by trying one of the Colab tutorial notebooks.

Citation

@article{DBLP:journals/corr/abs-1905-01235,
  author    = {Priya Goyal and
               Dhruv Mahajan and
               Abhinav Gupta and
               Ishan Misra},
  title     = {Scaling and Benchmarking Self-Supervised Visual Representation Learning},
  journal   = {CoRR},
  volume    = {abs/1905.01235},
  year      = {2019},
  url       = {http://arxiv.org/abs/1905.01235},
  archivePrefix = {arXiv},
  eprint    = {1905.01235},
  timestamp = {Mon, 28 Sep 2020 08:19:37 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1905-01235.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

@article{DBLP:journals/corr/NorooziF16,
  author    = {Mehdi Noroozi and
               Paolo Favaro},
  title     = {Unsupervised Learning of Visual Representations by Solving Jigsaw
               Puzzles},
  journal   = {CoRR},
  volume    = {abs/1603.09246},
  year      = {2016},
  url       = {http://arxiv.org/abs/1603.09246},
  archivePrefix = {arXiv},
  eprint    = {1603.09246},
  timestamp = {Mon, 13 Aug 2018 16:49:09 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/NorooziF16.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

@misc{goyal2021vissl,
  author =       {Priya Goyal and Benjamin Lefaudeux and Mannat Singh and Jeremy Reizenstein and Vinicius Reis and 
                  Min Xu and and Matthew Leavitt and Mathilde Caron and Piotr Bojanowski and Armand Joulin and 
                  Ishan Misra},
  title =        {VISSL},
  howpublished = {\url{https://github.com/facebookresearch/vissl}},
  year =         {2021}
}

Results

Image Classification on ImageNet

MODEL	TOP 1 ACCURACY
Jigsaw ResNet-50 (Goyal19, ImageNet-22K)	53.09%
Jigsaw ResNet-50 (Goyal19, YFCC100M)	51.37%
Jigsaw ResNet-50 - 100 permutations	48.57%
Jigsaw ResNet-50 - 10K permutations	48.11%
Jigsaw ResNet-50 (ImageNet-1K, 2K permutations)	46.73%
Jigsaw ResNet-50 (Goyal19, ImageNet-1K)	46.58%
Jigsaw ResNet-50 (ImageNet-22K, 2K permutations)	44.84%
Jigsaw AlexNet (Goyal19, ImageNet-22K)	37.5%
Jigsaw AlexNet (Goyal19, YFCC100M)	37.01%
Jigsaw AlexNet (Goyal19, ImageNet-1K)	34.82%