The benchmarks section lists all benchmarks using a given dataset or any of
its variants. We use variants to distinguish between results evaluated on
slightly different versions of the same dataset. For example, ImageNet 32⨉32
and ImageNet 64⨉64 are variants of the ImageNet dataset.
WMT 2016 is a collection of datasets used in shared tasks of the First Conference on Machine Translation. The conference builds on ten previous Workshops on statistical Machine Translation.
The conference featured ten shared tasks:
a news translation task,
an IT domain translation task,
a biomedical translation task,
an automatic post-editing task,
a metrics task (assess MT quality given reference translation).
a quality estimation task (assess MT quality without access to any reference),