DuReader

Introduced by He et al. in DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

DuReader is a large-scale open-domain Chinese machine reading comprehension dataset. The dataset consists of 200K questions, 420K answers and 1M documents. The questions and documents are based on Baidu Search and Baidu Zhidao. The answers are manually generated. The dataset additionally provides question type annotations – each question was manually annotated as either Entity, Description or YesNo and one of Fact or Opinion.

Source: https://arxiv.org/pdf/1711.05073v4.pdf

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Open-Domain Question Answering	DuReader	ERNIE 2.0 Large
Reading Comprehension (Few-Shot)	DuReader	PanGu-α 2.6B
Reading Comprehension (Zero-Shot)	DuReader	PanGu-α 2.6B
Reading Comprehension (One-Shot)	DuReader	PanGu-α 2.6B