Chart2Text is a dataset that was crawled from 23,382 freely accessible pages from statista.com in early March of 2020, yielding a total of 8,305 charts, and associated summaries. For each chart, the chart image, the underlying data table, the title, the axis labels, and a human-written summary describing the statistic was downloaded.
8 PAPERS • NO BENCHMARKS YET
SciGen is a challenge dataset for the task of reasoning-aware data-to-text generation consisting of tables from scientific articles and their corresponding descriptions. The unique properties of SciGen are that (1) tables mostly contain numerical values, and (2) the corresponding descriptions require arithmetic reasoning. SciGen is therefore the first dataset that assesses the arithmetic reasoning capabilities of generation models on complex input structures, i.e., tables from scientific articles. SciGen opens new avenues for future research in reasoning-aware text generation and evaluation.
2 PAPERS • NO BENCHMARKS YET