5 dataset results for Linguistic Acceptability AND Texts

GLUE (General Language Understanding Evaluation benchmark)

General Language Understanding Evaluation (GLUE) benchmark is a collection of nine natural language understanding tasks, including single-sentence tasks CoLA and SST-2, similarity and paraphrasing tasks MRPC, STS-B and QQP, and natural language inference tasks MNLI, QNLI, RTE and WNLI.

2,776 PAPERS • 25 BENCHMARKS

CoLA (Corpus of Linguistic Acceptability)

The Corpus of Linguistic Acceptability (CoLA) consists of 10657 sentences from 23 linguistics publications, expertly annotated for acceptability (grammaticality) by their original authors. The public version contains 9594 sentences belonging to training and development sets, and excludes 1063 sentences belonging to a held out test set.

627 PAPERS • 3 BENCHMARKS

DaLAJ

DaLAJ 1.0, a dataset for Linguistic Acceptability Judgments for Swedish, comprising 9,596 sentences in its first version; and the initial experiment using it for the binary classification task. DaLAJ is based on the SweLL second language learner data, consisting of essays at different levels of proficiency.

5 PAPERS • 1 BENCHMARK

ItaCoLA

ItaCoLA is a corpus for monolingual and cross-lingual acceptability judgments which contains almost 10,000 sentences with acceptability judgments.

5 PAPERS • 1 BENCHMARK

RuCoLA

The Russian Corpus of Linguistic Acceptability (RuCoLA) is built from the ground up under the well-established binary LA approach. RuCoLA consists of 9.8k in-domain sentences from linguistic publications and 3.6k out-of-domain sentence produced by generative models.

4 PAPERS • 1 BENCHMARK

Datasets

5 dataset results for Linguistic Acceptability AND Texts