SNLI-VE

Introduced by Xie et al. in Visual Entailment: A Novel Task for Fine-Grained Image Understanding

Visual Entailment (VE) consists of image-sentence pairs whereby a premise is defined by an image, rather than a natural language sentence as in traditional Textual Entailment tasks. The goal of a trained VE model is to predict whether the image semantically entails the text. SNLI-VE is a dataset for VE which is based on the Stanford Natural Language Inference corpus and Flickr30k dataset.

Source: https://github.com/necla-ml/SNLI-VE

Homepage

Benchmarks

Add a new result Link an existing benchmark

Trend	Task	Dataset Variant	Best Model	Paper	Code
	Visual Entailment	SNLI-VE val	OFA
	Visual Entailment	SNLI-VE test	OFA

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

allenai/allennlp-models

513

necla-ml/SNLI-VE

105

Tasks

Visual Question Answering (VQA)
Natural Language Inference
Visual Reasoning

SNLI-VE

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

e-ViL

Visual Question Answering v2.0

NLVR

e-SNLI-VE

Usage

License

Modalities

Languages

SNLI-VE

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

e-ViL

Visual Question Answering v2.0

NLVR

e-SNLI-VE

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages