Video-Adverb Retrieval (Unseen Compositions)

2 papers with code • 3 benchmarks • 3 datasets

The task aims to recognize adverbs beyond seen adverb-action compositions, i.e. compositions that were not seen during training.

Benchmarks

Add a Result

These leaderboards are used to track progress in Video-Adverb Retrieval (Unseen Compositions)

Dataset	Best Model	Compare
VATEX Adverbs	ReGaDa	See all
MSR-VTT Adverbs	ReGaDa	See all
ActivityNet Adverbs	ReGaDa	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs

hazeld/pseudoadverbs • • CVPR 2022

We aim to understand how actions are performed and identify subtle differences, such as 'fold firmly' vs. 'fold gently'.

Paper
Code

Video-adverb retrieval with compositional adverb-action embeddings

ExplainableML/ReGaDa • • 26 Sep 2023

We propose a framework for video-to-adverb retrieval (and vice versa) that aligns video embeddings with their matching compositional adverb-action text embedding in a joint embedding space.