AVA is a project that provides audiovisual annotations of video for improving our understanding of human activity. Each of the video clips has been exhaustively annotated by human annotators, and together they represent a rich variety of scenes, recording conditions, and expressions of human activity. There are annotations for:
98 PAPERS • 7 BENCHMARKS
VPCD contains multi-modal annotations (face, body and voice) for all primary and secondary characters from a range of diverse TV-shows and movies. It is used for evaluating multi-modal person-clustering. It contains body-tracks for each annotated character, face-tracks when visible, and voice-tracks when speaking, with their associated features.
7 PAPERS • 1 BENCHMARK