Skip to main content

A curated list of different papers and datasets in various areas of audio-visual processing

772
GitHub Stars
31
Curated Resources
1
Categories
4 hours ago
Last Refreshed
Datasets

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me datasets resources from awesome-audio-visual"

Installation instructions →

What's inside

Datasets

  • ACAV100M

    140 million full-length videos (total duration 1,030 years) and produce a dataset of 100 million 10-second clips (31 years) with high audio-visual correspondence.

  • AIST++

    A large-scale 3D human dance motion dataset, which contains a wide variety of 3D motion paired with music It is built upon the AIST Dance Database, which is an uncalibrated multi-view collection of dance videos.

  • AudioSet

    Audio-Visual Classification

  • AudioSet Single Source

    Subset of AudioSet videos containing only a single souding object

  • AudioSetZSL

    Audio-Visual Zero-shot Learning

  • AuDio Visual Aerial sceNe reCognition datasEt (ADVANCE)

    Geotagged aerial images and sounds, classified into 13 scene classes

Showing a sample of 31 resources. View the full list on GitHub →