awesome-audio-visual-question-answering
github.com/swarupbehera/awesome-audio-visual-question-answering ↗A curated list of resources in audio visual question answering and related area. :-)
17
GitHub Stars
45
Curated Resources
1
Categories
13 min ago
Last Refreshed
Papers
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me 2024 resources from awesome-audio-visual-question-answering"
Installation instructions →What's inside
Papers
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension2024
Zhou Zhao, Yichong Leng, Jin Xu, Jingren Zhou, Qian Yang, Xiaohuan Zhou, Chang Zhou, Yunfei Chu, Ziyue Jiang, Wenrui Liu, YuanJun Lv (Code available)
- Answering Diverse Questions via Text Attached with Key Audio-Visual Clues2024
Xin Liu, Zitong Yu, Qilang Ye (Code available)
- AQUALLM: Audio Question Answering Data Generation Using Large Language Models2023
Swarup Ranjan Behera, Praveen Kumar Pokala, Krishna Mohan Injeti, Jaya Sai Kiran Patibandla, Balakrishna Reddy Pailla (Code not available)
- Audio-Visual Adaptive Fusion Network for Question Answering Based on Contrastive Learning2025
- AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue2024
Jing Bi et al. (Code not available)
- AVQA: A Dataset for Audio-Visual Question Answering on Videos2022
Pinci Yang et al. (Code not available)
Showing a sample of 45 resources. View the full list on GitHub →