awesome-multi-modal-dialog
github.com/yuco-z/awesome-multi-modal-dialog ↗[Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics
37
GitHub Stars
64
Curated Resources
2
Categories
1 hour ago
Last Refreshed
DatasetsMethods
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me question generation resources from awesome-multi-modal-dialog"
Installation instructions →What's inside
Methods
- Answer-Driven Visual State Estimator for Goal-Oriented Visual DialogueQuestion Generation
GuessWhat?!; MS-COCO
- Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial LearningVisual Grounded Dialogue
VisDial v0.9
- Ask No More: Deciding when to guess in referential visual dialogueVisual Grounded Dialogue
Guessing
- Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog ModelVisual Grounded Dialogue
VisDial v0.9
- Beyond task success: A closer look at jointly learning to see, ask, and GuessWhatVisual Grounded Dialogue
GuessingWhat?!
- Category-Based Strategy-Driven Question Generator for Visual DialogueQuestion Generation
GuessingWhat?!
Datasets
- CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning
visual QA (VQA)
- CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog
visual-grounded dialog (VGD)
- Constructing Multi-Modal Dialogue Dataset by Replacing Text with Semantically Relevant Images
multimodal conv. (MMC)
- Embodied Question Answering (EQA)
visual QA (VQA)
- GuessWhat?! Visual object discovery through multi-modal dialogue
visual QA (VQA)
- Image-Chat: Engaging Grounded Conversations
visual-grounded dialog (VGD)
Showing a sample of 64 resources. View the full list on GitHub →