Context Awesome

awesome-multi-modal-dialog

github.com/yuco-z/awesome-multi-modal-dialog ↗

[Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics

36

GitHub Stars

64

Curated Resources

2

Categories

21 hours ago

Last Refreshed

DatasetsMethods

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me question generation resources from awesome-multi-modal-dialog"

Installation instructions →

What's inside

Methods

Answer-Driven Visual State Estimator for Goal-Oriented Visual DialogueQuestion Generation
GuessWhat?!; MS-COCO
Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial LearningVisual Grounded Dialogue
VisDial v0.9
Ask No More: Deciding when to guess in referential visual dialogueVisual Grounded Dialogue
Guessing
Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog ModelVisual Grounded Dialogue
VisDial v0.9
Beyond task success: A closer look at jointly learning to see, ask, and GuessWhatVisual Grounded Dialogue
GuessingWhat?!
Category-Based Strategy-Driven Question Generator for Visual DialogueQuestion Generation
GuessingWhat?!

Datasets

CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning
visual QA (VQA)
CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog
visual-grounded dialog (VGD)
Constructing Multi-Modal Dialogue Dataset by Replacing Text with Semantically Relevant Images
multimodal conv. (MMC)
Embodied Question Answering (EQA)
visual QA (VQA)
GuessWhat?! Visual object discovery through multi-modal dialogue
visual QA (VQA)
Image-Chat: Engaging Grounded Conversations
visual-grounded dialog (VGD)

Showing a sample of 64 resources. View the full list on GitHub →