awesome-referring-image-segmentation

:books: A collection of papers about Referring Image Segmentation.

826

GitHub Stars

Curated Resources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me 3. traditional referring image segmentation resources from awesome-referring-image-segmentation"

-
Learning From Box Annotations for Referring Image Segmentation
ASDA
Adaptive Selection based Referring Image Segmentation
BKINet
Bilateral Knowledge Interaction Network for Referring Image Segmentation
BRINet
Bi-directional Relationship Inferring Network for Referring Image Segmentation
BUSNet
Bottom-Up Shift and Reasoning for Referring Image Segmentation
CARIS
CARIS: Context-Aware Referring Image Segmentation

1st MeViS Challenge
CVPR 2024 Workshop: Pixel-level Video Understanding in the Wild

3D-GRES
3D-GRES: Generalized 3D Referring Expression Segmentation
3D-STMN
3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation
InstanceRefer
InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring
RefMask3D
RefMask3D: Language-Guided Transformer for 3D Referring Segmentation
SegPoint
SegPoint: Segment Any Point Cloud via Large Language Model
X-RefSeg3D
X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks

CLEVR-Ref+
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
ClevrTex
ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation
Google-Ref
Generation and comprehension of unambiguous object descriptions
ReferIt
Referit game: Referring to objects in photographs of natural scenes
ScanRefer
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
UNC+
Modeling context in referring expressions

DsHmp
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
HTML
HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation
LBDT
Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
LMPM
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Locater
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
LoSh
LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation

RIS-LAD
RIS-LAD: A Benchmark and Model for Referring Low-Altitude Drone Image Segmentation
RS2-SAM2
RS2-SAM2: Customized SAM2 for Referring Remote Sensing Image Segmentation
SurgRef
Where It Moves, It Matters: Referring Surgical Instrument Segmentation via Motion

Showing a sample of 97 resources. View the full list on GitHub →