awesome-vision-language-models-for-earth-observation

A curated list of awesome vision and language resources for earth observation.

256

GitHub Stars

Curated Resources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me related repositories & libraries resources from awesome-vision-language-models-for-earth-observation"

CapERA: Captioning Events in Aerial Videos
Size : 2864 videos and 14,320 captions, where each video is paired with five unique captions
Change Detection-Based Visual Question Answering Dataset
Size: 2,968 pairs of multitemporal images and more than 122,000 question–answer pairs Classes: 6 Resolution : 512×512 pixels Platforms: It is based on semantic change detection dataset (SECOND) Use: Remote Sensing Visual Question Answering
Dense Labeling Remote Sensing Dataset (DLRSD)
Size: 2,100 images Number of Classes: 21 Resolution : 256 x 256 Platforms: Extension of the UC Merced Use: Remote Sensing Image Retrieval (RSIR), Classification and Semantic Segmentation
Dior-Remote Sensing Visual Grounding Dataset (RSVGD)
Size: 38,320 RS image-query pairs and 17,402 RS images Number of Classes: 20 Resolution : 800 x 800 Platforms: DIOR dataset Use: Remote Sensing Visual Grounding
FloodNet Visual Question Answering Dataset
Size: 11,000 question-image pairs Resolution : 224 x 224 Platforms: UAV-DJI Mavic Pro quadcopters, after Hurricane Harvey Use: Remote Sensing Visual Question Answering
LAION-EO
Size : 24,933 samples with 40.1% english captions as well as other common languages from LAION-5B mean height of 633.0 pixels (up to 9,999) and mean width of 843.7 pixels (up to 19,687) Platforms : Based on LAION-5B

Showing a sample of 27 resources. View the full list on GitHub →