awesome-distributed-ml
github.com/shenggan/awesome-distributed-ml ↗A curated list of awesome projects and papers for distributed training or inference
279
GitHub Stars
94
Curated Resources
2
Categories
7 hours ago
Last Refreshed
Open Source ProjectsPapers
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me mixture-of-experts system resources from awesome-distributed-ml"
Installation instructions →What's inside
Papers
- Accelerating Distributed MoE Training and Inference with LinaMixture-of-Experts System
- Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU PlatformsGraph Neural Networks System
- ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed TrainingMemory Efficient Training
- Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep LearningAuto Parallelization
- Amazon SageMaker Model Parallelism: A General and Flexible Framework for Large Model TrainingHybrid Parallelism & Framework
- A Survey on Auto-Parallelism of Neural Networks TrainingSurvey
Open Source Projects
- Alpa: Auto Parallelization for Large-Scale Neural Networks
- ColossalAI: A Unified Deep Learning System for Large-Scale Parallel Training
- DeepSpeed: A Deep Learning Optimization Library that Makes Distributed Training and Inference Easy, Efficient, and Effective.
- EasyDist: Automated Parallelization System and Infrastructure
- Easy Parallel Library: A General and Efficient Deep Learning Framework for Distributed Model Training
- exo: Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
Showing a sample of 94 resources. View the full list on GitHub →