awesome-llmops
github.com/tensorchord/awesome-llmops ↗An awesome & curated list of best LLMOps tools for developers
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me frameworks for training resources from awesome-llmops"
Installation instructions →What's inside
Training
- AccelerateFrameworks for Training
🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision.
- AimExperiment Tracking
an easy-to-use and performant open-source experiment tracker.
- alpaca-loraFoundation Model Fine Tuning
Instruct-tune LLaMA on consumer hardware
- Apache MXNetFrameworks for Training
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler.
- axolotlFrameworks for Training
A tool designed to streamline the fine-tuning of various AI models, offering support for multiple configurations and architectures.
- CaffeFrameworks for Training
A fast open framework for deep learning.
LLMOps
- agenta
The LLMOps platform to build robust LLM apps. Easily experiment and evaluate different prompts, models, and workflows to build robust apps.
- AgentField
Open-source control plane for building and operating AI agents like APIs at scale, with routing, memory, observability, identity, auth, and policy controls.
- AgentMark
Type-Safe Markdown-based Agents
- ai-evaluation
Evaluation framework for automated, reproducible scoring of LLM, agent, and workflow performance.
- AI studio
A Reliable Open Source AI studio to build core infrastructure stack for your LLM Applications. It allows you to gain visibility, make your application reliable, and prepare it for production with features such as caching, rate limiting, exponential retry, model fallback, and more.
- Arize-Phoenix
ML observability for LLMs, vision, language, and tabular models.
Optimizations
- agent-opt
Automated optimization engine for improving agent workflows using feedback-driven iterative refinements.
- Entroly
Information-theoretic context optimization proxy. Cuts LLM token costs by 70–95% with zero accuracy loss using greedy submodular knapsack maximization.
- FeatherCNN
FeatherCNN is a high performance inference engine for convolutional neural networks.
- Forward
A library for high performance deep learning inference on NVIDIA GPUs.
Code AI
- AgentsMesh
Self-hostable AI Agent Workforce Platform. Multi-agent orchestration with remote AI workstations (AgentPods), PTY sandbox + git worktree isolation, built-in Kanban, and per-pod MCP server. Supports Claude Code, Codex CLI, Gemini CLI, Aider, OpenCode.
- AIDE
Open-source ML engineering agent that uses tree search to explore solution spaces. Automates machine learning experimentation from data analysis to model training. Paper .
- Bernstein
Deterministic Python orchestrator for 37 CLI coding agents (Claude Code, Codex CLI, Gemini CLI, GitHub Copilot CLI, Cursor, Aider, OpenHands, OpenCode, Goose, Qwen, Ollama, ...) running in parallel git worktrees. First-class MCP server, quality gates, cost tracking with budgets.
- CodeGeeX
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
- CodeGen
CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
- CodeT5
Open Code LLMs for Code Understanding and Generation.
Large Scale Deployment
- AirflowWorkflow
A platform to programmatically author, schedule and monitor workflows.
- aqueductWorkflow
An Open-Source Platform for Production Data Science
- Argo WorkflowsWorkflow
Workflow engine for Kubernetes.
- ClearMLML Platforms
Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management.
- CometModel Management
Comet is an MLOps platform that offers Model Production Management, a Model Registry, and full model lineage from training straight through to production. Use Comet for model reproducibility, model debugging, model versioning, model visibility, model auditing, model governance, and model monitoring.
- dstackML Platforms
Open-source confidential AI framework for secure LLM deployment with data privacy, providing hardware-enforced isolation for production ML workloads.
Search
- AirweaveHybrid search
An easy way to turn any app into searchable data for LLMs.
- AquilaDBVector search
An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.
- AwadbVector search
AI Native database for embedding vectors
- ChromaVector search
the open source embedding database
- EpsillaVector search
A 10x faster, cheaper, and better vector database
- InfinityVector search
The AI-native database built for LLM applications, providing incredibly fast vector and full-text search
Model
- AlpacaLarge Language Model
Code and documentation to train Stanford's Alpaca models, and generate the data.
- barkAudio Foundation Model
Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects.
- BELLELarge Language Model
A 7B Large Language Model fine-tune by 34B Chinese Character Corpus, based on LLaMA and Alpaca.
- BloomLarge Language Model
BigScience Large Open-science Open-access Multilingual Language Model
- ChatGLM2-6BLarge Language Model
ChatGLM2-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model ChatGLM-6B .
- disco-diffusionCV Foundation Model
A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations.
Serving
- Alpaca-LoRA-ServeLarge Model Serving
Alpaca-LoRA as Chatbot service
- BentoMLFrameworks/Servers for Serving
The Unified Model Serving Framework
- Clip-as-a-serviceLarge Model Serving
serving the OpenAI CLIP model
- CTranslate2Large Model Serving
fast inference engine for Transformer models in C++
- DeepSpeed-MIILarge Model Serving
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
- Faster WhisperLarge Model Serving
fast inference engine for whisper in C++ using CTranslate2.
Showing a sample of 386 resources. View the full list on GitHub →