Skip to main content

An awesome & curated list of best LLMOps tools for developers

5.8k
GitHub Stars
386
Curated Resources
14
Categories
17 hours ago
Last Refreshed
ModelServingSecurityLLMOpsSearchCode AITrainingDataLarge Scale DeploymentPerformanceAutoMLOptimizationsFederated MLAwesome Lists

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me frameworks for training resources from awesome-llmops"

Installation instructions →

What's inside

Training

  • AccelerateFrameworks for Training

    🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision.

  • AimExperiment Tracking

    an easy-to-use and performant open-source experiment tracker.

  • alpaca-loraFoundation Model Fine Tuning

    Instruct-tune LLaMA on consumer hardware

  • Apache MXNetFrameworks for Training

    Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler.

  • axolotlFrameworks for Training

    A tool designed to streamline the fine-tuning of various AI models, offering support for multiple configurations and architectures.

  • CaffeFrameworks for Training

    A fast open framework for deep learning.

LLMOps

  • agenta

    The LLMOps platform to build robust LLM apps. Easily experiment and evaluate different prompts, models, and workflows to build robust apps.

  • AgentField

    Open-source control plane for building and operating AI agents like APIs at scale, with routing, memory, observability, identity, auth, and policy controls.

  • AgentMark

    Type-Safe Markdown-based Agents

  • ai-evaluation

    Evaluation framework for automated, reproducible scoring of LLM, agent, and workflow performance.

  • AI studio

    A Reliable Open Source AI studio to build core infrastructure stack for your LLM Applications. It allows you to gain visibility, make your application reliable, and prepare it for production with features such as caching, rate limiting, exponential retry, model fallback, and more.

  • Arize-Phoenix

    ML observability for LLMs, vision, language, and tabular models.

Optimizations

  • agent-opt

    Automated optimization engine for improving agent workflows using feedback-driven iterative refinements.

  • Entroly

    Information-theoretic context optimization proxy. Cuts LLM token costs by 70–95% with zero accuracy loss using greedy submodular knapsack maximization.

  • FeatherCNN

    FeatherCNN is a high performance inference engine for convolutional neural networks.

  • Forward

    A library for high performance deep learning inference on NVIDIA GPUs.

Code AI

  • AgentsMesh

    Self-hostable AI Agent Workforce Platform. Multi-agent orchestration with remote AI workstations (AgentPods), PTY sandbox + git worktree isolation, built-in Kanban, and per-pod MCP server. Supports Claude Code, Codex CLI, Gemini CLI, Aider, OpenCode.

  • AIDE

    Open-source ML engineering agent that uses tree search to explore solution spaces. Automates machine learning experimentation from data analysis to model training. Paper .

  • Bernstein

    Deterministic Python orchestrator for 37 CLI coding agents (Claude Code, Codex CLI, Gemini CLI, GitHub Copilot CLI, Cursor, Aider, OpenHands, OpenCode, Goose, Qwen, Ollama, ...) running in parallel git worktrees. First-class MCP server, quality gates, cost tracking with budgets.

  • CodeGeeX

    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

  • CodeGen

    CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

  • CodeT5

    Open Code LLMs for Code Understanding and Generation.

Large Scale Deployment

  • AirflowWorkflow

    A platform to programmatically author, schedule and monitor workflows.

  • aqueductWorkflow

    An Open-Source Platform for Production Data Science

  • Argo WorkflowsWorkflow

    Workflow engine for Kubernetes.

  • ClearMLML Platforms

    Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management.

  • CometModel Management

    Comet is an MLOps platform that offers Model Production Management, a Model Registry, and full model lineage from training straight through to production. Use Comet for model reproducibility, model debugging, model versioning, model visibility, model auditing, model governance, and model monitoring.

  • dstackML Platforms

    Open-source confidential AI framework for secure LLM deployment with data privacy, providing hardware-enforced isolation for production ML workloads.

Search

  • AirweaveHybrid search

    An easy way to turn any app into searchable data for LLMs.

  • AquilaDBVector search

    An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.

  • AwadbVector search

    AI Native database for embedding vectors

  • ChromaVector search

    the open source embedding database

  • EpsillaVector search

    A 10x faster, cheaper, and better vector database

  • InfinityVector search

    The AI-native database built for LLM applications, providing incredibly fast vector and full-text search

Model

  • AlpacaLarge Language Model

    Code and documentation to train Stanford's Alpaca models, and generate the data.

  • barkAudio Foundation Model

    Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects.

  • BELLELarge Language Model

    A 7B Large Language Model fine-tune by 34B Chinese Character Corpus, based on LLaMA and Alpaca.

  • BloomLarge Language Model

    BigScience Large Open-science Open-access Multilingual Language Model

  • ChatGLM2-6BLarge Language Model

    ChatGLM2-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model ChatGLM-6B .

  • disco-diffusionCV Foundation Model

    A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations.

Serving

  • Alpaca-LoRA-ServeLarge Model Serving

    Alpaca-LoRA as Chatbot service

  • BentoMLFrameworks/Servers for Serving

    The Unified Model Serving Framework

  • Clip-as-a-serviceLarge Model Serving

    serving the OpenAI CLIP model

  • CTranslate2Large Model Serving

    fast inference engine for Transformer models in C++

  • DeepSpeed-MIILarge Model Serving

    MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

  • Faster WhisperLarge Model Serving

    fast inference engine for whisper in C++ using CTranslate2.

Showing a sample of 386 resources. View the full list on GitHub →