awesome-production-machine-learning
github.com/eric-erki/awesome-production-machine-learning ↗A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me adversarial robustness libraries resources from awesome-production-machine-learning"
Installation instructions →What's inside
Adversarial Robustness Libraries
- AdvBox
generate adversarial examples from the command line with 0 coding using PaddlePaddle, PyTorch, Caffe2, MxNet, Keras, and TensorFlow. Includes 10 attacks and also 6 defenses. Used to implement
- Adversarial DNN Playground
the attack library is limited in size, but it has a nice front-end to it with buttons you can press!
- AdverTorch
library for adversarial attacks / defenses specifically for PyTorch.
- Alibi Detect
alibi-detect is a Python package focused on outlier, adversarial and concept drift detection. The package aims to cover both online and offline detectors for tabular data, text, images and time series. The outlier detection methods should allow the user to identify global, contextual and collective outliers.
- Artificial Adversary
- CleverHans
library for testing adversarial attacks / defenses maintained by some of the most important names in adversarial ML, namely Ian Goodfellow (ex-Google Brain, now Apple) and Nicolas Papernot (Google Brain). Comes with some nice tutorials!
Explaining Black Box Models and Datasets
- Aequitas
An open-source bias audit toolkit for data scientists, machine learning researchers, and policymakers to audit machine learning models for discrimination and bias, and to make informed and equitable decisions around developing and deploying predictive risk-assessment tools.
- Alibi
Alibi is an open source Python library aimed at machine learning model inspection and interpretation. The initial focus on the library is on black-box, instance based model explanations.
- anchor
Code for the paper
- captum
model interpretability and understanding library for PyTorch developed by Facebook. It contains general purpose implementations of integrated gradients, saliency maps, smoothgrad, vargrad and others for PyTorch models.
- casme
Example of using classifier-agnostic saliency map extraction on ImageNet presented on the paper
- ContrastiveExplanation (Foil Trees)
Python script for model agnostic contrastive/counterfactual explanations for machine learning. Accompanying code for the paper
Commercial Platforms
- Algorithmia
Cloud platform to build, deploy and serve machine learning models
- Amazon SageMaker
End-to-end machine learning development and deployment interface where you are able to build notebooks that use EC2 instances as backend, and then can host models exposed on an API
- cnvrg.io
An end-to-end platform to manage, build and automate machine learning
- Comet.ml
Machine learning experiment management. Free for open source and students
- Dataiku
Collaborative data science platform powering both self-service analytics and the operationalization of machine learning models in production.
- DataRobot
Automated machine learning platform which enables users to build and deploy machine learning models.
Data Storage Optimisation
- Alluxio
A virtual distributed storage system that bridges the gab between computation frameworks and storage systems.
- Apache Arrow
In-memory columnar representation of data compatible with Pandas, Hadoop-based systems, etc
- Apache Kafka
Distributed streaming platform framework
- Apache Parquet
On-disk columnar representation of data compatible with Pandas, Hadoop-based systems, etc
- BayesDB
Database that allows for built-in non-parametric Bayesian model discovery and queryingi for data on a database-like interface -
- ClickHouse
ClickHouse is an open source column oriented database management system supported by Yandex -
Data Pipeline ETL Frameworks
- Apache Airflow
Data Pipeline framework built in Python, including scheduler, DAG definition and a UI for visualisation
- Apache Nifi
Apache NiFi was made for dataflow. It supports highly configurable directed graphs of data routing, transformation, and system mediation logic.
- Azkaban
Azkaban is a batch workflow job scheduler created at LinkedIn to run Hadoop jobs. Azkaban resolves the ordering through job dependencies and provides an easy to use web user interface to maintain and track your workflows.
- Genie
Job orchestration engine to interface and trigger the execution of jobs from Hadoop-based systems
- Luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs, handling dependency resolution, workflow management, visualisation, etc
- Neuraxle
A framework for building neat pipelines, providing the right abstractions to chain your data transformation and prediction steps with data streaming, as well as doing hyperparameter searches (AutoML).
Data Stream Processing
- Apache Flink
Open source stream processing framework with powerful stream and batch processing capabilities.
- Apache Samza
Distributed stream processing framework. It uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management.
- Brooklin
Distributed stream processing framework. It uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management.
- Faust
Streaming library built on top of Python's Asyncio library using the async kafka client inspired by the kafka streaming library.
- Kafka Streams
Kafka client library for buliding applications and microservices where the input and output are stored in kafka clusters
Model and Data Versioning
- Apache Marvin
- Catalyst
High-level utils for PyTorch DL & RL research. It was developed with a focus on reproducibility, fast experimentation and code/ideas reusing.
- D6tflow
A python library that allows for building complex data science workflows on Python.
- DAGsHub
The home for data science collaboration. A platform, based on DVC, for data science project management and collaboration.
- Data Version Control (DVC)
A git fork that allows for version management of models
- FGLab
Machine learning dashboard, designed to make prototyping experiments easier.
Function as a Service Frameworks
- Apache OpenWhisk
Open source, distributed serverless platform that executes functions in response to events at any scale.
- Fission
(Early Alpha) Serverless functions as a service framework on Kubernetes
- Hydrosphere Mist
Serverless proxy for Apache Spark clusters
- Hydrosphere ML Lambda
Open source model management cluster for deploying, serving and monitoring machine learning models and ad-hoc algorithms with a FaaS architecture
- KNative Serving
Kubernetes based serverless microservices with "scale-to-zero" functionality.
- OpenFaaS
Serverless functions framework with RESTful API on Kubernetes
Showing a sample of 232 resources. View the full list on GitHub →