awesome-seml
github.com/se-ml/awesome-seml ↗A curated list of articles that cover the software engineering best practices for building machine learning applications.
1.4k
GitHub Stars
84
Curated Resources
7
Categories
44 min ago
Last Refreshed
Broad OverviewsData ManagementModel TrainingDeployment and OperationSocial AspectsGovernanceTooling
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me model training resources from awesome-seml"
Installation instructions →What's inside
Model Training
- 10 Best Practices for Deep Learning
- Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement
- Fairness On The Ground: Applying Algorithmic FairnessApproaches To Production Systems
- How do you manage your Machine Learning Experiments?
- Machine Learning Testing: Survey, Landscapes and Horizons
- Nitpicking Machine Learning Technical Debt
Governance
- A Human-Centered Interpretability Framework Based on Weight of Evidence
- An Architectural Risk Analysis Of Machine Learning Systems
- Beyond Debiasing
- Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing
- Inherent trade-offs in the fair determination of risk scores
- Responsible AI practices
Broad Overviews
- AI Engineering: 11 Foundational Practices
- Best Practices for Machine Learning Applications
- Engineering Best Practices for Machine Learning
- Hidden Technical Debt in Machine Learning Systems
- Rules of Machine Learning: Best Practices for ML Engineering
- Software Engineering for Machine Learning: A Case Study
Tooling
- Aim
Aim is an open source experiment tracking tool.
- Airflow
Programmatically author, schedule and monitor workflows.
- Alibi Detect
Python library focused on outlier, adversarial and drift detection.
- Archai
Neural architecture search.
- Data Version Control (DVC)
DVC is a data and ML experiments management tool.
- Facets Overview / Facets Dive
Robust visualizations to aid in understanding machine learning datasets.
Data Management
- A Survey on Data Collection for Machine Learning A Big Data - AI Integration Perspective_2019
AI Integration Perspective_2019
- Automating Large-Scale Data Quality Verification
- Data management challenges in production machine learning
- Data Validation for Machine Learning
- How to organize data labelling for ML
- The curse of big data labeling and three ways to solve it
Deployment and Operation
- Best Practices in Machine Learning Infrastructure
- Building Continuous Integration Services for Machine Learning
- Continuous Delivery for Machine Learning
- Continuous Training for Production ML in the TensorFlow Extended (TFX) Platform
- Fairness Indicators: Scalable Infrastructure for Fair ML Systems
- Machine Learning Logistics
Showing a sample of 84 resources. View the full list on GitHub →