awesome-python-data-science
github.com/thomasjpfan/awesome-python-data-science ↗A curated list of Python libraries used for data science.
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me images and video resources from awesome-python-data-science"
Installation instructions →What's inside
Deployment
- airflow
ETL.
- evidently
Evidently helps evaluate machine learning models during validation and monitor them in production.
- kubeflow
Machine Learning Toolkit for Kubernetes.
- lore
Lore makes machine learning approachable for Software Engineers and maintainable for Machine Learning Researchers.
- mlflow
Open source platform for the complete machine learning lifecycle.
- onnx
Open Neutral Network Exchange.
Feature Extraction
- albumentationsImages and Video
fast image augmentation library.
- AugmentorImages and Video
Image augmentation library.
- BERT-pytorchText/NLP
Google AI 2018 BERT pytorch implementation.
- BlingFireText/NLP
A lightning fast Finite State machine and REgular expression manipulation library.
- categorical-encodingGeneral Feature Extraction
sklearn compatible categorical variable encoders.
- CausalityTime Series
Causal analysis.
Exploration
- alibi
Algorithms for monitoring and explaining machine learning models.
- alibi-detect
Open source Python library focused on outlier, adversarial and drift detection. The package aims to cover both online and offline detectors for tabular data, text, images and time series.
- cleanlab
Finding label errors in datasets and learning with noisy labels.
- dabl
Data Analysis Baseline Library
- Dora
Exploratory data analysis.
- dtale
Flask/React client for visualizing pandas data structures
Deep Learning Tools
Visualization
Misc
- annoyGeneral Feature Extraction
Approximate Nearest Neighbors.
- crayon
A language-agnostic interface to TensorBoard.
- faiss
A library for efficient similarity search and clustering of dense vectors.
- fbpca
Fast Randomized PCA/SVD.
- mmh3
MurmurHash3, a set of fast and robust hash functions.
- pipeline
Standard Runtime For Every Real-Time Machine Learning.
Scientific
AutoML
- autokeras
Automated machine learning in Keras.
- auto_ml
Automated machine learning.
- auto-sklearn
Automated machine learning.
- devol
Automated deep neural network design via genetic programming.
- featuretools
Automated feature engineering.
- MLBox
Automated Machine Learning python library.
Showing a sample of 321 resources. View the full list on GitHub →