awesome-datascience
github.com/academic/awesome-datascience ↗:memo: An awesome Data Science repository to learn and apply for real world problems.
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me tutorials resources from awesome-datascience"
Installation instructions →What's inside
Training Resources
- 1000 Data Science ProjectsTutorials
- 12 free Data Science projects to practice Python and PandasTutorials
- 365 Data Science CourseMOOC's
- A 2020 Vision of Linear Algebra (G. Strang)MOOC's
- AI Expert RoadmapFree Courses
Roadmap to becoming an Artificial Intelligence Expert
- A list of colleges and universities offering degrees in data science.Colleges
Other Awesome Lists
- 100 NLP Papers
- 40+ Data Analytics Projects Ideas
- AI Dev Jobs
Job board focused on AI/ML engineering roles with 5,400+ listings and a free REST API.
- AI in Data Science: Uses, Roles, and Tools
- awesome-awesomeness
- Awesome Community Detection
Fun
- 250k+ Job PostingsDatasets
An expanding dataset of historical job postings from Luxembourg from 2020 to today. Free with 250k+ job postings hosted on AWS Data Exchange.
- 5000 Images of ClothesDatasets
- Academic TorrentsDatasets
- A community-curated database of well-known people, places, and thingsDatasets
- A Deep Catalog of Human Genetic VariationDatasets
- ADS-B ExchangeDatasets
Specific datasets for aircraft and Automatic Dependent Surveillance-Broadcast (ADS-B) sources.
Literature and Media
- 8bitconceptsJournals, Publications and Magazines
AI industry research and analysis with papers on AI pricing, enterprise adoption, and evaluation frameworks.
- Aditi RastogiBloggers
ML,DL,Data Science blog
- Advances in Evolutionary AlgorithmsBooks
Free Download
- Advances in Genetic Programming, Vol. 3Books
Free Download
- Adventures in Data LandBloggers
- Adversarial LearningPodcasts
The Data Science Toolbox
- AdaBoostComparison
- Adaptive resonance theoryComparison
- AerosolveMiscellaneous Tools
A machine learning package built for humans.
- AI for DatabaseMiscellaneous Tools
Chat with your database in natural language — no SQL needed. Get instant insights, build self-refreshing dashboards, and trigger automated workflows based on database changes.
- AlbumentationsMiscellaneous Tools
А fast and framework agnostic image augmentation library that implements a diverse set of augmentation techniques. Supports classification, segmentation, and detection out of the box. Was used to win a number of Deep Learning competitions at Kaggle, Topcoder and those that were a part of the CVPR workshops.
- altairDeep Learning Packages
Agents
- ADK-RustFrameworks
Production-ready AI agent development kit for Rust with model-agnostic design (Gemini, OpenAI, Anthropic), multiple agent types (LLM, Graph, Workflow), MCP support, and built-in telemetry.
- ai-evaluationTools
Open-source LLM and agent evaluation framework with 50+ metrics, LLM-as-Judge augmentation, and guardrail scanners (jailbreak, PII, prompt-injection). Useful for scoring RAG outputs, agent trajectories, and function-calling behavior in data-science workflows.
- Arch ToolsTools
61 production-ready AI API tools for data science workflows: code analysis, web scraping, NLP, image generation, crypto data, and search. REST API and MCP protocol support.
- BGPT MCPResearch & Knowledge Retrieval
MCP server that gives AI agents access to a database of scientific papers built from raw experimental data extracted from full-text studies. Returns 25+ structured fields per paper including methods, results, sample sizes, and quality scores.
- CAJALTools
Local AI agent for generating publication-ready scientific papers with real arXiv citations, IMRaD structure, and tribunal scoring. Runs 100% offline via Ollama with 4B-9B models. MIT licensed.
- Chunk TunerResearch & Knowledge Retrieval
Open-source Python library and MCP server to benchmark document chunking strategies for RAG, score retrieval quality, and recommend configurations for a corpus.
Socialize
- Alexey GrigorevTwitter Accounts
Data science author
- Analytics, Data Mining, Predictive Modeling, Artificial IntelligenceFacebook Accounts
- Analytics VidhyaData Science Competitions
- Berkeley Institute for Data ScienceGitHub Groups
- Big Data Analytics using RFacebook Accounts
- Big Data Analytics with R and HadoopFacebook Accounts
What is Data Science?
- a very short history of #datascience
The story of how data scientists became sexy is mostly the story of the coupling of the mature discipline of statistics with a very young one--computer science. The term “Data Science” has emerged only recently to specifically designate a new profession that is expected to make sense of the vast stores of big data. But making sense of data has a long history and has been discussed by scientists, statisticians, librarians, computer scientists and others for years. The following timeline traces the evolution of the term “Data Science” and its use, attempts to define it, and related terms.
Showing a sample of 904 resources. View the full list on GitHub →