awesome-dlab
github.com/dlab-berkeley/awesome-dlab ↗😎 Awesome lists about all kinds of topics and tools interesting to D-Labbers
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me datasets resources from awesome-dlab"
Installation instructions →What's inside
Datasets
- Awesome Public Data
list of a topic-centric public data sources collected and tidied from blogs, answers, and user responses.
- California COVID Assessment Tool
This repository contains an application written in Shiny and for use with any US state to assist in assessing the many different models available for understanding COVID-19 transmission and spread. It brings together several data sources that are publicly available, and can be supplemented with your own data to improve the assessment.
- Case.Law
all official, book-published United States case law — every volume designated as an official report of decisions by a court within the United States.
- DEA Pain Pills Database
The Washington Post published a significant portion of a database that tracks the path of every opioid pain pill, from manufacturer to pharmacy, in the United States between 2006 and 2012.
- tidyethnicnews
R package for turning one of the largest databases on ethnic newspapers and magazines (Ethnic NewsWatch) into a tidyverse-ready dataframe. The package takes 0.0005 seconds to turn 100 newspaper articles into a tidy dataframe.
- tidytweetjson
R package for Turning Tweet JSON Files into a Tidyverse-ready Dataframe. The package takes 18 minutes to turn 1 million tweets into a dataframe.
Python
- Awesome Python
more awesomeness related to this topic.
R
- Awesome R
more awesomeness related to this topic.
- makereproducible
- rio: A Swiss-Army Knife for Data I/O
Import, Export, and Convert Data Files including web-based import, reading compressed files directly without explicit decompression, and 'convert()' function for converting between file types.
Databases
- Awesome SQL
more awesomeness related to this topic.
- binder-postgres
Demo of launching a binderhub notebook server with a free running Postgres server.
- fuzzy string matching with Postgresql
examples of different ways to match strings using PostgreSQL and extensions.
- SQLite
A completely embedded, full-featured relational database in a few 100k that you can include right into your project.
- sqlitebiter
a CLI tool to convert CSV / Excel / HTML / JSON / and many other formats to a SQLite database file.
- SQL Join Types Explained in Visuals
Simple, useful visual expalanation of joins in SQL.
Systems Administration
- Awesome Sysadmin
more awesomeness related to this topic.
- Ops School
Comprehensive program that will help you learn to be an operations engineer.
Cloud Computing
- Binder
To turn a Git repo into a collection of interactive notebooks. A great tool for teaching workshops.
Rosetta Stones
- Data Science Rosetta Stone
A Tutorial of and Translation between Data Science Programming Languages
- Rosetta: Python, R, Stata Rosetta Stone. Projects implemented in each language side-by-side.
- Stata to Pandas Cross-Walk
Bash
- jid
JSON Incremental Digger to drill down interactively by using filtering queries like jq.
- jq
jq is a lightweight and flexible command-line JSON processor.
- miller
With Miller, you get to use named fields without needing to count positional indices, using familiar formats such as CSV, TSV, JSON, and positionally-indexed.
- q
Run SQL directly on CSV or TSV files.
Showing a sample of 31 resources. View the full list on GitHub →