awesome-apache-airflow
github.com/jghoman/awesome-apache-airflow ↗Curated list of resources about Apache Airflow
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me airflow summit 2020 videos resources from awesome-apache-airflow"
Installation instructions →What's inside
Airflow Summit 2020 videos
Slide deck presentations and online videos
- Advanced Data Engineering Patterns with Apache Airflow
Video of
- Airflow Breeze - Development and Test Environment for Apache Airflow
Screencast showing how to use Breeze environment by
- Airflow @ Lyft
Talks from
- Apache Airflow in the Cloud: Programmatically orchestrating workloads with Python
Slides from
- Apache Airflow @ Umuzi.org
- Apache Airflow YouTube tutorials
Libraries, Hooks, Utilities
- afctl
A CLI tool that includes everything required to create, manage and deploy airflow projects faster and smoother.
- airflow-code-editor
A plugin for Apache Airflow that allows you to edit DAGs in browser.
- airflow-config
- Airflow Ditto
An extensible framework to do transformations to an Airflow DAG and convert it into another DAG which is flow-isomorphic with the original DAG, to be able to run it on different environments (e.g. on different clouds, or even different container frameworks - Apache Spark on YARN vs Kubernetes). Comes with out-of-the-box support for EMR-to-HDInsight-DAG transforms.
- Airflow DVC plugin
Plugin for open-source version-control system for data science and Machine Learning pipelines -
- Airflow ECR Plugin
Plugin to refresh AWS ECR login token at regular intervals. This is helpful where DockerOperator needs to pull images hosted on ECR.
Non-English resources
- Airflow
Overview of Airflow, concept, basic use with use case.
- Airflowはすごいぞ!100行未満で本格的なデータパイプライン
(🇯🇵Japanese)
- Airflow - Automatizando seu fluxo de trabalho
(🇧🇷Portuguese)
- Airflow Documentation-Chinese
(🇨🇳Chinese)
- AirflowのタスクログをS3に保存する方法
(🇯🇵Japanese)
- Gestion de Tâches avec Apache Airflow
Overview of Airflow, basic concepts and how to write and trigger a DAG.
Books, blogs, podcasts, and such
- Airflow 2.0: DAG Authoring Redesigned
Blog post about new ways of writing DAGs in Airflow 2.0.
- Airflow 2.0 Providers
Blog post about providers packages in Airflow 2.0.
- Data Pipelines with Apache Airflow
A Manning book (Early Access September 2019) on Airflow.
- Handling Airflow logs with Kubernetes Executor
A blogpost that outlines how you can set up remote S3 logging when using KubernetesExecutor, without creating complex infrastructure.
- Maxime Beauchemin
Maxime's blog on medium that gives insight into the philosophy behind Apache Airflow.
- Robert Chang
Blog posts about data engineering with Apache Airflow, explains why and has examples in code.
Airflow deployment solutions
- Airflow-Component
Lightweight installer of federated Airflow-Airflow (RabbitMQ) reference architectrure on Compute node(s).
- airflow-cookbook
Chef cookbook for deploying Airflow.
- airflow-k8s-executor-on-GKE
A detailed tutorial to get a scalable, low maintenance airflow kubernetes executor environment deployed on
- airflow-on-kubernetes
A guide on all relevant resources, scripts and projects that relate to running Airflow on Kubernetes.
- airflow-pipeline
Airflow Docker container that comes preconfigured for Spark and Hadoop. It can be docker pulled at
- Apache Airflow Multi-Tier Free Deployment on Azure
A free Azure Resource Manager (ARM) template by Bitnami providing a one-click solution for Airflow deployment on Azure for production use-cases.
Best practices, lessons learned and cool use cases
- Airflow Dag Management & Versioning
Efficently manage DAGs release process by using Git Submodules
- Airflow Dag Python Package Management
Managing python package dependencies across 100+ dags can become painful. It's hard to keep track of which packages are used by which dag, and hard to clean up during DAG removal/upgrade. Learn how KubernetesPodOperator and DockerOperator can fix this.
- Airflow: Lesser Known Tips, Tricks, and Best Practises
- Airflow Lessons from the Data Engineering Front in Chicago
- Airflow: Why is nothing working? - TL;DR Airflow’s SubDagOperator causes deadlocks
Deep dive into troubleshooting a troublesome Airflow DAG with good tips on how to diagnosis problems.
- Apache Airflow as an External scheduler for distributed systems
Introductions and tutorials
- Airflow Repository Template
A boilerplate repository for developing locally with Airflow, with linting & tests for valid DAGs and plugins. Just clone and run
- Apache Airflow 2.0 Tutorial
This article discusses the basic concepts that stand behind Airflow and discusses the problems it solves.
- Apache Airflow Monitoring Metrics
A two-part series by
- Beyond CRON: an introduction to Workflow Management Systems
- Dustin Stansbury
- ETL with Apache Airflow for Data Analysis on Transaction Data
Showing a sample of 192 resources. View the full list on GitHub →