Skip to main content

A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin

6.6k
GitHub Stars
232
Curated Resources
12
Categories
3 hours ago
Last Refreshed
Pipeline frameworks & librariesWorkflow platformsWorkflow languagesWorkflow standardization initiativesETL & Data orchestrationLiterate programming (aka interactive notebooks)Extract, transform, load (ETL)Continuous Delivery workflowsBuild automation toolsAutomated workflow compositionOther projectsRelated lists

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me pipeline frameworks & libraries resources from awesome-pipeline"

Installation instructions →

What's inside

Pipeline frameworks & libraries

  • ActionChain

    A workflow system for simple linear success/failure workflows.

  • Adage

    Small package to describe workflows that are not completely known at definition time.

  • AiiDA

    workflow manager with a strong focus on provenance, performance and extensibility.

  • Airflow

    Python-based workflow system created by AirBnb.

  • Anduril

    Component-based workflow framework for scientific data analysis.

  • Antha

    High-level language for biology.

Workflow platforms

  • ActivePapers

    Computational science made reproducible and publishable.

  • Active Workflow

    Polyglot workflows without leaving the comfort of your technology stack.

  • Anvi’o

    A community and framework centered around metagenomics, designed to facilitate reproducible exploration and visualization of data.

  • Apache Iravata

    Framework for executing and managing computational workflows on distributed computing resources.

  • Arteria

    Event-driven automation for sequencing centers. Initiates workflows based on events.

  • Arvados

    A container based workflow platform.

Automated workflow composition

  • APE

    A tool for the automated exploration of possible computational workflows based on semantic annotations.

Continuous Delivery workflows

  • Argo

    Get stuff done with container-native workflows for Kubernetes.

  • CDS

    A pipeline based Continuous Delivery Service written in Golang.

Related lists

Build automation tools

  • Bazel

    Build software just as engineers do at Google.

  • doit

    Highly generalized task-management and automation in Python.

  • Gradle

    Unified cross platforms builds.

  • Just

    Command and recipe runner similar to Make, built in Rust.

  • Make

    The GNU Make build system.

  • Prodmodel

    Build system for data science pipelines.

Literate programming (aka interactive notebooks)

Extract, transform, load (ETL)

  • Bruin

    Data pipeline framework supporting SQL and Python in the same DAG. Built-in data quality assertions, cross-database lineage, and incremental processing. Targets data warehouses (BigQuery, Snowflake, Postgres, etc.).

  • Cadence

  • Dataform

    Dataform is a framework for managing SQL based operations in your data warehouse.

  • DataRaven

    Managed cloud object storage transfers for ingestion workflows.

  • Hevo

    Hevo is a Fully Automated, No-code Data Pipeline Platform that supports 150+ ready-to-use integrations across Databases, SaaS Applications, Cloud Storage, SDKs, and Streaming Services.

  • Kiba ETL

    A data processing & ETL framework for Ruby.

Showing a sample of 232 resources. View the full list on GitHub →