Skip to main content

A curated list of awesome Data Activation resources, libraries, tools and applications.

3
GitHub Stars
39
Curated Resources
8
Categories
1 month ago
Last Refreshed
ETL (Extract, Transform, Load)Reverse ETL (rETL)Data Warehouses and LakesData Integration PatternsData Governance and QualityReal-time Data ActivationMachine Learning for Data ActivationResources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me open source retl tools resources from awesome-data-activation"

Installation instructions →

What's inside

Reverse ETL (rETL)

  • AirbyteOpen Source rETL Tools

    While primarily known for ETL, Airbyte also supports reverse ETL workflows for certain destinations.

  • CensusCommercial rETL Platforms

    A reverse ETL platform that syncs data from your warehouse to your business tools, enabling operational analytics.

  • GrouparooOpen Source rETL Tools

    An open-source framework for syncing customer data from your data warehouse to cloud-based tools.

  • HightouchCommercial rETL Platforms

    A Reverse ETL platform that syncs data from your data warehouse to business tools without code.

  • JitsuOpen Source rETL Tools

    An open-source data integration platform that can handle both ETL and reverse ETL processes.

  • MeltanoOpen Source rETL Tools

    An open-source ELT platform that supports reverse ETL through its extensive plugin ecosystem.

Data Warehouses and Lakes

  • Amazon RedshiftCloud Data Warehouses

    A fully managed, petabyte-scale data warehouse service in the cloud, part of the AWS ecosystem.

  • Amazon S3Data Lakes

    Object storage built to store and retrieve any amount of data from anywhere, commonly used as a data lake solution.

  • AWS Lake FormationData Lakehouses

    A service that makes it easy to set up, secure, and manage your data lake.

  • Azure Data Lake StorageData Lakes

    A highly scalable data lake solution for big data analytics, built on Azure Blob Storage.

  • Azure Synapse AnalyticsCloud Data Warehouses

    An integrated analytics service that brings together data integration, enterprise data warehousing, and big data analytics.

  • Cloudera Data PlatformData Lakes

    A hybrid data platform for data engineering, streaming analytics, and data science workloads.

ETL (Extract, Transform, Load)

  • Apache KafkaOpen Source ETL Tools

    A distributed streaming platform that can be used for building real-time data pipelines and streaming apps.

  • Apache NiFiOpen Source ETL Tools

    A powerful and scalable system to process and distribute data between disparate systems.

  • AWS GlueCommercial ETL Platforms

    A fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics.

  • FivetranCommercial ETL Platforms

    A cloud-based data integration platform that enables data engineers to build data pipelines to sync data from various sources to data warehouses.

  • Google Cloud DataflowCommercial ETL Platforms

    A fully managed streaming analytics service that minimizes latency, processing time, and cost through autoscaling and batch processing.

  • Informatica PowerCenterCommercial ETL Platforms

    An enterprise-grade data integration platform for complex, high-performance data management.

Resources

Showing a sample of 39 resources. View the full list on GitHub →