Topic

Data Engineering

17 items across the graph — tagged with Data Engineering.

From the graph · 17

repo
apache/airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

repo
argoproj/argo-workflows

Workflow Engine for Kubernetes

repo
mage-ai/mage-ai

🧙 Build, run, and manage data pipelines for integrating and transforming data.

repo
feast-dev/feast

The Open Source Feature Store for AI/ML

repo
Zipstack/unstract

LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows

repo
Eventual-Inc/Daft

High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale

repo
Netflix/maestro

Maestro: Netflix’s Workflow Orchestrator

repo
lakehq/sail

Drop-in Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.

repo
apache/hamilton

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and sca…

repo
quixio/quix-streams

Python Streaming DataFrames for Kafka

repo
semantica-agi/semantica

Semantica 🧠 • Build AI systems that can explain, trace, and justify every decision. Knowledge graphs, context graphs, reasoning engines, provenance, and govern…

repo
kimtth/software-architect-mindmap

🧠Mindmap of 🗺️Software Architecture, Software engineering: An Overview of Software Terminologies and Concepts.

repo
benseverndev-oss/goldenmatch

Zero-config entity resolution & record linkage. The zero-tuning Fellegi-Sunter path beats hand-tuned Splink head-to-head and scales from a CSV to a verified 100…

repo
mloda-ai/mloda

mloda.ai - Open Data Access for AI and ML. Plugin-based. Traceable. Framework-agnostic.

repo
rasinmuhammed/misata

High-performance open-source synthetic data engine. Uses LLMs for schema design and vectorized NumPy for deterministic, scalable generation.

repo
SuperCowPowers/workbench

Workbench: An easy to use Python API for creating and deploying AWS SageMaker Models

repo
PatrickWiloak/cloud-data-ai-security-zero-to-hero

Cloud + Data + AI + Security from zero to hero. 122+ certs across 22 providers, 37 plain-English concepts, 15 hands-on builds, cross-cloud + AI service comparis…

Related topics