Data Engineering Directory
AI tools for the modern data stack
Curated by the Data Stack Community
Search across ingestion, processing, orchestration, observability, and governance platforms to assemble an intelligent data pipeline.
50+ copilots across warehouses, ETL, observability, and cataloging.
60 tools shown
| # | Tool | What it's good for | AI highlights | Category |
|---|---|---|---|---|
| 1 | โ๏ธ | SQL, ML, document processing | AI SQL, doc processing, vector search, NL querying | Warehouse / AI ETL |
| 2 | ๐ค | Developer experience | Code generation, pipeline suggestions, auto-SQL | Warehouse |
| 3 | ๐ | Analytics and SQL | Natural language to SQL, model-assisted optimizations | Warehouse |
| 4 | ๐ฅ | ETL, notebooks, SQL | AI SQL, notebook assistants, code generation | Unified analytics |
| 5 | ๐ฆ | Analytics on DuckDB | AI SQL assistant, insights | Warehouse / Processing |
| 6 | ๐ | SQL, analytics | Auto SQL explanation and tuning | Warehouse |
| 7 | โ๏ธ | ETL pipelines, Spark-based jobs | AI code generation, schema inference, test generation | Cloud ETL |
| 8 | ๐ | Batch + streaming ETL | AI job explanations and pipeline optimization | Cloud ETL |
| 9 | ๐ฐ๏ธ | ETL pipeline design | Natural language pipeline generation, code suggestions | Cloud ETL |
| 10 | ๐ข | Traditional ETL + governance | Smart mappings and rule suggestions | ETL |
| 11 | โ๏ธ | Enterprise ETL | Semantic matching, metadata enrichment | ETL |
| 12 | ๐งฑ | Cloud ETL UI | AI pipeline generation, smart suggestions | Cloud ETL |
| 13 | ๐งฉ | No-code ETL | AI mapping plus anomaly detection | No-code ETL |
| 14 | โ๏ธ | No-code ETL | AI-based transformations and mapping | No-code / ETL |
| 15 | โจ | SQL transformations | Model generation, test suggestions | Transformation |
| 16 | ๐ฎ | SQL + notebooks | AI SQL and insight generation | Transformation |
| 17 | ๐ฆ | DuckDB analytics | NL to SQL and automated analysis | Transformation |
| 18 | ๐ | Notebook/SQL BI | Generate SQL and narrative analysis | Transformation |
| 19 | ๐ | Collaborative notebooks | Write SQL/Python with AI | Transformation |
| 20 | ๐ | Data modeling | AI model inference and linting | Transformation |
| 21 | ๐ข | Automated ETL/ELT | Generate pipelines from natural language | Transformation |
| 22 | ๐ซ | SQL generation | Natural language to SQL via LLMs | Transformation |
| 23 | ๐ | Ingestion connectors | Generate connectors from plain language | Ingestion |
| 24 | ๐ | Managed ingestion | AI-mapped schemas and transforms | Ingestion |
| 25 | ๐ ๏ธ | Open-source ingestion | AI code generation for connectors | Ingestion |
| 26 | ๐ฏ | Reverse ETL | NL transform logic and AI audience building | Integration |
| 27 | ๐๏ธ | Reverse ETL | AI-driven transformation logic | Integration |
| 28 | ๐งต | Lightweight ingestion | Smart mapping suggestions | Ingestion |
| 29 | ๐ฐ๏ธ | Data observability | AI anomaly detection and root-cause | Observability |
| 30 | ๐ | Data testing | AI-generated data quality rules | Data quality |
| 31 | ๐ฅค | Data quality | AI rule generation and NL checks | Data quality |
| 32 | ๐๏ธ | Data observability | AI baselines and anomaly detection | Observability |
| 33 | โ ๏ธ | AI-native data quality | Automated drift and completeness checks | Data quality |
| 34 | ๐ | Data reliability | AI anomaly detection | Observability |
| 35 | ๐ก | Pipeline observability | AI pipeline anomaly detection | Observability |
| 36 | ๐ก๏ธ | Data monitoring | Lineage-based anomaly predictions | Observability |
| 37 | ๐ | Cataloging | Auto-tagging and natural language search | Catalog |
| 38 | ๐๏ธ | Governance + catalog | Semantic classification with NL search | Catalog |
| 39 | ๐ | Active metadata catalog | AI lineage and documentation | Catalog |
| 40 | ๐งฉ | Open-source catalog | AI tag inference | Catalog |
| 41 | ๐งญ | Open-source metadata catalog | Semantic suggestions | Catalog |
| 42 | โญ | Usage analytics catalog | AI column naming and documentation | Catalog |
| 43 | โก | Orchestration | AI-generated flows | Orchestration |
| 44 | ๐ธ๏ธ | Orchestration | Code and graph generation | Orchestration |
| 45 | ๐ช | AI-native pipeline tool | Autogenerated ETL pipelines | Orchestration |
| 46 | ๐ฉ๏ธ | Workflow automation | Generate DAGs from natural language | Orchestration |
| 47 | ๐ | Document to structured ETL | AI extraction, OCR, classification | AI extraction |
| 48 | ๐ง | Enterprise extraction | OCR plus entity extraction | AI extraction |
| 49 | ๐ | Document extraction | Entity extraction and summarization | AI extraction |
| 50 | ๐ | OCR + structure extraction | NLP entity extraction | AI extraction |
| 51 | ๐ฆ | Parsing PDFs for RAG | AI structuring and tables | AI extraction |
| 52 | ๐ค | AI-based structuring | Transform unstructured text into structured data | AI extraction |
| 53 | ๐ | Index building + ETL | Auto-chunking and schema extraction | RAG / ETL |
| 54 | ๐ | ETL for LLM pipelines | AI extraction, structuring, loaders | RAG / ETL |
| 55 | ๐ง | Turn documents into embeddings | Automatic chunking and metadata ETL | RAG / ETL |
| 56 | ๐ฒ | Vector ingestion | Managed embedding pipelines | RAG / ETL |
| 57 | ๐ | Vector DB ingestion | Unstructured to vector pipelines | RAG / ETL |
| 58 | ๐ธ๏ธ | Vector DB ingestion | Auto-schema inference | RAG / ETL |
| 59 | ๐ฅ | Embedding pipelines | AI-based structuring | RAG / ETL |
| 60 | ๐ | Vector ingestion | Smart metadata extraction | RAG / ETL |