TL;DR
ELT (Extract, Load, Transform) is the modern data-pipeline pattern where raw data is extracted from sources, loaded into the warehouse first, and then transformed inside the warehouse using SQL-based tools (typically dbt). ELT became dominant in the mid-2010s when cloud warehouses (Snowflake, BigQuery, Redshift) made warehouse-side transformation cheaper and easier than staging-environment transformation. ELT preserves raw data for reprocessing and centralises transformation logic in version-controlled SQL.
What is ELT?
ELT inverts the original ETL pattern: instead of transforming data in a staging environment before loading, ELT loads raw data into the warehouse first and runs transformations there using warehouse-native compute.
The pattern became dominant when three changes converged: cloud warehouses made compute cheap and elastic; warehouse-native transformation tools (dbt) became mature; and the value of preserving raw data for reprocessing became widely recognised.
Why ELT won
- Warehouse compute became cheap and elastic. Modern warehouses scale compute on demand; running heavy transforms inside them no longer carries the cost penalty it did in the on-premise era.
- Raw data preservation enables reprocessing. When requirements change or bugs are discovered, ELT pipelines can re-run transforms against preserved raw data. ETL pipelines often require re-extracting from source — sometimes impossible if source systems have changed.
- SQL is more accessible than ETL DSLs. Analytics-engineering work in dbt (SQL + Jinja templating + tests) is accessible to a wider talent pool than legacy ETL tools.
- Version control and CI/CD on transformations. dbt projects live in git, support automated testing, and integrate with CI/CD — closing the gap between data engineering and software engineering practice.
Modern ELT stack
| Stage | Common tools (2025) |
|---|---|
| Extract + Load | Fivetran, Airbyte, Stitch, Segment, custom Singer-based pipelines |
| Warehouse / Lakehouse | Snowflake, BigQuery, Databricks, Redshift, ClickHouse |
| Transform | dbt (overwhelmingly dominant), SQLMesh, Coalesce |
| Orchestration | Airflow, Dagster, Prefect |
| Reverse-ETL | Hightouch, Census |
| Semantic / metric layer | dbt Semantic Layer (MetricFlow), Cube, LookML |
Common ELT patterns
Modern ELT projects typically organise transformations in layers:
- Source / staging layer: light cleaning of raw extracted data (rename columns, cast types, deduplicate)
- Intermediate layer: business logic, joins, derived attributes — typically the bulk of the transformation work
- Mart layer: dimensionally-modeled tables (facts + dimensions) consumed by BI and downstream tools
- Metric / semantic layer: centralised metric definitions exposed via headless BI to consumers
Common pitfalls
- 1. Skipping the staging layer. Loading raw data and immediately joining and transforming produces brittle pipelines. The staging layer is cheap and pays off in maintenance.
- 2. Putting business logic in BI tools. When metric calculations live in Looker or Tableau rather than dbt models, the business logic is invisible to other consumers and tightly coupled to a specific tool.
- 3. No tests on transforms. Modern ELT tools (dbt) make data-quality testing easy — uniqueness, not-null, accepted-values, custom assertions. Skipping tests is the most common cause of silent pipeline breakage.
Related concepts
ETL is the predecessor pattern. Reverse ETL is the operational-systems-side complement. CDC is the change-aware extraction technique. Dimensional modeling is the discipline applied in the mart layer.
At a glance
- Category
- Business Intelligence
- Related
- 5 terms
Frequently asked questions
Why is ELT better than ETL?
For typical cloud analytical workloads: cheaper (warehouse compute is elastic), more flexible (raw data preserved for reprocessing), more accessible (SQL-based dbt workflows), and better integrated with software-engineering practice (git, CI/CD, testing). For some specific use cases (real-time, data residency, legacy environments) ETL remains the right choice.
What's the role of dbt in ELT?
dbt is the dominant transformation layer in modern ELT stacks. It compiles SQL with Jinja templating into warehouse-executable transforms, supports testing and documentation, and integrates with version control and CI/CD. Most modern ELT projects use dbt for the 'T' step.
Should ELT preserve raw data forever?
Yes — usually. Storage is cheap; reprocessing requirements are unpredictable. The cost of preserving raw data is typically far less than the cost of being unable to reprocess when requirements change. Most modern ELT projects preserve raw data indefinitely or with very long retention.
Sources
- dbt Labs analytics engineering documentation
- Modern Data Stack reports (2024–25)
- Fivetran ELT whitepapers
Fairview is an operating intelligence platform that consumes from dbt-modeled ELT pipelines directly — preserving the analytics-engineering discipline teams have built rather than recreating transformation logic in a Fairview-specific layer. Start your free trial →
Siddharth Gangal is the founder of Fairview. He built the dbt-aware ingestion layer after watching companies build careful ELT models only to have new operating tools force them to expose raw extracted data anyway — bypassing the entire transformation layer they had spent quarters perfecting.
See it in Fairview
Track ELT (Extract, Load, Transform) automatically.
14-day free trial. No credit card. First data source connected in 5 minutes.