The data warehouse is the brain. Reverse ETL is the nervous system that lets the brain control the body.
For decades the data warehouse was a one-way destination. Data flowed in from operational systems, analysts queried, dashboards rendered, decisions happened in meetings. Around 2019 someone asked: what if we sent the warehouse's insights back into the operational systems automatically? That question created the reverse ETL category.
The use cases
- Lifecycle marketing: send "high-value customer" status from the warehouse to your email tool, your ad platforms, your CS tooling.
- Sales prioritization: push lead scores from the warehouse into Salesforce so reps work the right accounts first.
- Product personalization: feed user segments back into Pendo, Intercom, or your in-app messaging system.
- Finance ops: sync customer health into Stripe metadata, NetSuite contacts, accounts receivable.
Why it was hard before
Each operational tool has its own API, rate limits, and idempotency model. Each warehouse has its own query interface. The integration matrix is N×M and the engineering cost was prohibitive. Hightouch and Census productized this pain and the category was born.
The pattern that works
The warehouse remains canonical. dbt models compute the facts you want operational tools to know. Reverse ETL diffs the model output against the destination's current state and pushes only the changes. Schedule frequency depends on the use case — every 5 minutes for sales prioritization, daily for marketing campaigns.
What to watch out for
Operational tools have schemas you do not control. A custom field name change in Salesforce silently breaks the sync. Build alerting for unexpected destination errors. Treat the operational tool's data model as a contract.
What you avoid
The bad alternative is "let each app integrate with each app via Zapier-style point-to-point." That is the integration hairball that reverse ETL replaces. Star topology with the warehouse at the center is genuinely better.
What we ship
For most client engagements that are already on dbt + Snowflake or dbt + BigQuery, we install Hightouch (managed) or write minimal custom Python jobs (self-hosted). The break-even point for self-hosted versus managed is around five sync destinations.