Official Runs and History
Spotlight DQ supports persisting DQ and reconciliation metrics across multiple dbt runs, building a time series you can use for trend analysis and run-to-run comparison.
How it works
By default (dq_persist_results: false), only current-run summary tables are built. History models emit zero rows.
When you set dq_persist_results: true, the package appends (upserts) current-run results into three incremental history tables:
| History table | Source |
|---|---|
reconciliation_history__monthly_summary | reconciliation__monthly_summary |
dq_history__input_layer_field_summary | dq_results__input_layer_field_summary |
dq_history__terminology_field_summary | dq_results__terminology_field_summary |
Each row is stamped with run_id and run_ts so you can distinguish results from different runs.
Run metadata
| Variable | Default | Purpose |
|---|---|---|
dq_persist_results | false | Enable history persistence |
dq_run_id | dbt invocation_id | Identifier stamped on each row |
tuva_last_run | run_started_at | Timestamp stamped on each row |
For scheduled production runs, use a meaningful dq_run_id so history is easy to query:
dbt run -s tag:spotlight --vars '{dq_persist_results: true, dq_run_id: "official_2026_03"}'
Upsert behavior
History tables use adapter-aware incremental strategies:
| Adapter | Strategy |
|---|---|
| Snowflake, BigQuery, Databricks, Redshift, Spark | merge |
| Fabric, others | delete+insert |
If you re-run with the same dq_run_id, existing rows for that run are updated rather than duplicated. This makes reruns safe.
Schema evolution is handled with on_schema_change: append_new_columns. If you add new DQ checks in a package upgrade, new columns appear in history without breaking existing data.
Latest vs previous run views
Each history table has two companion views:
*__latest_official_run: rows from the most recentrun_ts/run_id*__previous_official_run: rows from the second most recentrun_ts/run_id
These let you compare consecutive runs without writing window functions:
-- Which terminology fields degraded between the last two runs?
select
l.model_name,
l.field_name,
p.valid_rows_applicable_pct as previous_pct,
l.valid_rows_applicable_pct as latest_pct,
l.valid_rows_applicable_pct - p.valid_rows_applicable_pct as delta
from spotlight.dq_history__terminology_field_summary__latest_official_run l
join spotlight.dq_history__terminology_field_summary__previous_official_run p
on l.data_source = p.data_source
and l.payer = p.payer
and l.plan = p.plan
and l.model_name = p.model_name
and l.field_name = p.field_name
where l.valid_rows_applicable_pct < p.valid_rows_applicable_pct
order by delta asc;
Typical workflow
- Dev runs: leave
dq_persist_resultsatfalse. Iterate on upstream data and models without polluting history. - Official runs: set
dq_persist_results: truewith a meaningfuldq_run_id. Run on a schedule (weekly, monthly, or after each data refresh). - Monitoring: query latest/previous views or the full history table to track trends, set alerts on degradation, and report to stakeholders.
Analysis queries
The package includes analysis SQL files in the analyses/ directory for common diagnostic scenarios:
dq_terminology_latest_vs_previous_deltas.sql: compares terminology validation metrics between the latest and previous official runs, showing row counts and percentage deltas.reconciliation_enrollment_diagnostics_*.sql: a series of queries for investigating enrollment-to-claims mismatches, including monthly dropoff analysis, mismatch reason bucketing, and unmatched member investigation.
Run these with dbt compile to generate executable SQL, then run directly in your warehouse.