Official Runs and History

Spotlight DQ supports persisting DQ and reconciliation metrics across multiple dbt runs, building a time series you can use for trend analysis and run-to-run comparison.

How it works

By default (dq_persist_results: false), only current-run summary tables are built. History models emit zero rows.

When you set dq_persist_results: true, the package appends (upserts) current-run results into three incremental history tables:

History table	Source
`reconciliation_history__monthly_summary`	`reconciliation__monthly_summary`
`dq_history__input_layer_field_summary`	`dq_results__input_layer_field_summary`
`dq_history__terminology_field_summary`	`dq_results__terminology_field_summary`

Each row is stamped with run_id and run_ts so you can distinguish results from different runs.

Run metadata

Variable	Default	Purpose
`dq_persist_results`	`false`	Enable history persistence
`dq_run_id`	dbt `invocation_id`	Identifier stamped on each row
`tuva_last_run`	`run_started_at`	Timestamp stamped on each row

For scheduled production runs, use a meaningful dq_run_id so history is easy to query:

dbt run -s tag:spotlight --vars '{dq_persist_results: true, dq_run_id: "official_2026_03"}'

Upsert behavior

History tables use adapter-aware incremental strategies:

Adapter	Strategy
Snowflake, BigQuery, Databricks, Redshift, Spark	`merge`
Fabric, others	`delete+insert`

If you re-run with the same dq_run_id, existing rows for that run are updated rather than duplicated. This makes reruns safe.

Schema evolution is handled with on_schema_change: append_new_columns. If you add new DQ checks in a package upgrade, new columns appear in history without breaking existing data.

Latest vs previous run views

Each history table has two companion views:

*__latest_official_run: rows from the most recent run_ts / run_id
*__previous_official_run: rows from the second most recent run_ts / run_id

These let you compare consecutive runs without writing window functions:

-- Which terminology fields degraded between the last two runs?
select
    l.model_name,
    l.field_name,
    p.valid_rows_applicable_pct as previous_pct,
    l.valid_rows_applicable_pct as latest_pct,
    l.valid_rows_applicable_pct - p.valid_rows_applicable_pct as delta
from spotlight.dq_history__terminology_field_summary__latest_official_run l
join spotlight.dq_history__terminology_field_summary__previous_official_run p
    on l.data_source = p.data_source
    and l.payer = p.payer
    and l.plan = p.plan
    and l.model_name = p.model_name
    and l.field_name = p.field_name
where l.valid_rows_applicable_pct < p.valid_rows_applicable_pct
order by delta asc;

Typical workflow

Dev runs: leave dq_persist_results at false. Iterate on upstream data and models without polluting history.
Official runs: set dq_persist_results: true with a meaningful dq_run_id. Run on a schedule (weekly, monthly, or after each data refresh).
Monitoring: query latest/previous views or the full history table to track trends, set alerts on degradation, and report to stakeholders.

Analysis queries

The package includes analysis SQL files in the analyses/ directory for common diagnostic scenarios:

dq_terminology_latest_vs_previous_deltas.sql: compares terminology validation metrics between the latest and previous official runs, showing row counts and percentage deltas.
reconciliation_enrollment_diagnostics_*.sql: a series of queries for investigating enrollment-to-claims mismatches, including monthly dropoff analysis, mismatch reason bucketing, and unmatched member investigation.

Run these with dbt compile to generate executable SQL, then run directly in your warehouse.

How it works​

Run metadata​

Upsert behavior​

Latest vs previous run views​

Typical workflow​

Analysis queries​