Quickstart
1. Add package dependencies
In your project packages.yml:
packages:
- git: "https://github.com/illuminatehealth/ih_spotlight_dq.git"
revision: main
- package: tuva-health/the_tuva_project
version: [">=0.15.0", "<1.0.0"]
- package: dbt-labs/dbt_utils
version: [">=1.0.0", "<2.0.0"]
Then install:
dbt deps
2. Ensure upstream models exist
Spotlight DQ reads from Tuva core, input-layer, reference, and terminology models. Your project must already build these:
Core models:
core__member_months: enrollment grain (person, year-month)core__medical_claim: standardized medical claims
Input-layer models:
input_layer__eligibilityinput_layer__medical_claiminput_layer__pharmacy_claim
Reference data:
reference_data__calendarreference_data__ansi_fips_state
Terminology tables (used for validation):
terminology__race,terminology__medicare_dual_eligibilityterminology__ms_drg,terminology__apr_drgterminology__revenue_center,terminology__hcpcs_level_2terminology__icd_10_cm,terminology__icd_10_pcsterminology__discharge_disposition,terminology__admit_source,terminology__admit_typeterminology__bill_type,terminology__place_of_serviceterminology__provider,terminology__ndc
If you're already running the Tuva Project, these should all exist.
3. Build current-run summaries
dbt run -s tag:spotlight
This builds three summary tables in the spotlight schema:
reconciliation__monthly_summarydq_results__input_layer_field_summarydq_results__terminology_field_summary
4. Validate outputs
-- Monthly reconciliation: enrollment vs claims alignment
select data_source, payer, year_month_int, member_months, claims, pmpm_paid
from spotlight.reconciliation__monthly_summary
order by year_month_int;
-- Field completeness: which fields have the most nulls?
select model_name, field_name, non_null_rows_pct
from spotlight.dq_results__input_layer_field_summary
order by non_null_rows_pct asc;
-- Terminology validation: which fields have the most invalid values?
select model_name, field_name, valid_rows_applicable_pct
from spotlight.dq_results__terminology_field_summary
where applicable_rows > 0
order by valid_rows_applicable_pct asc;
5. Persist official-run history
For production runs where you want to track DQ trends over time:
dbt run -s tag:spotlight --vars '{dq_persist_results: true}'
With an explicit run ID and timestamp:
dbt run -s tag:spotlight --vars '{dq_persist_results: true, dq_run_id: "official_2026_03", tuva_last_run: "2026-03-01 00:00:00"}'
See Official Runs and History for how persistence works.
6. Run tests
dbt test -s tag:spotlight
Tests verify:
- Grain uniqueness on all summary and history tables
- Percentage columns are between 0 and 100
- Required fields (
run_id,data_source,model_name,field_name) are not null