Skip to main content

Quickstart

1. Add package dependencies

In your project packages.yml:

packages:
- git: "https://github.com/illuminatehealth/ih_spotlight_dq.git"
revision: main
- package: tuva-health/the_tuva_project
version: [">=0.15.0", "<1.0.0"]
- package: dbt-labs/dbt_utils
version: [">=1.0.0", "<2.0.0"]

Then install:

dbt deps

2. Ensure upstream models exist

Spotlight DQ reads from Tuva core, input-layer, reference, and terminology models. Your project must already build these:

Core models:

  • core__member_months: enrollment grain (person, year-month)
  • core__medical_claim: standardized medical claims

Input-layer models:

  • input_layer__eligibility
  • input_layer__medical_claim
  • input_layer__pharmacy_claim

Reference data:

  • reference_data__calendar
  • reference_data__ansi_fips_state

Terminology tables (used for validation):

  • terminology__race, terminology__medicare_dual_eligibility
  • terminology__ms_drg, terminology__apr_drg
  • terminology__revenue_center, terminology__hcpcs_level_2
  • terminology__icd_10_cm, terminology__icd_10_pcs
  • terminology__discharge_disposition, terminology__admit_source, terminology__admit_type
  • terminology__bill_type, terminology__place_of_service
  • terminology__provider, terminology__ndc

If you're already running the Tuva Project, these should all exist.

3. Build current-run summaries

dbt run -s tag:spotlight

This builds three summary tables in the spotlight schema:

  • reconciliation__monthly_summary
  • dq_results__input_layer_field_summary
  • dq_results__terminology_field_summary

4. Validate outputs

-- Monthly reconciliation: enrollment vs claims alignment
select data_source, payer, year_month_int, member_months, claims, pmpm_paid
from spotlight.reconciliation__monthly_summary
order by year_month_int;

-- Field completeness: which fields have the most nulls?
select model_name, field_name, non_null_rows_pct
from spotlight.dq_results__input_layer_field_summary
order by non_null_rows_pct asc;

-- Terminology validation: which fields have the most invalid values?
select model_name, field_name, valid_rows_applicable_pct
from spotlight.dq_results__terminology_field_summary
where applicable_rows > 0
order by valid_rows_applicable_pct asc;

5. Persist official-run history

For production runs where you want to track DQ trends over time:

dbt run -s tag:spotlight --vars '{dq_persist_results: true}'

With an explicit run ID and timestamp:

dbt run -s tag:spotlight --vars '{dq_persist_results: true, dq_run_id: "official_2026_03", tuva_last_run: "2026-03-01 00:00:00"}'

See Official Runs and History for how persistence works.

6. Run tests

dbt test -s tag:spotlight

Tests verify:

  • Grain uniqueness on all summary and history tables
  • Percentage columns are between 0 and 100
  • Required fields (run_id, data_source, model_name, field_name) are not null