In-Warehouse Product

EMPI Engine + Workbench

Patient identity breaks quietly, then it breaks trust. EMPI Engine + Workbench gives healthcare data teams transparent matching, analyst-grade stewardship, and auditable golden records without moving PHI outside the warehouse.

Deploy the matching engine, configuration editor, and review workbench inside Snowflake, Databricks, or Microsoft Fabric.

Keep every threshold, rule, and analyst decision inspectable so governance teams can understand exactly how a person_id was created.

Book EMPI walkthrough Read the docs

Product brief

Identity resolution with governed stewardship

EMPI turns identity resolution into governed reconciliation: every match, split, merge, and survivorship choice can be inspected before it shapes downstream measures.

Resolve patient identity where PHI already lives
Give stewards a real review queue, not a spreadsheet
Publish person-level records with auditable decisions

Problem

Identity resolution usually forces a bad tradeoff

Teams often end up choosing between dbt-only pipelines with no usable analyst workflow and SaaS platforms that require PHI to leave the cloud environment.

Silent fragmentation

One person shows up as multiple patients across eligibility, claims, and clinical sources, which breaks downstream patient-level measures.

Opaque logic

Stakeholders cannot explain why records matched, why they did not, or how to tune the clerical-review band responsibly.

Stewardship bottlenecks

Uncertain candidates pile up with no deliberate work queue, merge controls, or audit trail that holds up under governance review.

Approach

Transparent probabilistic matching with practical governance

EMPI combines the warehouse-native linkage engine with the config editor and review workbench so the same operating model covers matching, stewardship, and downstream publishing.

Configure openly

Blocking rules, thresholds, survivorship, and scoring behavior are visible and tunable instead of buried inside an opaque service.

Run in your warehouse

dbt-native SQL and Python models execute where the source data already lives, under your IAM and network controls.

Resolve with context

Analysts can split, merge, ignore, and review golden records through a real workbench, with decisions flowing back into the next run.

Capabilities

What your team gets on day one

Matching and configuration

A guided configuration layer keeps blocking strategy, thresholds, and survivorship logic readable and versionable.

Explainable Fellegi-Sunter score behavior
Adapter-aware dbt execution patterns
Deterministic person_id outputs and survivorship-aligned golden records

Stewardship and governance

The workbench turns ambiguous candidates into an operational queue instead of a spreadsheet exercise.

Split, merge, and ignore workflows
Queue management and prioritization controls
Audit history for every review action

Implementation

A practical rollout from first run to production stewardship

EMPI implementations follow a clear sequence so matching quality, analyst workflow, and downstream reporting impacts are validated before broad adoption.

Baseline and configure

Profile source identity quality, define the matching universe, and align survivorship logic to downstream reporting expectations before running at scale.

Run and calibrate in your warehouse

Stand up the engine, config editor, and review workbench in your cloud environment, then tune thresholds with stewards against real candidate quality.

Operationalize stewardship

Move split, merge, and review operations into day-to-day workflows with auditable controls so identity quality improves continuously after go-live.

Next step

Want to see how EMPI Engine + Workbench fits your warehouse and data contract?

We can talk through your current cloud platform, upstream source contract, and the operational workflow that needs to be supported before we recommend an implementation path.

Book EMPI walkthrough