Skip to main content

Outputs and Contract

Prediction outputs

These are the tables you'll query after a successful run. All outputs are written to the ml schema (or {tuva_schema_prefix}_ml if a prefix is configured).

TableDescription
train_model_registryTrain/reuse status, artifact URI, diagnostics, and model metadata for each run.
predict_valuesPredicted values by person, anchor month, target definition, and horizon.
predict_probabilities_longRow-level probability outputs for both count thresholds (P(Y >= k)) and spend percentiles (P(spend in top k%)).
train_metrics_longTrain/test evaluation metrics for both absolute (MAE, RMSE, R²) and threshold (AUC, Brier, logloss) families.
  • target_values is rendered as plain text in output tables. Single-value targets appear as values like emergency department, not JSON arrays.
  • train_model_registry is the current-run output. Historical trained bundles are tracked separately in train_model_registry_history for reuse decisions.

Targets and horizons

Target measureTarget dimensionTarget valuesDefault horizons (months)Enabled by default
paid_amountall(none)6, 12Yes
encounter_countencounter_group["inpatient"]12Yes
encounter_countencounter_type["emergency department"]12Yes
encounter_countencounter_type["inpatient skilled nursing"]12Yes
  • paid_amount targets are normalized paid amount per member month in the horizon window.
  • encounter_count targets are normalized encounter count per member month.
  • Target definitions are fully configurable via ml_target_policy using target_dimension plus exact target_values.
  • If one ml_target_policy row contains multiple target_values, the package expands it into one target per value.
  • Any target can optionally be sliced to exact encounter_group or encounter_type values. For example, paid_amount can stay unsliced across all claims or be limited to values like acute inpatient or emergency department.

Count probability outputs default to P(Y >= k) for k = 1, 2, 3, 5.

Spend percentile probability outputs default to P(spend in top k%) for k = 1, 5 on paid_amount targets. Those percentile cutoffs are derived separately for each data_source and each spend target / horizon combination.

Required upstream models

The package reads from these Tuva core models:

Upstream modelPurpose
core__member_monthsAnchor population and membership/exposure windows.
core__patientDemographic context at anchor month.
core__medical_claimUtilization/cost lookbacks and outcome labels.
core__conditionCondition feature derivation from clinical history.
cms_hcc__int_hcc_hierarchyHCC feature derivation for risk context.

Additional assumptions:

  • Member month history exists by person_id and year_month.
  • Medical claims include date and encounter context for lookback/outcome logic.
  • HCC inputs may arrive either at person grain or at person_id + payer grain depending on Tuva version; the package normalizes both shapes to person-level HCC features.

Internal models

These models are used internally by the pipeline. You generally won't need to query them directly, but they're useful for debugging. Even when you configure features/targets via vars, these staging models are still built. They materialize your var settings into normalized runtime tables consumed by downstream SQL and Python models.

Config and contract models:

ModelPurpose
stg_runtime_configRuntime key/value settings for train/predict.
stg_feature_policyFeature-group enablement policy.
stg_target_policyTarget policy with normalized target keys and claim-filter definitions.
stg_count_probability_policyCount-threshold policy for P(Y >= k).
stg_spend_percentile_probability_policySpend-percentile policy for P(spend in top k%).
int_anchor_populationperson_id × anchor_month anchor rows and eligibility flags.
int_features_longSparse features at person-anchor-feature-window grain.
int_labels_longLabels at person-anchor-target-horizon grain.
int_feature_dictionaryFeature metadata dictionary.
int_hcc_features_sourceNormalized person-level HCC source used to absorb Tuva version grain differences before feature generation.
train_model_registry_historyAppend-only history of trained bundles used to decide whether future runs can reuse an existing model version.

Matrix and sparse coordinate artifacts:

ModelPurpose
int_model_matrix_trainOne row per (person_id, anchor_month) in the training population, with all features packed as a sparse JSON/VARIANT object. This is the pre-assembled training dataset before numeric matrix conversion.
int_model_matrix_predictSame structure as the training matrix, scoped to the configured prediction anchor month. Each row is a member to be scored.
int_model_matrix_train_row_indexAssigns a sequential integer row index to each training anchor. Required to construct sparse matrix row coordinates for the Python training runtime.
int_model_matrix_predict_row_indexSame as above, for prediction anchors.
int_model_matrix_train_tripletsExplodes the JSON feature objects into one record per (anchor, feature, value). These triplets are the intermediate format before mapping to numeric sparse coordinates.
int_model_matrix_predict_tripletsSame as above, for prediction anchors.
int_feature_index_trainMaps each feature name to a stable integer column index. Ensures consistent column ordering when assembling the matrix in Python, so the same feature always lands in the same column position during both training and prediction.
int_sparse_coords_trainFinal (row_idx, col_idx, value) tuples consumed by the Python training runtime to assemble a sparse matrix for XGBoost training.
int_sparse_coords_predictSame format as the train coordinates, used by the Python runtime to assemble the prediction matrix for inference.