Outputs and Contract

Prediction outputs

These are the tables you'll query after a successful run. All outputs are written to the ml schema (or {tuva_schema_prefix}_ml if a prefix is configured).

Table	Description
`train_model_registry`	Train/reuse status, artifact URI, diagnostics, and model metadata for each run.
`predict_values`	Predicted values by person, anchor month, target definition, and horizon.
`predict_probabilities_long`	Row-level probability outputs for both count thresholds (`P(Y >= k)`) and spend percentiles (`P(spend in top k%)`).
`train_metrics_long`	Train/test evaluation metrics for both absolute (MAE, RMSE, R²) and threshold (AUC, Brier, logloss) families.

target_values is rendered as plain text in output tables. Single-value targets appear as values like emergency department, not JSON arrays.
train_model_registry is the current-run output. Historical trained bundles are tracked separately in train_model_registry_history for reuse decisions.

Targets and horizons

Target measure	Target dimension	Target values	Default horizons (months)	Enabled by default
`paid_amount`	`all`	(none)	6, 12	Yes
`encounter_count`	`encounter_group`	`["inpatient"]`	12	Yes
`encounter_count`	`encounter_type`	`["emergency department"]`	12	Yes
`encounter_count`	`encounter_type`	`["inpatient skilled nursing"]`	12	Yes

paid_amount targets are normalized paid amount per member month in the horizon window.
encounter_count targets are normalized encounter count per member month.
Target definitions are fully configurable via ml_target_policy using target_dimension plus exact target_values.
If one ml_target_policy row contains multiple target_values, the package expands it into one target per value.
Any target can optionally be sliced to exact encounter_group or encounter_type values. For example, paid_amount can stay unsliced across all claims or be limited to values like acute inpatient or emergency department.

Count probability outputs default to P(Y >= k) for k = 1, 2, 3, 5.

Spend percentile probability outputs default to P(spend in top k%) for k = 1, 5 on paid_amount targets. Those percentile cutoffs are derived separately for each data_source and each spend target / horizon combination.

Required upstream models

The package reads from these Tuva core models:

Upstream model	Purpose
`core__member_months`	Anchor population and membership/exposure windows.
`core__patient`	Demographic context at anchor month.
`core__medical_claim`	Utilization/cost lookbacks and outcome labels.
`core__condition`	Condition feature derivation from clinical history.
`cms_hcc__int_hcc_hierarchy`	HCC feature derivation for risk context.

Additional assumptions:

Member month history exists by person_id and year_month.
Medical claims include date and encounter context for lookback/outcome logic.
HCC inputs may arrive either at person grain or at person_id + payer grain depending on Tuva version; the package normalizes both shapes to person-level HCC features.

Internal models

These models are used internally by the pipeline. You generally won't need to query them directly, but they're useful for debugging. Even when you configure features/targets via vars, these staging models are still built. They materialize your var settings into normalized runtime tables consumed by downstream SQL and Python models.

Config and contract models:

Model	Purpose
`stg_runtime_config`	Runtime key/value settings for train/predict.
`stg_feature_policy`	Feature-group enablement policy.
`stg_target_policy`	Target policy with normalized target keys and claim-filter definitions.
`stg_count_probability_policy`	Count-threshold policy for `P(Y >= k)`.
`stg_spend_percentile_probability_policy`	Spend-percentile policy for `P(spend in top k%)`.
`int_anchor_population`	`person_id × anchor_month` anchor rows and eligibility flags.
`int_features_long`	Sparse features at person-anchor-feature-window grain.
`int_labels_long`	Labels at person-anchor-target-horizon grain.
`int_feature_dictionary`	Feature metadata dictionary.
`int_hcc_features_source`	Normalized person-level HCC source used to absorb Tuva version grain differences before feature generation.
`train_model_registry_history`	Append-only history of trained bundles used to decide whether future runs can reuse an existing model version.

Matrix and sparse coordinate artifacts:

Model	Purpose
`int_model_matrix_train`	One row per `(person_id, anchor_month)` in the training population, with all features packed as a sparse JSON/VARIANT object. This is the pre-assembled training dataset before numeric matrix conversion.
`int_model_matrix_predict`	Same structure as the training matrix, scoped to the configured prediction anchor month. Each row is a member to be scored.
`int_model_matrix_train_row_index`	Assigns a sequential integer row index to each training anchor. Required to construct sparse matrix row coordinates for the Python training runtime.
`int_model_matrix_predict_row_index`	Same as above, for prediction anchors.
`int_model_matrix_train_triplets`	Explodes the JSON feature objects into one record per (anchor, feature, value). These triplets are the intermediate format before mapping to numeric sparse coordinates.
`int_model_matrix_predict_triplets`	Same as above, for prediction anchors.
`int_feature_index_train`	Maps each feature name to a stable integer column index. Ensures consistent column ordering when assembling the matrix in Python, so the same feature always lands in the same column position during both training and prediction.
`int_sparse_coords_train`	Final (row_idx, col_idx, value) tuples consumed by the Python training runtime to assemble a sparse matrix for XGBoost training.
`int_sparse_coords_predict`	Same format as the train coordinates, used by the Python runtime to assemble the prediction matrix for inference.

Prediction outputs​

Targets and horizons​

Required upstream models​

Internal models​

Prediction outputs

Targets and horizons

Required upstream models

Internal models