ML Models for Bias Prediction
WIFA-UQ supports multiple machine learning models for predicting residual bias after calibration. Each model offers different trade-offs between accuracy, interpretability, and uncertainty quantification.
Overview
The bias predictor learns to predict residual model error as a function of atmospheric features:
residual_bias = f(ABL_height, wind_veer, lapse_rate, turbulence_intensity, ...)
┌─────────────────────────────────────────────────────────────────────────────┐
│ Bias Prediction Pipeline │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Input Features ML Model Output │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ ABL_height │ │ │ │ predicted_bias │ │
│ │ wind_veer │ ──► │ XGB / PCE / │ ──► │ │ │
│ │ lapse_rate │ │ SIR / Linear │ │ (± uncertainty) │ │
│ │ TI │ │ │ │ │ │
│ │ ... │ └─────────────────┘ └─────────────────┘ │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Model Comparison
| Model | regressor |
Interpretability | UQ Method | Best For |
|---|---|---|---|---|
| XGBoost | XGB |
SHAP values | Ensemble variance | Best accuracy, feature importance |
| PCE | PCE |
Sobol indices | Analytical | Smooth responses, global sensitivity |
| SIR Polynomial | SIRPolynomial |
Coefficients | Analytical | Dimension reduction, interpretability |
| Ridge | Linear |
Coefficients | Analytical | Baseline, linear relationships |
| Lasso | Linear |
Sparse coefficients | Analytical | Feature selection |
| ElasticNet | Linear |
Sparse coefficients | Analytical | Mixed regularization |
Quick Start
error_prediction:
calibrator: MinBiasCalibrator
regressor: XGB # Model choice
regressor_params:
max_depth: 3
n_estimators: 100
features:
- ABL_height
- wind_veer
- lapse_rate
- turbulence_intensity
XGBoost (XGB)
Gradient boosted decision trees. Generally provides the best predictive accuracy and robust feature importance via SHAP values.
When to Use
- Best accuracy is the priority
- You want SHAP-based feature importance
- Non-linear relationships exist between features and bias
- You have sufficient training data (> 50-100 cases)
Configuration
error_prediction:
regressor: XGB
regressor_params:
max_depth: 3 # Tree depth (default: 3)
n_estimators: 100 # Number of trees (default: 100)
learning_rate: 0.1 # Step size shrinkage (default: 0.1)
min_child_weight: 1 # Minimum sum of instance weight in child
subsample: 1.0 # Fraction of samples per tree
colsample_bytree: 1.0 # Fraction of features per tree
random_state: 42 # Reproducibility
Parameter Tuning Guide
| Parameter | Low Value Effect | High Value Effect | Recommended |
|---|---|---|---|
max_depth |
Underfitting | Overfitting | 2-5 |
n_estimators |
Underfitting | Slower, diminishing returns | 50-200 |
learning_rate |
Needs more trees | Overfitting risk | 0.05-0.3 |
min_child_weight |
More complex trees | More regularization | 1-10 |
SHAP Feature Importance
XGBoost automatically generates SHAP (SHapley Additive exPlanations) analysis:
Output files:
shap_summary.png # Beeswarm plot of feature impacts
shap_feature_importance.png # Bar chart of mean |SHAP|
Interpreting SHAP plots:
- Beeswarm plot: Each point is a prediction; x-axis shows impact on output; color shows feature value
- Feature importance: Mean absolute SHAP value per feature
API Usage
from wifa_uq.postprocessing.error_prediction import BiasPredictor
import xarray as xr
database = xr.load_dataset("results_stacked_hh.nc")
predictor = BiasPredictor(
database,
regressor_name="XGB",
regressor_params={"max_depth": 3, "n_estimators": 100},
feature_names=["ABL_height", "wind_veer", "lapse_rate"]
)
predictor.fit(calibrated_sample_indices)
# Predict bias
predictions = predictor.predict(X_test)
# Get SHAP values
shap_values = predictor.get_shap_values()
Strengths & Limitations
Strengths: - Best out-of-the-box accuracy - Handles non-linear relationships - Robust to outliers - SHAP provides trustworthy feature importance
Limitations: - Less interpretable than linear models - Can overfit with small datasets - No analytical uncertainty (uses ensemble variance)
Polynomial Chaos Expansion (PCE)
Represents the bias function as a polynomial expansion in the input features. Provides analytical sensitivity indices (Sobol) and smooth predictions.
When to Use
- You need Sobol sensitivity indices for variance decomposition
- The response is expected to be smooth and polynomial-like
- You want analytical uncertainty quantification
- Interpretability of polynomial coefficients is valuable
Configuration
error_prediction:
regressor: PCE
regressor_params:
degree: 3 # Maximum polynomial degree (default: 3)
q_norm: 0.75 # Hyperbolic truncation parameter (default: 0.75)
fit_type: LeastSquares # Fitting method: LeastSquares or LARS
Parameters Explained
| Parameter | Description | Typical Range |
|---|---|---|
degree |
Maximum total polynomial degree | 2-5 |
q_norm |
Controls interaction term truncation (lower = sparser) | 0.5-1.0 |
fit_type |
LeastSquares (dense) or LARS (sparse) |
- |
Polynomial basis:
For features $(x_1, x_2)$ with degree=2:
$$f(x) = c_0 + c_1 x_1 + c_2 x_2 + c_{11} x_1^2 + c_{12} x_1 x_2 + c_{22} x_2^2$$
Sobol Sensitivity Indices
PCE automatically computes Sobol indices for global sensitivity analysis:
| Index | Meaning |
|---|---|
| First-order ($S_i$) | Variance from feature $i$ alone |
| Total-order ($S_{Ti}$) | Variance from $i$ including all interactions |
Output files:
sobol_indices.csv # Numerical values
sobol_barplot.png # Visualization
API Usage
from wifa_uq.postprocessing.error_prediction import BiasPredictor
predictor = BiasPredictor(
database,
regressor_name="PCE",
regressor_params={"degree": 3, "q_norm": 0.75},
feature_names=["ABL_height", "wind_veer", "lapse_rate"]
)
predictor.fit(calibrated_sample_indices)
# Get Sobol indices
sobol_first, sobol_total = predictor.get_sobol_indices()
print("First-order Sobol indices:")
for feat, val in zip(predictor.feature_names, sobol_first):
print(f" {feat}: {val:.3f}")
Strengths & Limitations
Strengths: - Analytical Sobol indices for sensitivity analysis - Smooth, continuous predictions - Well-suited for UQ workflows - Interpretable polynomial coefficients
Limitations: - Assumes polynomial response structure - Can struggle with discontinuities or sharp transitions - Number of terms grows with degree and features - Requires careful degree selection
Sliced Inverse Regression Polynomial (SIRPolynomial)
Combines Sliced Inverse Regression (SIR) for dimension reduction with polynomial regression. Finds low-dimensional projections of features that best explain the response.
When to Use
- You have many features and suspect lower-dimensional structure
- You want dimension reduction combined with prediction
- Interpretability of the projection directions is valuable
- Linear combinations of features drive the response
Configuration
error_prediction:
regressor: SIRPolynomial
regressor_params:
n_directions: 2 # Number of SIR directions to keep (default: 2)
n_slices: 10 # Number of slices for SIR (default: 10)
degree: 2 # Polynomial degree in reduced space (default: 2)
How It Works
┌─────────────────────────────────────────────────────────────────────────────┐
│ SIR Polynomial Pipeline │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Original Features (p dims) SIR Directions (d dims) Polynomial │
│ ┌─────────────────────┐ ┌─────────────────────┐ ┌─────────┐ │
│ │ ABL_height │ │ z₁ = w₁·X │ │ │ │
│ │ wind_veer │ SIR │ z₂ = w₂·X │ ──► │ ŷ=P(z) │ │
│ │ lapse_rate │ ───► │ │ │ │ │
│ │ TI │ │ (d << p) │ └─────────┘ │
│ │ ... │ └─────────────────────┘ │
│ └─────────────────────┘ │
│ (p=10) (d=2) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
- SIR finds directions $w_i$ such that $z_i = w_i \cdot X$ captures response variation
- Polynomial fits $\hat{y} = P(z_1, z_2, ...)$ in the reduced space
Parameters Explained
| Parameter | Description | Guidance |
|---|---|---|
n_directions |
Dimensionality of reduced space | Start with 1-2, increase if underfitting |
n_slices |
Slices for estimating inverse regression | 5-20, more for larger datasets |
degree |
Polynomial degree in reduced space | 1-3 |
Interpreting Directions
The SIR directions reveal which feature combinations matter:
predictor = BiasPredictor(
database,
regressor_name="SIRPolynomial",
regressor_params={"n_directions": 2},
feature_names=features
)
predictor.fit(calibrated_indices)
# Get SIR directions
directions = predictor.get_sir_directions()
print("SIR Direction 1:")
for feat, weight in zip(features, directions[0]):
print(f" {feat}: {weight:.3f}")
Example output:
SIR Direction 1:
ABL_height: 0.72
wind_veer: 0.45
lapse_rate: 0.12
TI: 0.52
This indicates the first direction is dominated by ABL_height and TI.
Strengths & Limitations
Strengths: - Effective dimension reduction - Interpretable projection directions - Handles many features gracefully - Can reveal unexpected feature combinations
Limitations:
- Assumes linear projections capture response structure
- Sensitive to n_slices parameter
- Less common, fewer reference implementations
- May miss complex non-linear interactions
Linear Regressors
Simple and interpretable models for baseline comparisons and when linear relationships are expected.
Available Variants
| Variant | regressor_params.linear_type |
Regularization | Best For |
|---|---|---|---|
| Ridge | ridge (default) |
L2 (shrinkage) | Collinear features, default |
| Lasso | lasso |
L1 (sparsity) | Feature selection |
| ElasticNet | elasticnet |
L1 + L2 | Mixed benefits |
| OLS | ols |
None | Simple baseline |
Configuration
Ridge (default):
error_prediction:
regressor: Linear
regressor_params:
linear_type: ridge
alpha: 1.0 # Regularization strength
Lasso:
error_prediction:
regressor: Linear
regressor_params:
linear_type: lasso
alpha: 0.1
ElasticNet:
error_prediction:
regressor: Linear
regressor_params:
linear_type: elasticnet
alpha: 0.5
l1_ratio: 0.5 # Balance between L1 and L2
Interpreting Coefficients
Linear models provide direct coefficient interpretation:
predictor = BiasPredictor(
database,
regressor_name="Linear",
regressor_params={"linear_type": "ridge", "alpha": 1.0},
feature_names=features
)
predictor.fit(calibrated_indices)
# Get coefficients
coefs = predictor.get_coefficients()
print("Feature coefficients:")
for feat, coef in zip(features, coefs):
print(f" {feat}: {coef:.4f}")
Interpretation: A coefficient of 0.002 for ABL_height means a 1-unit increase in ABL_height increases predicted bias by 0.002 (normalized units).
When to Use Each
| Scenario | Recommended |
|---|---|
| Baseline comparison | Ridge or OLS |
| Many correlated features | Ridge |
| Want automatic feature selection | Lasso |
| Uncertain about regularization type | ElasticNet |
| Very few features, lots of data | OLS |
Strengths & Limitations
Strengths: - Highly interpretable coefficients - Fast training and prediction - Analytical uncertainty estimates - Good baseline for comparison
Limitations: - Cannot capture non-linear relationships - May underfit complex responses - Feature engineering required for interactions
Model Selection Guide
Decision Tree
Start Here
│
▼
┌─────────────────────────────┐
│ Need Sobol sensitivity │
│ indices? │
└─────────────────────────────┘
│ │
Yes No
│ │
▼ ▼
PCE ┌─────────────────────────────┐
│ Many features (>5) with │
│ suspected low-dim structure?│
└─────────────────────────────┘
│ │
Yes No
│ │
▼ ▼
SIRPolynomial ┌─────────────────────────────┐
│ Need best accuracy or │
│ SHAP importance? │
└─────────────────────────────┘
│ │
Yes No
│ │
▼ ▼
XGB ┌─────────────────────┐
│ Simple baseline or │
│ linear expected? │
└─────────────────────┘
│
▼
Linear
Comparison by Criterion
| Criterion | Best Choice | Second Choice |
|---|---|---|
| Predictive accuracy | XGB | PCE |
| Interpretability | Linear | SIRPolynomial |
| Sensitivity analysis | PCE (Sobol) | XGB (SHAP) |
| Small dataset (<50) | Linear | PCE (low degree) |
| Many features | SIRPolynomial | XGB |
| Production deployment | XGB or Linear | PCE |
Common Configuration Patterns
High-Accuracy Setup
error_prediction:
regressor: XGB
regressor_params:
max_depth: 4
n_estimators: 200
learning_rate: 0.05
min_child_weight: 3
features:
- ABL_height
- wind_veer
- lapse_rate
- turbulence_intensity
- wind_speed
Sensitivity Analysis Setup
error_prediction:
regressor: PCE
regressor_params:
degree: 3
q_norm: 0.75
features:
- ABL_height
- wind_veer
- lapse_rate
- turbulence_intensity
sensitivity_analysis:
output_sobol: true
Interpretable Baseline
error_prediction:
regressor: Linear
regressor_params:
linear_type: ridge
alpha: 1.0
features:
- ABL_height
- wind_veer
- lapse_rate
Dimension Reduction Setup
error_prediction:
regressor: SIRPolynomial
regressor_params:
n_directions: 2
n_slices: 10
degree: 2
features:
- ABL_height
- wind_veer
- lapse_rate
- turbulence_intensity
- capping_inversion_strength
- capping_inversion_thickness
- wind_speed
- wind_direction
Troubleshooting
"XGBoost is overfitting"
Symptoms: Training RMSE much lower than test RMSE.
Solutions:
- Reduce max_depth (try 2-3)
- Reduce n_estimators
- Increase min_child_weight
- Add regularization: reg_alpha, reg_lambda
"PCE predictions are unstable"
Symptoms: Large prediction variance, coefficients vary across CV folds.
Solutions:
- Reduce degree
- Increase q_norm closer to 1.0
- Use fit_type: LARS for sparse solution
- Ensure features are scaled
"Linear model has high bias"
Symptoms: Both training and test errors are high.
Solutions: - Add polynomial features manually - Add interaction terms - Switch to XGB or PCE for non-linear capture - Check if features are properly normalized
"SIRPolynomial gives poor results"
Symptoms: Predictions don't track actual values.
Solutions:
- Increase n_directions
- Adjust n_slices (try 5-20)
- Increase degree in reduced space
- Verify sufficient data for dimension reduction
See Also
- Feature Engineering — Available features and custom features
- Cross-Validation — Validation strategies
- Global Calibration — Calibration before bias prediction
- Configuration Reference — Full YAML options