Skip to content

xai

Tools to explain aspects of a model.

compute_partial_dependence(pred_fun, X, features, grid=10, weights=None, n_max=1000, rng=None)

Compute partial dependence.

This is a fast brute force method to compute partial dependence values for the given grid.

Parameters:

Name Type Description Default
pred_fun callable

Prediction function, such that pred_fun(X) gives predicted values.

required
X array-like of shape (n_obs, n_features)

The dataframe or array of features to be passed to the model predict function.

required
features int or str

Column index or column name of the feature in X.

required
grid Series or int

Values of the feature specified by features, for wich to compute partial dependence. If an integer is specified, a grid of grid points of the given feature is constructed automatically using binning.

10
weights array-like of shape (n_obs) or None

Case weights. If given, the bias is calculated as weighted average of the identification function with these weights.

None
n_max int or None

The number of rows to subsample from X. This speeds up computation, in particular for slow predict functions.

1000
rng (Generator, int or None)

The random number generator. The used one will be np.random.default_rng(rng).

None

Returns:

Type Description
np.ndarray : shape (n_grid,)

Partial dependence values for the grid.

compute_permutation_importance(pred_fun, X, y, features=None, scoring_function=SquaredError(), weights=None, n_repeats=5, n_max=10000, scoring_orientation='smaller_is_better', rng=None)

Compute permutation feature importance.

This function calculates permutation feature importance for features and/or feature groups according to the idea in [Breiman] and [Fisher].

For each feature (group), permutation importance measures how much the model performance worsenes when shuffling the values of that feature (group) before calculating predictions. The idea is that if a feature is important, then shuffling its values will lead to a large drop in model performance. Shuffling is done n_repeats times, and mean differences and mean ratios are returned along with their standard errors.

Note that the model is never retrained during this process.

Parameters:

Name Type Description Default
pred_fun callable

A callable to get predictions, i.e. pred_fun(X).

required
X array-like of shape (n_obs, n_features)

The dataframe or array of features to be passed to the model predict function.

required
y ArrayLike

1D array of shape (n_observations,) containing the target values.

required
features Optional[Union[list, tuple, set, dict]]

Iterable of feature names/indices of features in X. The default None will use all features in X. Can also be a dictionary with lists of feature names/indices as values. The keys of the dictionary are used as feature group names. Example: {"x1": ["x1"], "x2": ["x2"], "size": ["x1", "x2"]}. Passing a dictionary is also useful if you want to represent feature indices of a numpy array as strings. Example: {"area": 0, "age": 1}.

None
scoring_function callable

A scoring function with signature roughly fun(y_obs, y_pred, weights) -> float.

SquaredError()
weights array-like of shape (n_obs) or None

Case weights passed to the scoring_function.

None
n_repeats int

Number of times to repeat the permutation for each feature group.

5
n_max int or None

Maximum number of observations used. If the number of observations is greater than n_max, a random subset of size n_max will be drawn from X, y, (and weights). Pass None for no subsampling.

10_000
scoring_orientation str

Direction of scoring function. Use "smaller_is_better" if smaller values are better (e.g., average losses), or "greater_is_better" if greater values are better (e.g., R-squared).

"smaller_is_better"
rng (Generator, int or None)

The random number generator used for shuffling values and for subsampling n_max rows. The input is internally wrapped by np.random.default_rng(rng).

None

Returns:

Name Type Description
df DataFrame

A DataFrame with one row per feature (group) and the following columns:

  • feature: Feature name or feature group name.
  • difference_mean: Mean of the score differences.
  • difference_stderr: Standard error, i.e. standard deviation of difference_mean. (None if n_repeats = 1.)
  • ratio_mean: Mean of the score ratios.
  • ratio_stderr: Standard error, i.e. standard deviation of ratio_mean. (None if n_repeats = 1.)
References
[Breiman]

Breiman, L. (2001). "Random Forests". Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324

[Fisher]

Fisher, A. and Rudin, C. and Dominici F. (2019). "All Models Are Wrong, but Many Are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously". Journal of Machine Learning Research, 20(177), 1-81.

Examples:

>>> import numpy as np
>>> import polars as pl
>>> from sklearn.linear_model import LinearRegression
>>> # Create a synthetic dataset
>>> rng = np.random.default_rng(1)
>>> n = 1000
>>> X = pl.DataFrame(
...     {
...         "rooms": rng.choice([2.5, 3.5, 4.5], n),
...         "area": rng.uniform(30, 120, n),
...         "age": rng.uniform(0, 100, n),
...     }
... )
>>> y = X["area"] + 20 * X["rooms"] + rng.normal(0, 10, n)
>>> model = LinearRegression()
>>> _ = model.fit(X, y)
>>> perm_importance = compute_permutation_importance(
...     pred_fun=model.predict,
...     X=X,
...     y=y,
...     rng=1,
... )
>>> perm_importance
shape: (3, 5)
┌─────────┬─────────────────┬───────────────────┬────────────┬──────────────┐
│ feature ┆ difference_mean ┆ difference_stderr ┆ ratio_mean ┆ ratio_stderr │
│ ---     ┆ ---             ┆ ---               ┆ ---        ┆ ---          │
│ str     ┆ f64             ┆ f64               ┆ f64        ┆ f64          │
╞═════════╪═════════════════╪═══════════════════╪════════════╪══════════════╡
│ rooms   ┆ 524.213195      ┆ 8.813555          ┆ 6.263515   ┆ 0.088495     │
│ area    ┆ 1328.885114     ┆ 15.924463         ┆ 14.343058  ┆ 0.159894     │
│ age     ┆ 0.174047        ┆ 0.090023          ┆ 1.001748   ┆ 0.000904     │
└─────────┴─────────────────┴───────────────────┴────────────┴──────────────┘

Using feature subsets

>>> perm_importance = compute_permutation_importance(
...     pred_fun=model.predict,
...     X=X,
...     y=y,
...     features=["area", "age"],
...     rng=1,
... )

Using feature groups

>>> perm_importance = compute_permutation_importance(
...     pred_fun=model.predict,
...     X=X,
...     y=y,
...     features={"size": ["area", "rooms"], "age": "age"},
...     rng=1,
... )

plot_permutation_importance(pred_fun, X, y, features=None, scoring_function=SquaredError(), weights=None, n_repeats=5, n_max=10000, scoring_orientation='smaller_is_better', rng=None, max_display=15, which='difference', confidence_level=0.95, ax=None)

Plot permutation importance as barplot with confidence intervals.

Parameters:

Name Type Description Default
pred_fun callable

A callable to get predictions, i.e. pred_fun(X).

required
X array-like of shape (n_obs, n_features)

The dataframe or array of features to be passed to the model predict function.

required
y ArrayLike

1D array of shape (n_observations,) containing the target values.

required
features Optional[Union[list, tuple, set, dict]]

Iterable of feature names/indices of features in X. The default None will use all features in X. Can also be a dictionary with lists of feature names/indices as values. The keys of the dictionary are used as feature group names. Example: {"x1": ["x1"], "x2": ["x2"], "size": ["x1", "x2"]}. Passing a dictionary is also useful if you want to represent feature indices of a numpy array as strings. Example: {"area": 0, "age": 1}.

None
scoring_function callable

A scoring function with signature roughly fun(y_obs, y_pred, weights) -> float.

SquaredError()
weights array-like of shape (n_obs) or None

Case weights passed to the scoring_function.

None
n_repeats int

Number of times to repeat the permutation for each feature group.

5
n_max int or None

Maximum number of observations used. If the number of observations is greater than n_max, a random subset of size n_max will be drawn from X, y, (and weights). Pass None for no subsampling.

10_000
scoring_orientation str

Direction of scoring function. Use "smaller_is_better" if smaller values are better (e.g., average losses), or "greater_is_better" if greater values are better (e.g., R-squared).

"smaller_is_better"
rng (Generator, int or None)

The random number generator used for shuffling values and for subsampling n_max rows. The input is internally wrapped by np.random.default_rng(rng).

None
max_display int or None

Maximum number of features to display, by default 15. If None, all features are displayed.

15
which str

Should difference or ratio scores be shown? Either "difference" or "ratio".

"difference"
confidence_level float

Confidence level for error bars. If 0, no error bars are plotted. Value must fulfil 0 <= confidence_level < 1. Set to 0.683 to show standard errors.

0.95
ax matplotlib.axes.Axes or plotly Figure

Axes object to draw the plot onto, otherwise uses the current Axes.

None

Returns:

Name Type Description
ax

Either the matplotlib axes or the plotly figure.