Correlation¶
CorrelationResult¶
CorrelationResult
dataclass
¶
CorrelationResult(matrix: DataFrame, blocks: list[tuple[str, int, int]], block_stats: DataFrame, method: CorrelationMethod)
Output of every correlation metric in this module.
matrix
instance-attribute
¶
Square DataFrame indexed/columned by feature name in graph order.
blocks
instance-attribute
¶
[(concept_path, start_idx, end_idx_exclusive), ...] over rows/cols.
block_stats
instance-attribute
¶
One row per block. Columns: concept_path, size, mean_abs,
median_abs, min, max.
feature_correlation¶
feature_correlation
¶
feature_correlation(graph: ConceptGraph, X: DataFrame | ndarray, *, feature_names: Sequence[str] | None = None, method: CorrelationMethod = 'spearman') -> CorrelationResult
Block-structured correlation matrix on feature values (P14).
Diagonal blocks reveal within-concept coherence; off-diagonal blocks reveal boundary leakage (features in different concepts that turn out to be highly correlated).
Accepts either a pd.DataFrame (column names taken from
X.columns) or an np.ndarray of shape (N, F) (column names
must be supplied via feature_names).
Source code in src/concept_graph_xai/metrics/correlation.py
nullity_correlation¶
nullity_correlation
¶
nullity_correlation(graph: ConceptGraph, X: DataFrame, *, method: CorrelationMethod = 'spearman') -> CorrelationResult
Block-structured correlation matrix on feature missingness (P15a).
Built on X.isna(). A high diagonal-block value means the features in
that concept tend to go missing together — directly relevant to the AUC
drop "this branch is missing" scenario.
Source code in src/concept_graph_xai/metrics/correlation.py
shap_correlation¶
shap_correlation
¶
shap_correlation(graph: ConceptGraph, feature_names: Sequence[str], shap_values: ndarray, *, method: CorrelationMethod = 'spearman') -> CorrelationResult
Block-structured correlation of SHAP values across samples (P17).
Two raw-uncorrelated features can still be SHAP-redundant: diagonal blocks near 1 indicate features inside a concept push the model in the same way; off-diagonal blocks near 1 indicate the model treats different concepts as substitutes.