NNPDF solves an inverse problem: we have a set of experimental data and we want to find the underlying PDFs that best explain the data.
Theory (QCD, QED) allows us to compute the observable values \(\mathcal{O}_n\) from the PDF space, and the inverse is done by fitting precise experimental data.
The PDF space is a 9-dimensional space (flavor basis or evolution basis).
Why does a given flavor look like this in a given \(x\) region? Which datasets are responsible for this behavior?
Black-box systems are a recurrent interpretability issue in modern DL. We can take inspiration from the methods developed there to solve the interpretability problem of NNPDF.
XAI answer : What's the impact of one feature on the output of the model ?
Inherited from game theory: represent the value of the contribution \(\phi\) of a player \(i\) within a coalition \(S\).
\[ \phi_i = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|! \, (|N| - |S| - 1)!}{|N|!} \left[ v(S \cup \{i\}) - v(S) \right] \]where \(N\) is the set of all players, \(S\) is a subset of players not including \(i\), and \(v(S)\) is the value of the coalition \(S\).
Computational cost scales exponentially with the number of players \(2^N\).
SHAP adapts SV for DL application. Makes them computable for large numbers of players (features) given certain assumptions and approximations.
Applying Shapley Values to NNPDF, we treat the PDF of each flavor as an input of our black-box model.
Loading trained PDFs from n3fit.
For selected datasets, we can run the following Shapley analysis:
What we compute: the exact Shapley value for each PDF flavor by averaging its marginal contribution across all coalitions (all subsets of flavors).
Computational cost: exponential in the number of flavors, as we need to evaluate the fit for each coalition. In the flavor basis, \(N=9\) flavors, we have \(2^9 = 512\) coalitions.
Using exact Shapley values, we make no assumption of independence of the features.
We can set coalitions with correlated perturbation. How to infer the correlations between the flavors:
Sum rules:
\[ \text{Momentum sum rule:} \quad\int_0^1dxx(g(x,Q)+\Sigma(x,Q))=1, \] \[ \text{Valence sum rule:} \int_0^1dxV(x,Q)=\int_0^1dxV_8(x,Q)=3, \qquad \int_0^1dxV_3(x,Q)=1, \]Sum rules enforce physical PDF and correlate our perturbations within a coalition.
Contact : raphael.bonnet-guerrini@unimi.it
This work was supported by the European Union's Horizon Europe research and innovation programme under the Marie Sklodowska-Curie grant agreement No 101168829, Challenging AI with Challenges from Physics: How to solve fundamental problems in Physics by AI and vice versa (AIPHY).
# INPUT: observables, mu,sigma,amplitude, n_samples, n_flavors = n
# OUTPUT: shapley_vals[], baseline_chi2, cache
baseline_chi2 = evaluate_chi2(observables, flavor_subset=[]) # v({})
cache = {} # map subset -> v(S)
all_subsets = power_set(0..n-1) # exclude full-set if desired
for i in 0..n-1:
SV = 0
for S in all_subsets:
if i in S: continue
# v(S)
if S not in cache:
cache[S] = evaluate_chi2(observables, flavor_subset=list(S), mu,sigma,amplitude, n_samples)
vS = cache[S]
# v(S ∪ {i})
S_with = S ∪ {i}
if S_with not in cache:
cache[S_with] = evaluate_chi2(observables, flavor_subset=list(S_with), mu,sigma,amplitude, n_samples)
vSw = cache[S_with]
Δ = vSw - vS # marginal contribution
s = |S|
w = factorial(s) * factorial(n - s - 1) / factorial(n)
SV += w * Δ
shapley_vals[i] = SV
# RETURN: shapley_vals, baseline_chi2, evaluated_coalitions=|cache|
# COMPLEXITY: time ~ O(n · 2^n · cost_eval), space ~ O(2^n) (memoized)