$\newcommand{\E}{\mathbb{E}}$ $\newcommand{\R}{\mathbb{R}}$

Deep Learning in Practice

Chapter 2: Interpretability

NB: turn on javascript to get beautiful mathematical formulas thanks to MathJax

$\newcommand{\epsi}{\varepsilon}$

Overview:

I - Visualization / Analysis (of a neural network trained)

At the neuron level
At the layer level
The case of CNN
At the functional level (of a network already trained)
The case of LLMs: chain of thoughts
About optimization visualization
General visualization tools
By sub-task design: “explainable AI”

II - Interpretability: societal impact and approaches

Why interpretability is important: what is at stake
Interpretability by design: "Explainable AI"
Interpretability of data: causality

III - Issues related to datasets

Dataset poisoning
Dataset contamination
Fairness
Differential privacy

I - Visualization / Analysis (of a neural network trained)

At the neuron level

pick a neuron, study its activities on the training set
- show its history (particularly relevant for recurrent networks)
- show its distribution of activities (possibly as a function of input classes in case of a classication task)
what does it see?
- display its receptive field (particularly relevant in convolutional networks, on images, etc.)
what does it react to?
- display input patterns (in computer vision: images or image patches taken from the dataset) that maximize its activity (or lead to some target activity)
- compute & display the artificial pattern that would theoretical activate that neuron the most (or: that would activate it that way, to get that same value).
  $\implies$ by gradient descent: backpropagation of the activation through the layers from the neuron, iteratively modifying the input image (starting from random noise or a given image)
  - Specific case of last layer neurons in classification tasks:
    - adversarial examples
      [Intriguing properties of neural networks; Szegedy et al, ICLR 2014]
    - deep dreams: making objects appear in an image
    - $\implies$ inspired by neural style transfer [A Neural Algorithm of Artistic Style, Gatys, Ecker & Bethge, 2015]
does it have any impact, actually? Which neuron influenced the network decision the most?
- derivative $\frac{df}{da}$ of the output of the network w.r.t. the activity of that neuron
- causality-style analysis (replace $a$ with typical values and check influence: costly)

At the layer level

show what the layer actually saw from a given input (rebuild the input from the activities)
[Understanding deep image representations by inverting them; Mahendran and Vedaldi, CVPR 2015]
With CCA (Canonical correspondence analysis), check whether the features developed in such layer are correlated with another set of explainable features (e.g., handmade).
- variants and extensions, e.g. CKA (Centered Kernel Alignment)

The case of CNN

filter visualisation (first layer: easy, but next ones? deconv!)
[Feature visualization of CNN trained on ImageNet by Matthew D. Zeiler and Rob Fergus]
display which parts of the input image were actually looked at by the network and were important in the decision process
- grad-CAM
  [Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization; Selvaraju et al, ICCV 2017 / IJCV 2019]
  - consider a classification task; output: $y = (y_c)$ : probability distribution over classes
  - consider a convolutional layer (preferably, the last one): its activities $A^k_{ij}$ are indexed by pixel location $(i,j)$ and feature number $k$
  - importance of feature $k$ for class $c$:
    $\alpha^c_k = \frac{1}{\#\text{pixels}} \sum_{ij} \frac{\partial y_c}{\partial A^k_{ij}} \;\;$ (easily obtainable by averaging backpropagated quantities)
    $\implies$ kind of linearization (mapping activities to outputs)
  - heatmap: importance of pixel $(i,j)$ described by $\sum_k \alpha^c_k A^k_{ij}$
  - compute and display $\text{ReLU}(\sum_k \alpha^c_k A^k)$
- based on CAM: Class Activation Maps
  [Learning Deep Features for Discriminative Localization; Zhou et al, CVPR 2016]

Analyzing inputs

Which parts of the input (which input features) were responsible for this decision for that particular sample?

if few features (low-dimension input): Shapley values: quantify each feature's importance, by checking whether conditioning on it changes the distribution of outputs (over the remaining training samples)
path gradient: start from a baseline (e.g., a fully black image), make a straight line (in the input space) to the desired input sample, and integrate along that path the sensitivity of the output w.r.t. the input $\implies$ produces a heat map

Which parts of the inputs, which patterns are statistically meaningful across the dataset?

CAV (Concept Activation Vectors) / ACE (Automatic Concept-based Explanations) :
- segment the input images into small parts,
- cluster them (according to a latent space representation from some a reference pre-trained network such as VGG) into "concepts",
- and estimate each "concept"'s importance in terms of impact on the decisions (using the TCAV metric, based on layer activity statistics over images with/without the concept)

Which input samples are similar or influent?

input similarity: causing same output... or same activations inside the network: perceptual loss
input similarity... according to the network
for a given test example and associated prediction, which examples in the training set are influencing that prediction? in the sense that if I change their associated labels, or perform extra training steps on thoses examples, they would change the prediction on the test sample.
influence functions

At the functional level (of a network already trained)

e.g., to visualize the dynamics of a Q-learning (DQN) agent: data clustering → colors in MDP
[Graying the black box: Understanding DQNs; Tom Zahavy, Nir Ben Zrihem, Shie Mannor, ICML 2016]
Information Bottleneck: using information theory to study the flow of information in the network (from layer to layer)
[Compressing Neural Networks using the Variational Information Bottleneck; Bin Dai, Chen Zhu, David Wipf, 2018]

The case of LLMs: chain of thoughts

Just ask the LLM to explain its decision instead of just giving the final answer:

by showing examples of answers including explanations (for similar questions, beforehand in the prompt): [Chain-of-Thought Prompting Elicits Reasoning in Large Language Models; Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou, NeurIPS 2022]
by just asking to show all steps: [Large Language Models are Zero-Shot Reasoners; Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa, NeurIPS 2022]

About optimization visualization

display the accuracy as a function of time, etc. (to pick a good learning rate, etc.)
project the network (seen as a function) on a low-dimensional space (e.g. 2D), e.g. using t-SNE (available in scikit-learn), in order to visualize its training as a planar curve (e.g. to see the effect of random intializations; oscillations vs clear convergence).

General visualization tools

To map any kind of (potentially high-dimensional) data (inputs, networks activities, ouputs...) onto a very-low dimensional space (typically, 2D or 3D) for visualization purposes:

good old PCA: best linear projection to keep as much data variance as possible
non-linear "projections", based on neural networks to estimate a non-linear mapping for similar training criteria: t-SNE, UMAP, PyMDE (Minimum-Distortion Embedding)
clustering / summarizing, using LLMs as general knowledge

By sub-task design: “explainable AI”

Cf below.

NB: in the following, most is not specific to deep learning, but applicable to ML in general

II - Interpretability: societal impact and approaches

Why interpretability is important: what is at stake

Example: medical diagnosis

need for explanation of the final score (help to diagnosis): why should we trust the prediction? Can we see on which elements the decision was based?
Isabelle Guyon's skin disease classification tool [patent (pdf), (html)]
- hand-crafted features are designed, and labeled by their type (are they based on: color? texture? shape? etc.);
- at test time, for each diagnosis, the importance of each feature is estimated (a bit like Grad-CAM's spirit); then a score for each type of feature is given (e.g., this decision was based for 40% on color-type features, for 25% on shape-type features, etc.)

Example of biases in data

early computer vision dataset built for binary classification: town vs landscape pictures
- issue: town pictures taken from a car with fiexed camera location, with car brand logo always appearing at the bottom of the picture at the same place...
disease recognition/detection from medical scans
- different hospitals for different diseases $\implies$ learn to identify the scanner parameters, not the disease itself

Societal impact: "Weapons of Maths Destruction" by Cathy O'Neil

companies using black-box software (provided by other companies) for important matters (for the life of people involved), such as hiring (example in the book: waiter position), firing (example in a big school), loans (whom should the bank lend money to?)
- examples of widespread algorithms behaving arbitrarily, sometimes even stochastically
- no feedback or questioning possible (in spite of heavy consequences)
- same arbitrary algorithm used everywhere $\implies$ people trapped in arbitrariness nightmare (no loan / hiring possible nowhere if everyone uses the same software)
- using illegal criteria, or proxy for them (e.g., living neighborhood for ethnicity)
- It has been found in 2016 that COMPAS, the algorithm used for recidivism prediction (in the US) produces much higher false positive rate for black people than white people (and jail duration is based on it!)
self-reinforcing/predicting (self-fulfilling prophecy): police patrol optimization: go more often in ghettos $\implies$ arrest more people in ghettos $\implies$ go more in ghettos $\implies$ etc. $\implies$ focus on ghettos and forget the rest (during that time, no white-collar crime investigation)

$\implies$ crucial: feedback (from people involved), explanability, right to contest/appeal
$\implies$ think twice about the impact of your algorithms before deploying them

Be responsible and careful

"With great power comes great responsability" (guess the source;)

machine learning tools are becoming more and more powerful
software: easy to deploy, potential great impact
choose which impact and where: finance, advertising, or humanitarian? (shortage of machine learners, so nobody will do what you refuse to do)
- Thales announced it will not produce killer robots; Google left a military drone project after employees' revolt (update on Feb. 2025: Google works again for military); NB: France/Europe/Russia/US = world biggest weapon producers
- check twice your work aims at the right target: example in France with dubious algorithmic harassment of the most fragile part of the population in the name of fighting against subsidies cheaters (1%, mostly involuntarily), while it would make more sense to help the 30% who do not receive the subsidies that are meant for them
many discussions about AI ethics; in particular, Montreal declaration for responsible AI
"FAT"-ML: Fairness, Accountability, and Transparency in Machine Learning
- principles defined on fatml.org
- key concepts: Responsibility, Explainability, Accuracy, Auditability, Fairness

Interpretability by design: "Explainable AI"

By breaking the pipeline into interpretable steps
Example: image captioning

[Women also Snowboard: Overcoming Bias in Captioning Models; Lisa Anne Hendricks, Kaylee Burns, Kate Saenko, Trevor Darrell, Anna Rohrbach; FAT-ML 2018]
- pipeline: input image → regions of interest → object classication (for each region) → captioning based on objects found
- grad-CAM on mistaken caption indicates what the neural network was looking at to take its decision
- example of bias found by analysing mistakes: "man sitting in front of computer" (while it's a woman) with "man" linked to the computer, not the person sitting
[Grounding Visual Explanations; Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata; ECCV 2018]
- captioning pipeline with a criterion favoring words (object subparts) that are both discrimant (for the object class) and relevant (for the input image)
Principle:

Pipeline:

Results:

Interpretability of data: causality

Growing field of machine learning

given a set of random variables (i.e., a dataset of examples of joint realization of these variables), determine which ones depend on which ones (oriented dependency graph)
NB: causality is not correlation
eg, sometimes, A and B are correlated because they're both caused by another variable, C
[Bernhard Schölkopf's team; book], [Isabelle Guyon's team; workshop/challenge]
tutorial on causality: slides by Michele Sebag and video recordings of my course: part 1 and part 2

III - Issues related to datasets

Dataset poisoning

Possible to forge a dataset:

in each image, add some invisible noise (e.g. color of one particular pixel) extremely correlated with the label to predict
machine learning algorithms trained on that dataset will learn that obvious dependency (invisible noise / label) and nothing else
anyone who train will not be able to generalize (to examples not in the dataset)

Variation:

not pixel noise, but other objects. For instance, in a classification task including the category 'cat', build a dataset where all cat pictures also include chairs, so that the algorithm actually learns to detect chairs, and not necessarily cats.
don't explicitely build such a dataset, but put such pictures on the web, well indexed by search engines, so that automatic dataset builders include them

Dataset contamination

To test a model, one needs a test set, that no one has ever seen. Is it still possible to find ? And if so, how much time before models get trained on it (on purpose or not), making it not a real test anymore ?

Also: many texts or images on the internet are now created by AI: be careful to train on real data

Fairness

Overview:

problems at stake (societal impact)
a number of different definitions
some of which are not compatible
ensuring fairness decreases accuracy
examples of algorithms

Intro

NB: unfairness might be more subtle than expected
eg: word2vec trained on Google News:

provides a "linear" embedding of words, such that $f($Paris$) - f($London$) \,=\, f($France$) - f($UK$)$ for instance, $f($man$) - f($woman$) \;\,=\, f($king$) - f($queen$)$, etc.
but also $f($man$) - f($woman$) \;\,=\, f($computer programmer$) - f($homemaker$) \;\,=\, f($surgeon$) - f($nurse$)$...
$\implies$ de-bias...
[Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings; Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai; NIPS 2016]
explore word2vec online:
- visualize the cloud of all words, as represented by their word2vec embeddings, projected in a low-dim space
- find the best word D such that D is to C what B is to A

Definition 1: fairness by unawareness

Simplistic version: unawareness

do not include sensitive features (such as gender, ethnicity...) in the data
matches the notion of "disparate treatment"
not sufficient: can use proxies (e.g., hair length for gender, address for ethnicity...)

Definition 2: fairness by awareness

[Fairness through awareness; Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, Rich Zemel; ITCS 2012]

relaxed notion: for each individual $x$, the prediction $f(x)$ is stochastic: distribution $D(x)$
quantify unfairness: for any pair of samples $x, x'$, require $d_D( D(x), D(x') ) \leqslant d_X(x,x')$ : distance between the distributions of outputs is less than the original distance
doesn't rely on predefined groups of people (as in next definitions) but on individuals directly
issues: which metrics $d_D$ and $d_X$ ?

$\newcommand{\hY}{\widehat{Y}}$

Definition 3: Equal opportunity / $\epsi$-fairness (group-based)

[Equality of Opportunity in Supervised Learning; Moritz Hardt, Eric Price, Nathan Srebro; NIPS 2016]

input $(X, A)$ with $A$ = sensitive attribute (gender, ethnicity...)
binary outcome variable, $Y=1$ is "success" (e.g., being hired)
binary prediction $\hY$; predicted success is thus when $\hY = 1$
the point is to ensure that chances of success ("opportunity"), for individuals deserving it, do not depend on the sensitive attribute $A$
equal opportunity: $$\forall a,a', \;\;\; p\left(\left.\hY=1\right|A=a,Y=1\right) \;=\; p\left(\left.\hY=1\right|A=a',Y=1\right) $$ i.e. $p($predicted success$\big|A=a,$ truth=success$) \;=\; p($predicted success$\big|A=a',$ truth=success$)$ for all groups $a,a'$
i.e. same chance to succeed when should succeed
NB: this definition relies on the notion of groups (people with same sensitive attribute)

$\epsi$-fairness: same but approximately:

difference $< \epsi$ : $$\left|\, p\left(\left.\hY=1\right|A=a,Y=1\right) \;-\; p\left(\left.\hY=1\right|A=a',Y=1\right)\, \right| \;<\; \epsi$$
(further details) [Empirical Risk Minimization under Fairness Constraints; Michele Donini, Luca Oneto, Shai Ben-David, John Shawe-Taylor, Massimiliano Pontil; NIPS 2018]
(variation on the loss) [Decoupled classifiers for group-fair and efficient machine learning; Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, Max Leiserson; 1st Conference on Fairness, Accountability and Transparency 2018]

More generally: this definition is group-based: one checks that for every sub-group the distribution of errors / of outputs are the same.
Principle: probability of outcome (or success) should not depend (or not much) on the sensitive attribute

Example: study of main commercial face classification softwares, tested on a grid of different ages/genders/etc bins (check the performance on each subset: young white males, adult asian women, etc.) [Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification; Joy Buolamwini, Timnit Gebru; 1st Conference on FAT, 2018 ]

accuracy not homogeneous at all!: many more mistakes for dark females gender classification
$\implies$ lead to the organization of a challenge in order to perform well on whole sub-categories
academics got better results than industry

Group-fairness:

Impact disparity: outputs conditioned on a subgroup (eg., gender) have different probabilities [what we want to avoid]
Treatment disparity: explicitely treat subgroups differently, to obtain impact parity [a possible way to solve the problem; whether good or not is debated]
Warning! Secondary effects might happen. Example: try to achieve both impact & treatment parity, for girls/boys admission at university. If "gender" is removed from data, but "hair length" is still there, as "hair length" is a (bad) proxy for "gender", short-haired women will be rejected and long-haired men will be accepted.
[Does mitigating ML's impact disparity require treatment disparity? Zachary C. Lipton, Alexandra Chouldechova, Julian McAuley; NIPS 2018]

$\newcommand{\hy}{\widehat{y}}$ 3 possible requirements (with the same notations as above: sensitive attribute $A$ to be independent of, prediction $\hY$, true label or value $Y$ to predict):

independence	$\hY$ independent of $A$	$\forall a, a', \hy, \;\;\;\;\;\;\;\; p(\hY=\hy\|A=a) \;=\; p(\hY=\hy\|A=a')$	outcome proba indep(group/sensitive info)
separation	$\hY$ independent of $A$ when $\|Y$	$\forall a, a',y,\hy, \;\;\;\;\;p(\hY=\hy\|A=a,Y=y) \;=\; p(\hY=\hy\|A=a',Y=y)$	$A$ doesn't influence distribution knowing skills : Equalized odds
sufficiency	$Y$ independent of $A$ when $\|\hY$	$\forall a, a',y,\hy, \;\;\;\;\; p(Y=y\|A=a,\hY=\hy) \;=\; p(Y=y\|A=a',\hY=\hy)$	$A$ doesn't influence the error distribution $y\|\hy$

→ variations: do not require strict equality, but |difference| $< \epsi$, or ratio of probabilities $< 1+ \epsi$

NB: these group-based definitions are incompatible (if A and Y are correlated, you can't have any 2 of these independences at once)

Definition 4: Causality (Counterfactual fairness)

[Counterfactual Fairness; Matt J. Kusner, Joshua R. Loftus, Chris Russell, Ricardo Silva; NIPS 2017]

suppose we know the causality graph between attributes (e.g., variable A causes variable B, etc.)
the sensitive attributes should not influence the outcome
to check: replace the sensitive attributes with various values : does it change the outcome probabilities of the algorithm?
→ causality testing
formulation, for a binary sensitive attribute $A \in \{0,1\}$ and input data $X$: $$\forall X,a,\hy,\;\;\;\;\;\; p( \hY_{A \leftarrow 0}= \hy | X, A=a ) \;=\; p( \hY_{A \leftarrow 1}= \hy | X, A=a ) $$ where $\hY_{A \leftarrow 0}$ means "when replacing sensitive attribute $A$ with a particular value 0"
issue: $\hY_{A \leftarrow 0}$ in practice? and which causality graph? $\implies$ hot research topic

Algorithms

depend on the fairness definition (of course)
in general: enforcing fairness will decrease accuracy $\implies$ fairness/accuracy trade-off

Type 1 [before training]: pre-process data, to remove sensitive data
Type 2 [while training]: enforce fairness while optimizing
Type 3 [after training]: at post-processing: change thresholds/biases

Type 3 works well but requires the sensitive information at test time

Example of type 2 with adversarial approach:

[Turning a Blind Eye: Explicit Removal of Biases and Variation from Deep Neural Network Embeddings; Mohsan Alvi, Andrew Zisserman and Christoffer Nellaker; ECCV 2018]
consider biases in face datasets (age, gender, ethnicity)
can remove a bias when learning a network with an adversarial approach, where the adversarial network tries to recover the sensitive attribute from a middle representation layer of the main network: impossibility to retrieve the sensitive data means independence to it

or enforce (soft, relaxed) constraints explicitely.

Example of type 1 :

~idem: use Information Bottleneck concepts:
from data $(x,a)$, build a new representation $z$ (to be used later for classification or regression, but we don't know the task yet):
map $(x,a) \mapsto z$
such that the mutual information $I(X,Z)$ is maximized while $I(A,Z)$ is minimized: i.e., keep relevant information only

Differential privacy

[NB: in French: "privacy" = "confidentialité"]

Issues regarding privacy

Why care about privacy? Isn't anonymization sufficient?
Netflix prize, 2007:

offered 1 million dollars to anyone able to increase by 10% their recommendation system performance
provided an anonymized dataset of users, with movie preferences (i.e. user name replaced)
Arvind Narayanan and Vitaly Shmatikov managed to re-identify part of the users, using IMDb (where users rate the movies they've watched)
standard process for un-anonymizing datasets: combine with other dataset(s); even if each of them is mostly uninformative, taken together, the information can be retrieved.
other example: anonymized electricity consumer dataset (including approximate location) + white pages + ... $\implies$ re-identify
yet another example: 87% of American citizens can be identified from their birth date, gender and zip code. A public release of "anonymized" medical data by the Massachussets' Group Insurance Commision in 1997, combined with the voter roll database...

Why care if no dataset sharing?
If you (e.g., Google) train an algorithm on your client database (containing private data) and provide the trained algorithm to all clients as a service: it might be possible to extract private data (of other clients) from it

example: email auto-completion, while using unfrequent words
from a neural network, it is (sometimes) possible to rebuild some of the data that were used for training, such as outliers, as the network has overfitted them

Queries on a database:

arbitrary queries on a private statistical database necessarily reveal some amount of private information; the entire information content of the database can be retrieved with surprisingly small number of random queries.
[Revealing information while preserving privacy; Kobbi Nissim and Irit Dinur; SIGMOD-SIGACT-SIGART symposium on Principles of database systems 2003]

$\epsi$-differentiable privacy

Formalization of the amount of noise needed to be added to query answers to keep privacy, i.e. not be able to distinguish a dataset from the same dataset + one more element : $\epsi$-differential privacy

[Calibrating noise to sensitivity in private data analysis; Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith; Conference on Theory of Cryptography, 2006]
Notations:
- algo: $A$
- dataset: $D_1$
- dataset $D_1$ + one element: $D_2$
Definition:
Algo $A$ has $(\epsi, \delta)$-privacy iff:
for all subsets $S$ of Im$(A)$, for all datasets $D_1$ and $D_2$ differing by one element only, $$ p\left( A(D_1) \in S\right) \;\; \leqslant \;\; e^\epsi\; p\left( A(D_2) \in S\right) \,+\, \delta$$ i.e. proba very close (interesting for small $\epsi$ and $\delta$)
Variant: $\epsi$-privacy: idem with $\delta = 0$.
How to ensure $\epsi$-privacy?
- add noise to query answers
- provably (and quantifyably), the fewer individuals involved in a query, the more noise needed
$\implies$ Gödel prize in 2017

To go further:

Example of privacy-preserving pipeline

Example of advanced ML pipeline taking into account privacy:
[Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data; Nicolas Papernot, Martin Abadi, Ulfar Erlingsson, Ian Goodfellow, Kunal Talwar; ICLR 2017]
Keypoints:

train several classifiers on different datasets [the more classifiers, the more private the result will be, but not too many otherwise only small data left to train each classifier]
make an ensemble method, with noise [crucial for $(\epsi, \delta)$-privacy proof]
label (a small part of) a publicly-available dataset using that ensemble classifier
train another network ("student") to learn to imitate that ensemble classifier on that public dataset ("teacher") (in a weakly-supervised manner)
share the student network $\implies$ has not seen any private data!
proofs rely on the number of requests made to the ensemble classifier: so, train the student with as few labeled examples as possible (which gives a privacy bound), hence weakly-supervision; then when sharing the student, this number of requests will not grow, as new requests are addressed to the student, not to the private classifiers, hence intensive usage is possible without privacy drop
results: small accuracy drop only:
yet the proven privacy level $\epsi$ is not really "small": in the case of filling a form with binary questions and aggregated statistics, $\epsi = 8$ corresponds to being able to identify your actual answer with probability 0.998.

Federated learning

When training on sensitive data that should not be shared, for instance:

healthcare: hospital datasets (e.g., Owkin)
predictive keyboards: do not send everything typed by every user to a central server! (Google's Gboard)

Setup:

$N$ local servers (hospital, client, user...) with their own private dataset
train only one algorithm, the same on all servers
share the parameters (send parameter updates), either with a central server, or in a peer-to-peer fashion
$\implies$ no data transfer, hence privacy
$\implies$ yet parameter transfer might leak information (cf above)

To go further

Bias, fairness, privacy...: course by Isabelle Guyon and Kim Gerdes at Paris-Sud: slides and recordings
Analysis/explanation of a trained network, + same topics as above: course by Piotr Mardziel at CMU

Back to the main page of the course

Principle:
Pipeline:
Results: