Explainable AI - GRADCAM(Gradient-weighted Class Activation Mapping)

Untitled

Grad-CAM uses the gradient information flowing into the last convolutional layer of the CNN to assign importance values to each neuron for a particular decision of interest.

in order to obain the class-discriminative localization map Grad-CAM $L^c_{GRAD-CAM}$ of with u and height v for any class c, we first compute the gradient of the score the class c, $y^c$(before the softmax), with respect to feature map activations $A^k$ of a convolutional layer, (${\partial y^c \over \partial A^k}$).

These gradients flowing back are global-average-pooled over the width and height dimensions (index by i and j respectively) to obtain the neuron importance weights $\alpha^c_k$:

$\alpha^c_k = {1\over Z}\sum_i \sum_j {\partial y^c \over \partial A^k_{ij}}$

This weight $a^c_k$ represents a partial linearization of the deep entwork downstream from A, and captures the importance of feature map k for a tarket class c. then perform a weighted combination of forward activation maps, and follow it by a ReLU to obtain.

$L^c_{Grad-CAM} = ReLU(\sum_k a^c_kA^k)$

Untitled

Research Question:

Which element/object/part in consumers’ posted photo have an effect on consumer dissatisfaction?

Potential Problem:

Are negative review data with photos many enough to train CNN-based ML model?
What’s the main mathematical model of this research?
how are we going to variablize the Grad-CAM image output?
- previous research(The power of brand selfie- JMR 2021) used this method just to clarify that in selfies with brand logo of the CNN model which is trained to classify sentiment of review, the logo with consumer region is responsible for prediction result.
How are we going to vaidate the result?
- after gradcam, ask consumers what they think about the highlighted region of the image(reverse)
what if the color is important for classification not certain region of the image ?
- maybe showing limitations of this method could be a good point.