CDAM is a novel post-hoc explanation method for vision transformers (ViTs) that is highly sensitive to the chosen target class and reveals evidence and counter-evidence through signed relevance scores.
To run CDAM.ipynb locally we recommend to create a Python >= 3.9 virtual environment, for example with Mamba. Inside the environment run pip install -r requirements_local.txt
and create a Jupyter kernel with python -m ipykernel install --user --name cdam_kernel
.
We provide a checkpoint for a classifier head trained on ImageNet. With train_imagenet.py you can also train the model yourself, but have to download the ImageNet dataset first.
Our live demo is the fastest way to try out CDAM!
We introduce CDAM, a novel method for visualizing input feature relevance of ViT classifications. CDAM scales the attention by how relevant the corresponding tokens are for the model's decision. Beyond targeting classifier outputs, we create explanations for a similarity measure in the latent space of the ViT. This allows for explanations of arbitary concepts, defined by the user through a few sample images.
CDAMs are obtained by calculating the gradients of the class or similarity score with respect to the tokens entering the final transformer block. The CDAM score
where