Skip to content

Commit

Permalink
vizu
Browse files Browse the repository at this point in the history
  • Loading branch information
mlelarge committed Dec 13, 2023
1 parent 0791823 commit 22cea76
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions modules/12-attention.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
\toc



## Attention with RNNs

The first attention mechanism was proposed in [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/abs/1409.0473) by Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio (presented at ICLR 2015).
Expand Down Expand Up @@ -104,11 +105,15 @@ Each of these layers is applied on each of the inputs given to the transformer b
Note that this block is equivariant: if we permute the inputs, then the outputs will be permuted with the same permutation. As a result, the order of the input is irrelevant to the transformer block. In particular, this order cannot be used.
The important notion of positional encoding allows us to take order into account. It is a deterministic unique encoding for each time step that is added to the input tokens.
## Transformers using Named Tensor Notation
## LLM Visualization.
In [Transformers using Named Tensor Notation](https://hackmd.io/@mlelarge/HkVlvrc8j), we derive the formal equations for the Transformer block using named tensor notation.
Have a look at Brendan Bycroft’s beautifully crafted interactive explanation of the transformers architecture:
[![gif](/modules/extras/attention/transformer_vizu.gif)](https://bbycroft.net/llm)
## Transformers using Named Tensor Notation
In [Transformers using Named Tensor Notation](https://hackmd.io/@mlelarge/HkVlvrc8j), we derive the formal equations for the Transformer block using named tensor notation.
## Hacking a simple Transformer block
Expand Down
Binary file added modules/extras/attention/transformer_vizu.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 22cea76

Please sign in to comment.