Expand ↗
Page list (942)

Encoder-Decoder Attention

Cross-attention from decoder queries to encoder outputs in the original Transformer for sequence-to-sequence tasks; now common in every modern encoder-decoder model.

In this vault

Last changed by zetl · stable 5d · history

Backlinks