Transformer Architecture
Transformer-Architektur
The transformer architecture is the neural network design underlying virtually all modern large language models. Its central feature is the attention mechanism, which lets the model capture relationships between words that are far apart in a text.