{"cells": [{"cell_type": "markdown", "id": "4b91d552", "metadata": {"papermill": {"duration": 0.014646, "end_time": "2025-04-03T19:15:27.449581", "exception": false, "start_time": "2025-04-03T19:15:27.434935", "status": "completed"}, "tags": []}, "source": ["\n", "# Tutorial 5: Transformers and Multi-Head Attention\n", "\n", "* **Author:** Phillip Lippe\n", "* **License:** CC BY-SA\n", "* **Generated:** 2025-04-03T19:15:20.711095\n", "\n", "In this tutorial, we will discuss one of the most impactful architectures of the last 2 years: the Transformer model.\n", "Since the paper Attention Is All You Need by Vaswani et al. had been published in 2017,\n", "the Transformer architecture has continued to beat benchmarks in many domains, most importantly in Natural Language Processing.\n", "Transformers with an incredible amount of parameters can generate long, convincing essays, and opened up new application fields of AI.\n", "As the hype of the Transformer architecture seems not to come to an end in the next years,\n", "it is important to understand how it works, and have implemented it yourself, which we will do in this notebook.\n", "This notebook is part of a lecture series on Deep Learning at the University of Amsterdam.\n", "The full list of tutorials can be found at https://uvadlc-notebooks.rtfd.io.\n", "\n", "\n", "---\n", "Open in [{height=\"20px\" width=\"117px\"}](https://colab.research.google.com/github/PytorchLightning/lightning-tutorials/blob/publication/.notebooks/course_UvA-DL/05-transformers-and-MH-attention.ipynb)\n", "\n", "Give us a \u2b50 [on Github](https://www.github.com/Lightning-AI/lightning/)\n", "| Check out [the documentation](https://lightning.ai/docs/)\n", "| Join us [on Discord](https://discord.com/invite/tfXFetEZxv)"]}, {"cell_type": "markdown", "id": "228a503d", "metadata": {"papermill": {"duration": 0.012648, "end_time": "2025-04-03T19:15:27.476021", "exception": false, "start_time": "2025-04-03T19:15:27.463373", "status": "completed"}, "tags": []}, "source": ["## Setup\n", "This notebook requires some packages besides pytorch-lightning."]}, {"cell_type": "code", "execution_count": 1, "id": "9b89aa0e", "metadata": {"colab": {}, "colab_type": "code", "execution": {"iopub.execute_input": "2025-04-03T19:15:27.502545Z", "iopub.status.busy": "2025-04-03T19:15:27.502111Z", "iopub.status.idle": "2025-04-03T19:15:28.687912Z", "shell.execute_reply": "2025-04-03T19:15:28.686615Z"}, "id": "LfrJLKPFyhsK", "lines_to_next_cell": 0, "papermill": {"duration": 1.201829, "end_time": "2025-04-03T19:15:28.690411", "exception": false, "start_time": "2025-04-03T19:15:27.488582", "status": "completed"}, "tags": []}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.\u001b[0m\u001b[33m\r\n", "\u001b[0m"]}, {"name": "stdout", "output_type": "stream", "text": ["\r\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.0.1\u001b[0m\r\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython -m pip install --upgrade pip\u001b[0m\r\n"]}], "source": ["! pip install --quiet \"pytorch-lightning >=2.0,<2.6\" \"torchvision\" \"torch >=1.8.1,<2.7\" \"seaborn\" \"torchmetrics >=1.0,<1.8\" \"numpy <3.0\" \"matplotlib\""]}, {"cell_type": "markdown", "id": "1690e492", "metadata": {"papermill": {"duration": 0.020319, "end_time": "2025-04-03T19:15:28.732020", "exception": false, "start_time": "2025-04-03T19:15:28.711701", "status": "completed"}, "tags": []}, "source": ["