id
type
5 (blog/news article)
status
21 (imported old-v2, waiting for another import)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-09-16 22:58:59
updated at
2026-01-10 23:43:51
url
https://nlp.seas.harvard.edu/2018/04/03/attention.html
url length
54
url crc
48805
url crc32
3458383525
location type
1 (url matches target location, page_location is empty)
canonical status
30 (canonical url is different, page_canonical_page_id points to it)
canonical page id
domain id
domain tld
2295
domain parts
0
originating warc id
-
originating url
https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151576670.96/warc/CC-MAIN-20250814162913-20250814192913-00994.warc.gz
source type
11 (CommonCrawl)
server ip
Publication date
2025-07-18 05:23:33
Fetch attempts
0
Original html size
193628
Normalized and saved size
190330
title
The Annotated Transformer
excerpt
content
———————- There is now a new version of this blog post updated for modern PyTorch. ———————- from IPython.display import Image Image(filename='images/aiayn.png') The Transformer from “Attention is All You Need” has been on a lot of people’s minds over the last year. Besides producing major improvements in translation quality, it provides a new architecture for many other NLP tasks. The paper itself is very clearly written, but the conventional wisdom has been that it is quite difficult to implement correctly. In this post I present an “annotated” version of the paper in the form of a line-by-line implementation. I have reordered and deleted some sections from the original paper and added comments throughout. This document itself is a working notebook, and should be a completely usable implementation. In total there are 400 lines of library code which can process 27,000 tokens per second on 4 GPUs. To follow along you will first need to install PyTorch. The complete notebook...
author
updated
1768944998
block type
0
extracted fields
104
extracted bits
title
full content
content was extracted heuristically
detected location
0
detected language
1 (English)
category id
-
index version
1
paywall score
0
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
33565
text words
7428
text unique words
1339
text lines
1
text sentences
203
text paragraphs
1
text words per sentence
36
text matched phrases
0
text matched dictionaries
0
links self subdomains
0
links other subdomains
0
links other domains
4
links spam adult
0
links spam random
0
links spam expired
0
links ext activities
14
links ext ecommerce
0
links ext finance
0
links ext crypto
0
links ext booking
0
links ext news
0
links ext leaks
0
links ext ugc
9
links ext klim
0
links ext generic
0
image author
featured image