Main

type

5 (blog/news article)

status

21 (imported old-v2, waiting for another import)

review version

0

cleanup version

0

pending deletion

0 (-)

created at

2025-10-21 16:54:22

updated at

2025-10-21 16:54:22

Address

url

https://adriencotton.com/blog/5/

url length

32

url crc

62370

url crc32

14152610

location type

1 (url matches target location, page_location is empty)

canonical status

30 (canonical url is different, page_canonical_page_id points to it)

canonical page id

2840219587

Source

domain id

40068030

domain tld

2211

domain parts

0

originating warc id

-

originating url

https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151280906.43/warc/CC-MAIN-20250812014435-20250812044435-00617.warc.gz

source type

11 (CommonCrawl)

Server response

server ip

104.219.248.5

Publication date

2025-08-12 03:01:11

Fetch attempts

0

Original html size

100889

Normalized and saved size

76373

Content

title

Blog - Adrien Cotton

excerpt

content

author

updated

1762766676

Text analysis

block type

0

extracted fields

139

extracted bits

featured image
image author
title
OpenGraph suggests this is an article

detected location

0

detected language

1 (English)

category id

Seks i związki (94)

index version

2025110801

paywall score

0

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

4561

text words

932

text unique words

309

text lines

177

text sentences

28

text paragraphs

14

text words per sentence

33

text matched phrases

12

text matched dictionaries

4