Main

type

5 (blog/news article)

status

21 (imported old-v2, waiting for another import)

review version

0

cleanup version

0

pending deletion

0 (-)

created at

2025-10-14 04:16:24

updated at

2025-10-14 04:16:25

Address

url

https://blog.stuajnht.co.uk/2018/03/

url length

36

url crc

57539

url crc32

1483792579

location type

1 (url matches target location, page_location is empty)

canonical status

2 (missing canonical tag in html)

canonical page id

-

Source

domain id

516293276

domain tld

826

domain parts

0

originating warc id

-

originating url

https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151281008.23/warc/CC-MAIN-20250812234112-20250813024112-00886.warc.gz

source type

11 (CommonCrawl)

Server response

server ip

172.67.221.25

Publication date

2025-08-13 01:21:03

Fetch attempts

0

Original html size

27734

Normalized and saved size

6739

Content

title

March 2018 – Stuajnht's Blog

excerpt

content

I’ve got to be honest, I haven’t been involved in the CyberCenturion IV competition as much as I was with the 3rd one. After watching the teams compete during the first competition round, during the second round the school closed early due to snow (I was involved with sending out all of the messages, and […]

author

updated

1763060294

Text analysis

block type

0

extracted fields

104

extracted bits

title
full content
content was extracted heuristically

detected location

0

detected language

1 (English)

category id

Pozostałe (16)

index version

2025110801

paywall score

0

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

244

text words

55

text unique words

39

text lines

1

text sentences

2

text paragraphs

1

text words per sentence

27

text matched phrases

0

text matched dictionaries

0