id
type
0 (not classified)
status
10 (page successfully fetched)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-10-13 22:13:48
updated at
2026-02-25 18:26:18
url
https://jard.pl/polityka-prywatnosci/
url length
37
url crc
15619
url crc32
3632020739
location type
1 (url matches target location, page_location is empty)
canonical status
2 (missing canonical tag in html)
canonical page id
-
domain id
domain tld
616
domain parts
0
originating warc id
-
originating url
https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151281020.56/warc/CC-MAIN-20250813024931-20250813054931-00074.warc.gz
source type
11 (CommonCrawl)
server ip
Publication date
2025-06-25 07:41:24
Fetch attempts
0
Original html size
46491
Normalized and saved size
20003
block type
0
extracted fields
136
extracted bits
title
OpenGraph suggests this is an article
detected location
0
detected language
121 (Polish)
category id
index version
1
paywall score
0
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
6581
text words
1039
text unique words
542
text lines
58
text sentences
48
text paragraphs
18
text words per sentence
21
text matched phrases
20
text matched dictionaries
5