id
type
0 (not classified)
status
10 (page successfully fetched)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-10-13 15:37:07
updated at
2026-02-25 16:00:25
url
http://jard.pl/polityka-prywatnosci/
url length
36
url crc
60971
url crc32
3186290219
location type
1 (url matches target location, page_location is empty)
canonical status
2 (missing canonical tag in html)
canonical page id
-
domain id
domain tld
616
domain parts
0
originating warc id
-
originating url
https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151281020.56/warc/CC-MAIN-20250813024931-20250813054931-00313.warc.gz
source type
11 (CommonCrawl)
server ip
Publication date
2025-06-25 07:10:13
Fetch attempts
0
Original html size
46475
Normalized and saved size
19996
block type
0
extracted fields
136
extracted bits
title
OpenGraph suggests this is an article
detected location
0
detected language
121 (Polish)
category id
index version
2025110801
paywall score
0
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
6581
text words
1039
text unique words
542
text lines
58
text sentences
48
text paragraphs
18
text words per sentence
21
text matched phrases
20
text matched dictionaries
5