id
type
0 (not classified)
status
21 (imported old-v2, waiting for another import)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-11-01 06:40:01
updated at
2025-11-01 06:40:02
url
https://12gwiazd.pl/polityka-cookies/
url length
37
url crc
42074
url crc32
3737494618
location type
1 (url matches target location, page_location is empty)
canonical status
10 (verified canonical url)
canonical page id
domain id
domain tld
616
domain parts
0
originating warc id
-
originating url
https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151280144.81/warc/CC-MAIN-20250810120801-20250810150801-00862.warc.gz
source type
11 (CommonCrawl)
server ip
Publication date
2025-08-10 13:48:46
Fetch attempts
0
Original html size
341419
Normalized and saved size
29100
title
Polityka cookies - Fundacja filmowa 12 Gwiazd
excerpt
content
author
updated
1762337255
block type
0
extracted fields
136
extracted bits
title
OpenGraph suggests this is an article
detected location
0
detected language
121 (Polish)
category id
index version
2025110801
paywall score
0
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
8009
text words
1203
text unique words
597
text lines
79
text sentences
64
text paragraphs
19
text words per sentence
18
text matched phrases
20
text matched dictionaries
4
image author
featured image