id
type
0 (not classified)
status
21 (imported old-v2, waiting for another import)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-10-20 16:21:11
updated at
2025-10-20 16:21:12
url
https://arbookslibrary.pl/index.php?route=common%2Fhome
url length
55
url crc
3606
url crc32
4152823318
location type
1 (url matches target location, page_location is empty)
canonical status
30 (canonical url is different, page_canonical_page_id points to it)
canonical page id
domain id
domain tld
616
domain parts
0
originating warc id
-
originating url
https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151280914.63/warc/CC-MAIN-20250812045045-20250812075045-00517.warc.gz
source type
11 (CommonCrawl)
server ip
Publication date
2025-08-12 05:10:32
Fetch attempts
0
Original html size
44325
Normalized and saved size
34124
title
AR Books LibrARy webshop
excerpt
content
Nasze publikacje Organizm człowieka 59,99 zł Szczegóły Chemia na co dzień 59,99 zł Szczegóły Magiczna geometria 59,99 zł Szczegóły Ssaki w lesie 59,99 zł Szczegóły Zwierzęta wokół domu ...
author
updated
1763187174
block type
0
extracted fields
104
extracted bits
title
full content
content was extracted heuristically
detected location
0
detected language
121 (Polish)
category id
index version
2025110801
paywall score
0
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
3226
text words
554
text unique words
315
text lines
1
text sentences
33
text paragraphs
1
text words per sentence
16
text matched phrases
22
text matched dictionaries
8