id
type
5 (blog/news article)
status
21 (imported old-v2, waiting for another import)
review version
1
cleanup version
2
pending deletion
0 (-)
created at
2023-12-03 18:15:13
updated at
2025-12-23 21:11:34
url
https://amax.pl/aktualnosci/
url length
28
url crc
41698
url crc32
3191251682
location type
1 (url matches target location, page_location is empty)
canonical status
10 (verified canonical url)
canonical page id
domain id
domain tld
616
domain parts
2
originating warc id
-
originating url
https://amax.pl/post-sitemap.xml
source type
1 (sitemap)
server ip
Publication date
2025-08-04 18:51:31
Fetch attempts
0
Original html size
102634
Normalized and saved size
61734
title
Aktualności - AMAX
excerpt
content
author
updated
1766894284
block type
0
extracted fields
136
extracted bits
title
OpenGraph suggests this is an article
detected location
106
detected language
121 (Polish)
category id
Spam (233)
index version
2025123101
paywall score
0
spam phrases
4
text nonlatin
0
text cyrillic
0
text characters
2054
text words
354
text unique words
216
text lines
53
text sentences
36
text paragraphs
4
text words per sentence
9
text matched phrases
4
text matched dictionaries
1
image author
featured image