id
type
0
status
20
review version
1
cleanup version
2
pending deletion
0
created at
2023-08-17 09:21:43
updated at
2024-01-14 23:30:12
url
https://sportdziennik.com/latest-news/
url length
38
url crc
32638
url crc32
2904457086
location type
1
canonical status
10
canonical page id
domain id
domain tld
2211
domain parts
2
originating warc id
-
originating url
https://sportdziennik.com/sitemap-1.xml
source type
1
server ip
-
pubdate
2023-09-01 05:36:43
attempts
0
size orig
0
size saved
21127041
block type
0
extracted fields
8
extracted bits
–
detected location
0
detected language
126 (language undetectable (empty document, too short, or engines disagree))
category id
Spam (233)
index version
2025030501
paywall score
0
spam phrases
2966
links self subdomains
0
links other subdomains
255
links other domains
255
links spam adult
255
links spam random
235
links spam expired
26
links ext activities
78
links ext ecommerce
20
links ext finance
9
links ext crypto
50
links ext booking
0
links ext news
13
links ext leaks
57
links ext ugc
255
links ext klim
0
links ext generic
150