id
type
0 (not classified)
status
20 (imported old-v1, waiting for another import)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-01-25 15:05:07
updated at
2025-04-30 14:40:16
url
https://p.dw.com/p/4pcSw
url length
24
url crc
38685
url crc32
381785885
location type
4 (page_location points to new url in different domain)
canonical status
21 (canonical url is different, but detected as spam)
canonical page id
-
location
https://www.dw.com/fa-ir/%DA%98%D9%86%D8%B1%D8%A7%D9%84-%D8%A7%D8%B1%D9%88%D9%BE%D8%A7%DB%8C%DB%8C-%D8%AE%D9%88%D8%A7%D8%B3%D8%AA%D8%A7%D8%B1-%D8%A7%D8%B3%D8%AA%D9%82%D8%B1%D8%A7%D8%B1-%D8%B3%D8%B1%D8%A8%D8%A7%D8%B2%D8%A7%D9%86-%D8%A7%D8%AA%D8%AD%D8%A7%D8%AF%DB%8C%D9%87-%D8%A7%D8%B1%D9%88%D9%BE%D8%A7-%D8%AF%D8%B1-%DA%AF%D8%B1%DB%8C%D9%86%D9%84%D9%86%D8%AF-%D8%B4%D8%AF/a-71407938
domain id
domain tld
0
domain parts
0
originating warc id
-
originating url
https://p.dw.com/
source type
22 (Telegram)
server ip
-
Publication date
2025-04-30 14:40:16
Fetch attempts
0
Original html size
121252
Normalized and saved size
43208
block type
0
extracted fields
0
extracted bits
–
detected location
0
detected language
126 (language undetectable (empty document, too short, or engines disagree))
category id
Pozostałe (16)
index version
2025061201
paywall score
0
spam phrases
0
links self subdomains
0
links other subdomains
0
links other domains
3
links spam adult
0
links spam random
0
links spam expired
0
links ext activities
0
links ext ecommerce
0
links ext finance
0
links ext crypto
0
links ext booking
0
links ext news
27
links ext leaks
2
links ext ugc
6
links ext klim
0
links ext generic
0