Main

type

5 (blog/news article)

status

10 (page successfully fetched)

review version

1

cleanup version

2

pending deletion

0 (-)

created at

2023-12-09 05:11:34

updated at

2026-02-27 06:45:33

Address

url

https://widzew.com/aktualnosci/

url length

31

url crc

54647

url crc32

2556876151

location type

1 (url matches target location, page_location is empty)

canonical status

2 (missing canonical tag in html)

canonical page id

-

Source

domain id

61528673

domain tld

2211

domain parts

2

originating warc id

-

originating url

https://widzew.com/sitemap.xml

source type

11 (CommonCrawl)

Server response

server ip

176.31.33.142

Publication date

2025-06-24 18:56:35

Fetch attempts

0

Original html size

224519

Normalized and saved size

64874

Text analysis

block type

0

extracted fields

105

extracted bits

featured image
title
full content
content was extracted heuristically

detected location

113

detected language

121 (Polish)

category id

Piłka nożna (74)

index version

2025123101

paywall score

0

spam phrases

1

Text statistics

text nonlatin

0

text cyrillic

0

text characters

2187

text words

401

text unique words

193

text lines

1

text sentences

22

text paragraphs

1

text words per sentence

18

text matched phrases

10

text matched dictionaries

4