Main

type

0

status

21

review version

0

cleanup version

0

pending deletion

0

created at

2025-11-03 12:55:35

updated at

2025-11-03 12:55:36

Address

url

https://8a.pl/regulamin-newsletter

url length

34

url crc

51861

url crc32

1131268757

location type

1

canonical status

10

canonical page id

2895333542

Source

domain id

199612566

domain tld

616

domain parts

0

originating warc id

-

originating url

https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151280119.22/warc/CC-MAIN-20250810024619-20250810054619-00712.warc.gz

source type

11

Server response

server ip

104.26.14.185

pubdate

2025-08-10 04:47:22

attempts

0

size orig

237354

size saved

96303

Content

page id

2895333542

title

Regulamin newsletter

excerpt

content

author

updated

1767197877

Text analysis

block type

0

extracted fields

8

extracted bits

title

detected location

159

detected language

121 (Polish)

category id

Moda (84)

index version

2025123101

paywall score

0

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

21574

text words

3282

text unique words

1180

text lines

636

text sentences

76

text paragraphs

35

text words per sentence

43

text matched phrases

184

text matched dictionaries

14