Main

type

5 (blog/news article)

status

21 (imported old-v2, waiting for another import)

review version

0

cleanup version

0

pending deletion

0 (-)

created at

2025-06-06 16:22:59

updated at

2025-11-20 04:34:14

Address

url

https://www.bbc.com/news/live/cp8vyy35g3mt

url length

42

url crc

47541

url crc32

178895285

location type

1 (url matches target location, page_location is empty)

canonical status

10 (verified canonical url)

canonical page id

-

Source

domain id

74498681

domain tld

0

domain parts

0

originating warc id

-

originating url

https://www.bbc.com/news/live/cp8vyy35g3mt?page=2

source type

10 (canonical url)

Server response

server ip

151.101.128.81

Publication date

2025-11-20 04:34:14

Fetch attempts

1

Original html size

438375

Normalized and saved size

125907

Content

title

excerpt

content

author

updated

1764065325

Text analysis

block type

0

extracted fields

0

extracted bits

detected location

0

detected language

1 (English)

index version

2025110801

paywall score

0

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

11106

text words

2366

text unique words

738

text lines

302

text sentences

76

text paragraphs

53

text words per sentence

31

text matched phrases

46

text matched dictionaries

8