id
type
5 (blog/news article)
status
21 (imported old-v2, waiting for another import)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-06-06 16:22:59
updated at
2025-11-20 04:34:14
url
https://www.bbc.com/news/live/cp8vyy35g3mt
url length
42
url crc
47541
url crc32
178895285
location type
1 (url matches target location, page_location is empty)
canonical status
10 (verified canonical url)
canonical page id
-
domain id
domain tld
0
domain parts
0
originating warc id
-
originating url
https://www.bbc.com/news/live/cp8vyy35g3mt?page=2
source type
10 (canonical url)
server ip
Publication date
2025-11-20 04:34:14
Fetch attempts
1
Original html size
438375
Normalized and saved size
125907
title
excerpt
content
author
updated
1764065325
block type
0
extracted fields
0
extracted bits
–
detected location
0
detected language
1 (English)
category id
index version
2025110801
paywall score
0
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
11106
text words
2366
text unique words
738
text lines
302
text sentences
76
text paragraphs
53
text words per sentence
31
text matched phrases
46
text matched dictionaries
8
links self subdomains
0
links other subdomains
20
links other domains
0
links spam adult
0
links spam random
0
links spam expired
0
links ext activities
0
links ext ecommerce
0
links ext finance
0
links ext crypto
0
links ext booking
0
links ext news
30
links ext leaks
0
links ext ugc
0
links ext klim
0
links ext generic
1
image author
featured image