id
type
5 (blog/news article)
status
21 (imported old-v2, waiting for another import)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-12-31 00:11:05
updated at
2025-12-31 00:11:05
url
https://www.cattlemansroadhouse.com/blog/blog-post
url length
50
url crc
7289
url crc32
2732792953
location type
1 (url matches target location, page_location is empty)
canonical status
10 (verified canonical url)
canonical page id
domain id
domain tld
2211
domain parts
2
originating warc id
13193615
originating url
source type
11 (CommonCrawl)
server ip
Publication date
2025-08-03 00:05:38
Fetch attempts
0
Original html size
72703
Normalized and saved size
20048
title
excerpt
content
Blog Post Published: August 10, 2023 Updated: August 10, 2023 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent iaculis nunc ante, ut elementum urna aliquam at. Aenean ultrices, dui nec lobortis malesuada, lectus ante blandit lorem, sed aliquam mi erat vel mauris. Aenean a arcu at ligula tempor semper at eget libero. Aliquam in sem ligula. Cras ut risus est. Mauris semper augue id velit condimentum venenatis. Phasellus aliquet turpis ac lorem consectetur scelerisque. Maecenas quis nisl id lacus tincidunt lacinia. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer metus sapien, hendrerit nec ante ac, vulputate malesuada tellus. Fusce accumsan semper enim, et hendrerit mauris tincidunt non. Curabitur sodales varius nisl a consequat. Curabitur ligula purus, fermentum in sodales vel, gravida s...
author
updated
1767346563
block type
0
extracted fields
96
extracted bits
full content
content was extracted heuristically
detected location
0
detected language
126 (language undetectable (empty document, too short, or engines disagree))
category id
-
index version
1
paywall score
0
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
2076
text words
378
text unique words
149
text lines
1
text sentences
48
text paragraphs
1
text words per sentence
7
text matched phrases
0
text matched dictionaries
0
links self subdomains
0
links other subdomains
4
links other domains
0
links spam adult
0
links spam random
0
links spam expired
0
links ext activities
0
links ext ecommerce
0
links ext finance
0
links ext crypto
0
links ext booking
0
links ext news
0
links ext leaks
0
links ext ugc
0
links ext klim
0
links ext generic
0
image author
featured image