id
type
5 (blog/news article)
status
21 (imported old-v2, waiting for another import)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-10-17 02:27:19
updated at
2025-10-17 02:27:20
url
https://4treesbuilding.ca/blog-2/
url length
33
url crc
47022
url crc32
3058808750
location type
1 (url matches target location, page_location is empty)
canonical status
10 (verified canonical url)
canonical page id
domain id
domain tld
124
domain parts
0
originating warc id
-
originating url
https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151280951.94/warc/CC-MAIN-20250812141533-20250812171533-00856.warc.gz
source type
11 (CommonCrawl)
server ip
Publication date
2025-08-12 15:04:24
Fetch attempts
0
Original html size
289702
Normalized and saved size
198352
title
Blog - 4trees Cannabis Building
excerpt
content
author
https://www.facebook.com/4Treesbuilding
updated
1762741429
block type
0
extracted fields
141
extracted bits
featured image
article author
title
OpenGraph suggests this is an article
detected location
0
detected language
1 (English)
category id
index version
2025110801
paywall score
0
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
8436
text words
1635
text unique words
544
text lines
234
text sentences
39
text paragraphs
26
text words per sentence
41
text matched phrases
58
text matched dictionaries
2
links self subdomains
0
links other subdomains
6
links other domains
1
links spam adult
0
links spam random
0
links spam expired
0
links ext activities
1
links ext ecommerce
0
links ext finance
0
links ext crypto
0
links ext booking
0
links ext news
0
links ext leaks
0
links ext ugc
13
links ext klim
0
links ext generic
3
image author