id
processing priority
4
site type
3 (personal blog or private political site, e.g. Blogspot, Substack, also small blogs on own domains)
review version
11
html import
20 (imported)
first seen date
2024-02-25 00:03:28
expired found date
-
created at
2024-06-19 17:40:44
updated at
2025-04-19 19:30:27
length
35
crc
48283
tld
2211
nm parts
0
nm random digits
0
nm rare letters
0
is subdomain of id
13642151 (wordpress.com)
previous id
0
replaced with id
0
related id
-
dns primary id
0
dns alternative id
0
lifecycle status
0 (unclassified, or currently active)
deleted subdomains
0
page imported products
0
page imported random
0
page imported parking
0
count skipped due to recent timeouts on the same server IP
0
count content received but rejected due to 11-799
0
count dns errors
0
count cert errors
0
count timeouts
0
count http 429
0
count http 404
0
count http 403
0
count http 5xx
0
next operation date
-
server bits
—
server ip
-
mp import status
20
mp rejected date
-
mp saved date
-
mp size orig
279393
mp size raw text
96933
mp inner links count
0
mp inner links status
1 (no links)
title
Electric Archaeology
description
Human Cogitation, Human Explanation
site name
Electric Archaeology
author
updated
2025-12-29 14:36:00
raw text
Electric Archaeology | Human Cogitation, Human Explanation Electric Archaeology – Human Cogitation, Human Explanation Skip to content Primary Menu Electric Archaeology Home About Shawn Graham Skip to content Electric Archaeology Human Cogitation, Human Explanation Uncategorized LLM as a discovery bridge for an API Posted on 6 Feb 2024 6 Feb 2024 by Shawn How it all goes …in which I discuss the logic and functioning of two jupyter notebooks that use Simon Willison’s LLM to act as a kind of discovery agent for a history api and an archaeology api. Both notebooks are available for copying and improving and use! I have been playing with GPT-Researcher , figuring out ways to make it use different data sources. I eventually got it to work with Open Alex , Open Context , and Chronicling America , and to write particular kinds of research reports. It was tricksy, and getting it to even start took a wee bit (tacit/hidden knowledge at play once again). I...
redirect type
30 (window.location)
block type
0 (no issues)
detected language
1 (English)
category id
AI [en] (229)
index version
2025123101
spam phrases
1
text nonlatin
0
text cyrillic
0
text characters
65535
text words
16010
text unique words
3103
text lines
936
text sentences
831
text paragraphs
195
text words per sentence
19
text matched phrases
24
text matched dictionaries
14
links self subdomains
0
links other subdomains
13 - ced.sascdn.com, llm.datasette.io, forum.obsidian.md, offshoreleaks.icij.org, chat.openai.com, cookbook.openai.com, mermaid.js.org, core.tdar.org, archaeologydataservice.ac.uk
links other domains
102 - electricarchaeology.ca, openalex.org, opencontext.org, scholar.social, tavily.com, hachyderm.io, elicit.com, carleton.ca, fulcra.design, oneusefulthing.org, openai.com, simonwillison.net, ldf.fi, mermaid.live, datasette.io, alexandriaarchive.org, dbreunig.com, metmuseum.org, samplereality.com
links spam adult
0
links spam random
0
links spam expired
0
links ext activities
1
links ext ecommerce
0
links ext finance
0
links ext crypto
0
links ext booking
0
links ext news
0
links ext leaks
0
links ext ugc
49 - s0.wp.com, widgets.wp.com, wp.me, secure.gravatar.com, s1.wp.com, wordpress.com, electricarchaeologist.files.wordpress.com, medium.com, en.wordpress.com
links ext klim
0
links ext generic
7
dol status
0
dol updated
2025-12-29 14:36:00
rss status
32 (unknown)
rss found date
2024-02-26 14:52:12
rss size orig
152553
rss items
10
rss spam phrases
0
rss detected language
1 (English)
inbefore feed id
-
inbefore status
0 (new)
sitemap status
30 (processing completed, results pushed to table crawler_sitemaps.ext_domain_sitemap_lists)
sitemap review version
2
sitemap urls count
888
sitemap urls adult
0
sitemap filtered products
0
sitemap filtered videos
2
sitemap found date
2024-02-25 00:03:28
sitemap process date
2024-11-12 22:33:06
sitemap first import date
-
sitemap last import date
-