Main

related bits

0

processing priority

4

site type

3 (personal blog or private political site, e.g. Blogspot, Substack, also small blogs on own domains)

review version

11

html import

20 (imported)

Events

first seen date

2024-08-26 17:27:15

expired found date

-

created at

2024-08-26 17:27:15

updated at

2025-04-22 13:56:48

Domain name statistics

length

20

crc

7141

tld

2211

nm parts

0

nm random digits

0

nm rare letters

0

Connections

is subdomain of id

13642151 (wordpress.com)

previous id

0

replaced with id

0

related id

-

dns primary id

0

dns alternative id

0

lifecycle status

0 (unclassified, or currently active)

Subdomains and pages

deleted subdomains

0

page imported products

0

page imported random

0

page imported parking

0

Error counters

count skipped due to recent timeouts on the same server IP

0

count content received but rejected due to 11-799

0

count dns errors

0

count cert errors

0

count timeouts

0

count http 429

0

count http 404

0

count http 403

0

count http 5xx

0

next operation date

-

Server

server bits

server ip

-

Mainpage statistics

mp import status

20

mp rejected date

-

mp saved date

-

mp size orig

134253

mp size raw text

18436

mp inner links count

0

mp inner links status

1 (no links)

Open Graph

title

snail

description

techie librarian; meatier than a seahorse

site name

snail

author

updated

2026-03-02 00:54:52

raw text

snail | techie librarian; meatier than a seahorse Skip to primary content Skip to secondary content snail techie librarian; meatier than a seahorse Search Main menu who am i flicks books home Post navigation ← Older posts messing with pdfs Posted on 17 July 2024 by snail Reply A key outcome of attending VALA was that it gave me an environment to think about stuff outside my usual boxes. I haven’t ruminated on tech stuff well in a while. A key area I have long wanted to develop is finding better access models for content on harvested web sites . I, via work, started harvesting government websites in 2014, and a key issue then was alternative approaches to collecting digital content eg annual reports. This has remained an itch. I’ve been thinking and experimenting around this topic sporadically, at times very sporadically, for some years. Mostly badly I suspect but it’s been useful for me. For a long time, I thought the key approach was to ...

Text analysis

redirect type

30 (window.location)

block type

0 (no issues)

detected language

1 (English)

category id

Koronawirus (17)

index version

1

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

14174

text words

3233

text unique words

1032

text lines

241

text sentences

145

text paragraphs

41

text words per sentence

22

text matched phrases

0

text matched dictionaries

0

RSS

rss status

32 (unknown)

rss found date

2024-08-28 06:02:04

rss size orig

36115

rss items

10

rss spam phrases

0

rss detected language

1 (English)

inbefore feed id

-

inbefore status

0 (new)

Sitemap

sitemap status

30 (processing completed, results pushed to table crawler_sitemaps.ext_domain_sitemap_lists)

sitemap review version

1

sitemap urls count

662

sitemap urls adult

0

sitemap filtered products

0

sitemap filtered videos

0

sitemap found date

2024-08-28 04:27:43

sitemap process date

2024-08-28 04:27:44

sitemap first import date

-

sitemap last import date

-