id
processing priority
4
site type
3 (personal blog or private political site, e.g. Blogspot, Substack, also small blogs on own domains)
review version
11
html import
20 (imported)
first seen date
2024-08-14 05:42:16
expired found date
-
created at
2024-08-14 05:42:16
updated at
2026-01-15 10:07:14
length
24
crc
11117
tld
2211
nm parts
0
nm random digits
0
nm rare letters
0
is subdomain of id
69893241 (blogspot.com)
previous id
0
replaced with id
0
related id
-
dns primary id
0
dns alternative id
0
lifecycle status
0 (unclassified, or currently active)
deleted subdomains
0
page imported products
0
page imported random
0
page imported parking
0
count skipped due to recent timeouts on the same server IP
0
count content received but rejected due to 11-799
0
count dns errors
0
count cert errors
0
count timeouts
0
count http 429
0
count http 404
0
count http 403
0
count http 5xx
0
next operation date
-
server bits
—
server ip
-
mp import status
20
mp rejected date
-
mp saved date
-
mp size orig
67021
mp size raw text
18373
mp inner links count
10
mp inner links status
10 (links queued, awaiting import)
title
Structured Learning
description
image
site name
author
updated
2026-03-01 18:54:26
raw text
Structured Learning skip to main | skip to sidebar Structured Learning Tuesday, July 3, 2007 Corrections to ACL Anthology URLs Thanks to an alert reader, I found out that several of the paper links in my previous postings on ACL and EMNLP-CoNLL papers where incorrect. The problem was that some of the BibTeX entries in the ACL DVD distributed in Prague have wrong ACL Anthology links, and I derived these postings semi-authomatically from those BibTeX entries. I've edited the most recent posting to use the correct links. Posted by Fernando Pereira at 8:20 AM No comments: Labels: natural language processing Monday, July 2, 2007 Reposting interesting ACL and CoNLL-EMNLP papers I've now added a short comment to each paper. This list is created semi-automatically from BibDesk with a custom HTML export template and some minor post-editing. The red titles are my special picks. Frustratingly Easy Domain Adaptation H. Daume III Proceedings of the 45t...
redirect type
0 (-)
block type
0 (no issues)
detected language
1 (English)
category id
index version
1
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
14198
text words
2679
text unique words
800
text lines
362
text sentences
162
text paragraphs
24
text words per sentence
16
text matched phrases
0
text matched dictionaries
0
links self subdomains
0
links other subdomains
1 - cvlab.epfl.ch
links other domains
37 - aclweb.org, hunch.net
links spam adult
0
links spam random
0
links spam expired
0
links ext activities
6
links ext ecommerce
0
links ext finance
0
links ext crypto
0
links ext booking
0
links ext news
0
links ext leaks
0
links ext ugc
22 - blogger.com
links ext klim
0
links ext generic
0
dol status
0
dol updated
2026-03-01 18:54:26
rss status
32 (unknown)
rss found date
2024-08-20 12:46:36
rss size orig
52008
rss items
8
rss spam phrases
0
rss detected language
1 (English)
inbefore feed id
-
inbefore status
0 (new)
sitemap path
sitemap status
40 (completed successful import of reports.txt file to table in_pages)
sitemap review version
2
sitemap urls count
8
sitemap urls adult
0
sitemap filtered products
0
sitemap filtered videos
0
sitemap found date
2024-08-20 07:20:27
sitemap process date
2026-01-06 07:44:24
sitemap first import date
-
sitemap last import date
2026-01-15 10:07:14