Main

processing priority

4

site type

0 (generic, awaiting analysis)

review version

11

html import

20 (imported)

Events

first seen date

2023-10-02 05:11:22

expired found date

-

created at

2024-06-08 04:55:44

updated at

2025-12-31 23:09:35

Domain name statistics

length

15

crc

47449

tld

2688

nm parts

0

nm random digits

0

nm rare letters

0

Connections

is subdomain of id

51241385 (apache.org)

previous id

0

replaced with id

0

related id

-

dns primary id

0

dns alternative id

0

lifecycle status

0 (unclassified, or currently active)

Subdomains and pages

deleted subdomains

0

page imported products

0

page imported random

0

page imported parking

0

Error counters

count skipped due to recent timeouts on the same server IP

0

count content received but rejected due to 11-799

0

count dns errors

0

count cert errors

0

count timeouts

0

count http 429

0

count http 404

0

count http 403

0

count http 5xx

0

next operation date

-

Server

server bits

server ip

-

Mainpage statistics

mp import status

20

mp rejected date

-

mp saved date

-

mp size orig

63784

mp size raw text

33635

mp inner links count

1

mp inner links status

20 (imported)

Open Graph

title

description

image

site name

author

updated

2025-12-20 06:04:07

raw text

Apache Tika – Apache Tika Apache Tika - a content analysis toolkit The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more. You can find the latest release on the download page . Please see the Getting Started page for more information on how to start using Tika. The Parser and Detector pages describe the main interfaces of Tika and how they work. For more in-depth documentation, see our wiki , especially for tika-server . If you're interested in contributing to Tika, please see the Contributing page or send an email to the Tika development list . Tika is a project of the Apache Software Foundation , and was formerly a subproject of Apache Lucene . Latest News 28 August 2023: Apache Tika Release Apache Tika 2.9.0 has be...

Text analysis

redirect type

0 (-)

block type

0 (no issues)

detected language

1 (English)

category id

Serwisy SEC (10)

index version

2025110801

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

26193

text words

5785

text unique words

876

text lines

468

text sentences

281

text paragraphs

78

text words per sentence

20

text matched phrases

6

text matched dictionaries

6

RSS

rss path

rss status

1 (priority 1 already searched, no matches found)

rss found date

-

rss size orig

0

rss items

0

rss spam phrases

0

rss detected language

0 (awaiting analysis)

inbefore feed id

-

inbefore status

0 (new)

Sitemap

sitemap path

sitemap status

1 (priority 1 already searched, no matches found)

sitemap review version

1

sitemap urls count

0

sitemap urls adult

0

sitemap filtered products

0

sitemap filtered videos

0

sitemap found date

-

sitemap process date

-

sitemap first import date

-

sitemap last import date

-