Main

type

0 (not classified)

status

21 (imported old-v2, waiting for another import)

review version

0

cleanup version

0

pending deletion

0 (-)

created at

2025-06-07 04:30:57

updated at

2026-01-11 01:26:51

Address

url

https://civcm.psu.edu/sitemap.html

url length

34

url crc

49774

url crc32

1312080494

location type

1 (url matches target location, page_location is empty)

canonical status

2 (missing canonical tag in html)

canonical page id

-

Source

domain id

200736302

domain tld

0

domain parts

0

originating warc id

-

originating url

https://civcm.psu.edu/sitemap-misc.xml

source type

1 (sitemap)

Server response

server ip

23.185.0.253

Publication date

2025-07-18 03:14:20

Fetch attempts

1

Original html size

67039

Normalized and saved size

3126

Content

title

XML Sitemap

excerpt

content

This is a XML Sitemap which is supposed to be processed by search engines which follow the XML Sitemap standard like Ask.com, Bing, Google and Yahoo. It was generated using the WordPress content management system and the Google Sitemap Generator Plugin by Arne Brachhold. You can find more information about XML sitemaps on sitemaps.org and Google's list of sitemap programs. This file contains links to sub-sitemaps, follow them to see the actual sitemap content.

author

updated

1770920944

Text analysis

block type

0

extracted fields

104

extracted bits

title
full content
content was extracted heuristically

detected location

0

detected language

0 (awaiting analysis)

category id

-

index version

1

paywall score

0

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

380

text words

76

text unique words

55

text lines

1

text sentences

4

text paragraphs

1

text words per sentence

19

text matched phrases

0

text matched dictionaries

0