Main

type

5 (blog/news article)

status

21 (imported old-v2, waiting for another import)

review version

0

cleanup version

0

pending deletion

0 (-)

created at

2025-12-08 12:04:43

updated at

2025-12-08 12:04:44

Address

url

https://www.tepunahamatatini.ac.nz/2018/07/30/interns-work-to-enhance-use-of-te-reo-maori/

url length

90

url crc

60827

url crc32

4116049307

location type

1 (url matches target location, page_location is empty)

canonical status

10 (verified canonical url)

canonical page id

3110283579

Source

domain id

1083656

domain tld

554

domain parts

0

originating warc id

-

originating url

https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151279867.65/warc/CC-MAIN-20250805125539-20250805155539-00980.warc.gz

source type

11 (CommonCrawl)

Server response

server ip

13.55.10.250

Publication date

2025-08-05 13:11:40

Fetch attempts

0

Original html size

59364

Normalized and saved size

20910

Content

title

Interns work to enhance use of te reo Māori | Te Pūnaha Matatini

excerpt

content

Interns work to enhance use of te reo Māori Our stories In the summer of 2016-2017, Te Hiku Media and Te Pūnaha Matatini co-funded a number of student internships – work from which led to the development of Kōrero Māori – a project to teach machines how to speak te reo Māori. One of the interns was Jamie Chow, a conjoint BComm/BEng (Honours) degree student from the University of Auckland. Jamie’s work on Te Hiku’s Data Analytics and Visualisation Project involved using online audience data to measure the performance of the organisation’s digital platform, matching it with other information such as demographics and geographical data. Internship leads to ongoing employment opportunity Following his 10-week summer internship, Jamie continued working on the project for Te Hiku in part-time employment over the course of 2017. “We kept Jamie on board,” says Te Hiku’s R&D Scientist and Engineer Keoni Mahelona. “He had the internship ...

author

updated

1767137745

Text analysis

block type

0

extracted fields

104

extracted bits

title
full content
content was extracted heuristically

detected location

0

detected language

1 (English)

category id

Other [en] (231)

index version

2025123101

paywall score

0

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

2864

text words

594

text unique words

261

text lines

1

text sentences

25

text paragraphs

1

text words per sentence

23

text matched phrases

3

text matched dictionaries

1