id
type
5 (blog/news article)
status
21 (imported old-v2, waiting for another import)
review version
0
cleanup version
0
pending deletion
0 (-)
created at
2025-12-05 05:11:28
updated at
2025-12-05 05:11:30
url
https://benhay.es/posts/building-pipelines-kedro/
url length
49
url crc
2187
url crc32
973998219
location type
1 (url matches target location, page_location is empty)
canonical status
2 (missing canonical tag in html)
canonical page id
-
domain id
domain tld
724
domain parts
0
originating warc id
-
originating url
https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151279890.42/warc/CC-MAIN-20250806043623-20250806073623-00385.warc.gz
source type
11 (CommonCrawl)
server ip
Publication date
2025-08-06 06:05:09
Fetch attempts
0
Original html size
40526
Normalized and saved size
38858
title
Ben Hayes - Building Data Pipelines with Kedro
excerpt
content

 
 
 
 
 Table of Contents Introduction Choosing Kedro - An explanation of what Kedro does and doesn't do Using Kedro - An example of a Kedro project Conclusion Folks have compared data to oil in the past, and while the metaphor may be tired, the process of building pipelines in both cases consumes valuable time and resources, and often results in a mess. For data pipelines, there’s a handy solution. Use kedro to clean, process, and analyze your data in neat pipelines. Kedro leverages software engineering best practices and can optimize the processing of nodes within the pipeline, but more on that later. Background and History Kedro is now an open-source project within the Linux Foundation’s AI & Data umbrella. The project, however, did not begin that way. Kedro was conceived at QuantumBlack to accelerate data pipelining and data science prototyping. There are after all repeatable and even reusab...
author
Ben Hayes
updated
1768245305
block type
0
extracted fields
109
extracted bits
featured image
article author
title
full content
content was extracted heuristically
detected location
0
detected language
1 (English)
category id
-
index version
1
paywall score
0
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
13185
text words
2608
text unique words
678
text lines
1
text sentences
111
text paragraphs
1
text words per sentence
23
text matched phrases
0
text matched dictionaries
0
links self subdomains
0
links other subdomains
0
links other domains
4
links spam adult
0
links spam random
0
links spam expired
0
links ext activities
0
links ext ecommerce
0
links ext finance
0
links ext crypto
0
links ext booking
0
links ext news
0
links ext leaks
0
links ext ugc
13
links ext klim
0
links ext generic
0
image author