Crawling web sites using “Data Prep Kit”

A hands-on exercise using “data-prep-kit” and storing the result as parquet files.
Introduction
Alongside with Docling, “IBM Research” open-sourced another sets of tool which could be ...