Modern Web Scraping with Erez Naveh
Listen now
Description
Today it’s estimated there are over 1 billion websites on the internet. Much of this content is optimized to be viewed by human eyes, not consumed by machines. However, creating systems to automatically parse and structure the web greatly extends its utility, and paves the way for innovative solutions and applications. The industry of web
More Episodes
Apache Iceberg is an open source high-performance format for huge data tables. Iceberg enables the use of SQL tables for big data, while making it possible for engines like Spark and Hive to safely work with the same tables, at the same time. Iceberg was started at Netflix by Ryan Blue and Dan...
Published 03/07/24
Starburst is a data lake analytics platform. It’s designed to help users work with structured data at scale, and is built on the open source platform, Trino. Adam Ferrari is the SVP of Engineering at Starburst. He joins the show to talk about Starburst, data engineering, and what it takes to...
Published 02/06/24