Trifacta with Joe Hellerstein
Listen now
Description
If you haven’t encountered a data quality problem, then you haven’t yet worked on a large enough project.  Invariably, a gap exists between the state of raw data and what an analyst or machine learning engineer needs to solve their problem.  Many organizations needing to automate data preparation workflows look to Trifacta as a solution. 
More Episodes
Apache Iceberg is an open source high-performance format for huge data tables. Iceberg enables the use of SQL tables for big data, while making it possible for engines like Spark and Hive to safely work with the same tables, at the same time. Iceberg was started at Netflix by Ryan Blue and Dan...
Published 03/07/24
Starburst is a data lake analytics platform. It’s designed to help users work with structured data at scale, and is built on the open source platform, Trino. Adam Ferrari is the SVP of Engineering at Starburst. He joins the show to talk about Starburst, data engineering, and what it takes to...
Published 02/06/24