Episodes
Let's face it, out of all those petabytes of data you've been hoarding, only a small fraction of it is creating business value for you today. When you scan the same data multiple times and transfer it over the wire, you're wasting time, compute cycles, and ultimately money. This gets worse when you're pulling data across regions or clouds from disaggregate Trino clusters. In situations like these, caching solutions like Alluxio can make a tremendous impact on the latency and cost of your...
Published 01/02/23
We're going to discuss all of the awesome sessions that happened during Trino Summit this year. Manfred, Cole, and I will be joined by Martin, Dain, Brian Zhan, and Claudius for their, perspective and what they found most interesting about the summit. We also dive into stats around the summit and some exciting topics discussed off-camera. We'll also dive into some key takeaways from the Trino Contributor Congregation that took place the day after and some of the topics we went over there. -...
Published 12/20/22
- Releases: 14:43 - Concept of the episode: Intro to Hudi and the Hudi connector: 22:29 - Concept of the episode: Merge on read and copy on write tables: 28:28 - Concept of the episode: Hudi metadata table: 39:24 - Concept of the episode: Hudi data layout: 46:39 - Concept of the episode: Robinhood Trino and Hudi use cases: 51:12 - Concept of the episode: Current state and roadmap for the Hudi connector: 1:03:15 - Pull request of the episode: PR 14445: Fault-tolerant execution for PostgreSQL...
Published 11/16/22
Join us for this next episode of the broadcast, where we bring back Ryan Blue, the creator of Iceberg, to discuss some of the latest happenings in the Iceberg community. We also discuss and demo a bunch of new features that have come out in the Trino Iceberg connector. We also have a new guest, Tabular Developer Advocate Sam Redai, shedding light on this incredible community as well! Since the first episodes, Iceberg has finalized the v2 spec and added a lot of new features along the way....
Published 09/12/22
In this episode we sit down with engineers, Steve Morgan and Edward Morgan, to discuss how they use Trino at Raft. Raft provides consulting services and is particularly skilled at DevSecOps. One particular challenge they face is dealing with fragmented government infrastructure. In this episode, we dive in to learn how Trino enables Raft to supply government sector clients with a data fabric solution. Raft takes a special stance on using and contributing to open source solutions that run well...
Published 09/08/22
We'll be doing a more focused look at a specific feature that's being added to Trino: polymorphic table functions. We're excited to talk about what they do, where we are so far, where we're going, and how you can leverage them to make Trino better than ever! Show Notes: https://trino.io/episodes/38.html Show Page: https://trino.io/broadcast/
Published 08/17/22
- Concept of the episode: How to strengthen the Trino community: 15:07 - Concept of the episode: Pull request process: 30:33 - Concept of the episode: Impact of community and developer experience: 33:07 - Concept of the episode: Community metrics for better decision making: 44:00 - Pull requests of the episode: PR 12259: Support updating Iceberg table partitioning: 1:09:42 - Demo of the episode: Iceberg table partition migrations: 1:16:00 - Question of the episode: Can I force a pushdown...
Published 08/04/22
As Trino preps to jump to Java 17, we discuss the latest features added Java 11 to Java 17, talk with Martin through a few of the potential uses of new features like the Vector API, language improvements, and G1GC speedups, and finally, we will dive into discussing some of the features that we'll be implementing in the upcoming months under a new project in Trino! - Intro song: 00:00 - Intro: 00:36 - Releases: 8:17 - Question of the episode: Will Trino be making a vectorized C++ version of...
Published 06/16/22
- Intro song: 00:00 - Intro: 00:32 - Releases: 4:22 - Concept of the episode: Packaging Trino: 21:28 - Additional topic of the episode: Modernizing Trino with Java 17: 46:49 - Pull requests of the episode: Worker stats in the Web UI: 55:25 - Question of the episode: HDFS supported by Delta Lake connector?: 1:01:52 - Demo of the episode: Tarball installation and new Web UI feature: 1:05:58 Show Notes: https://trino.io/episodes/35.html Show Page: https://trino.io/broadcast/
Published 05/24/22
News from the Trino releases 372, 373, and 374, and an update on Project Tardigrade are the start. Then we dive into the details of the new Delta Lake connector contributed to Trino by Starburst. - Intro song: 00:00 - Intro: 00:37 - Releases: 2:05 - Project Tardigrade update: 9:21 - Concept of the episode: A new connector for Delta Lake object storage. 18:37 - Pull requests of the episode: Add Delta Lake connector and documentation. 26:10 - Demo of the episode: Delta Lake connector in...
Published 03/18/22
- Concept of the month: High Availability with Trino: 20:23 - PR of the month: PR 8956 Add support for external db for schema management in mongodb connector: 1:04:09 - Bonus PR of the month: PR 8202 Metadata for alias in elasticsearch connector only uses the first mapping: 1:15:15 - Demo of the month: Trino Fiddle: A tool for easy online testing and sharing of Trino SQL problems and their solutions: 1:32:08 - Question of the month: Does trino hive connector supports CarbonData?:...
Published 02/28/22
While Trino has been proven to run batch analytic workloads at scale, many have avoided long-running batch jobs in fear of query failure. Join this month's broadcast discussing the project introducing granular fault-tolerance to Trino. Codenamed Project Tardigrade, it is being thoughtfully crafted to maintain the speed advantage that Trino has over other query engines while increasing the resiliency of queries. We will discuss some of the design proposals being considered with Tardigrade...
Published 02/18/22
Concept of the week: ReplicaSets, Deployments, and Services Demo of the month: Deploy Trino k8s to Amazon EKS PR of the week: PR 8921: Support TRUNCATE TABLE statement Question of the week: How do I run system.sync_partition_metadata with different catalogs? Show Notes: https://trino.io/episodes/31.html Show Page: https://trino.io/broadcast/
Published 02/18/22
Concept of the week: Trino and dbt, a hot data mesh PR of the week: Partitioned table tests and fixed PR 9757 Question of the week: What’s the difference between location and external_location? Show Notes: https://trino.io/episodes/30.html Show Page: https://trino.io/broadcast/
Published 12/20/21
Concept of the week: What is Trino? PR of the week: PR 8821 Add HTTP/S query event logger Question of the week: Does the Hive connector depend on the Hive runtime? Show Notes: https://trino.io/episodes/29.html Show Page: https://trino.io/broadcast/
Published 12/20/21
Concept of the week: Event Stream abstractions and Pravega Demo of the week: Event Stream abstractions and Pravega PR of the week: Pravega presto-connector PR 49 Question of the week: What is the point of Trino Forum and what is the relationship to Trino Slack? Show Notes: https://trino.io/episodes/28.html Show Page: https://trino.io/broadcast/
Published 11/17/21
Concept of the week: LakeFS and Git on Object Storage Demo of the week: Running Trino on LakeFS PR of the week: PR 8762 Add query error info to cluster overview page in web UI Question of the week: Why are deletes so limited in Trino? Show Notes: https://trino.io/episodes/27.html Show Page: https://trino.io/broadcast/
Published 11/17/21
Concept of the week: Data discovery and Amundsen Concept of the week: Amundsen Architecture Concept of the week: Amundsen as a subcomponent to data mesh PR of the week: Index Trino Views Question of the week: Can I add a UDF without restarting Trino? Show Notes: https://trino.io/episodes/26.html Show Page: https://trino.io/broadcast/
Published 11/17/21
Concept of the week: Change Data Capture Concept of the week: Debezium Concept of the week: Debezium + Trino at Zomato PR of the week: PR 4140 Implement aggregation pushdown in Pinot Question of the week: Is there an array function that flattens a row into three rows? Show Notes: https://trino.io/episodes/25.html Show Page: https://trino.io/broadcast/
Published 10/02/21
Concept of the week: K8s architecture: Containers, Pods, and kubelets PR of the week: PR 11 Merge contributor version of k8s charts with the community version Demo: Running the Trino charts with kubectl Show Notes: https://trino.io/episodes/24.html Show Page: https://trino.io/broadcast/
Published 09/17/21
Concept of the week: Row pattern matching and MATCH_RECOGNIZE PR of the week: PR 8348 Document row pattern recognition in window Demo: Showing MATCH_RECOGNIZE functionality by example Question of the week: How do you tag a list of rows with custom periodic rules? Show Notes: https://trino.io/episodes/23.html Show Page: https://trino.io/broadcast/
Published 08/09/21
This episode will cover LinkedIn's journey to upgrade from PrestoSQL to Trino and some of the operational challenges LinkedIn's engineering team has faced at their scale. Concept of the week: Trino usage at LinkedIn Concept of the week: Trino hardware and operational scale Concept of the week: Challenges operating at scale Concept of the week: Open source at LinkedIn Concept of the week: PrestoSQL to Trino upgrade challenges Concept of the week: PrestoSQL to Trino upgrade steps PR of the...
Published 08/03/21
- Question of the week: Can dbt connect to different databases in the same project? - Concept of the week: What is dbt? - Concept of the week: dbt + Trino - Demo: Querying Trino from a dbt project - PR of the week: PR 8283 Externalised destination table cache expiry duration for BigQuery Connector Show Notes: https://trino.io/episodes/21.html Show Page: https://trino.io/broadcast/
Published 07/15/21
Concept of the week: Trino for the Trinewbie Concept of the week: Marius' Journey Concept of the week: Contributing to Trino PR of the week: PR 8135 Set default time zone for the current session Demo: Contributing to Trino Question of the week: How do I search nested objects in Elasticsearch from Trino? We didn't have time to run through the demo. I created another video outside of the show if you want need help with the contribution process:...
Published 06/29/21
Concept of the week: Ingesting into Iceberg with Pulsar and Flink at BlueCat: 17:30 Concept of the week: BlueCat Overview: 20:31 Concept of the week: Single Tenant to Multi-Tenant: 21:33 Concept of the week: Pre-Iceberg: 26:13 Concept of the week: Iceberg: 39:29 PR of the week: PR 1905 Add format_number function: 1:01:55 Demo: Showing the format_number functionality: 1:04:38 Question of the week: How do I search nested objects in Elasticsearch from Trino?: 1:08:54 Show Notes: ...
Published 06/11/21