Open-Source Data Catalog Amundsen with Mark Grover @ Stemma
Listen now
Description
In this episode of Building The Backend we hear from Mark Grover founder @ Stemma, co-creator of Amundsen. Stemma is a fully managed data catalog, powered by the leading open-source data catalog, Amundsen. Below are top 3 value bombs:  Automated data catalogs are critical to help wrangle the growing data across organizations. (i.e. Being able to identify out of 150 columns on this table only 10 are being used downstream)Tribal knowledge and context cannot be automated - data catalogs cannot be 100% automated. Amundsen is an open-source data catalog originally created at Lyft. Stemma has created a managed version of Amundsen. Help me improve the podcast by completing this 60 second survey: https://buildingthebackend.com/survey
More Episodes
In this episode we speak with Justin Borgman, Chairman & CEO at Starburst, which is based on open source Trino (formerly PrestoSQL) and was recently valued at $3.35 billion after securing their series D funding.  In this episode we discuss convergence of DW’s / DL's, why data lakes fail and...
Published 03/15/22
In this episode we speak with Paul Singman Developer Advocate at Treeverse / LakeFS. LakeFS is an open source project  that allows you to transform your object storage into a Git-like repository.  Top 3 takeaways LakeFS enables use cases like debugging to quickly view historical versions of your...
Published 03/01/22