All episodes of Data Brew by Databricks

Episodes

Kumo AI & Relational Deep Learning | Data Brew | Episode 34

Published 10/14/24

LLMs: Internals, Hallucinations, and Applications | Data Brew | Season 5 | Episode 4

Our fifth season dives into large language models (LLMs), from understanding the internals to the risks of using them and everything in between. While we're at it, we'll be enjoying our morning brew. In this session, we interviewed Chengyin Eng (Senior Data Scientist, Databricks), Sam Raymond (Senior Data Scientist, Databricks), and Joseph Bradley (Lead Production Specialist - ML, Databricks) on the best practices around LLM use cases, prompt engineering, and how to adapt MLOps for LLMs...

Published 07/21/23

Data Brew by Databricks

Published 07/21/23

Demonstrate–Search–Predict Framework | Data Brew | Season 5 | Episode 3

We will dive into LLMs for our fifth season, from understanding the internals to the risks of using them and everything in between. While we’re at it, we’ll be enjoying our morning brew. In this session, we interviewed Omar Khattab - Computer Science Ph.D. Student at Stanford, creator of DSP (Demonstrate–Search–Predict Framework), to discuss DSP, common applications, and the future of NLP.

Published 06/29/23

Generative AI Risks | Data Brew | Season 5 | Episode 2

We will dive into LLMs for our fifth season, from understanding the internals to the risks of using them and everything in between. While we’re at it, we’ll be enjoying our morning brew. In this session, we interviewed Yaron Singer, CEO of Robust Intelligence, Professor of Computer Science at Harvard University, and guest of Data Brew Season 3 (our first repeat guest!). In this session, we discuss generative AI, the trends toward embracing LLMs, and how the surface area for vulnerabilities...

Published 06/08/23

John Snow Labs & SparkNLP | Data Brew | Season 5 | Episode 1

For our fifth season, we will dive into LLMs from understanding the internals to the risks of using them and everything in between. While we’re at it, we’ll be enjoying our morning brew. In this session, we interviewed David Talby who is the CTO at John Snow Labs; they help healthcare & life science companies put AI to good use. David's interests include natural language processing, applied artificial intelligence in healthcare, and responsible AI.

Published 06/01/23

Data Brew Season 4 Episode 6: Professional Athletes

For our fourth season, we focus on connected health and how data & AI augment and improve our daily health. While we’re at it, we’ll be enjoying our morning brew. Shayna Powless and Eli Ankou, professional cyclist for L39ion of Los Angeles and defensive tackle for the Buffalo Bills, respectively, provide valuable insight on how professional athletes leverage data to improve their performance and how they combine their passion for sports with the Dreamcatcher Foundation. See more at...

Published 06/09/22

Data Brew Season 4 Episode 5: Public Health: Education, Access, and Policy

For our fourth season, we focus on connected health and how data & AI augment and improve our daily health. While we’re at it, we’ll be enjoying our morning brew. Matt Willis, Marin County Public Health Officer, shares the three pillars of public health: education, access, and policy, and the critical role data plays in addressing the COVID-19 pandemic & opioid epidemic. See more at databricks.com/data-brew

Published 05/05/22

Data Brew Season 4 Episode 4: 1283 Days of Running (and Counting)

For our fourth season, we focus on connected health and how data & AI augment and improve our daily health. While we’re at it, we’ll be enjoying our morning brew. Running the length of the US every year, Alexandra Matthiesen shares her motivational secrets for running 1,283 consecutive days (and counting!) and redefining physical and mental limits. See more at databricks.com/data-brew

Published 04/14/22

Data Brew Season 4 Episode 3: Last Man Standing

For our fourth season, we focus on connected health and how data & AI augment and improve our daily health. While we’re at it, we’ll be enjoying our morning brew. Winner of the infamous Last Man Standing race (running 246 miles in 59 hours), Guillaume merges the world of competitive long-distance running with data science to push the boundaries of body and mind. See more at databricks.com/data-brew

Published 03/31/22

Data Brew Season 4 Episode 2: NBA Analytics

For our fourth season, we focus on connected health and how data & AI augment and improve our daily health. While we’re at it, we’ll be enjoying our morning brew. Alexander Powell chronicles the evolution of sports analytics and how professional sports teams use data as a competitive advantage. See more at databricks.com/data-brew

Published 03/10/22

Data Brew Season 4 Episode 1: Reducing Injury & Increasing Retention of Industrial Athletes

For our fourth season, we focus on connected health and how data & AI augment and improve our daily health. While we’re at it, we’ll be enjoying our morning brew. Globally, 38,000 people get hurt on the job every hour. In the United States alone, over $250 billion dollars is spent on workplace injury annually. Sean Petterson, founder and CEO of StrongArm Tech, discusses the role of wearable devices to reduce workplace injury and increase retention of industrial athletes. See more at...

Published 02/24/22

Data Brew Season 3 Episode 6: Open Source

For our third season, we focus on how leaders use data for change. Whether it’s building data teams or using data as a constructive catalyst, we interview subject matter experts from industry to dive deeper into these topics. For our season 3 finale, Nithya Ruff discusses the open-source ecosystem, ways to contribute to open-source projects (hint: it’s not just about the code), and how businesses can balance community and company interests. With 95% of open-source contributions coming from...

Published 10/28/21

Data Brew Season 3 Episode 5: Sustainability & Sake

For our third season, we focus on how leaders use data for change. Whether it’s building data teams or using data as a constructive catalyst, we interview subject matter experts from industry to dive deeper into these topics. We interview Junta Nakai in our most unique location yet - Brooklyn Kura - the first non-Japanese sake distillery in New York. In this episode, Junta shares the philosophical, economic, and tactical approaches to sustainability and ESG, as well as the secrets to brewing...

Published 10/14/21

Data Brew Season 3 Episode 4: Executive Education

For our third season, we focus on how leaders use data for change. Whether it’s building data teams or using data as a constructive catalyst, we interview subject matter experts from industry to dive deeper into these topics. Did you know that the average tenure of a board member is longer than the average tenure of a marriage in the United States? In this episode, Coco Brown discusses the benefits and drawbacks of the long tenures of corporate boards, their current structure, the impact of...

Published 10/07/21

Data Brew Season 3 Episode 3: 3 T’s to Securing AI Systems: Tests, tests, and more tests

For our third season, we focus on how leaders use data for change. Whether it’s building data teams or using data as a constructive catalyst, we interview subject matter experts from industry to dive deeper into these topics. What does it mean to make your machine learning system “production-ready”? Yaron Singer walks us through the infrastructure, testing procedures, and more that help make ML systems ready for the real world in this episode of Data Brew. See more at databricks.com/data-brew

Published 09/30/21

Data Brew Season 3 Episode 2: Data Culture Outside ‘The Valley’

For our third season, we focus on how leaders use data for change. Whether it’s building data teams or using data as a constructive catalyst, we interview subject matter experts from industry to dive deeper into these topics. Have you ever had a spam call automatically blocked for you? You can thank First Orion for that - in one day they blocked or scam tagged over 108 million calls - just on T-Mobile alone! In this episode, we have the pleasure to chat with Charles Morgan and Kent Welch,...

Published 09/23/21

Data Brew Season 3 Episode 1: Disrupt: Challenge your Business Assumptions

For our third season, we focus on how leaders use data for change. Whether it’s building data teams or using data as a constructive catalyst, we interview subject matter experts from industry to dive deeper into these topics. In this season opener, Elena Donio shares her experience using data and domain knowledge to disrupt the traditional service and sales compensation model. She also discusses how to build companies that scale, manage corporate cultural evolution, and the influence of...

Published 09/16/21

Data Brew Season 2 Episode 9: Data Driven Software

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more. We branch, version, and test our code, but what if we treated data like code? Tim Hunter joins us to discuss the open-source Data-Driven Software (DDS) package and how it leads to immense gains in collaboration and...

Published 07/21/21

Data Brew Season 2 Episode 8: Feature Engineering

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more. Is there ever a “one-size fits all” approach for feature engineering? Find out this and more with Amanda Casari and Alice Zheng, co-authors of the Feature Engineering for Machine Learning book. See more at...

Published 07/09/21

Data Brew Season 2 Episode 7: Interpretable Machine Learning

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more. What does it mean for a model to be “interpretable”? Ameet Talwalkar shares his thoughts on IML (Interpretable Machine Learning), how it relates to data privacy and fairness, and his research in this field. See more...

Published 07/01/21

Data Brew Season 2 Episode 6: AutoML

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more. Erin LeDell shares valuable insight on AutoML, what problems are best solved by it, its current limitations, and her thoughts on the future of AutoML. We also discuss founding and growing the Women in Machine...

Published 06/17/21

Data Brew Season 2 Episode 5: ML Applications

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more. Good machine learning starts with high quality data. Irina Malkova shares her experience managing and ensuring high-fidelity data, developing custom metrics to satisfy business needs, and discusses how to improve...

Published 06/10/21

Data Brew Season 2 Episode 4: Hyperparameter and Neural Architecture Search

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more. Liam Li is a leading researcher in the fields of hyperparameter optimization and neural architecture search, and is the author of the seminal Hyperband paper. In this session, Liam discusses the evolution of...

Published 05/13/21

Data Brew Season 2 Episode 3: Infrastructure for ML

For our second season of Data Brew, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more. Adam Oliner discusses how to design your infrastructure to support ML, from integration tests to glue code, the importance of iteration, and centralized vs decentralized data science teams. He provides valuable...

Published 05/05/21