Episodes
Summary Large Language Models (LLMs) have rapidly captured the attention of the world with their impressive capabilities. Unfortunately, they are often unpredictable and unreliable. This makes building a product based on their capabilities a unique challenge. Jignesh Patel is building DataChat to bring the capabilities of LLMs to organizational analytics, allowing anyone to have conversations with their business data. In this episode he shares the methods that he is using to build a product...
Published 03/03/24
Published 03/03/24
Summary Machine learning is a powerful set of technologies, holding the potential to dramatically transform businesses across industries. Unfortunately, the implementation of ML projects often fail to achieve their intended goals. This failure is due to a lack of collaboration and investment across technological and organizational boundaries. To help improve the success rate of machine learning projects Eric Siegel developed the six step bizML framework, outlining the process to ensure that...
Published 02/18/24
Summary One of the most time consuming aspects of building a machine learning model is feature engineering. Generative AI offers the possibility of accelerating the discovery and creation of feature pipelines. In this episode Colin Priest explains how FeatureByte is applying generative AI models to the challenge of building and maintaining machine learning pipelines. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it...
Published 02/11/24
Summary Every business develops their own specific workflows to address their internal organizational needs. Not all of them are properly documented, or even visible. Workflow automation tools have tried to reduce the manual burden involved, but they are rigid and require substantial investment of time to discover and develop the routines. Boaz Hecht co-founded 8Flow to iteratively discover and automate pieces of workflows, bringing visibility and collaboration to the internal organizational...
Published 01/28/24
Summary Machine learning and AI applications hold the promise of drastically impacting every aspect of modern life. With that potential for profound change comes a responsibility for the creators of the technology to account for the ramifications of their work. In this episode Nicholas Cifuentes-Goodbody guides us through the minefields of social, technical, and ethical considerations that are necessary to ensure that this next generation of technical and economic systems are equitable and...
Published 01/28/24
Summary Building machine learning systems and other intelligent applications are a complex undertaking. This often requires retrieving data from a warehouse engine, adding an extra barrier to every workflow. The RelationalAI engine was built as a co-processor for your data warehouse that adds a greater degree of flexibility in the representation and analysis of the underlying information, simplifying the work involved. In this episode CEO Molham Aref explains how RelationalAI is designed,...
Published 12/31/23
Summary Machine learning and generative AI systems have produced truly impressive capabilities. Unfortunately, many of these applications are not designed with the privacy of end-users in mind. TripleBlind is a platform focused on embedding privacy preserving techniques in the machine learning process to produce more user-friendly AI products. In this episode Gharib Gharibi explains how the current generation of applications can be susceptible to leaking user data and how to counteract those...
Published 11/22/23
Summary Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of developer tools to provide useful suggestions that speed up the work of engineers. Tabnine is one of the main platforms offering an AI powered assistant for software engineers. In this episode Eran Yahav shares the journey that he has taken in building this product and the ways that it enhances the ability of humans to get their work done, and...
Published 11/13/23
Summary Software systems power much of the modern world. For applications that impact the safety and well-being of people there is an extra set of precautions that need to be addressed before deploying to production. If machine learning and AI are part of that application then there is a greater need to validate the proper functionality of the models. In this episode Erez Kaminski shares the work that he is doing at Ketryx to make that validation easier to implement and incorporate into the...
Published 11/08/23
Summary Large language models have gained a substantial amount of attention in the area of AI and machine learning. While they are impressive, there are many applications where they are not the best option. In this episode Piero Molino explains how declarative ML approaches allow you to make the best use of the available tools across use cases and data formats. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from...
Published 10/24/23
Summary Artificial Intelligence is experiencing a renaissance in the wake of breakthrough natural language models. With new businesses sprouting up to address the various needs of ML and AI teams across the industry, it is a constant challenge to stay informed. Matt Turck has been compiling a report on the state of ML, AI, and Data for his work at FirstMark Capital. In this episode he shares his findings on the ML and AI landscape and the interesting trends that are...
Published 10/15/23
Summary A core challenge of machine learning systems is getting access to quality data. This often means centralizing information in a single system, but that is impractical in highly regulated industries, such as healthchare. To address this hurdle Rhino Health is building a platform for federated learning on health data, so that everyone can maintain data privacy while benefiting from AI capabilities. In this episode Ittai Dayan explains the barriers to ML in healthcare and how they have...
Published 09/11/23
Summary Satellite imagery has given us a new perspective on our world, but it is limited by the field of view for the cameras. Synthetic Aperture Radar (SAR) allows for collecting images through clouds and in the dark, giving us a more consistent means of collecting data. In order to identify interesting details in such a vast amount of data it is necessary to use the power of machine learning. ICEYE has a fleet of satellites continuously collecting information about our planet. In this...
Published 06/17/23
Summary The focus of machine learning projects has long been the model that is built in the process. As AI powered applications grow in popularity and power, the model is just the beginning. In this episode Josh Tobin shares his experience from his time as a machine learning researcher up to his current work as a founder at Gantry, and the shift in focus from model development to machine learning systems. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about...
Published 05/29/23
Summary Machine learning models have predominantly been built and updated in a batch modality. While this is operationally simpler, it doesn't always provide the best experience or capabilities for end users of the model. Tecton has been investing in the infrastructure and workflows that enable building and updating ML models with real-time data to allow you to react to real-world events as they happen. In this episode CTO Kevin Stumpf explores they benefits of real-time machine learning and...
Published 03/09/23
Summary Shopify uses machine learning to power multiple features in their platform. In order to reduce the amount of effort required to develop and deploy models they have invested in building an opinionated platform for their engineers. They have gone through multiple iterations of the platform and their most recent version is called Merlin. In this episode Isaac Vidas shares the use cases that they are optimizing for, how it integrates into the rest of their data platform, and how they...
Published 02/02/23
Summary All data systems are subject to the "garbage in, garbage out" problem. For machine learning applications bad data can lead to unreliable models and unpredictable results. Anomalo is a product designed to alert on bad data by applying machine learning models to various storage and processing systems. In this episode Jeremy Stanley discusses the various challenges that are involved in building useful and reliable machine learning models with unreliable data and the interesting problems...
Published 01/24/23
Summary Building a machine learning model one time can be done in an ad-hoc manner, but if you ever want to update it and serve it in production you need a way of repeating a complex sequence of operations. Dagster is an orchestration engine that understands the data that it is manipulating so that you can move beyond coarse task-based representations of your dependencies. In this episode Sandy Ryza explains how his background in machine learning has informed his work on the Dagster project...
Published 12/02/22
Summary Machine learning is a data-hungry approach to problem solving. Unfortunately, there are a number of problems that would benefit from the automation provided by artificial intelligence capabilities that don’t come with troves of data to build from. Christopher Nguyen and his team at Aitomatic are working to address the "cold start" problem for ML by letting humans generate models by sharing their expertise through natural language. In this episode he explains how that works, the...
Published 09/28/22
Summary Data is one of the core ingredients for machine learning, but the format in which it is understandable to humans is not a useful representation for models. Embedding vectors are a way to structure data in a way that is native to how models interpret and manipulate information. In this episode Frank Liu shares how the Towhee library simplifies the work of translating your unstructured data assets (e.g. images, audio, video, etc.) into embeddings that you can use efficiently for machine...
Published 09/21/22
Summary Because machine learning models are constantly interacting with inputs from the real world they are subject to a wide variety of failures. The most commonly discussed error condition is concept drift, but there are numerous other ways that things can go wrong. In this episode Wojtek Kuberski explains how NannyML is designed to compare the predicted performance of your model against its actual behavior to identify silent failures and provide context to allow you to determine whether...
Published 09/14/22
Summary Using machine learning in production requires a sophisticated set of cooperating technologies. A majority of resources that are available for understanding how to design and operate these platforms are focused on either simple examples that don’t scale, or over-engineered technologies designed for the massive scale of big tech companies. In this episode Jacopo Tagliabue shares his vision for "ML at reasonable scale" and how you can adopt these patterns for building your own...
Published 09/10/22
Summary The increasing sophistication of machine learning has enabled dramatic transformations of businesses and introduced new product categories. At Assembly AI they are offering advanced speech recognition and natural language models as an API service. In this episode founder Dylan Fox discusses the unique challenges of building a business with machine learning as the core product. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how...
Published 09/09/22
Summary The majority of machine learning projects that you read about or work on are built around batch processes. The model is trained, and then validated, and then deployed, with each step being a discrete and isolated task. Unfortunately, the real world is rarely static, leading to concept drift and model failures. River is a framework for building streaming machine learning projects that can constantly adapt to new information. In this episode Max Halford explains how the project works,...
Published 08/26/22