Deep Papers
Listen now
More Episodes
Due to the cumbersome nature of human evaluation and limitations of code-based evaluation, Large Language Models (LLMs) are increasingly being used to assist humans in evaluating LLM outputs. Yet LLM-generated evaluators often inherit the problems of the LLMs they evaluate, requiring further...
Published 05/13/24
This week we explore ReAct, an approach that enhances the reasoning and decision-making capabilities of LLMs by combining step-by-step reasoning with the ability to take actions and gather information from external sources in a unified framework. To learn more about ML observability, join the...
Published 04/26/24
This week, we’ve covering Amazon’s time series model: Chronos. Developing accurate machine-learning-based forecasting models has traditionally required substantial dataset-specific tuning and model customization. Chronos however, is built on a language model architecture and trained with billions...
Published 04/04/24