Summary
Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of developer tools to provide useful suggestions that speed up the work of engineers. Tabnine is one of the main platforms offering an AI powered assistant for software engineers. In this episode Eran Yahav shares the journey that he has taken in building this product and the ways that it enhances the ability of humans to get their work done, and when the humans have to adapt to the tool.
Announcements
Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
Your host is Tobias Macey and today I'm interviewing Eran Yahav about building an AI powered developer assistant at Tabnine
Interview
Introduction
How did you get involved in machine learning?
Can you describe what Tabnine is and the story behind it?
What are the individual and organizational motivations for using AI to generate code?
What are the real-world limitations of generative AI for creating software? (e.g. size/complexity of the outputs, naming conventions, etc.)
What are the elements of skepticism/oversight that developers need to exercise while using a system like Tabnine?
What are some of the primary ways that developers interact with Tabnine during their development workflow?
Are there any particular styles of software for which an AI is more appropriate/capable? (e.g. webapps vs. data pipelines vs. exploratory analysis, etc.)
For natural languages there is a strong bias toward English in the current generation of LLMs. How does that translate into computer languages? (e.g. Python, Java, C++, etc.)
Can you describe the structure and implementation of Tabnine?
Do you rely primarily on a single core model, or do you have multiple models with subspecialization?
How have the design and goals of the product changed since you first started working on it?
What are the biggest challenges in building a custom LLM for code?
What are the opportunities for specialization of the model architecture given the highly structured nature of the problem domain?
For users of Tabnine, how do you assess/monitor the accuracy of recommendations?
What are the feedback and reinforcement mechanisms for the model(s)?
What are the most interesting, innovative, or unexpected ways that you have seen Tabnine's LLM powered coding assistant used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI assisted development at Tabnine?
When is an AI developer assistant the wrong choice?
What do you have planned for the future of Tabnine?
Contact Info
LinkedIn
Website
Parting Question
From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email
[email protected]) with your story.
To help other people find the show please leave a review on iTunes and tell your friends and co-workers.
Links
TabNine
Technion University
Program Synthesis
Context Stuffing
Elixir
Dependency Injection
COBOL
Verilog
MidJourney
The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
Support The Machine Learning Podcast