37 episodes

Welcome to The Voice Box, a podcast all about the fascinating world of voice and speech technology! Every week, we bring you in-depth interviews with experts in the field, covering everything from the latest advances in natural language processing to the ethical implications of voice assistants.


Whether you're a developer working on your own voice-powered project, a business trying to elevate your customer experience through conversational AI, Bots or Automated Quality Controls or simply someone who's interested in the future of how we interact with technology, The Voice Box has something for you. Our guests come from a wide range of backgrounds, including academia, industry, and startups, giving you a well-rounded perspective on the state of the field.


In each episode, we delve into the business and technical details of how voice and speech technology works, exploring topics like machine learning, signal processing, natural language understanding and processing and linguistics. But we also go beyond the nuts and bolts to discuss the broader implications of these technologies for society. How will voice assistants change the way we live and work? What are the ethical considerations surrounding their use? How will voice biometrics secure and protect our accounts? We tackle these and other big questions on The Voice Box.


So, join us as we explore the exciting and rapidly evolving world of voice and speech technology. With The Voice Box, you'll stay up to date on the latest developments and gain a deeper understanding of the technologies shaping our future.


The Voice Box is hosted by brothers Jeff and Darin Adams. Jeff has a 30-year history working in speech & language technology. He led R&D teams at Dragon, Nuance, Yap, and Amazon before founding Cobalt Speech & Language. Darin has a 30-year history in television broadcasting and is a consummate interviewer. With Jeff and Darin, you will always be informed and entertained.


The Voice Box is sponsored by Cobalt Speech and Language, a leading provider of voice technology to companies around the world. Drop us a line to explore how we can help you win with voice: info@cobaltspeech.com


Podcast theme music is graciously provided by The Garden Plot. (Spotify)

The Voice Box Jeff Adams

    • Technology
    • 5.0 • 7 Ratings

Welcome to The Voice Box, a podcast all about the fascinating world of voice and speech technology! Every week, we bring you in-depth interviews with experts in the field, covering everything from the latest advances in natural language processing to the ethical implications of voice assistants.


Whether you're a developer working on your own voice-powered project, a business trying to elevate your customer experience through conversational AI, Bots or Automated Quality Controls or simply someone who's interested in the future of how we interact with technology, The Voice Box has something for you. Our guests come from a wide range of backgrounds, including academia, industry, and startups, giving you a well-rounded perspective on the state of the field.


In each episode, we delve into the business and technical details of how voice and speech technology works, exploring topics like machine learning, signal processing, natural language understanding and processing and linguistics. But we also go beyond the nuts and bolts to discuss the broader implications of these technologies for society. How will voice assistants change the way we live and work? What are the ethical considerations surrounding their use? How will voice biometrics secure and protect our accounts? We tackle these and other big questions on The Voice Box.


So, join us as we explore the exciting and rapidly evolving world of voice and speech technology. With The Voice Box, you'll stay up to date on the latest developments and gain a deeper understanding of the technologies shaping our future.


The Voice Box is hosted by brothers Jeff and Darin Adams. Jeff has a 30-year history working in speech & language technology. He led R&D teams at Dragon, Nuance, Yap, and Amazon before founding Cobalt Speech & Language. Darin has a 30-year history in television broadcasting and is a consummate interviewer. With Jeff and Darin, you will always be informed and entertained.


The Voice Box is sponsored by Cobalt Speech and Language, a leading provider of voice technology to companies around the world. Drop us a line to explore how we can help you win with voice: info@cobaltspeech.com


Podcast theme music is graciously provided by The Garden Plot. (Spotify)

    David Forman - Clarity Creative

    David Forman - Clarity Creative

    David Forman is a visionary entrepreneur and digital marketing expert, renowned for his innovative use of speech and voice technology to revolutionize business operations. 


    With a background in corporate management and a passion for maximizing efficiency, David has consistently sought out cutting-edge solutions to streamline processes and drive success.


    As the founder of Clarity Creative, a leading digital marketing agency, David has spearheaded initiatives to leverage speech recognition technology, notably utilizing Otter.ai to record and transcribe meetings seamlessly. 


    His forward-thinking approach has not only transformed how his team communicates and collaborates but has also significantly enhanced client satisfaction and project turnaround times.


    David's philosophy of "productive laziness" underscores his commitment to finding efficient solutions to complex challenges, inspiring his team to embrace innovation and prioritize effectiveness in their endeavors. 


    Beyond his entrepreneurial pursuits, David is dedicated to sharing his expertise and insights, actively engaging with industry peers to explore the vast potential of speech and voice technology in diverse contexts.


    With a keen focus on continual improvement and a drive to push the boundaries of what is possible, David Forman stands at the forefront of the digital marketing landscape, guiding businesses toward greater efficiency, productivity, and success through the power of speech technology.






    Main Takeaways


    1. Efficiency through Technology: David demonstrates how embracing speech and voice technology, exemplified by Otter.ai, can significantly enhance productivity and streamline business operations, particularly in recording and transcribing meetings with precision and ease.


    2. Productive Laziness Philosophy: David's concept of "productive laziness" highlights the importance of seeking efficient solutions to tasks and processes, ultimately freeing up time for more meaningful endeavors and driving progress in both personal and professional spheres.


    3. Future of Speech Technology: Looking ahead, David envisions further advancements in speech technology, including automatic summarization of meetings and task assignment based on transcriptions, underscoring the transformative potential of AI-driven solutions in reshaping work dynamics and communication methods.






    Key Timestamps


    [00:01:41] Introduction of David Forman by Darin and Jeff, highlighting his expertise in leveraging speech and voice technology to transform business operations.


    [00:06:21] David explains the concept of "productive laziness" from his previous job, emphasizing the importance of finding efficient solutions to tasks and processes.


    [00:09:55] David discusses his vision for the future of speech technology, including automatic summarization of meetings and task assignment based on transcriptions.


    [00:15:57] Jeff compares Cobalt's speech technology with Otter.ai, highlighting Cobalt's focus on providing core technology for integration into other platforms and discussing ongoing projects in emotion and speaker recognition.


    [00:17:36] Advancements in speech technology, such as task assignment based on transcriptions, and emphasizes the importance of characterizing speakers' voices for enhanced functionality, offering a glimpse into the future of Cobalt's developments.






    Guest Quotes


    1. "Productive laziness is about finding more time to do things efficiently, not wasting time on unnecessary tasks." - David Forman


    2. "Speech and voice technology have unlocked a new level of efficiency in our meetings, allowing us to focus on meaningful tasks rather than tedious note-taking." - David Forman


    3. "Embrace innovation and prioritize effectiveness in your endeavors to drive progress and success in both personal and professional spheres." - David Forman

    • 18 min
    Alyson Pace - Voiceitt

    Alyson Pace - Voiceitt

    • 28 min
    Manuj Aggarwal - TetraNoodle Technologies

    Manuj Aggarwal - TetraNoodle Technologies

    Manuj Aggarwal is an expert in artificial intelligence, having worked in the field since 2007. 


    He believes that AI is not just coming, it's been changing the world for at least the past 15 years. From smartphones to voice-enabled devices like Alexa, AI is already all around us. 


    Manuj sees a future where AI will not take over humanity or cause mass unemployment, but will instead displace some jobs while creating new and more satisfying opportunities for those who upskill themselves and become early adopters. He is excited about the potential of AI to change the world and looks forward to helping others navigate the transformation it will bring.


    In our interview with him, Manuj shares his optimistic view on AI and its potential to create a harmonious utopian world. He talks about how AI can afford humanity almost anything and build more harmonious relationships. 


    He envisions a more fulfilling lifestyle as a result of AI and how AI technology is already present in every device from smartphones to voice assistants. 


    We also discuss how AI will lead to fundamental shifts in how we think about human life, creating hyper-personalized customer service and even personalized medicine and healthcare.  


    Listen to this episode to discover more about the potential benefits of AI, the importance of enhancing voice and speech, and the significance of having an open mind when it comes to adapting new tech. 






    Key Takeaways


    The positive potential of AI in creating a utopian world Efficiency and cost-effectiveness of AI in producing resources AI's ability to simulate all human senses, including speech, and detect COVID through coughs AI's presence in common technology and its optimization of the technology we currently useThe displacement of some jobs with the introduction of new technology Machine-led thinking and increased value placed on emotional intelligence Personalization through AI and its potential use in personalized medicine Darin’s personal story and fear of the unknown with AI





    Timestamps


    [00:00:22] Darin’s "Viper" story.


    [00:06:10] AI mimics human senses, detects COVID via cough.


    [00:08:21] Taking action for success.


    [00:11:10] Enhance your voice, and how you can build a community and solve problems.


    [00:15:46] AI recognizes diseases through coughs.


    [00:18:02] Voice recognition technology enables personalized service.






    Quotes


    The Impact of AI on Jobs: "AI is not going to take over humanity. It is not going to cause mass unemployment. Yes, there will be some job displacement, but that always happens when a new groundbreaking technology is introduced."The Future of Emotional Intelligence: "We are entering what I call a post thinking era...we are going to be valuing emotional intelligence more than our intellectual intelligence."





    Connect with Manuj 


    LinkedIn - https://www.linkedin.com/in/manujaggarwal/?originalSubdomain=ca 


    Website - https://www.manujaggarwal.com/ 


    Twitter - https://twitter.com/manujagro 

    • 21 min
    Allison Smith - The IVR Voice

    Allison Smith - The IVR Voice

    Allison Smith is a well-known voice actress who is famous for her work in the entertainment industry. She has provided her voice for several automated systems, including hotel wake-up calls and workout apps. 


    A great story that Allison shares is about how she once experienced the oddity of hearing her voice wake her up in a hotel room. Allison's husband even downloaded a fitness app with her voice to motivate him at the gym, but eventually switched to a different voice due to her constant encouragement. 


    Despite these amusing experiences, Allison Smith continues to be a highly sought-after voice artist in the industry.


    On today’s episode, we discuss the various advancements in text to speech technology. 


    We touch on the possibility of developing a system that can detect someone's truthfulness and the challenges of doing so. 


    We also talk about ChatGPT, which is capable of mimicking a human voice flawlessly. Allison suggests that a hybrid approach combining both human and AI voices will be the future of the industry. 






    Key Takeaways


    Detecting lies and synthetic voices: Allison discusses the challenges of developing a system that can detect when someone is lying or telling the truth, as people have different ways of interpreting what sounds truthful. They also talk about the rise of new synthetic voices that mimic human speech and are used in entertainment, language localization, and other areas.Good IVR and vocal inflections: Allison explains the importance of having good interactive voice response (IVR) prompts that flow naturally and sound like a human conversation. They also discuss the importance of using different vocal inflections for numbers, dates, and other information.Voice-over experience and AI voices: Allison’s experience as a voice-over artist for various clients, including medical ads that require them to sound cheerful while listing side effects. They also talk about the evolution of text-to-speech technology and the emergence of Chat GPT, which can mimic human voices flawlessly. The speaker predicts that a hybrid approach combining human and AI voices will be the future of the industry, and mentions having a voice clone built based on their own voice.Emotional metrics and speech synthesis: Allison discusses the use of bots to analyze emotional metrics in callers' voices to measure urgency in crisis situations, and the possibility of measuring fertility by analyzing the sound of women's voices. 





    Timestamps


    [00:00:00] TV job led to tanning salon recording.


    [00:06:22] Text-to-speech technology advances the alarm voiceover industry.


    [00:10:14] Recording Cepstral speech from script fragments.


    [00:14:10] Technology measures emotion and fertility in voice.


    [00:17:48] Tech-created better voices, from parametric to concatenative.


    [00:22:17] Synthetic voices used in the entertainment localization industry.


    [00:24:25] Detecting truth and lies is difficult.






    Quotes


    The Trend in AI Voices: "They [AI developers] want to be very conversational... they want it to sound that casual and that conversational."





    The Future of Emotion Detection: "I was really, absolutely blown away by some of the things that they can do. For example, there was one speaker that was talking about if somebody were to call into a crisis line, and they would have Bots that would actually gauge exactly how urgent their request is just by measuring the metrics of their emotion in their voice, which is astounding."





    Connect with Allison


    LinkedIn -  https://www.linkedin.com/in/allisonsmith3/ 


    Website - https://www.theivrvoice.com/ 


    Twitter - https://twitter.com/voicegal 

    • 26 min
    Jillian Domingue - Lucero

    Jillian Domingue - Lucero

    Jillian Domingue is the founder and CEO of Lucero, a youth-driven, therapist-approved gamified wellness app for tweens, teens, and their crew


    She has a Bachelor’s degree in Human Development and Family Sciences from The University of Texas and over a decade of experience building programs, products, and services to improve the lives of individuals and families. 


    Her experience as a foster mom and daily life as an adoptive mom to two young children inspires and influences her work developing Lucero. She generally enjoys working on big ideas that push the boundaries of what has been done before to maximize positive social outcomes. 


    Jilian also believes that we can find better ways to balance purpose with profit and that if the right technologies, people, ideas, and business models come together -- we can, and will change the world for the better.






    Key Takeaways


    We discuss the idea behind the creation of Lucero, and how Jilian and her team at Lucero help the youth in becoming more self-aware of their feelings and emotionsCreating great conversations between tweens and adults via game-based activitiesHow Jilian is working alongside experienced, creative game creators and therapists to ensure they’re able to impact as many framilies (any combination of youth and adults who want to radically support each other) as possibleThe massive milestone Jilian hopes to cover, and how voice-to-text technology can help Lucero in ensuring faster and more effective communication between little kids and wellness tech





    Connect with Jilian


    Website - https://lucerospeaks.com/ 


    LinkedIn - https://www.linkedin.com/in/jillian-domingue/ 

    • 24 min
    Itay Baruchi - MyndYou

    Itay Baruchi - MyndYou

    Itay is fascinated by how technology and neuroscience can be harnessed to develop new and disruptive tools for healthcare systems.


    Itay did his Ph.D. in Physics and Neuroscience and has over 20 years of experience in inventing and developing cutting-edge technologies. As a researcher, developer, and founder at a variety of startup companies in different sectors such as optics and renewable energy, Itay has an extensive background in leading multidisciplinary technological teams from an idea stage up to a working product.


    Interested in the idea that cognitive monitoring can improve healthcare systems and enhance diagnostics and care, he leads the technical efforts at MyndYou.






    Key Takeaways


    Itay describes the idea behind the creation of the AI-powered platform called MyEleanor, and how she worksWho is Eleanor Intended for and what section of the population does she majorly check in on?Itay shares with us the kind of questions MyEleanor asks in order to help her collect data from verbal cues in order to detect deteriorating health.Success stories of patients who have built a great relationship with MyEleanorWith the health system in the US incentivizing to reduce readmission rates, we discuss why MyEleanor would be an effective way of significantly reducing patient readmissions and freeing up hospital resources for those who really need themWho benefits more from an Eleanor call; is it the patient, is it the care nurses, is it the health system or does ensure success for all of them?   How widespread is MyEleanor and how can someone in need can get access to the product 





    Connect with Itay


    Website - https://myndyou.com/ 


    LinkedIn - https://www.linkedin.com/in/itaybaruchi/ 


    Twitter - https://twitter.com/MyndYou_ 

    • 27 min

Customer Reviews

5.0 out of 5
7 Ratings

7 Ratings

Top Podcasts In Technology

No Priors: Artificial Intelligence | Technology | Startups
Conviction | Pod People
All-In with Chamath, Jason, Sacks & Friedberg
All-In Podcast, LLC
Lex Fridman Podcast
Lex Fridman
Acquired
Ben Gilbert and David Rosenthal
Hard Fork
The New York Times
TED Radio Hour
NPR