Episodes
The confluence of sensing, communication and computing technologies is allowing capture and access to data, in diverse forms and modalities, in ways that were unimaginable even a few years ago. These include data that afford the analysis and interpretation of multimodal cues of verbal and non-verbal human behavior to facilitate human behavioral research and its translational applications. They carry crucial information about a person’s intent, identity and trait but also underlying attitudes...
Published 03/08/16
Human-like Singing and Talking Machines: Flexible Speech Synthesis in Karaoke, Anime, Smart Phones, Video Games, Digital Signage, TV and Radio Programs This talk will give an overview of statistical approach to flexible speech synthesis. For constructing human-like talking machines, speech synthesis systems are required to have an ability to generate speech with arbitrary speaker's voice, various speaking styles in different languages, varying emphasis and focus, and/or emotional...
Published 06/06/15
Clarification in Spoken Dialogue Systems such as in mobile applications often consists of simple requests to “Please repeat” or “Please rephrase” when the system fails to understand a word or phrase. However, human-human dialogues rarely include such questions. When humans ask for clarification of user input such as “I want to travel on XXX”, they typically use targeted clarification questions, such as “When do you want to travel?” However, systems frequently make mistakes when they try to...
Published 05/28/15
Automatically extracting social meaning from language is one of the most exciting challenges in natural language understanding. In this talk I’ll summarize a number of recent results using the tools of natural language processing to help extract and understand social meaning from texts of different sorts. We’ll explore the relationship between language, economics and social psychology in the automatic processing of the language of restaurant menus and reviews. And I’ll show how natural...
Published 10/22/14
What effect does language have on people, and what effect do people have on language? You might say in response, "Who are you to discuss these problems?" and you would be right to do so; these are Major Questions that science has been tackling for many years. But as a field, I think natural language processing and computational linguistics have much to contribute to the conversation, and I hope to encourage the community to further address these issues. To this end, I'll describe two efforts...
Published 04/29/14
As speech recognition continues to improve, new applications of the technology have been enabled. It is now common to search for information and send accurate short messages by speaking into a cellphone - something completely impractical just a few years ago. Another application that has recently been gaining attention is "Spoken Term Detection" - using speech recognition technology to locate key words or phrases of interest in running speech of variable quality. Spoken Term Detection can be...
Published 04/02/14
In contrast to traditional rule-based approaches to building spoken dialogue systems, recent research has shown that it is possible to implement all of the required functionality using statistical models trained using a combination of supervised learning and reinforcement learning. This new approach to spoken dialogue is based on the mathematics of partially observable Markov decision processes (POMDPs) in which user inputs are treated as observations of some underlying belief state, and...
Published 12/20/13