Technology gives a voice to those who face speaking challenges. Researcher Keith Vertanen's work to create faster, more fluid augmentative and alternative communication (AAC) interfaces has earned recognition and funding from the National Science Foundation.
Vertanen, an assistant professor in the Michigan Technological University Computer Science Department, received the NSF CAREER Award, a milestone for up-and-coming researchers that recognizes outstanding achievement and potential. The award will provide $538,799 in funding for Vertanen's five-year project, " Technology Assisted Conversations." The project goal: improve AAC devices that help non-speaking individuals, including those with physical or cognitive disabilities, and their communication partners have faster and more interactive face-to-face conversations.
"Over the past decades, we have witnessed how technology has changed the way people live, work and receive entertainment. Yet the level of change for people who face speaking challenges is not as dramatic. I am proud that Dr. Vertanen is going to address this important issue."
The project builds on Vertanen's experience in text entry research. In 2015, his paper investigating fast mobile text entry using VelociTap, an algorithm that auto-corrects a user's input even when they miss most of their intended keys, won a CHI (Computer Human Interaction) Conference Series best paper award. In 2016, Vertanen received a Google Faculty award to investigate accelerating text input via abbreviations.
"The computer science department has excelled in both teaching and research. Faculty and staff work extremely hard to enhance the student learning experience through high-quality teaching and cutting-edge research. Dr. Vertanen is one of the high performers. His CAREER project will undoubtedly improve the department’s mission," Song says.
More Than A Two-Way Street
"Spoken conversation can reach speeds of 150-plus words per minute," Vertanen says. "But users of predictive AAC devices can average less than 10 words per minute. This rate differential is incredibly problematic for the everyday conversations you and I take for granted. A key challenge is to accelerate the communication speed of these users," Vertanen says. Turn-taking in larger groups can be even more problematic. "By the time an AAC user produces a message, it may no longer be relevant to the conversation. Improving other aspects such as this are also important pieces to the puzzle."
Algorithms will incorporate both speech recognition and speaker identification techniques to inform the predictions of the AAC device. "If I'm an AAC user, what you're saying is relevant to what I'm likely to say in response. Who you are is relevant as well. You have different conversations with different people," Vertanen says. Making such context-sensitive predictions both accurate and robust is a key challenge. Vertanen's approach will require collecting unique types of conversational data as well as leveraging the latest advances in language modeling.
The project will also explore having the communication partner actively participate in improving the AAC device's predictions. "Often in conversations, the communication partner helps to co-construct a response. They know the person and the conversational context. They may be able to short-circuit things by guessing what the user is about to say. We'll explore whether users like a technology-enabled version of that," Vertanen says. "We'll look at having the communication partner provide hints that feed the system's predictions. As a bonus, it helps keep the partner actively engaged while waiting for the next turn in the dialogue."
Involving people who use the technology is a vital part of the innovation process. "It's important to reach out and engage AAC users, to guide the research and allow them to contribute to the research that affects them," says Vertanen. Users will be involved in designing the interfaces the research team is building. To help engage with end-users, Vertanen assembled an advisory panel of AAC users, industry practitioners and researchers.
Creating an Inclusive Community
As part of the project, Vertanen plans to build an online community he calls ImagineVille. "The idea is to have volunteers engage in various conversational tasks. For example, two people might have a conversation in which one speaks while the other responds using an interface that only allows slow typing. This allows us to collect the unique types of data we need. It will also generate public awareness about the challenges associated with having a conversation when one person has a much slower speaking rate. I hope actual AAC users will also participate, providing authentic data for comparison," Vertanen says. Whether ImagineVille will take off remains to be seen, but he's hopeful.
"People are willing to dump a bucket of ice water over their head to raise money for ALS. I think people will be willing to spend time chatting online with strangers to help improve the interfaces people with ALS may use to communicate."
Most of the project will take place at Michigan Tech. But some research will happen virtually and in other locations. An educational element is woven throughout the project, including working with Michigan Tech undergraduates on software development. The project will also host a Summer Youth Program on the technology behind texting, with scholarships offered to young women and other underrepresented groups, including students with disabilities. "Texting is a ubiquitous part of everyday life for many young people. They text all the time and in all sorts of situations, often typing quite quickly and inaccurately. The auto-correction that bails them out is actually quite simple. You tally up how often words occur in a large amount of text, say a novel. You then use basic probability theory to make educated guesses about what a person actually intended to type." Vertanen says.
Vertanen was working on his master's in speech and language processing at the University of Cambridge when he met his PhD advisor, who introduced him to "this crazy interface called Dasher. Basically, you zoomed through these boxes of letters to write. Dasher is particularly good for people who communicate via an eye- or head-tracker. We were demo-ing Dasher at a conference on assistive technologies when I first realized the scope of the problem. The vendor hall was chockfull of predictive AAC devices and more or less their predictions all sucked," Vertanen says. "Despite decades of research in speech and language processing, not much seemed to trickle down. Communication is a fundamental human right, and I think there is wide scope for computer scientists to improve the communication ability of those who speak via technology." In the end, Vertanen's doctoral work had little to do with this interface. But Dasher is back.
The project includes a series of enhancements to Dasher, as well as a new tablet interface and a helper app that will allow conversation partners to offer suggestions to the AAC interface.
Can We Talk?
The project has implications far beyond the scope of communication solutions for those with cognitive or physical disabilities. "The primary focus of the project is to improve the technology for people who are non-speaking. People who use technology as their voice. But the things we're developing could have application to everyone," Vertanen says. For example, ever have trouble remembering names? Hearing conversation in a crowded room? What about an augmented reality headset to help you speak a language you're not fluent in? All are possibilities based on the project's research into conversational interactive systems.