Noise-cancelling headphones have become quite effective at generating an auditory blank slate. However, permitting certain noises from a wearer’s environment through erasure remains a hurdle for researchers. The most recent version of Apple’s Air Pods Pro, for example, automatically adjusts sound levels for wearers by sensing when they’re in conversation, but the user has no control over who to listen to or when this happens.
A team at the University of Washington created an artificial intelligence system that allows a user to “enrol” someone by looking at them for three to five seconds while wearing headphones. The device, known as “Target Speech Hearing,” then suppresses all other noises in the area and plays only the enrolled speaker’s voice in real time, even when the listener walks around in busy environments and no longer faces the speaker.
The team’s findings were presented on May 14 in Honolulu at the ACM CHI Conference on Human Factors in Computing Systems. The code for the proof-of-concept gadget is accessible for others to develop upon. The system is not commercially available.
“We tend to think of AI now as web-based chatbots that answer questions,” said senior author Shyam Gollakota, a UW professor at the Paul G. Allen School of Computer Science and Engineering. “However, in our study, we use AI to adjust the audio impression of everyone using headphones based on their preferences. Our technologies allow you to clearly hear a single speaker even in a crowded area with several other individuals conversing.”
A system uses off-the-shelf headphones with microphones to detect a speaker’s voice. The sound waves reach both sides simultaneously, with a 16-degree margin of error. The system learns the speaker’s vocal patterns and plays it back to the listener, improving its focus. The system was tested on 21 subjects, who rated the speaker’s voice clarity as high as unfiltered audio.