top of page

The Shape of Sound: What SpeechCompass Teaches Us About Directional UX & Inclusive Design

By CMCD Research

Behavioral Systems • Accessible Design • Emerging Interfaces


When Google Research dropped their deep dive into Sound Localization for Group Captions, it wasn’t a single accessibility win — it was a signal. A signal that UX design is about to get spatial in a whole new way.

A group of people having a conversation with one using Compass to speak to others and listen.
Spatial technology is the future of UX.

At Commit Me Co Design (CMCD), we’ve long argued that accessibility is a systems-level UX choice. And SpeechCompass (CHI 2025 Best Paper) proves it: real-time captioning that doesn’t just transcribe speech, but orients it. Visually. Spatially. Intelligently.


Let’s break it down.


🔊 What Is SpeechCompass?


A novel system developed by Google researchers to enhance mobile captioning in noisy, multi-person environments.


Unlike standard live-caption apps, SpeechCompass uses a four-microphone array and real-time sound localization to:


  • Determine who’s speaking

  • Estimate where they’re speaking from

  • Visually display directional guidance on a smartphone screen alongside captions


This creates a “compass-like” captioning interface — a literal map of the conversation.


From the Google blog:


“We heard from participants that the ability to quickly know who said what — and where they are — was essential for engagement and emotional clarity in conversation.”

🧬 Why CMCD Is Paying Attention


We’re not hardware engineers — but we’re behavioral system designers. And this project touches several UX megatrends we’re tracking for 2025–2026:


1. Spatial Interfaces Are the Future of Mobile UX

As phones blur into wearables and AR devices, interfaces will need to reflect physical space. SpeechCompass is one of the first mainstream examples of what we’d call spatial UX affordances: design that maps behavior across physical vectors, not just screens.


Takeaway: Don’t just ask “what was said?” Ask “from where?” “By whom?” “In what rhythm or cadence?” Information design is no longer flat.



2. Cognitive Load ≠ Comprehension

The standard caption stream assumes a single, linear source of speech. In reality, group conversations are messy — overlapping voices, unknown speakers, noisy cafés. By separating speakers and adding directional cues, SpeechCompass reduces cognitive load — a concept CMCD uses heavily in Behavioral UX Audits. This means faster comprehension, lower anxiety, and more trust.


Takeaway: Your interface isn’t just about presenting data. It’s about reducing effort to interpret it.



3. Accessibility as Infrastructure, Not Add-On

The project was co-designed with D/deaf and Hard-of-Hearing participants from the start. This wasn’t a tacked-on feature — it was an infrastructure-first decision to build around the needs of often-excluded users.


Takeaway: Accessibility doesn’t slow down innovation — it leads it. Many of the best UX breakthroughs of the next decade will come from designing for the edges of usability.



4. Low-Power, Localized Intelligence

SpeechCompass runs on a low-power, real-time audio processor, enabling on-device, offline usage. That’s crucial for privacy, portability, and broader adoption.

 At CMCD, this mirrors our belief in intelligent, autonomous front-ends — not always dependent on cloud processing.


Takeaway: The future of interaction is local, private, fast — and personalized.


ree

🎯 What This Means for CMCD Clients & Collaborators


Whether you’re a nonprofit, arts collective, or product studio, here’s how this research should inform your next digital move:


Spatial design is not just for AR — even websites and mobile apps can guide user attention directionally (think scroll, hierarchy, microanimations, and behavioral prompts).


Captioning ≠ accessibility-only — it’s a storytelling tool. Imagine using localized captioning in brand videos, live events, or onboarding flows to guide engagement.


Design for chaos — overlap, noise, messiness are the norm. If your system can’t handle real-world entropy, it’s not designed for real people.


Your brand doesn’t just need a caption.

It needs a compass.


Let’s design it together.



 
 
 

Comments


bottom of page