We are entering a multi-species civilization. Humans and AIs will work together, think together, live alongside each other. Voice is the interface for that world—the highest-bandwidth channel we have, and the one that every human already shares.
Extrian is an applied audio research lab. We build datasets, models, and evaluation frameworks for end-to-end speech systems—preserving the full dimensionality of human expression across languages, dialects, and speaking styles. Our work sits at the foundation of the next generation of voice-native AI.
This is the interface between species.
The largest conversational audio collection effort ever undertaken. Millions of hours. Research-ready.
Why existing benchmarks underweight naturalness, and a proposed framework for measuring it.
What high-fidelity, linguistically diverse speech data unlocks for end-to-end model performance.
Turn-taking, backchannels, and the acoustic structure of natural dialogue.