AssemblyAI

Software Development

San Francisco, California 41,780 followers

Industry-leading Speech AI models to automatically recognize and understand speech.

See jobs Follow

Discover all 102 employees

About us

AssemblyAI is the best way to build Voice AI apps. We build the industry’s best speech-to-text and speech understanding models, including promptable speech recognition, that serve as critical infrastructure for top Voice AI products like Granola, Dovetail, Ashby, and Cluely. Our speech-to-text models lead the industry in accuracy and quality, so you can build reliable product experiences on top of voice data. And our Speech Understanding models help you go beyond transcription to uncover insights, identify speakers, and highlight key information. We make it simple to get started, with a developer-first API and usage-based pricing that scales effortlessly to millions of hours.

Website: http://www.assemblyai.com
External link for AssemblyAI
Industry: Software Development
Company size: 51-200 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2017

Products

AssemblyAI

Speech Recognition Software

At AssemblyAI, we build AI models and systems that developers and product teams use to ship transformational AI-powered audio products. As an applied AI company, our mission is to empower app builders to build 10x faster, focus on their specific use cases and user needs, and win market share with a true technology partner. We've raised over $63M in funding from leading investors, including Insight Partners, Accel, and Y Combinator. Learn more at AssemblyAI.com.

Locations

Primary

320 Judah St

San Francisco, California 94122, US

Get directions

Employees at AssemblyAI

See all employees

Updates

AssemblyAI

41,780 followers
2d
Report this post
If you're at HumanX and building anything that touches customer conversations—support, CX, or voice—this one’s worth your time. 👇 Our CEO, Dylan Fox, is joining ASAPP, NiCE, and TLDR for a panel on: Turning Customer Conversations into Action. Expect a practical look at how teams are going from raw conversations → structured insight → real decisions and workflows. 📍 The Grove Theater 🗓 April 7 | 3:15–3:40 PM If you’re thinking about how conversation data plugs into your stack (or your business), come by!
1 Comment

Like Comment Share
AssemblyAI

41,780 followers
6d Edited
Report this post
We talk about ambient scribes a lot. So when S2 E2 of The Pitt opened with one going wrong, it felt close to home. Well-timed—right before the launch of Medical Mode. A doctor walks in. Pulls up the AI-generated chart. Reads: "It says she takes Risperdal... and she takes Restoril when needed for sleep." Accuracy was the butt of the joke, "AI. Almost intelligent." That's the problem with general-purpose ASR in clinical settings. It hits 95%+ accuracy on a consult. It also might get "hydrochlorothiazide" wrong every time. Our new Medical Mode is optimized for medical entity recognition. One parameter to enable, works on pre-recorded and streaming audio. We had a little too much fun making this. Watch our (corrected) clip 👇

Clinical-grade accuracy on every drug name, dose, and diagnosis

1 Comment

Like Comment Share
AssemblyAI

41,780 followers
6d
Report this post
Heading to HumanX next week? Don't forget to say hi! 👋 Stop by our booth to share what you're working on, check out our swag, or ask questions about how we can help you build with voice. Our AAI crew will be there waiting for you! 👀 Ryan Seams, Michael Miller, Tim Higgins, Zackary Klebanoff. 🗓️ April 6–9 📍 San Francisco Moscone Center Booth 315
Like Comment Share
AssemblyAI reposted this
Dylan Fox

AssemblyAI•18K followers
1w Edited
Report this post
We looked through 22 popular Voice AI datasets: VoxPopuli, Earnings-22, AfriSpeech. Datasets with 25K+ downloads/month. They're full of errors. Wrong company names. Wrong people names. Entire languages dropped from multilingual audio. Hundreds of sections marked <inaudible>. When we shipped Universal-3-Pro a few weeks ago, WER went up on some benchmarks. So we dug in. Listened to the audio. Read the transcripts side by side. What we found was very surprising... Our model was beating the human ground truth. And getting penalized for it! I wrote up everything we found with real audio examples you can listen to yourself. Full post: https://lnkd.in/ePfY-ZXw

Beyond WER — Dylan Fox dylanbfox.com

17 Comments

Like Comment Share
AssemblyAI

41,780 followers
1w
Report this post
Earlier this week our team joined founders, engineers, and PMs at Granola's London office to talk about what it actually takes to build with voice AI in production. Great city, great conversations, even better people. London, we'll be back.📍

2 Comments

Like Comment Share
AssemblyAI reposted this
Ryan Seams

AssemblyAI•4K followers
1w
Report this post
When we launched Universal-3 Pro, we knew the model was good. We didn't expect it to consistently outperform human transcribers. Right after launch, customers told us: "the model is failing our evals." So we dug in. Turns out: the problem wasn’t always the model. Human labels are far from perfect. And now? In many cases, the model is actually more accurate than the ground truth. Which leads to a bigger issue: WER-based evals are breaking. Join AssemblyAI next week for a workshop where we cover: - common errors in human labeled files affecting WER - how to quickly spot/fix issues with your labeled files - going beyond WER to run a proper eval - what a vibe eval is and how to scale one up Voice AI models are evolving fast. Your eval strategy needs to keep up.
3 Comments

Like Comment Share
AssemblyAI

41,780 followers
1w
Report this post
Three companies. Three different voice AI use cases. One thing they all agreed on: Transcription quality is key 🔑. Last night they got together at Granola's London office to talk about it... A few themes that kept surfacing: 🔹 Speaker diarisation isn't a nice-to-have— it's foundational 🔹 Domain-specific terminology accuracy makes or breaks real-world deployments 🔹 The real-time vs. post-call trade-off looks different depending on your product 🔹 Multilingual support and voice tonality detection are the next frontier Hearing how differently each company has shaped its pipelines was a reminder of how much surface area "voice AI" actually covers. Special shoutout to our speakers: Jonathan Kim (Granola), Adrien Wald (CoLoop), and Shane Lynn (EdgeTier)— moderated by Ryan Seams. 🌟 Thanks to everyone who came out. 📍 London, we'll be back.
4 Comments

Like Comment Share
AssemblyAI

41,780 followers
1w
Report this post
General-purpose ASR: 95%+ accuracy on a clinical consult. Also general-purpose ASR: gets "hydrochlorothiazide" wrong every time. Introducing Medical Mode — a correction pass on top of Universal-3 Pro optimized for medical entity recognition. Enable it with one parameter. The real failure mode isn't the transcript. It's what comes next. Most healthcare AI pipelines feed transcripts into an LLM → SOAP notes, discharge summaries, referral letters. Wrong drug name in. Wrong drug name out. Errors don't attenuate. They propagate. Medical Mode catches it before it gets that far. Works on both pre-recorded and streaming audio. No commitments or up-charges for BAAs to meet HIPAA compliance. $0.15/hr. See our benchmarks here → https://lnkd.in/gjvknmCA Test with your own audio → https://lnkd.in/gA97USAW

1 Comment

Like Comment Share
AssemblyAI

41,780 followers
1w
Report this post
The Pitt meets AssemblyAI Medical mode 👀

3 Comments

Like Comment Share
AssemblyAI

41,780 followers
1w
Report this post
Medical Mode is now available for clinical workflows. We built Medical Mode because a transcript that's 95% accurate can still be unusable in a clinical setting. Errors in general-purpose ASR are often concentrated on exactly the tokens clinicians care about most: drug names, dosages, and clinical terminology. "Lisprohumalog" is a phonetically reasonable guess. It's also not a real medication. Most healthcare AI products feed a transcript into an LLM to produce structured output. A wrong drug name in the transcript becomes a wrong drug name in the SOAP note, the discharge summary, the referral letter. Errors don't attenuate through the pipeline. They propagate. Medical Mode runs a correction pass optimized specifically for medical entity recognition: drug names, procedures, clinical terminology. The base model's noise handling and latency characteristics stay the same. Medical Mode just refines the output on the tokens that actually matter. Works on both Universal-3 Pro pre-recorded and Universal-3 Pro Streaming. No commitments or up-charges for BAAs to meet HIPAA compliance.

1 Comment

Like Comment Share

Browse jobs

Funding

AssemblyAI 6 total rounds

Last Round

Series C Jan 3, 2024

US$ 50.0M

Investors

Accel + 5 Other investors

See more info on crunchbase

AssemblyAI

Software Development

San Francisco, California 41,780 followers

Industry-leading Speech AI models to automatically recognize and understand speech.

About us

Products

AssemblyAI

Speech Recognition Software

Locations

Employees at AssemblyAI

Alex Kroman

AssemblyAI•5K followers

Justin Head

1K followers

Tomas Urbonaitis

AssemblyAI•1K followers

Adam Urpsis

AssemblyAI•115 followers

Updates

Clinical-grade accuracy on every drug name, dose, and diagnosis

Join now to see what you are missing

Similar pages

RevenueCat

Invisible Technologies

Phantom

Goodnotes

DuckDuckGo

Customer.io

Help Scout

Pulumi

Maze

MagicSchool AI

Browse jobs

Engineer jobs

Researcher jobs

Analyst jobs

Developer jobs

Scientist jobs

Machine Learning Engineer jobs

Software Engineer jobs

Senior Software Engineer jobs

Manager jobs

Platform Engineer jobs

Support Engineer jobs

Account Executive jobs

Associate jobs

Vice President jobs

Specialist jobs

Intern jobs

Full Stack Engineer jobs

Python Developer jobs

Account Manager jobs

Director of Operations jobs

Funding