Together AI

Software Development

San Francisco, California 82,631 followers

Accelerate inference, model shaping, and pre-training on a research-optimized platform.

See jobs Follow

Discover all 355 employees

About us

Together AI is the AI Native Cloud, purpose-built for AI engineers and researchers with a full suite of tooling across inference, model shaping, and pre-training. AI natives can use Together AI as a full-stack AI platform — from a high- performance inference engine built for reliable and fast scaling to on-demand GPU clusters and massive-scale AI factories. Together AI continuously pushes the frontier forward by productizing cutting-edge research from our world-leading AI systems research team. By combining research velocity with production-grade infrastructure, we enable companies to reliably scale AI-native applications as fast as the field evolves. Trusted by leading AI natives like Cursor, Decagon, Eleven Labs, AI21, Hedra, and Cartesia, as well as SaaS innovators such as Salesforce, Zoom, and Zomato, Together AI powers the next generation of AI-native applications.

Website: https://together.ai
External link for Together AI
Industry: Software Development
Company size: 201-500 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2022
Specialties: Artificial Intelligence, Cloud Computing, LLM, Open Source, and Decentralized Computing

Locations

Primary

251 Rhode Island St

Suite 205

San Francisco, California 94103, US

Get directions

Employees at Together AI

See all employees

Updates

Together AI

82,631 followers
1d
Report this post
New from Together Research: LLMs can fix the query plans your database optimizer gets wrong, resulting in up to 4.78x faster execution. Cost-based optimizers fail when they miss semantic correlations. A filter that prunes 15M rows to 2.9M gets applied after a join instead of before — because the optimizer assumed independence where none existed. DBPlanBench exposes DataFusion's physical operator graph to an LLM, which applies targeted JSON patch edits to fix join ordering without regenerating the full plan. On TPC-H and TPC-DS workloads: → Up to 4.78x speedup on complex multi-join queries → 60.8% of queries improved by more than 5% → Build memory: 3.3 GB → 411 MB on a single benchmark query → Plans optimized at small scale transfer directly to larger databases Paper and code are open-source. Link in comments.
3 Comments

Like Comment Share
Together AI

82,631 followers
1d
Report this post
We’re excited to announce Deepgram's speech and voice models on Together AI. AI natives can now deploy Deepgram’s STT and TTS models natively on Together AI Dedicated Model Inference and run the full real-time voice stack — from transcription to reasoning to synthesis — on the AI Native Cloud. Deepgram’s lineup includes Flux, Nova-3, Nova-3 Multilingual, and Aura-2, built for low-latency voice agents operating in real-world production environments. Highlights: → Conversational STT built for turn-taking — Flux delivers 250ms end-of-turn detection for faster, more natural exchanges → Production-ready transcription and synthesis — Nova-3, Nova-3 Multilingual, and Aura-2 support noisy audio, multilingual interactions, and clear enterprise voice output → Enterprise infrastructure by default — Dedicated workloads, 99.9% uptime SLA, zero data retention, SOC 2 Type II, HIPAA-ready support, and data residency options Read the announcement: https://lnkd.in/gVHwraYT
1 Comment

Like Comment Share
Together AI

82,631 followers
3d
Report this post
The Together AI kernels team pushes performance to the next level. An investigation into how left more questions than answers, but VP of Kernels Dan Fu seemed proud. If you want the full picture, read on: https://lnkd.in/gwtkFbU9

6 Comments

Like Comment Share
Together AI

82,631 followers
4d
Report this post
New from Together Research: Aurora Speculative decoding that adapts to shifting traffic in real time. Static draft models degrade as domains change. Offline retraining can't keep pace. Aurora fixes this — an open-source, RL-based framework that learns continuously from live inference traces without interrupting serving. Key results: → 1.25x over a well-trained static speculator through online adaptation → Online training from scratch surpasses a carefully pretrained baseline → Under abrupt domain shifts, Aurora recovers quickly You don't need extensive offline pretraining. Aurora learns from the first requests it serves. Code is open-sourced. Read the blog and paper [links in comments]
5 Comments

Like Comment Share
Together AI reposted this
Emergence Capital

16,662 followers
5d
Report this post
Since 2023, we’ve known that Vipul Ved Prakash and his team at Together AI are building something special. As more companies across verticals adopt AI, it’s clear that generic, off-the-shelf models won’t cut it. That’s what makes Together AI so critical. They’re building a platform that enables enterprises of all shapes and sizes to pre-train their own proprietary models tailored to their individual workflows, providing the essential infrastructure that makes verticalized AI safer and more affordable. Today, Together AI returns to the Enterprise Tech 30 list as a late stage company – proving its longevity and quality in a crowded, noisy AI market. The ET30, by Wing Venture Capital and Eric Newcomer, is voted on by 90+ leading investors and corporate development leaders. It recognizes the private companies with the most potential to shape the future of enterprise technology. They see what we see – the companies who leverage AI in the right way will be the ones defining the next generation of business. Congratulations to the entire team at Together AI on this well-deserved milestone, and all companies honored this year. #ET30 https://lnkd.in/g--BxZP2
3 Comments

Like Comment Share
Together AI

82,631 followers
5d
Report this post
Open or closed — how do you choose the right AI model? For AI-native companies and enterprises, it's one of the most consequential decisions you'll make. Together CEO Vipul Ved Prakash takes the stage at HumanX with Mozilla's Mark Surman and The Wall Street Journal's Rolfe Winkler to work through what actually matters, because the model you choose now shapes what you build next. 🗓️ Thursday, April 9, 11:40 a.m.
Like Comment Share
Together AI

82,631 followers
1w
Report this post
New from Together Research: small models can beat GPT-4o on long context with the right system design. The instinct when context windows hit 128K or 1M tokens is to throw everything into one prompt. In practice, performance degrades as length grows. Our new paper, accepted at #ICLR2026, introduces a framework to study when and why "Divide & Conquer" works, and how to design it effectively. The core insight: long context failures come from three distinct noise sources: 1/ Model noise: confusion grows superlinearly with input length 2/ Task noise: chunks lose cross-document context 3/ Aggregator noise: the Manager fails to stitch partial answers correctly Naive "MapReduce" approaches collapse on that third point. The fix is a Planner agent that rewrites the task prompt so Workers return exactly what the Manager needs. Results: Llama-3-70B and Qwen-72B using this framework consistently outperform GPT-4o single-shot on retrieval, QA, and summarization as context length scales. The smaller models win, and they're cheaper and faster. The limit: tasks with high cross-chunk dependency — where a clue on page 1 connects to page 100 — still favor the single-shot approach. Blog, paper, and code in the comments.
3 Comments

Like Comment Share
Together AI

82,631 followers
1w
Report this post
We had a big week at #NVIDIAGTC 2026. Here’s what we shipped together with NVIDIA: • NVIDIA Dynamo 1.0 — Dynamo is already baked into our full-stack AI platform, and this release pushes inference performance even further for teams running production workloads. • NVIDIA OpenShell via NemoClaw — We’re hosting the OpenShell runtime, so developers can tap into 150+ optimized models and build autonomous agents without sacrificing safety or scale. • NVIDIA Nemotron 3 Super — This 120B parameter hybrid MoE model (just 12B active per token) brings long-horizon reasoning and multi-agent collaboration to production, and is now available via Together Dedicated Model Inference. • NVIDIA Parakeet TDT 0.6B V3 — Fast, reliable transcription meets Together’s inference infrastructure. Building real-time voice agents just got a lot more straightforward. Read it all here: https://lnkd.in/g-kxFKR8 Open innovation is what drives Together AI, and this is what it looks like in practice. #TogetherAI #NVIDIAGTC #OpenSource #AIInfrastructure #Inference

Together AI at NVIDIA GTC 2026: Explore our latest innovations across research and products together.ai

1 Comment

Like Comment Share
Together AI

82,631 followers
1w
Report this post
Together AI is heading to HumanX — and we're bringing the AI Native Cloud experience to booth #819. Stop by for live demos with our Solution Architects, customer story activations, research meet & greets, and a build-your-own hat bar and custom comic book station. We're also taking the stage twice: 📅 April 7 | Dan Fu, VP of Kernels — Building for compute that doesn't exist yet (but almost does) 📅 April 9 | Vipul Ved Prakash, CEO — Open or closed models: What AI natives & enterprises need to know Come find us, see what we've been building, and hear from real AI-native teams getting real results with Together AI. 👉 Booth #819. See you there: https://lnkd.in/gXZvTzky
Like Comment Share
Together AI

82,631 followers
1w Edited
Report this post
#NVIDIAGTC 2026 is a wrap. Here’s what we were up to 👇 • Sessions with Together AI Senior Director Yineng Zhang and co-founder Percy Liang • Deep dives with customers like Cursor & Decagon • A packed agenda of lightning talks at our booth • Announcements including availability of NVIDIA Dynamo 1.0 and NVIDIA Parakeet TDT 0.6B V3 on Together AI • Trivia, a hat bar, custom comic books and of course, a meeting with Jensen We can't wait to see you next year!

1 Comment

Like Comment Share

Browse jobs

Funding

Together AI 4 total rounds

Last Round

Series B Mar 20, 2025

US$ 305.0M

Investors

Prosperity7 Ventures General Catalyst + 19 Other investors

See more info on crunchbase

Together AI

Software Development

San Francisco, California 82,631 followers

Accelerate inference, model shaping, and pre-training on a research-optimized platform.

About us

Locations

Employees at Together AI

Vipul Ved Prakash

Together AI•6K followers

Hemant Taneja Hemant Taneja is an Influencer

General Catalyst•90K followers

Yaron Samid 🇮🇱🇺🇸

TechAviv•19K followers

Charles Zedlewski

Together AI•6K followers

Updates

Join now to see what you are missing

Similar pages

ElevenLabs

Lambda

Fireworks AI

Anthropic

Harvey

Hippocratic AI

Mercor

Perplexity

Abridge

EnCharge AI

Browse jobs

Engineer jobs

Machine Learning Engineer jobs

Scientist jobs

Manager jobs

Account Executive jobs

Software Engineer jobs

Senior Software Engineer jobs

Analyst jobs

Intern jobs

Marketing Manager jobs

Vice President jobs

Solutions Architect jobs

Director jobs

Developer jobs

Account Manager jobs

Intelligence Specialist jobs

Senior Director jobs

Associate jobs

Project Manager jobs

Enterprise Account Executive jobs

Funding