Baseten

Software Development

San Francisco, CA 24,647 followers

Own your inference.

See jobs Follow

Discover all 231 employees

About us

Inference is everything. Baseten is an AI infrastructure platform giving you the tooling, expertise, and hardware needed to bring great AI products to market - fast. Our proprietary Inference Stack utilizes the cutting-edge of performance research combined with highly performant and reliable infrastructure to give you out-of-the-box global availability with 99.99% of uptime.

Website: https://www.baseten.co/
External link for Baseten
Industry: Software Development
Company size: 51-200 employees
Headquarters: San Francisco, CA
Type: Privately Held
Specialties: developer tools, software engineering, artificial intelligence, and machine learning

Products

Baseten

Machine Learning Software

At Baseten we provide all the infrastructure you need to deploy and serve ML models performantly, scalably, and cost-efficiently. Get started in minutes, and avoid getting tangled in complex deployment processes. You can deploy best-in-class open-source models and take advantage of optimized serving for your own models. We also utilize horizontally scalable services that take you from prototype to production, with light-speed inference on infra that autoscales with your traffic. Best in class doesn't mean breaking the bank. Run your models on the best infrastructure without running up costs by taking advantage of our scaled-to-zero feature.

Locations

Primary

San Francisco, CA, US

Get directions
New York, NY, US

Get directions

Employees at Baseten

See all employees

Updates

Baseten reposted this
Amir Haghighat

Baseten•14K followers
2d
Report this post
Almost half of US doctors use OpenEvidence everyday to converse with a world of medical knowledge and latest medical research breakthroughs. A few minutes of downtime can result in thousands of suboptimal clinical decisions. We’re proud they’ve chosen us to power their inference.

4 Comments

Like Comment Share
Baseten reposted this
Dannie Herzberg

Baseten•10K followers
2d
Report this post
Nearly half of all U.S. physicians rely on OpenEvidence every single day - at the bedside, in the operating room, and at the point of care. It surfaces research in real time, exactly when doctors need it most. Baseten powers the inference that makes this possible - a responsibility we take very seriously. Thank you, Team OpenEvidence, for what you do and for the trust.

5 Comments

Like Comment Share
Baseten reposted this
OpenEvidence

35,513 followers
2d
Report this post
OpenEvidence answers over 1 Million questions every day from more than half of practicing physicians in the US. Physicians-and their patients-need OpenEvidence to provide the most accurate, up to date information in those critical moments. Downtime or delays have real life consequences; we partner with Baseten to provide the inference infrastructure make sure our answers are always available when physicians are making those high stakes clinical decisions. Baseten came through our office recently to talk to us about why this is so important, watch below:

Like Comment Share
Baseten

24,647 followers
3d
Report this post
Gemma 4 is live on Baseten and available to all customers on day 0 via the Baseten model library. All models in the Gemma 4 family are multimodal, supporting text and image inputs with text output. Key capabilities include: -> Advanced reasoning and thinking -> Coding and function calling -> OCR for document understanding -> Long context windows up to 256K tokens But the most impressive is how Gemma 4 is pushing the boundaries of model architecture with innovations including alternative attention mechanisms, Proportional RoPE, Per-Layer Embeddings (PLE), KV-Cache Sharing, native aspect ratio handling for vision, and a smaller frame window for audio. All are designed to improve efficiency, accuracy, and scalability. Try it today: https://lnkd.in/gEVxUuxh
2 Comments

Like Comment Share
Baseten reposted this
NVIDIA

5,011,692 followers
4d
Report this post
Delivered performance, not peak chip specifications, drives AI factory productivity. Rigorous benchmarks are the only way to see past the noise. In MLPerf Inference v6.0, NVIDIA extreme co-design delivered the highest token output across the broadest range of models and scenarios. Maximizing token output drives down token cost and maximizes AI factory productivity. Read the blog post to dive into the details: https://nvda.ws/3OaEE5b Baseten, CoreWeave, MLCommons

Why Do Performance Benchmarks Matter?

39 Comments

Like Comment Share
Baseten

24,647 followers
4d
Report this post
What if LLMs could remember as humans do? LLM memory is either perfect and lossless or ultra-compressed. What does a slightly compressed working memory to extend its context window look like? Our researchers built a 7M-parameter perceiver that compresses KV caches 8x while retaining 90%+ factual retention. Unlike existing compaction methods, we trained a model to do this in a single forward pass. We see this as the first step toward models that actually learn from experience. Read here: https://lnkd.in/eqaVCPSv
Like Comment Share
Baseten

24,647 followers
4d
Report this post
We've had a great month of March! A brief recap: -> NVIDIA GTC, featuring book signing, ice cream, and swag -> KubeCon EMEA, including a 2000+ person House of Kube event -> AI engineering leaders dinners at Wolfsbane (SF) and Manhatta (NYC) -> Baseten-branded ice cream social in SF -> AI/ML trivia night in the West Village -> NYC office warming party Want to attend our next event? Sign up here https://lnkd.in/gNZuscJ4

3 Comments

Like Comment Share
Baseten reposted this
Rootly

11,467 followers
6d
Report this post
500+ engineers proved it at KubeCon: sometimes all people need is a chance to unwind after a long conference day. As snacks, wine, and soft drinks washed over the RAI's Boat House, SREs, DevOps, and Platform Engineers shared how they're keeping up with the faster-than-ever cycles in the industry. Thanks to Rootly, Upwind Security, Baseten, Checkly, Cloudsmith, MetalBear, FusionAuth, Echo, Twingate, and Spotify for Backstage for making the evening unforgettable.
2 Comments

Like Comment Share

Browse jobs

Funding

Baseten 6 total rounds

Last Round

Series D Oct 5, 2025

US$ 150.0M

Investors

Bond + 8 Other investors

See more info on crunchbase

Baseten

Software Development

San Francisco, CA 24,647 followers

Own your inference.

About us

Products

Baseten

Machine Learning Software

Locations

Employees at Baseten

Jason Dupree

12K followers

Marylise Tauzia

I have over 15 years of…•1K followers

Tarun Diwan

Baseten•26K followers

William Lau

751 followers

Updates

Why Do Performance Benchmarks Matter?

Join now to see what you are missing

Similar pages

Decagon

Fireworks AI

ElevenLabs

Harvey

Arize AI

Metronome

Sierra

Together AI

Parsed

Render

Browse jobs

Engineer jobs

Machine Learning Engineer jobs

Scientist jobs

Software Engineer jobs

Developer jobs

Marketing Manager jobs

Manager jobs

Senior Software Engineer jobs

Intern jobs

Associate jobs

Analyst jobs

Human Resources Specialist jobs

Executive jobs

Full Stack Engineer jobs

Operational Specialist jobs

Junior Software Engineer jobs

Designer jobs

Human Resources Generalist jobs

Human Resources Manager jobs

Account Executive jobs

Funding