Introducing NousCoder-14b, a competitive olympiad programming model. Our latest blog details the full findings from extensive experiments and logs with the full stack released - the RL environment, benchmark, and harness built in Atropos, all fully reproducible with our open training stack. NousCoder-14b was post-trained on Qwen3-14B by researcher in residence Joe Li using 48 B200s over the course of 4 days, our Atropos framework, and Modal's autoscaler. It achieves a Pass@1 accuracy of 67.87%,+7.08% over Qwen's baseline accuracy using verifiable execution rewards. View the full blog post here: https://lnkd.in/gAQc9rmk
Nous Research
Technology, Information and Internet
New York, NY 6,215 followers
World-class open source models
About us
Nous Research is an AI research lab creating world-class models out in the open. We are best known for the Hermes series of open-source models, which are general purpose, human-aligned, lightweight models downloaded more than 50 million times on HuggingFace. In the process of developing these models, Nous is building a fully open AI stack, allowing anyone to meaningfully participate in the development of frontier intelligence, beginning with our fully distributed pre-training network, Psyche.
- Website
-
https://nousresearch.com/
External link for Nous Research
- Industry
- Technology, Information and Internet
- Company size
- 11-50 employees
- Headquarters
- New York, NY
- Type
- Privately Held
- Founded
- 2023
Locations
-
Primary
Get directions
New York, NY, US
Employees at Nous Research
Updates
-
Today we open source Nomos 1. At just 30B parameters, it scores 87/120 on this year’s Putnam, one of the world’s most prestigious math competitions. This score would rank #2/3988 in 2024 and marks our first step with hillclimb towards creating a SOTA AI mathematician. Nomos 1 achieved an 87/120 with 8 perfect scores, while Qwen3-30ba3b-Thinking-2507 scored 24/120 when run in the same harness under the same conditions, indicating that the performance is largely due to post-training and data quality rather than the harness. Submissions were blind-graded by a human Putnam top 200 contestant who was given anonymized submissions. The exact files sent to our human annotators for grading are available here de-anonymized here: https://lnkd.in/ejUeS8JC, along with the runbooks used to generate them https://lnkd.in/exNc4-kT omos/tree/main/runbooks. We open-source our model here https://lnkd.in/eHvQfMZK and our reasoning harness here https://lnkd.in/e32B9aeK.
-
We are excited to announce a drastic 70%+ price reduction on our flagship hybrid reasoning models. Try them now at https://lnkd.in/e6g5NF8z
-
-
Nous Research reposted this
Check out Vercel's blog for a little article on how we use use BotID at Nous Research: https://lnkd.in/g9gWwdT5 . Thanks Vercel folks, you're always a pleasure to work with. :)
-
We are excited to announce our partnership with hillclimb to create a model that can take on the world's most difficult mathematics challenges.
hillclimb is the first human superintelligence community, dedicated to building golden datasets for AGI. Data is accelerating on every axis. 3 years ago, we were impressed that ChatGPT could get good SAT scores. Today, GPT-5 and other frontier models assist meaningfully in coding, research, and tool use. Within two years, AI systems will autonomously extend the frontier of science. hillclimb has built the world’s first human superintelligence community - starting with math. Their team of IMO medalists, lean experts, and PhDs is designing RL environments for frontier labs to push state-of-the-art intelligence. 2 years ago, the founders retired from competing professionally in esports and taught ourselves code. Jun Park pursued research at DeepMind and Ibrakhim Ustelbay built the first AI interviewer, growing it to 100k+ users. In the business of digitizing human intelligence for frontier labs, they understand both how to rigorously assess true expertise and how to turn that into data researchers actually want. Congrats to the team on the launch! 🚀 https://lnkd.in/g4ZTds6P
-
As large language models continue to scale, centralized pre-training faces growing challenges in high barrier to entry, data availability, compute concentration, and trust. Decentralized pre-training offers a promising alternative by leveraging globally distributed compute and data while maintaining model quality and security. Watch to the full talk from the PyTorch conference by our Chief Scientist, Bowen Peng. https://lnkd.in/eEjTCQZn
The Four Pillars of Large-Scale Decentralized Pre-Training: Methods, Bottlenecks... - Bowen Peng
https://www.youtube.com/
-
Nous Research reposted this
This has been a fun release to work on, because it pulled together many threads we've been juggling for a while: - The Hermes 4 models themselves (which are genuinely distinctive imo) - The custom RL framework that made fine-tuning possible - A fresh and rather experimental chat UI (won't be to everyone's taste, but no point in us all building the same chat over and over!) - Lots of behind the scenes tweaks to our developer portal for key management, credit tracking, and payments, etc.. - An auth provider migration (to Privy.io) - Loads of explorations and comparisons of different inference providers - An innovative cross-chat memory system, backed by a fancy a knowledge graph Lots of moving parts that clicked into place. Nice work team. :)
Nous Research presents Hermes 4, our latest line of open-source models. https://lnkd.in/eFf3WCbG Hermes 4 builds on our legacy of user-aligned models with expanded test-time compute capabilities. Special attention was given to making the models creative and interesting to interact with, unencumbered by censorship, and neutrally aligned while maintaining state of the art level math, coding, and reasoning performance for open weight models. You can try Hermes 4 in the new, revamped Nous Chat UI. https://lnkd.in/ewQ6Mj4P For the first week, all Hermes 4 inference in Nous Chat is free of charge.
-
Nous Research presents Hermes 4, our latest line of open-source models. https://lnkd.in/eFf3WCbG Hermes 4 builds on our legacy of user-aligned models with expanded test-time compute capabilities. Special attention was given to making the models creative and interesting to interact with, unencumbered by censorship, and neutrally aligned while maintaining state of the art level math, coding, and reasoning performance for open weight models. You can try Hermes 4 in the new, revamped Nous Chat UI. https://lnkd.in/ewQ6Mj4P For the first week, all Hermes 4 inference in Nous Chat is free of charge.
-
Measuring Thinking Efficiency in Reasoning Models: The Missing Benchmark https://lnkd.in/eJFWZRsb We measured token usage across reasoning models: open models output 1.5-4x more tokens than closed models on identical tasks, but with huge variance depending on task type (up to 10x on simple questions). This hidden cost often negates per-token pricing advantages. Token efficiency should become a primary target alongside accuracy benchmarks, especially considering non-reasoning use cases. Read the thorough review of reasoning efficiency across the open and closed model landscape in our latest blog post in collaboration with our researcher in residence, Tim. See more of their work here: https://lnkd.in/eju3abfC
-
Announcing the Nous RL Environments Hackathon in SF! Create with Atropos, Nous' RL environments framework, and claim your stake of a $50,000 prize pool. Partners include xAI, Nvidia, Nebius, Akash Network, Lambda, TensorStax, Runpod and Cerebral Valley. You can also use the atropos-rl-gym channel in our Discord to discuss and learn in preparation! May 18th. Sign up below 👇👇 https://lnkd.in/gjusg2x9 https://lnkd.in/eFzRqbXa