IBM Developer reposted this
IBM released a new benchmark for evaluating enterprise agents on multi-hop multi-source reasoning, called VAKRA 🙌🏻 Interesting to see gpt-oss-120b is second in the benchmark after Gemini-3-Flash topping the leaderboard 👏 pretty cool how they consistently release enterprise agentic work (that no other lab does!) get started in their demo → https://lnkd.in/dc_jVuW4