Arena runs the most trusted ranking system for AI chatbots like ChatGPT and Claude. But here’s the twist: the companies being ranked are also funding the leaderboard. Arena went from a university research project to a $25 million startup in just seven months, with backing from some of the same AI companies it evaluates.
The platform lets regular people test different AI models side-by-side without knowing which is which, then vote for the better response. This “blind” testing has made Arena’s rankings incredibly influential – they affect which AI companies get funding and media attention.
The Judge Gets Paid by the Contestants
Arena started as an academic project at UC Berkeley, designed to create fair AI rankings that couldn’t be manipulated. The researchers wanted to solve a real problem: AI companies were cherry-picking examples to make their models look better than competitors.
But as Arena became the go-to authority for AI rankings, it also became a business. The startup now takes money from Anthropic, OpenAI, and other AI giants whose models appear on the leaderboard. Arena says this doesn’t influence their rankings since the voting is done by the public, not by Arena itself.
The company claims its “crowd-sourced” approach keeps things honest – with over a million votes from real users, it’s harder for any single company to game the system.
What Happens Next
Arena plans to expand beyond chatbots to rank AI models for images, video, and other tasks. The bigger question is whether a ranking system can stay neutral when it’s funded by the companies it judges.

