LLM Stats is the go-to place to analyze and compare AI models across benchmarks, pricing and capabilities. Compare model performance easily through our playground and API that gives you access to hundreds of models at once.
This is incredibly useful for developers! As a UI/UX designer with experience in data-heavy tools, I'm curious: how did you approach the information architecture to make comparing hundreds of models feel manageable? The challenge of presenting complex technical data without overwhelming users is huge. Love the clean interface. Congrats on the launch!
The LLM rankings are actually very nice! Very helpful! I hope you continue maintaining them.
I really like how the information is displayed and aggregated. Literally made me use Grok 4 Fast more across apps that I use and well... dang. Would be really cool to have a @Raycast extension to compare models or do quick ranking lookups! Congrats on the launch :)
I'm an avid user of LLM Stats, and I think the chat feature is amazing!
Do you plan to include real-world tests like long conversations, tool use, or memory tasks? Those are becoming important lately.
Been drowning in model docs lately, pricing vs quality is a headache. Having cost per 1k tokens next to coding and long context scores on one site is useful. I care about fresh data, so I'll watch how often runs update. Will try the playground against my own prompts.
Hey Makers! 👋
I’m Jonathan, one of the creators of LLM Stats, a community-first leaderboard for comparing the performance between language models, from costs, benchmarks and more.
Quick backstory: This project was born late last year out of a personal need. I was spending hours of my time scouring through various different sources in an attempt to figure out what the best models were for another project that I was working on.
Now, we're working towards building the best semi-private, open and reproducible AI benchmarking community. We believe there's a greater need for independent benchmarks and environments that measure the progress of AI in areas like coding, science, visuals and long horizon tasks.
We're backed by Y Combinator and leaders of Hugging Face, Harvard Medical School, Daytona, Insight Data Science and many more.
Would love to hear your thoughts, see you on the platform.
This is incredibly useful for developers! As a UI/UX designer with experience in data-heavy tools, I'm curious: how did you approach the information architecture to make comparing hundreds of models feel manageable? The challenge of presenting complex technical data without overwhelming users is huge. Love the clean interface. Congrats on the launch!