Product Thumbnail

Braintrust

Rapidly ship AI without guesswork

Developer Tools
Artificial Intelligence

Evaluate your AI applications with Braintrust: the enterprise-grade stack for building high quality AI products. From experiment tracking, to prompt playground, to data management, we take uncertainty and tedium out of shipping AI.

Top comment

Thanks so much for hunting us @rrhoover We're excited to introduce Braintrust, a platform for running and tracking AI evaluations (“evals”) [1]. At my previous startup Impira and leading AI at Figma, we had this recurring problem where we never knew if changes we made to our products would improve or regress key user scenarios. We built some tooling to solve this problem and after talking to other developers learned that it was a widespread issue. Specifically, it’s challenging to establish a great dev loop that lets you systematically improve and ship high quality AI products. We worked with the teams at Zapier, Coda, and Replit to refine Braintrust. We consistently heard that they were facing challenges with evaluation, so we built Braintrust to help them. Today we’re releasing the product for anyone to use — including a free plan [3] There are a lot of LLM tooling products on the market. Here are a few ways Braintrust is different: - Rather than showing eval metrics in your observability tools, Braintrust offers an “experiment tracking” workflow, meaning you can try out changes while developing them, and drill down into diffs between other experiments and git branches before you ship. Check out our docs [3] for more details. - We believe strongly that you should own your data and support on-premises and private VPC deployments. - We natively and equally support Typescript and Python. - We have a flexible free plan for builders and an unlimited free plan for academic and non-commercial open source projects [3]. We will introduce a self-service paid (”Pro”) tier, hopefully with feedback from this community. Our mission is to enable developers to build high quality, reliable AI products. We couldn’t be doing this without Elad Gil, who helped me incubate the initial idea and team, which today includes founding designer, Coleen Baik, and founding engineer, Manu Goyal. Also big thanks to David Song, from Elad’s team, who is also helping us. We’re excited to launch today [4], but we know there’s a lot left to build and are excited to hear your feedback. [1] https://www.braintrustdata.com [2] http://www.braintrustdata.com/do... [3] https://www.braintrustdata.com/p... [4] https://www.braintrustdata.com/b...

Comment highlights

We're seeing a massive shift in how some software products are built, which of course introduces new challenges not covered by incumbent infra. Braintrust is worth watching, founded by @ankur_goyal, former Head of ML Platform at Figma (by way of acquisition).

I've built user facing interfaces with AI backends before and I see exactly what Braintrust is doing! Clear pain point here, I've faced it way too many times! Awesome idea :)

I like this challenge. We use AI for our product, morphdb.io , but I have huge issue of evaluation of AI we’re building. I want to try it and we provide AI infrastructure with our client using Morph as well. Your product seems help us to deliver!

It's been amazing getting to see Ankur Goyal work on Braintrust to help those building with AI evaluate these fun non-deterministic models. 😅 At Zapier, we've used it to successfully measure and improve our AI-first products. 📈 Congrats on the launch Braintrust team! 🚀

Oh, this is intriguing! Your platform seems incredibly powerful. Congratulations on the successful launch, and keep up the excellent work! Dania from True Nation

Braintrust has quickly become an essential platform for engineers on my team that are working on AI features. Given how hard it is to know precisely what LLMs are capable of, tools that allow engineers to be easily data driven is critical for ensuring product quality and preventing regressions!

Congrats on the launch! Braintrust looks promising for improving AI product quality. The "experiment tracking" feature is unique and useful, along with the flexible free plan and support for Typescript and Python. I'm interested in learning more about how it can benefit my work. - Volodymyr from True Nation!

Congrats on the launch! Just wondering how you're looking into the relevancy of the responses generated w.r.t the prompts themselves?

Congrats on the launch!! This is a massive problem for productionizing AI agents!

Braintrust has significantly optimized Coda's AI operations! Initially, we started with a rather manual process for GenAI evals, and with Braintrust, we've implemented a robust workflow that facilitates scalable evaluations. Moreover, their datasets feature has streamlined the management of benchmark datasets at scale, thereby minimizing the workload on our AI team. This efficiency enables us to concentrate more on feature and model development tasks.