Product Thumbnail

Bench for Claude Code

Store, review, and share your Claude Code sessions

Developer Tools
Artificial Intelligence
Data Visualization

Hunted byChris MessinaChris Messina

Claude Code just opened a PR. But do you really know what it did? By using Bench you can automatically store every session and easily find out what happened. Spot issues at a glance, dig into every tool call and file change, and share the full context with others through a single link: no further context needed. When things go right, embed the history in your PRs. When things go wrong, send the link to a colleague to ask for help. Free, no limits. One prompt to set up on Mac and Linux.

Top comment

Hey Product Hunt! 👋

I’m Manuel, co-founder of Silverstream AI. Since 2018, I’ve been working on AI agents across Google, Meta, and Mila. Now I’m building Bench for Claude Code with a small team.

If you use Claude Code a lot and want to store, review, or share its sessions, this tool is for you. Once connected, Bench automatically records and organizes your sessions, letting you inspect and debug them on your own or share them with your team to improve your workflows.

Getting started is simple:
• Go to bench.silverstream.ai and set it up in under a minute on Mac or Linux
• Keep using Claude Code as usual
• Open Bench when you need to understand or share a session


That’s it.

Bench is completely free. We built it for ourselves and now want as many developers as possible to try it and shape it with us.


We’ll be here all day reading and replying to feedback (without using Claude 😂). Would love to hear what you think!


Btw, support for more agents is coming soon, so stay tuned!

Comment highlights

Amazing! I always felt the missing piece in PRobe (shameless plug-in) was to give my LLM buddy the developer's claude code/cursor chat!

Congrats on the launch!

Quick question, any plans to add Windows support? Would love to use this but currently stuck on Windows 🙂

The session sharing feature is what makes this stand out. I've lost count of how many times I've wanted to show a teammate "hey look what Claude did here" and had to resort to copy-pasting terminal output into Slack.

Being able to just send a link to a full session with context would save so much back and forth. Especially useful for code reviews where you want to show the AI's reasoning, not just the final diff.

Why aren't your Claude Code sessions as easy and convenient to share as Google Docs?

Well now they can be, thanks to Silverstream's Bench!

This free tool gives you insight into your agent runs so you can see where things went off the rails, and then share a link to your colleagues or coworkers to track down — and fix — the issue.

This is a simple, useful tool that can start using now to get a handle on your agents' performance.

I ve been doing a lot of heavy building and the 'how did I get here' problem I face every day.

This is interesting.

Feels like we’re starting to treat AI outputs more like artifacts that need to be tracked and reviewed, not just ephemeral responses.

Curious how you think about evaluation over time.

For example, do you see this becoming something like a feedback loop where past sessions actually improve future workflows?

This is useful. I use Claude through Cursor daily and half the time I wish I could go back and review what it actually changed across a session. Being able to store and review sessions would save a lot of second-guessing.

Hi everyone! 👋 I’m Giulio, co-founder and COO at Silverstream AI.


It feels like we’re all trying to buy back time these days. There’s always more to do, and never enough hours. That’s why I really think tools like Bench for Claude Code matter.


Agents are getting better fast, which means longer and more complex sessions. Hopefully more reliable too. But even as trust increases, I don’t think we’ll ever fully give up control. We’ll always want the option to see what they’re doing, as long as it doesn’t slow us down.


That’s exactly what we’re building Bench for.
If you try it out, I’d really appreciate your feedback. It’ll help us shape our product in the right direction.

So basically you can use this to correct your other AI or just Claude ...

Hey Product Hunt! I'm Omar, Founding Researcher at Silverstream AI.

We originally built Bench as an internal tool to make debugging our own agents less painful, and it's become something I reach for every day.

My favorite part? The high-level run overview. When an agent run has hundreds of steps, being able to scan the whole thing at a glance and immediately spot where something went wrong is a huge time-saver. From there, I can zoom in all the way down to the model's reasoning traces at the exact step where things broke, which makes a real difference when you're trying to understand why an agent made a certain decision, not just what it did.

As we kept adding features, we realized Bench had become too useful to keep to ourselves, so here we are! 🚀

We're starting with Claude Code, but support for more agents is on the way. Give it a try and let us know what you think!

I've tackled similar challenges with code reviews and context sharing, and I love how Bench automates session storage. How do you handle sensitive data in stored sessions to ensure developers aren’t accidentally sharing proprietary code?

Hey folks! I’m Simone, Co-founder and CTO of Silverstream AI.

Really happy to be launching this today. I’m excited to share it, and very curious to hear your feedback!

One habit we’ve introduced across the team is linking Bench sessions in PRs whenever Claude Code was involved in creating or debugging a change. It gives reviewers a lot more context on how a bug was found and fixed, instead of just showing the final diff.

That’s been one of the most useful workflows for us, and I’d recommend it to other teams using Claude Code too.

I’m also using Bench in a research setting, where session data helps generate detailed methodology reports showing how results were obtained. I’m already finding it useful, and I think there’s a lot more to unlock there!

Looking forward to your thoughts. I want to make Bench as useful for other devs as it's been so far for us, and your input really matters!

I've been thinking about this for a while now. Traditional git style version control is not optimal for the AI coding era. You lose information from your claude code terminal or your AI coding tool of choice. Cool to see this getting productize. Congrats on launch!

Great looking observability layer to see what's happening behind the scenes! I think it will surely help teams optimize their processes.

Congrats on the launch!

Now add observability + failure handling, otherwise it’s just scheduled guessing.

Nice.

Most people don’t need logs.

They need to understand why the agent made a bad decision and how to prevent it next time.

I love finding Claude Code related products daily on PH. This looks great!