The idea for Agent-Blackbox came from a real 3 AM Slack message I never want to see again.
Our multi-agent pipeline had just wiped the staging database. PM-agent said the requirement was clean. Coder-agent said it followed the spec perfectly. Verifier-agent said it never even received output. Four hours of grepping through thousands of log lines later, we still had no idea which agent actually failed.
That's the "accountability vacuum" β and I built Agent-Blackbox to solve it.
**What problem does it solve?**
When a single LLM call fails, you debug it. When a chain of 5 agents fails, you play detective. Every agent points fingers at the next one. Logs are scattered across systems. There's no verifiable chain of custody for decisions.
Agent-Blackbox gives you `git blame` for AI agents. In 3 seconds, it tells you exactly which agent broke the chain β with cryptographic proof, not guesswork.
**How it works (the 10-second version)**
Under the hood, it implements two IETF drafts:
- **JEP (Judgment Event Protocol)** β a minimal, cryptographically signed log format for agent decisions.
- **JAC (Judgment Accountability Chain)** β a `task_based_on` field that links every decision to its parent.
Four verbs β `J` (Judge), `D` (Delegate), `T` (Terminate), `V` (Verify) β model any accountability workflow. Every action produces a signed JEP receipt with Ed25519 signatures. When something breaks, you trace the `task_based_on` chain back to the failure point.
**How it evolved**
The first version was a messy internal script that just hashed logs. Then I realized the problem was bigger than my team β *everyone* deploying multi-agent systems was hitting the same wall. So I rebuilt it around IETF drafts to make it an open standard, not another closed ecosystem.
v1.0 ships with:
- β Rust core engine (fast)
- β Python SDK (TypeScript coming)
- β `blame-finder dashboard` for visual causality trees
- π§ LangChain / CrewAI native adapters (in progress)
**What's next?**
I want to make Agent-Blackbox the default accountability layer for agentic workflows β like what `git blame` did for code. PDF/HTML blame reports, real-time alerting, and deeper framework integrations are on the roadmap.
**Try it yourself**
```bash
pip install agent-blame-finder
blame-finder dashboard
```
GitHub: https://github.com/hjs-spec/Agen...
I'd love feedback β especially from anyone else who's been woken up at 3 AM by a broken agent pipeline. What's your current debugging workflow? How do you trace failures across agents today?
About Agent-Blackbox on Product Hunt
βgit blame for AI agents β find who broke prod in 3sβ
Agent-Blackbox was submitted on Product Hunt and earned 2 upvotes and 1 comments, placing #209 on the daily leaderboard. π Agent Blame-Finder - Cryptographic blackbox for multi-agent systems. Find which agent messed up in 3 seconds. JEP/JAC IETF reference implementation. - hjs-spec/Agent-Blackbox
On the analytics side, Agent-Blackbox competes within Task Management, Developer Tools, Artificial Intelligence and GitHub β topics that collectively have 1.1M followers on Product Hunt. The dashboard above tracks how Agent-Blackbox performed against the three products that launched closest to it on the same day.
Who hunted Agent-Blackbox?
Agent-Blackbox was hunted by yuqiang@JEP. A βhunterβ on Product Hunt is the community member who submits a product to the platform β uploading the images, the link, and tagging the makers behind it. Hunters typically write the first comment explaining why a product is worth attention, and their followers are notified the moment they post. Around 79% of featured launches on Product Hunt are self-hunted by their makers, but a well-known hunter still acts as a signal of quality to the rest of the community. See the full all-time top hunters leaderboard to discover who is shaping the Product Hunt ecosystem.
For a complete overview of Agent-Blackbox including community comment highlights and product details, visit the product overview.