Shimmy v2.0
The first pure-Rust GGUF inference engine. No C. No Python.
Open Source
Developer Tools
Artificial Intelligence

Upvotes0

▲ 0View on ProductHunt ⧉

Comments1

1 commentsSee comments on PH ⧉

Hunted by

Mike Kuykendall

Unicorn Platform

Create a Website for Your Project Fast • Sponsored

Create your website ⧉

This product was not featured by Product Hunt yet.
It will not be visible on their landing page and won't be ranked (cannot win product of the day regardless of upvotes).

Product upvotes vs the next 3

Waiting for data. Loading

Product comments vs the next 3

Waiting for data. Loading

Product upvote speed vs the next 3

Waiting for data. Loading

Product upvotes and comments

Waiting for data. Loading

Product vs the next 3

Shimmy v2.0

The first pure-Rust GGUF inference engine. No C. No Python.

Two 5,200-token runs. Same model. SHA-identical byte output. That's a proof, not a benchmark. Shimmy v2.0 ships Airframe: pure-Rust GPU inference with hand-written WGSL compute shaders. No llama.cpp. No C. No Python. No CUDA. First production GGUF engine Rust all the way down — including the GPU shaders. Run TinyLlama, Llama 3.2, Phi, DeepSeek from GGUF. Drop-in for AnythingLLM, Open WebUI, Cursor, Zed via OpenAI or Ollama API. Windows, macOS, Linux. cargo install shimmy

Top comment

Upvotes0

▲ 0View on ProductHunt ⧉

Comments1

1 commentsSee comments on PH ⧉

The novel part — Helical Shift: When the KV cache fills, a GPU compute shader slides the cached keys and values backward in the sequence dimension. Because keys and values are stored in raw pre-RoPE form (no position encoding baked in), the slide is a pure data copy — no trigonometric recomputation needed. Two independent 5,200-token runs crossing multiple compaction boundaries produce SHA-identical output. That's not an optimization; it's a provable mathematical invariant. Why this matters: Every other local inference tool — llama.cpp, candle, whisper.cpp — has a C or C++ core that Rust wrappers call through FFI. Airframe is the first production-ready GGUF inference engine that is Rust all the way down, including the GPU shaders. Tech stack: -13,586 lines Rust + 855 lines WGSL -wgpu (WebGPU), bytemuck, tokio, axum -Targets: Windows (D3D12), macOS (Metal), Linux (Vulkan) What you can do right now: -Run TinyLlama, Phi, Llama 3.2, DeepSeek Coder, and others from GGUF files -Connect AnythingLLM, SillyTavern, Zed, Cursor, Open WebUI via Ollama or OpenAI API -Generate beyond your context limit without crashes or garbage output Privacy first! Own your process from implementation to production! Down with our evil corporate AI overlords. https://github.com/Michael-A-Kuy...

About Shimmy v2.0 on Product Hunt

“The first pure-Rust GGUF inference engine. No C. No Python.”

Shimmy v2.0 was submitted on Product Hunt and earned 0 upvotes and 1 comments, placing #33 on the daily leaderboard. Two 5,200-token runs. Same model. SHA-identical byte output. That's a proof, not a benchmark. Shimmy v2.0 ships Airframe: pure-Rust GPU inference with hand-written WGSL compute shaders. No llama.cpp. No C. No Python. No CUDA. First production GGUF engine Rust all the way down — including the GPU shaders. Run TinyLlama, Llama 3.2, Phi, DeepSeek from GGUF. Drop-in for AnythingLLM, Open WebUI, Cursor, Zed via OpenAI or Ollama API. Windows, macOS, Linux. cargo install shimmy

On the analytics side, Shimmy v2.0 competes within Open Source, Developer Tools and Artificial Intelligence — topics that collectively have 1.1M followers on Product Hunt. The dashboard above tracks how Shimmy v2.0 performed against the three products that launched closest to it on the same day.

Who hunted Shimmy v2.0?

Shimmy v2.0 was hunted by Mike Kuykendall. A “hunter” on Product Hunt is the community member who submits a product to the platform — uploading the images, the link, and tagging the makers behind it. Hunters typically write the first comment explaining why a product is worth attention, and their followers are notified the moment they post. Around 79% of featured launches on Product Hunt are self-hunted by their makers, but a well-known hunter still acts as a signal of quality to the rest of the community. See the full all-time top hunters leaderboard to discover who is shaping the Product Hunt ecosystem.

For a complete overview of Shimmy v2.0 including community comment highlights and product details, visit the product overview.

Shimmy v2.0The first pure-Rust GGUF inference engine. No C. No Python.Open SourceDeveloper ToolsArtificial Intelligence

Product upvotes and comments

Product vs the next 3

Top comment

About Shimmy v2.0 on Product Hunt

Who hunted Shimmy v2.0?

Shimmy v2.0
The first pure-Rust GGUF inference engine. No C. No Python.
Open Source
Developer Tools
Artificial Intelligence