Product upvotes vs the next 3

Waiting for data. Loading

Product comments vs the next 3

Waiting for data. Loading

Product upvote speed vs the next 3

Waiting for data. Loading

Product upvotes and comments

Waiting for data. Loading

Product vs the next 3

Loading

SelfHostLLM

Calculate the GPU memory you need for LLM inference

Calculate GPU memory requirements and max concurrent requests for self-hosted LLM inference. Support for Llama, Qwen, DeepSeek, Mistral and more. Plan your AI infrastructure efficiently.

Top comment

Built to simplify planning for self-hosted AI deployments.

Unlike other AI infrastructure tools, SelfHostLLM lets you precisely estimate GPU requirements and concurrency for Llama, Qwen, DeepSeek, Mistral, and more using custom config.

B̶u̶t̶ n̶o̶w̶ I̶ w̶̶a̶n̶t̶ t̶o̶ s̶e̶e̶ ̶A̶p̶p̶l̶e̶ s̶i̶l̶i̶c̶o̶n̶ ̶a̶d̶d̶e̶d̶ t̶o̶ t̶h̶e̶ m̶i̶x̶!

Update: Now there's a Mac version too!