repoGitHubTrust 82 · PrimaryPublished 13h agoLive · 9h ago

mohitsoni48/TurboLLM

Run any local LLM engine, auto-tuned to your GPU — polished web UI + OpenAI/Anthropic-compatible API. Point Claude Code at your own machine in one command. No Electron, no Python, offline-first.

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Covers

newsHow're you deploying LLMs in production now-a-days? What's the best and most affordable way? [D]newsA barebones CPU-only inference engine for Qwen 3, written from scratch in pure C newsHolo3.1: Fast & Local Computer Use Agents newsShow HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch newsOpenAI and Broadcom unveil LLM-optimized inference chip

Related across the graph

newsOpenAI and Broadcom unveil LLM-optimized inference chip newsHolo3.1: Fast & Local Computer Use Agents newsHow're you deploying LLMs in production now-a-days? What's the best and most affordable way? [D]newsA barebones CPU-only inference engine for Qwen 3, written from scratch in pure C newsShow HN: NanoEuler – GPT-2 scale model in pure C/CUDA from scratch

Topics

ai anthropic-api claude-code gguf gpu inference llama-cpp llama-server llm local-llm