Read original ↗

repoGitHubTrust 82 · PrimaryPublished yesterdayLive · 21h ago

raketenkater/ggrun

Auto-tuned launcher for GGUF models on llama.cpp / ik_llama.cpp — OpenAI-compatible server with multi-GPU tensor-split, MoE expert placement, measured flag tuning (AI Tune), hardware-matched HuggingFace downloads, and crash recovery. An Ollama alternative for multi-GPU rigs.

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Covers

newsOpenAI’s Jalapeño chip is Big Tech’s spiciest move away from Nvidia newsOpenAI and Broadcom unveil LLM-optimized inference chip newsBuild real agentic apps using CUGA: two dozen working examples on a lightweight harness newsOpenAI unveils its first custom chip, built by Broadcom

Related to

modelopenai/gpt-oss-120b

Covers (incoming)

newsGPT-5.6 launches, but OpenAI is taking it slow - IBM

Related across the graph

newsOpenAI unveils its first custom chip, built by Broadcom newsOpenAI’s Jalapeño chip is Big Tech’s spiciest move away from Nvidia newsBuild real agentic apps using CUGA: two dozen working examples on a lightweight harness newsOpenAI and Broadcom unveil LLM-optimized inference chip newsGPT-5.6 launches, but OpenAI is taking it slow - IBM modelopenai/gpt-oss-120b

Topics

cuda gguf golang inference-server llama-cpp llamacpp llm local-llm localllama metal