Llama Cpp
16 items across the graph — tagged with Llama Cpp.
From the graph · 16
A Python framework for self-hosted LLM tool-calling and multi-step agentic workflows
Sudoless Apple Silicon system monitor (native SwiftUI GUI) with ANE / Media Engine / memory-bandwidth tracking
Local First Ai Agent. Optimized for Local Ai models. Long context window. Proper tools callings. Runs privately on your device.
Auto-tuned launcher for GGUF models on llama.cpp / ik_llama.cpp — OpenAI-compatible server with multi-GPU tensor-split, MoE expert placement, measured flag tuni…
Run any local LLM engine, auto-tuned to your GPU — polished web UI + OpenAI/Anthropic-compatible API. Point Claude Code at your own machine in one command. No E…
Run large language models locally on Intel Macs with AMD GPUs — native macOS app with Metal acceleration
Neutral, reproducible benchmark for local LLMs on Apple Silicon (Mac · iPhone · iPad) — MLX, llama.cpp, CoreML, Apple Foundation Models
Self-hosted AI workspace where chat becomes visual workflows, multi-agent operations, and reviewable automations. Local memory; local or cloud models
Neve AI é uma plataforma de IA local privacy-first, desenvolvida para oferecer uma experiência de alta performance na execução de LLMs, reduzindo a dependência…
Advanced code editor using local AI
From-scratch C++/CUDA inference engine for the NVIDIA RTX 5090 (sm_120a) — the best single-GPU backend for agentic AI: tool calling, long-context loops, reasoni…
Off Grid AI — private, on-device AI. Run open models (text, vision, image, voice) locally through one OpenAI-compatible gateway. No cloud, no accounts, no API k…
Unified KV cache compression for LLM inference — TurboQuant, IsoQuant, PlanarQuant, TriAttention. 10 methods, GPU-validated, multi-GPU planner. Compress KV cach…
Open-source self-hosted home AI inference platform for AMD Strix Halo — multi-backend slots, OpenAI-compatible gateway, Vue 3 + FastAPI + systemd.
Dual-engine (llama.cpp + vLLM) LLM benchmarking pipeline for GGUF & safetensors on NVIDIA GPUs — speed, quality, live dashboard, publishable cards.
