Npu
3 items across the graph — tagged with Npu.
From the graph · 3
repo
FastFlowLM/FastFlowLM
→repoRun LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.
ModelEngine-Group/unified-cache-management
→repoPersist and reuse KV Cache to speedup your LLM.
Hal0ai/hal0
→Open-source self-hosted home AI inference platform for AMD Strix Halo — multi-backend slots, OpenAI-compatible gateway, Vue 3 + FastAPI + systemd.
