Inference Engine
7 items across the graph — tagged with Inference Engine.
From the graph · 7
Large-scale LLM inference engine
Qualcomm® AI Hub Models is our collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qu…
NobodyWho is an inference engine that lets you run LLMs locally and efficiently on any device.
Pure Rust + CUDA LLM inference engine — no PyTorch, OpenAI-compatible, serves Qwen3 to Kimi-K2
AMD MIVisionX is a computer vision toolkit built around a highly optimized, conformant open-source implementation of the Khronos OpenVX™ 1.3 specification. As o…
The open source control plane for AI inference
From-scratch C++/CUDA inference engine for the NVIDIA RTX 5090 (sm_120a) — the best single-GPU backend for agentic AI: tool calling, long-context loops, reasoni…
