Topic

Cuda Kernels

2 items across the graph — tagged with Cuda Kernels.

From the graph · 2

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Pure Rust + CUDA LLM inference engine — no PyTorch, OpenAI-compatible, serves Qwen3 to Kimi-K2