Amd
16 items across the graph — tagged with Amd.
From the graph · 16
A high-throughput and memory-efficient inference and serving engine for LLMs
LMCache: Supercharge Your LLM with the Fastest KV Cache Layer
Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Z…
Vendor-agnostic orchestration for training, inference and agentic workloads across NVIDIA, AMD, TPU, and Tenstorrent on clouds, Kubernetes, and bare metal.
Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)
Open Source Continuous Inference Benchmark Research Platform — Kimi K2.7-Code, MiniMax M3, DeepSeekv4, GLM5 - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 & soo…
RDNA-native LLM inference engine in Rust.
AMD MIVisionX is a computer vision toolkit built around a highly optimized, conformant open-source implementation of the Khronos OpenVX™ 1.3 specification. As o…
Tensor Fusion is a state-of-the-art GPU virtualization and pooling solution designed to optimize GPU cluster utilization to its fullest potential.
Run large language models locally on Intel Macs with AMD GPUs — native macOS app with Metal acceleration
(GPU accelerated) Multi-arch (linux/amd64, linux/arm64/v8) JupyterLab Python docker images. Please submit Pull Requests to the GitLab repository. Mirror of
(GPU accelerated) Multi-arch (linux/amd64, linux/arm64/v8) Data Science dev containers for R, Python, Julia and MAX/Mojo
Dashboard for InferenceX™, Open Source Continuous Inference
Open-source self-hosted home AI inference platform for AMD Strix Halo — multi-backend slots, OpenAI-compatible gateway, Vue 3 + FastAPI + systemd.
Dependency/tool package detected from repository manifests (@streamdown/math).
