Topic

Inference Server

6 items across the graph — tagged with Inference Server.

From the graph · 6

repo

jundot/omlx

LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar

→repo

roboflow/inference

Turn any computer or edge device into a command center for your computer vision projects.

→repo

superlinked/sie

Open-source inference server and production cluster for all the models your agent needs.

→repo

basetenlabs/truss

The simplest way to serve AI/ML models in production

→repo

raketenkater/ggrun

Auto-tuned launcher for GGUF models on llama.cpp / ik_llama.cpp — OpenAI-compatible server with multi-GPU tensor-split, MoE expert placement, measured flag tuni…

→repo

kibae/onnxruntime-server

ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.

→

From the graph · 6

Related topics