Model Serving
6 items across the graph — tagged with Model Serving.
From the graph · 6
repo
ModelTC/LightLLM
→repoLightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed perf…
thu-pacman/chitu
→repoHigh-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
vllm-project/vllm-ascend
→repoCommunity maintained hardware plugin for vLLM on Ascend
alibaba/rtp-llm
→repoRTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
basetenlabs/truss
→repoThe simplest way to serve AI/ML models in production
openvinotoolkit/model_server
→A scalable inference server for models optimized with OpenVINO™
