Topic

Gpu

50 items across the graph — tagged with Gpu.

From the graph · 50

repo
pytorch/pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

repo
deeplearning4j/deeplearning4j

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular…

repo
isl-org/Open3D

Open3D: A Modern Library for 3D Data Processing

repo
apache/tvm

Open Machine Learning Compiler Framework

repo
triton-inference-server/server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

repo
skypilot-org/skypilot

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, Slurm, 20+ clouds, on-prem).

repo
NVIDIA/DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and infer…

repo
Andyyyy64/whichllm

Find the local LLM that actually runs and performs best on your hardware. Ranked by real, recency-aware benchmarks, not parameter count. One command, run it ins…

repo
rapidsai/cuml

cuML - RAPIDS Machine Learning Library

repo
pytorch/executorch

On-device AI across mobile, embedded and edge for PyTorch

repo
lemonade-sdk/lemonade

Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Z…

repo
crmne/ruby_llm

One delightful Ruby framework for every major AI provider. Build AI agents, chatbots, RAG apps, and multimodal workflows in beautiful, expressive code.

repo
llm-d/llm-d

Achieve state of the art inference performance with modern accelerators on Kubernetes

repo
NVIDIA/TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwel…

repo
thu-pacman/chitu

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

repo
NVIDIA/physicsnemo

Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods

repo
dstackai/dstack

Vendor-agnostic orchestration for training, inference and agentic workloads across NVIDIA, AMD, TPU, and Tenstorrent on clouds, Kubernetes, and bare metal.

repo
kubeflow/trainer

Distributed AI Model Training and LLM Fine-Tuning on Kubernetes

repo
beam-cloud/beta9

Ultrafast serverless GPU inference, sandboxes, and background jobs

repo
tenstorrent/tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.

repo
utkuozdemir/nvidia_gpu_exporter

Nvidia GPU exporter for prometheus using nvidia-smi binary

repo
uccl-project/uccl

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

repo
uxlfoundation/scikit-learn-intelex

Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

repo
CliMA/Oceananigans.jl

🌊 Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs

repo
NVIDIA/raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form bui…

repo
kaito-project/kaito

Kubernetes AI Toolchain Operator

repo
mosecorg/mosec

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

repo
NVIDIA-BioNeMo/bionemo-recipes

BioNeMo Recipes: For building and adapting AI models in drug discovery at scale

repo
brucefan1983/GPUMD

Graphics Processing Units Molecular Dynamics

repo
NVIDIA/cuvs

cuVS - a library for vector search and clustering on the GPU

repo
kennss/SiliconScope

Sudoless Apple Silicon system monitor (native SwiftUI GUI) with ANE / Media Engine / memory-bandwidth tracking

repo
felladrin/MiniSearch

Minimalist web-searching platform with an AI assistant that runs directly from your browser. Demo: https://felladrin-minisearch.hf.space

repo
openinfer-project/openinfer

Pure Rust + CUDA LLM inference engine — no PyTorch, OpenAI-compatible, serves Qwen3 to Kimi-K2

repo
Kaden-Schutt/hipfire

RDNA-native LLM inference engine in Rust.

repo
NVIDIA/aicr

Tooling for optimized, validated, and reproducible GPU-accelerated AI runtime in Kubernetes

repo
ModelEngine-Group/unified-cache-management

Persist and reuse KV Cache to speedup your LLM.

repo
FootprintAI/Containarium

Open-source agent runtime — SSH-native isolation, eBPF egress policy, Kubernetes + LXC backends, GPU passthrough, MCP-native CLI

repo
DoubangoTelecom/compv

Insanely fast Open Source Computer Vision library for ARM and x86 devices (Up to #50 times faster than OpenCV)

repo
ertis-research/kafka-ml

Kafka-ML: connecting the data stream with ML/AI frameworks (now TensorFlow and PyTorch!)

repo
traceopt-ai/traceml

A lightweight runtime health check for PyTorch training runs.

repo
RapidFireAI/rapidfireai

RapidFire AI: Rapid AI Customization from RAG to Fine-Tuning

repo
NexusGPU/tensor-fusion

Tensor Fusion is a state-of-the-art GPU virtualization and pooling solution designed to optimize GPU cluster utilization to its fullest potential.

repo
defilantech/LLMKube

Kubernetes operator for self-hosted LLM inference across a heterogeneous GPU fleet: NVIDIA CUDA, AMD Vulkan, and Apple Silicon Metal. Runtimes: llama.cpp, vLLM,…

repo
mohitsoni48/TurboLLM

Run any local LLM engine, auto-tuned to your GPU — polished web UI + OpenAI/Anthropic-compatible API. Point Claude Code at your own machine in one command. No E…

repo
SikamikanikoBG/homelab-monitor

Plug-and-play homelab dashboard in one container — GPU, local-AI VRAM, Docker, systemd, host health. Built-in read-only MCP server so AI agents can explore it t…

repo
gammahazard/locate-anything

Sleek, mobile-friendly web UI for NVIDIA LocateAnything-3B — open-vocabulary object detection & grounding on your own GPU, via one docker compose up.

repo
attevon-llc/OpenTranscribe

Self-hosted AI-powered transcription platform with speaker diarization, search, and collaboration features. Built with Svelte, FastAPI, and Docker for easy depl…

repo
engeldlgado/toshllm

Run large language models locally on Intel Macs with AMD GPUs — native macOS app with Metal acceleration

repo
NikolasEnt/ollama-webui-intel

Ollama with intel (i)GPU acceleration in docker and benchmark

repo
b-data/jupyterlab-python-docker-stack

(GPU accelerated) Multi-arch (linux/amd64, linux/arm64/v8) JupyterLab Python docker images. Please submit Pull Requests to the GitLab repository. Mirror of

Related topics