repoGitHubTrust 82 · PrimaryPublished yesterdayLive · 22h ago
LMCache/LMCache
LMCache: Supercharge Your LLM with the Fastest KV Cache Layer
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Covers
newsBiggest, baddest model to fill 144GB VRAM + 120GB RAM to the brim, regardless of speednewsI mapped which local LLMs actually fit each RAM tier, 8 to 128GB (open dataset)newsI compiled LLM inference pricing across 7 providers — the caching numbers are surprising(spreadsheet included) [R]newsDevs - you have 64gb of VRAM - which model do you use for coding?
Implements
Related across the graph
newsI compiled LLM inference pricing across 7 providers — the caching numbers are surprising(spreadsheet included) [R]newsI mapped which local LLMs actually fit each RAM tier, 8 to 128GB (open dataset)newsDevs - you have 64gb of VRAM - which model do you use for coding?newsBiggest, baddest model to fill 144GB VRAM + 120GB RAM to the brim, regardless of speedpaperOne-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining
