repoGitHubTrust 82 · PrimaryPublished 19h agoLive · 6h ago
juyterman1000/entroly
Cut your Claude / OpenAI / Gemini bill 70–95% on AI coding. Local proxy that compresses context, keeps provider caches hot, and verifies LLM output ($0 hallucination guard). Drop-in for Cursor, Claude Code, Codex, Aider + 34 more and custom providers — 30s, no code changes
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Covers
newsShow HN: Smart model routing directly in Claude, Codex and CursornewsI spent ~4.5 months building a free, self-hosted AI gateway: one endpoint for 237 providers (90+ free), auto-fallback, and a token-compression pipeline (MIT)newsClaude Code costs up to $200 a month. Goose does the same thing for free.newsHelp optimizing llama.cpp + Qwen 27B on RTX PRO 6000 Blackwell for coding agents
Implements
Related across the graph
paperTraceLab: Characterizing Coding Agent Workloads for LLM ServingnewsShow HN: Smart model routing directly in Claude, Codex and CursornewsClaude Code costs up to $200 a month. Goose does the same thing for free.newsHelp optimizing llama.cpp + Qwen 27B on RTX PRO 6000 Blackwell for coding agentsnewsI spent ~4.5 months building a free, self-hosted AI gateway: one endpoint for 237 providers (90+ free), auto-fallback, and a token-compression pipeline (MIT)
