repoGitHubTrust 82 · PrimaryPublished yesterdayLive · 8h ago
teilomillet/retrain
a Python library that uses Reinforcement Learning (RL) to train LLMs.
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Implements
Covers
Related across the graph
newsWould having a dedicated programming language specifically for LLMs be a viable solution? [D]paperGenerative Skill Composition for LLM AgentsnewsI shrank a transformer until every number fitted on the screen and made the weights editable [R]paperIs One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL TrainingpaperReinforcement Learning without Ground-Truth Solutions can Improve LLMs
