Skip to main content
Angestrom home
SearchPapersModelsLive AIIntelligence
Search⌕⌘K
EnterprisePricingSign in

Stay Ahead in the AI Revolution

Weekly digest — EPI pulse, top intelligence, fresh lineage. Free, no account.

Follow Angestrom
Global source network
Synced every 5 minutes

Continuous sync from primary AI sources — indexed, enriched, and queryable in real time.

arXivHugging FaceGitHubOpenAIAnthropicDeepMindReutersBBC TechHacker NewsReddit MLVerified feedsFunding
ANGESTROM

The Intelligence Layer of Humanity. Everything AI. All in One Place.

Angestrom connects every piece of the AI ecosystem — data, models, research, companies, tools, and people.

info@angestrom.comwww.angestrom.comLucknow, Uttar Pradesh, India

Product

  • AI Search
  • AI Models
  • Research Papers
  • Companies
  • News & Events
  • GitHub Explorer
  • APIs & Tools
  • Datasets
  • Benchmarks
  • Model lifecycle
  • Funding graph
  • Contributors
  • AI Agents

Resources

  • Weekly digest
  • Documentation
  • Tutorials
  • Guides
  • News
  • Help / Start
  • Community

Company

  • About
  • Contact
  • Privacy Policy
  • Terms of Service
  • Acceptable Use

Enterprise

  • Pricing
  • Workspace
  • Contact Sales

Developer

  • Developer Hub
  • API docs
  • GitHub

Learn

  • Learning Academy
  • Roadmaps
  • Glossary
  • AI for Beginners

Popular Topics

Loading topics…
View All Topics →
© 2026 Angestrom Intelligence Private Limited. All rights reserved.
English
Theme
Angestrom home
SearchPapersModelsLive AIIntelligence
Search⌕⌘K
EnterprisePricingSign in
  1. Home
  2. /Repositories
  3. /rllm-org/rllm
Read original ↗
repoGitHubTrust 82 · PrimaryPublished yesterdayLive · 21h ago

rllm-org/rllm

Democratizing Reinforcement Learning for LLMs

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Implements

paperReinforcement Learning without Ground-Truth Solutions can Improve LLMspaperWhich Tokens Matter? Adaptive Token Selection for RLVR with the Relative Surprisal IndexpaperTriadic Werewolf: A Jester Role for Multi-Hop Theory of Mind in LLMspaperTandem Reinforcement Learning with Verifiable RewardspaperIs One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

Related across the graph

paperTriadic Werewolf: A Jester Role for Multi-Hop Theory of Mind in LLMspaperWhich Tokens Matter? Adaptive Token Selection for RLVR with the Relative Surprisal IndexpaperTandem Reinforcement Learning with Verifiable RewardspaperIs One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL TrainingpaperReinforcement Learning without Ground-Truth Solutions can Improve LLMs
Knowledge path·PTriadic Werewolf: A Jester Role for Multi-Hop Theory of Mind in LLMs→PWhich Tokens Matter? Adaptive Token Selection for RLVR with the Relative Surprisal Index→PTandem Reinforcement Learning with Verifiable Rewards→Rrllm-org/rllm

Topics

agent-frameworkagentic-workflowcoding-agentdistributed-trainingllm-reasoningllm-trainingmachine-learningml-infrastructureml-platformreinforcement-learning

Explore

Search similar →Knowledge graph →All repos →Full intelligence feed →
Graph trust82Primary
Graph score5669