Read original ↗
paperarXivTrust 82 · PrimaryPublished yesterdayLive · 4h ago

EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

Autonomous agents are increasingly expected to improve executable policies through feedback, yet existing evaluations often collapse this process into a final score or confound it with open-ended software-engineering progress. We introduce Autonomous Policy Evolution, a controlled evaluation setting in which a harness-model agent repeatedly edits an executable policy system under a fixed interaction budget. We instantiate this setting in EvoPolicyGym, a benchmark built from compact interactive RL environments that evaluates how agents iteratively improve explored policies. On the EvoPolicyGym

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorZhilin Wang

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorHan Song

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorRunzhe Zhan

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorJusen Du

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorJiacheng Chen

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorTianle Li

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorQingyu Yin

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorYulun Wu

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorZhennan Shen

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorTong Zhu

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorYanshu Li

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorGuanjie Chen

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorDerek F. Wong

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorYafu Li

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorYu Cheng

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

  • Linked via arxiv authorYang Yang

    EvoPolicyGym: Evaluating Autonomous Policy Evolution in Interactive Environments

Implements

Has model

authored (incoming)

Related across the graph

Topics