paperarXivTrust 82 · PrimaryPublished 3d agoLive · 2d ago

FinPersona-Bench: A Benchmark for Longitudinal Psychometric Stability of Autonomous Financial Agents

Large Language Models (LLMs) are increasingly deployed as autonomous financial agents initialized with explicit behavioral mandates such as "preserve capital" or "avoid speculative bets" that are meant to govern every decision throughout deployment. In practice, however, as market context accumulates over long horizons, these mandates gradually lose their behavioral influence, a phenomenon we formalize as Mandate Salience Decay (MSD). To measure MSD objectively, we introduce FinPersona-Bench, a simulation benchmark in which a synthetic market decouples observable price from hidden fundamental

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Covers

newsProduction-grade AI agents for financial compliance: Lessons from Stripe

Related across the graph

newsProduction-grade AI agents for financial compliance: Lessons from Stripe

Topics

cs.CL