paper · arXiv

Long-horizon credit assignment in RL

A method for propagating reward across very long action sequences.

Want the primary source?View original →