We introduce IGPO, a RL algorithm for fine-grained credit assignment in search agent training. By modeling agentic search turns as an incremental information acquisition process, IGPO defines rewards ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results