"We develop an approach for solving time-consistent risk-sensitive stochastic optimization problems using model-free reinforcement learning (RL). Specifically, we assume agents assess the risk of a sequence of random variables using dynamic convex risk measures. We employ a time-consistent dynamic programming principle to determine the value of a particular policy, and develop policy gradient update rules. We further develop an actor-critic style algorithm using neural networks to optimize over policies. Finally, we demonstrate the performance and flexibility of our approach by applying it to optimization problems in statistical arbitrage trading and obstacle avoidance robot control."
top of page
Rechercher
Posts récents
Voir toutEffective risk management requires understanding aggregate risks, individual business unit riskiness, and systemic risks. Realistic...
10
The paper explains Artificial Intelligence (AI), focusing on Generative AI, its role in finance, and its differences from Machine...
30
Insurers face complex risk dependencies in loss reserving. Additive background risk models (ABRMs) offer interpretable structures but can...
10
bottom of page
Comments