Risk-Aware Continuous Control with Neural Contextual Bandits
- 1. NEC Laboratories Europe GmbH
- 2. i2CAT Foundation
- 3. ICREA
Description
Recent advances in learning techniques have garnered attention for their applicability to a diverse range of real-world sequential decision-making problems. Yet, many practical applications have critical constraints for operation in real environments. Most learning solutions often neglect the risk of failing to meet these constraints, hindering their implementation in real-world contexts. In this paper, we propose a risk-aware decision-making framework for contextual bandit problems, accommodating constraints and continuous action spaces. Our approach employs an actor multi-critic architecture, with each critic characterizing the distribution of performance and constraint metrics. Our framework is designed to cater to various risk levels, effectively balancing constraint satisfaction against performance. To demonstrate the effectiveness of our approach, we first compare it against stateof-the-art baseline methods in a synthetic environment, highlighting the impact of intrinsic environmental noise across different risk configurations. Finally, we evaluate our framework in a real-world use case involving a 5G mobile network where only our approach consistently satisfies the system constraint (a signal processing reliability target) with a small performance toll (8.5% increase in power consumption).
Files
Risk_Aware_Decision_Making_for_Continuous_Control.pdf
Files
(4.3 MB)
Name | Size | Download all |
---|---|---|
md5:913ad92e8879193575df4a9c0811df51
|
4.3 MB | Preview Download |
Additional details
Funding
- European Commission
- DAEMON – Network intelligence for aDAptive and sElf-Learning MObile Networks 101017109
- European Commission
- BeGREEN – Beyond 5G Artificial Intelligence Assisted Energy Efficient Open Radio Access Network 101097083
- European Commission
- ORIGAMI – Optimized resource integration and global architecture for mobile infrastructure for 6G 101139270