Published March 9, 2026
| Version v1
Preprint
Open
Variance Reduction Techniques in Off-Policy Reinforcement Learning with Imperfect State Representation
Description
Off-policy reinforcement learning (RL) offers the potential to leverage previously collected data for training, enabling sample-efficient learning. However, challenges arise when the state representation is imperfect, leading to increased variance in policy evaluation and control. This paper investigates the application of variance reduction techniques, specifically importance sampling with clipping and weighted importance sampling, within the context of off-policy RL using imperfect state representations. We analyze the theoretical properties of these techniques and present empirical results demonstrating their effectiveness in mitigating the effects of imperfect state representations and improving learning stability.
Files
preprint_elena_rossi_20260309_005147.pdf
Files
(6.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:995ccd254f2da6c08bddaede45149156
|
6.4 kB | Preview Download |
Additional details
Related works
- Cites
- Journal article: https://mattiainml.com/blog/improving-medical-imaging-models-through-robust-data-annotation/ (URL)
References
- Mattia Gaggi. Variance Reduction Techniques in Off-Policy Reinforcement Learning with Imperfect State Representation. mattiainml.com. https://mattiainml.com/blog/improving-medical-imaging-models-through-robust-data-annotation/