Best-of-Both-Worlds Algorithms for Linear Contextual Bandits

Kuroki, Yuko; Rumi, Alberto; Tsuchiya, Taira; Vitale, Fabio; Cesa-Bianchi, Nicolò

doi:10.5281/zenodo.13880712

Published February 19, 2024 | Version v1

Conference paper Open

Best-of-Both-Worlds Algorithms for Linear Contextual Bandits

1. CENTAI Institute
2. University of Milan
3. The University of Tokyo
4. Politecnico di Milano

We study best-of-both-worlds algorithms for K-armed linear contextual bandits. Our algorithms deliver near-optimal regret bounds in both the adversarial and stochastic regimes, without prior knowledge about the environment. In the stochastic regime, we achieve the polylogarithmic rate (dK)2polyln(dKT ) / ∆min where ∆min is the minimum suboptimality gap over the d-dimensional context space. In the adversarial regime, we obtain either the first-order O(dK√L∗) bound, or the second- order O(dK√Λ∗) bound, where L∗ is the cumulative loss of the best action and Λ∗ is a notion of the cumulative second moment for the losses incurred by the algorithm. Moreover, we develop an algorithm based on FTRL with Shannon entropy regularizer that does not require the knowledge of the inverse of the covariance matrix, and achieves a polylogarithmic regeret in the stochastic regime while obtaining O(dK √T) regret bounds in the adversarial regime.

Files

2312.15433v2.pdf

Files (654.2 kB)

Name	Size	Download all
2312.15433v2.pdf md5:4522fd67702ffb6c398da5bf4cffd742	654.2 kB	Preview Download

Additional details

European Commission
ELIAS - European Lighthouse of AI for Sustainability 101120237

	All versions	This version
Views	49	49
Downloads	39	39
Data volume	26.2 MB	26.2 MB

Best-of-Both-Worlds Algorithms for Linear Contextual Bandits

Creators

Description

Files

2312.15433v2.pdf

Files (654.2 kB)

Additional details

Funding