Published June 18, 2025 | Version v1.0
Software Open

skcpda/bandit-quota: v1.0 Bandit-quota

Authors/Creators

Description

Bandit‑Quota (6‑arm) — BEIR (all 13 BEIR datasets)

A lightweight contextual‑bandit retrieval demo that combines six off‑the‑shelf dense encoders with a latency‑aware Thompson‑sampling policy.

The pipeline reproduces the headline results reported in our CIKM 2025 resource‑track submission:

```
Bandit    nDCG@10 ≈ 0.704   mean latency ≈ 0.91 s/query
Union‑6   nDCG@10 ≈ 0.491   mean latency ≈ 6.97 s/query
```

Everything lives in a single, self‑contained script — `scripts/bandit_quota_artifact.py` — that you can run on any CPU‑only machine with ≥16 GB RAM.

---

Requirements

Python 3.9 – 3.12
pip install -r requirements.txt (≈ 900 MB once all HF models are cached)
No GPU needed — the reranker and encoders run comfortably on a modern laptop.

> Tip: first run with `TRANSFORMERS_OFFLINE=1` if you have already cached the models elsewhere.

---

Quick‑start

bash -
1) clone and enter
$ git clone https://github.com/skcpda/bandit-quota
$ cd bandit‑quota

2) (optional) create virtual‑env
$ python -m venv .venv && source .venv/bin/activate
$ cd bandit-quota
$ pip install -r requirements.txt

3) run the artifact script
$ python scripts/bandit_quota_artifact.py

The script automatically downloads the BEIR SciFact test split (~9 MB) on first launch, produces per‑arm baselines, the naïve union run, and the Bandit‑Quota scores.

Expected terminal tail:


=== SciFact test (300 queries) ===
Bandit    nDCG@10 0.7043   mean lat 0.907s
Union‑6   nDCG@10 0.4908   mean lat 6.970s

Similarly any other BEIR dataset can be run using commands like this:

  • python scripts/bandit_quota.py --dataset nfcorpus
  • python scripts/bandit_quota.py --dataset trec-covid

Here is full list of BEIR datasets:

  • TREC-COVID (COVID-19 literature)
  • NFCorpus (natural facts)
  • SciFact (scientific claim verification)
  • SCIDOCS (scientific document retrieval)
  • FEVER (fact verification)
  • Climate-FEVER (climate change verification)
  • HotpotQA (multi-hop QA)
  • NaturalQuestions (open-domain QA)
  • FiQA-2018 (financial QA)
  • ArguAna (argument retrieval)
  • CQADupStack (forum question duplication) – treated as separate sub-sets (AskUbuntu, SuperUser, ServerFault, Webmasters, etc.)
  • DBPedia (entity retrieval)
  • TREC-NEWS (news article retrieval)

Command-line interface at a glance scripts/rerank_single.py

python scripts/rerank_single.py \
       --arm bge \
       --dataset scifact \
       --topk 200 \
       --rerank 50

Flag Required? Default Accepted values What it controls
--arm yes bgecontrmpnetgtrminilmdistil Which dense encoder to fire.
--dataset no scifact any BEIR key you’ve mapped in URLS Target benchmark corpus.
--topk no 200 positive int How many hits to pull per encoder before merging.
--rerank no 50 positive int How many of the merged hits the MiniLM cross-encoder re-scores.



Citation

If you build on this work, please cite the resource paper: (To be updated soon)

---

License

Released under the MIT License — see the `LICENSE` file for full text.

Files

skcpda/bandit-quota-v1.0.zip

Files (17.1 MB)

Name Size Download all
md5:5b63589f86f74f96ac6d7b830a842a7d
17.1 MB Preview Download

Additional details

Related works

Dates

Submitted
2025-06-18