AOL4FOLTR
Authors/Creators
Description
AOL4FOLTR is the first learning-to-rank (LTR) dataset designed specifically for evaluating federated online learning-to-rank (FOLTR) algorithms.
Including user identifiers and timestamps, this dataset allows for the simulation of real user behavior with heterogeneous data and in asynchronous federated learning settings.
The dataset consists of two files
letor.txt.gz(55G uncompressed)metadata.csv
letor.txt contains the query-document pairs for all query logs in standard LETOR format. Each query-document pair holds a binary label derived from user clicks, and is further represented by a 103-dimensional vector. We document the features in our code repository.
The query logs are cross-referenced (by qid) in metadata.csv, where contextual information is provided. This includes the user, timestamp, raw query, the target document ID, and a list of 20 candidate documents.
The document IDs and user IDs directly map to the AOL-IA dataset; the query IDs do not. For access to the raw document contents, please refer to this dataset.
Files
metadata.csv
Files
(10.0 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:75362ad22b58cdf2b1252ea325f6beb1
|
9.0 GB | Download |
|
md5:bbd1cdfeec45120b9d8b0423dc5bd003
|
1.0 GB | Preview Download |
Additional details
Additional titles
- Subtitle
- A Large-Scale Web Search Dataset for Federated Online Learning to Rank
Funding
- Dutch Research Council
- BLOCK.2019.004
Software
- Repository URL
- https://github.com/mg98/aol4foltr
- Programming language
- Python