QReCC - Question Rewriting in Conversational Context
====================================================

QReCC contains 14K conversations with 81K question-answer pairs.

See the Github Repository [1] for more information.

In the version published here, qrecc-test.json has an additional field, "Passages", that contains the IDs of the passages that (1) have been retrieved by one of the methods from the paper [2] and (2) have a token overlap F1 above or equal to 0.8 with the human answer for the question.

The passages.zip contains the passages created as described in the Github Repository [3] (i.e., after applying the paragraph_chunker.py).


SCAI-QReCC-21
-------------
The test dataset is used in the SCAI QReCC'21 shared task [4]. This collection contains the following files:
  - scai-qrecc21-turns.json
    Adaptation of the qrecc-test.json. The fields "Rewrite", "Passages", and "Answer" are renamed to "Truth_rewrite", "Truth_passages", and "Truth_answer" respectively. A field "Transformer_rewrite" is added that contains the question rewrites by the "Transformer++" approach from the paper [2].
  - scai-qrecc21-questions.json
    Generated from scai-qrecc21-turns.json [5]. Contains the "Conversation_no", "Turn_no", "Context", and "Question" for each turn.
  - scai-qrecc21-questions-with-rewrites.json
    Generated from scai-qrecc21-turns.json [5]. Contains the "Conversation_no", "Turn_no", "Context", and "Question" for each turn and for each decontextualized turn, that is, with empty context and the question being rewritten (1) not, (2) with the "Transformer++" approach from the paper [2], and (3) by a human (i.e., content of the "Truth_answer" field). New "Conversation_no" values are created for these additional turns.
  - scai-qrecc21-ground-truth.json
    Generated from scai-qrecc21-turns.json [5]. Contains the "Truth_rewrite", "Truth_passages", and "Truth_answer" fields for each turn. The "Conversation_no" and "Turn_no" are provided for the different rewrites (see scai-qrecc21-questions-with-rewrites.json) in `Turns.*rewrite-type*`, where the rewrite type is one of "model" (not decontextualized), "original" (no rewrite), "transformer" (rewritten with the "Transformer++" approach from the paper [2]), and "human" (the rewrite is equal to the "Truth_answer" field)
  - scai-qrecc21-naacl-baseline.json
    Run file for the end-to-end approach from the paper [2], used as a baseline in the task.
    

References
----------
[1] https://github.com/apple/ml-qrecc
[2] https://arxiv.org/abs/2010.04898
[3] https://github.com/apple/ml-qrecc/tree/main/collection
[4] https://scai.info/scai-qrecc/
[5] https://github.com/scai-conf/SCAI-QReCC-21/blob/main/code/util/turns_split_data_and_truth.sh