Reproducibility of the Experimental Result of BERT for Evidence Retrieval and Claim Verification
- 1. Technische Universität Wien
Description
Reproducibility of the Experimental Result of BERT for Evidence Retrieval and Claim Verification
TU Wien Experiment Design For Data Science Assignment 2
Group 43
András Bonifác Kónya (ID:01502933)
Branimir Raguž (ID:12123474)
Thummanoon Kunanuntakij (ID:12122522)
Abstract
We attempt to reproduce the result of BERT for Evidence Retrieval and Claim Verification [1]. The original paper use BERT for the task of evidence-based claim verification using FEVER dataset 50K Wikipedia pages [2] and it achieves a new state of the art recall of 87.1 for retrieving evidence sentences the dataset, and scores second in the leaderboard with the FEVER score of 69.7. We discuss their experiment design, metric used and attempt to reproduce their result. By reviewing the process describe by the original paper, we conclude that their experiment design is questionable, and the result might not be able to generalize well. Although we are not able to confirm the number due to various difficulties encountered for recreating the dataset and the time frame limitation, we document the list of problem and our effort we have done to resolve them in the process.
Files
Reproducibility of the Experimental Result of BERT for Evidence Retrieval.pdf
Files
(267.9 kB)
Name | Size | Download all |
---|---|---|
md5:9b50a46e1bbca5a407e4adb6b8eb2f6d
|
267.9 kB | Preview Download |
Additional details
References
- Soleimani A., Monz C., Worring M.: BERT for Evidence Retrieval and Claim Verification. In: Jose J. et al. (eds) Advances in Information Retrieval. ECIR 2020. Lecture Notes in Computer Science, vol 12036. Springer, Cham. https://doi.org/10.1007/978-3-030-45442-5_45
- Thorne, J., Vlachos, A., Christodoulopoulos, C., Mittal, A.: FEVER: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355 (2018)
- Hanselowski, A., et al.: UKP-Athene: multi-sentence textual entailment for claim verification. In: Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), pp. 103–108. Association for Computational Linguistics, Brussels, November 2018. https://doi.org/10.18653/v1/W18-5516
- Soleimani A.: ASoleimaniB/BERT_FEVER, Github.com, https://github.com/ASoleimaniB/BERT_FEVER/tree/d630e7150554c72319b37729f0522b462b63603c (2020)