Published January 3, 2023 | Version 1.0.0
Preprint Open

A comprehensive review of automatic text summarization techniques: method, data, evaluation and coding

Description

We provide a literature review about Automatic Text Summarization (ATS) systems. We consider a citation-based approach. We start with some popular and well-known papers that we have in hand about each topic we want to cover and we have tracked the "backward citations" (papers that are cited by the set of papers we knew beforehand) and the "forward citations" (newer papers that cite the set of papers we knew beforehand). In order to organize the different methods, we present the diverse approaches to ATS guided by the mechanisms they use to generate a summary. Besides presenting the methods, we also present an extensive review of the datasets available for summarization tasks and the methods used to evaluate the quality of the summaries.  Finally, we present an empirical exploration of these methods using the CNN Corpus dataset that provides golden summaries for extractive and abstractive methods.

Notes

Partial financial support from the Ministry of Science and Technology of Brazil (MCTI) and CNPQ. DOC (grant number 302629/2019-0) and LW (309545/2021-8).

Files

ats-experiments-contents.zip

Files (12.9 MB)

Name Size Download all
md5:971caa89c7a4d5dde709b06a535ac853
12.9 MB Preview Download