Published December 1, 2023 | Version v1
Journal article Open

A hybrid approach for text summarization using semantic latent Dirichlet allocation and sentence concept mapping with transformer

  • 1. ROR icon Amrita Vishwa Vidyapeetham
  • 2. ROR icon Oracle (United States)

Description

Automatic text summarization generates a summary that contains sentences reflecting the essential and relevant information of the original documents. Extractive summarization requires semantic understanding, while abstractive summarization requires a better intermediate text representation. This paper proposes a hybrid approach for generating text summaries that combine extractive and abstractive methods. To improve the semantic understanding of the model, we propose two novel extractive methods: semantic latent Dirichlet allocation (semantic LDA) and sentence concept mapping. We then generate an intermediate summary by applying our proposed sentence ranking algorithm over the sentence concept mapping. This intermediate summary is input to a transformer-based abstractive model fine-tuned with a 
multi-head attention mechanism. Our experimental results demonstrate that the proposed hybrid model generates coherent summaries using the intermediate extractive summary covering semantics. As we increase the concepts and number of words in the summary the rouge scores are improved for precision and F1 scores in our proposed model.

Files

67 30497 EM K.pdf

Files (584.8 kB)

Name Size Download all
md5:1756031609425a4d7a7f9b031f82d269
584.8 kB Preview Download