Published November 6, 2019 | Version v1
Conference paper Open

Text Analysis of ETDs in ProQuest Dissertations and Theses (PQDT) Global (2016-2018)

  • 1. University of Delhi


The information explosion in the form of ETDs poses the challenge of management and extraction of appropriate knowledge for decision making. Thus, the present study forwards a solution to the above problem by applying topic mining and prediction modeling tools to 263 ETDs submitted to the PQDT Global database during 2016-18 in the field of library science. This study was divided into two phases. The first phase determined the core topics from the ETDs using Topic-Modeling-Tool (TMT), which was based on latent dirichlet allocation (LDA), whereas the second phase employed prediction analysis using RapidMiner platform to annotate the future research articles on the basis of the modeled topics. The core topics (tags) for the studied period were found to be book history, school librarian, public library, communicative ecology, and informatics followed by text network and trend analysis on the high probability co-occurred words. Lastly, a prediction model using Support Vector Machine (SVM) classifier was created in order to accurately predict the placement of future ETDs going to be submitted to PQDT Global under the five modeled topics (a to e). The tested dataset against the trained data set for the predictive performed perfectly.


Text Analysis of ETDs in ProQuest Dissertations and Theses.pdf

Files (1.1 MB)