Machine Learning-Based Semantic Analysis of Scientific Publications for Knowledge Extraction in Safety-Critical Domains
Authors/Creators
Description
This article presents the development of a modular software suite for automated analysis of scientific publications in PDF format. The system integrates vectorization, clustering, topic modelling, dimensionality reduction, and fuzzy logic to combine both formal (vectorbased) and semantic (topic-based) approaches. Interactive 3D visualization supports intuitive exploration of thematic clusters, allowing users to highlight relevant documents and adjust analytical parameters. Validation on a maritime safety case study confirmed the system’s ability to process large publication collections, identify relevant sources, and reveal underlying knowledge structures. Compared to established frameworks such as PRISMA or Scopus/WoS Analytics, the proposed tool operates directly on full-text content, provides deeper thematic classification, and does not require subscription-based databases. The study also addresses the limitations arising from data bias and reproducibility issues in the semantic interpretability of safety-critical decision-making systems. The approach offers practical value for organizations in safety-critical domains—including transportation, energy, cybersecurity, and human–machine interaction—where rapid access to thematically related research is essential.
Files
make-07-00150-v2.pdf
Files
(3.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:ae0ab86d6748748477a71dcc85f5e2c8
|
3.2 MB | Preview Download |
Additional details
Funding
- Ministry of Education Youth and Sports
- Programme Johannes Amos Comenius