Machine Learning techniques and visualization tools for STM images at CNR-IOM labs
Authors/Creators
Description
This thesis reports the activities carried out on a scientific archive of scanning tunneling microscopy images with the objective of organizing it in a more structured and convenient dataset. The work, performed in strict cooperation with scientists, is framed within the urgent and relevant issue of a correct data management approach in the scientific research activity. The solutions developed in this thesis are designed around the European Commission guidelines on research data management, which follow a concise set of principles, known as the FAIR Data Principles. The problem was addressed using different tools and techniques, addressing several shortcomings of the original dataset, meeting the needs of the data producers from one side, and giving the dataset a broader audience perspective on the other. In particular, the achievements of this work involve the creation of a database to store the scientific images metadata and the development of a web service to visually explore these metadata through interactive tools. Moreover, an analysis of the images and their representation led to the implementation of a novel technique to automatically detect and solve a particular type of image artifact. Finally, the focus of this thesis was dedicated to the development of a workflow for the labeling of the images based on the material composition of the measured samples. This goal was achieved using machine learning techniques, such as feature learning, to retrieve relevant images to help researchers in the manual labeling process. The complete source code used to reach the objectives of this thesis is open-source and publicly accessible . The thesis is organized as follows: chapter one shortly introduces the problem of scientific data management, frames the specific challenges faced in this work within the data management activities of CNR-IOM, and illustrates the necessary technical elements of STM images. The second chapter introduces the original dataset, deii scribes the development steps of the metadata database creation, and discusses the enhancements made to improve researchers interaction with this new service. The third chapter illustrates the metadata web service and its integration in the Trieste Advance Data service (TriDAS) website, with particular attention to the metadata selection process and the further platform development. Chapter four describes the representation and common artifacts of images, presenting an original solution for one of them. It also describes the method of image labeling implemented together with the research group itself. Finally, chapter five draws some conclusions and then highlights some possible additional steps that can be performed on the dataset.
Files
Machine_Learning_techniques_and_visualization_tools_for_STM_images_at_CNR_IOM_labs.pdf
Files
(13.5 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:efd2b55faa1832e521d10e07cbc46651
|
13.5 MB | Preview Download |
Additional details
Related works
- References
- Software: 10.5281/zenodo.4019641 (DOI)