Unveiling Hidden Bonds: A Deep Autoencoder Framework for the Autonomous Isolation and Archetype Generation of Crystallization Water in Mineral ATR-IR Spectroscopy
Authors/Creators
Description
Infrared (IR) spectroscopy is essential for mineralogical analysis, but spectral classification is often complicated by high dimensionality and subtle band overlaps, particularly in the diagnostic hydration region (2800-3800 cm-1). This study introduces an unsupervised machine learning framework utilizing a Densely Connected Autoencoder (DAE) for feature extraction and dimensionality reduction of 150 mineral ATR-IR spectra sourced from the RRUFF database. The core methodology employs a novel two-stage K-Means clustering approach: first, across the full spectral range (400-3800 cm-1) to establish classes based on fundamental structural chemistry (e.g., silicates vs. carbonates); second, restricting the DAE input exclusively to the hydration range to separate minerals based on H2O/OH bonding typology. The DAE successfully learned a compact 40-dimensional latent representation. Critically, the second stage autonomously isolated a highly distinct spectral archetype (Cluster 9), dominated by Gypsum (CaSO4.2H2O), which represents the pure, noise-free pseudo-spectrum of crystallization water. This archetype is characterized by the expected two narrow, sharp H2O peaks, clearly differentiated from the broader bands of complex/acidic hydrates (Cluster 3) and the single, sharp signals of structural hydroxyl groups (Cluster 5). This methodology provides a robust, data-driven alternative for generating clean spectral standards, enabling reliable comparison with potentially noisy or historical ATR-IR measurements without the need for manual denoising.
Files
densedense.pdf
Files
(1.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:f49e29eb65bb478aa25c1dcc6a2a9353
|
1.6 MB | Preview Download |