Published April 22, 2026
| Version v5
Dataset
Open
OneDZ: A Global Detrital Zircon Database and Implications for Constructing Giant Geoscience Database
Authors/Creators
-
Li, Keran1, 2
- Hu, Xiumian1, 2
- Chai, Rong3
- Yang, Jianghai4
- Xue, Weiwei5
- Pan, Yingdi1, 2
- Li, Taiyang6
- Fang, Can7
- Ma, Anlin1, 2
- Huang, Hu8, 9, 10
- Guo, Qianqian11
- Yang, Wentao12
- Hu, Lisha13
- Qi, Liang8, 9, 10
- Chen, Guohui14
- Sun, Gaoyuan15
- Zhang, Shijie16
- Deng, Tao1, 2
- Li, Kuizhou8, 9, 17
- Sun, Jiaopeng18, 19
- Gao, Biao20, 21, 22
- 1. State Key Laboratory of Mineral Deposit Research
- 2. School of Earth Sciences and Engineering, Nanjing University
-
3.
Chinese Academy of Geological Sciences
- 4. School of Earth Sciences, China University of Geosciences (Wuhan)
-
5.
State Key Laboratory of Isotope Geochemistry
- 6. College of Computer Science and Cyber Security, Chengdu University of Technology
- 7. Hangzhou Research Institute, Huawei Technologies
-
8.
State Key Laboratory of Oil and Gas Reservoir Geology and Exploitation
- 9. Institute of Sedimentary Geology, Chengdu University of Technology
- 10. Key Laboratory of Deep-time Geography and Environment Reconstruction and Applications of Ministry of Natural Resources, Chengdu University of Technology
- 11. College of Earth and Planetary Sciences, University of Chinese Academy of Sciences
- 12. School of Resources and Environment, Henan Polytechnic University
- 13. College of Marine Geosciences, Ocean University of China
- 14. School of Earth Sciences and Engineering, Hohai University
- 15. College of Oceanography, Hohai University
- 16. College of Tourism, Henan Normal University, Xinxiang
- 17. College of Earth and Planetary Sciences, Chengdu University of Technology
-
18.
State Key Laboratory of Continental Dynamics
- 19. Department of Geology, Northwest University
-
20.
State Key Laboratory of Palaeobiology and Stratigraphy
-
21.
Nanjing Institute of Geology and Paleontology
- 22. Center for Excellence in Life and Palaeoenvironment, Chinese Academy of Sciences
Description
# onedz_datasets_csv
This directory contains the **split CSV datasets** of the ZirconRegular_LLM project. All files are partitioned into manageable parts (~100,000–130,000 rows each) for batch processing, LLM ingestion, or memory-constrained workflows.
## Directory Structure
```
onedz_datasets_csv/
│
├── Total_UPb_split_parts/ # Main U-Pb geochronology database
│ ├── zircon_upb_part_01.csv
│ ├── zircon_upb_part_02.csv
│ └── ... (22 parts total)
│
├── Total_LuHf_split_parts/ # Lu-Hf isotope database, note that all files have been checked by experts
│ ├── zircon_luhf_part_01.csv
│ ├── zircon_luhf_part_02.csv
│ └── zircon_luhf_part_03.csv
│
└── Experts_checked_UPb_split_parts/ # Expert-reviewed U-Pb subsets
├── expert_upb_part_01.csv
├── expert_upb_part_02.csv
└── ... (14 parts total)
```
## Dataset Summary
| Dataset | Parts | Est. Total Rows | Columns | Content |
|---------|-------|-----------------|---------|---------|
| `Total_UPb_split_parts` | 22 | ~2,550,000 | 64 | Full detrital zircon U-Pb age database |
| `Total_LuHf_split_parts` | 3 | ~297,000 | 33 | Lu-Hf isotope data linked to U-Pb records (expert-checked) |
| `Experts_checked_UPb_split_parts` | 14 | ~1,497,000 | 64 | Peer-reviewed regional compilations (quality-controlled) |
---
## File Format
All CSV files follow the project standard:
| Property | Specification |
|----------|---------------|
| **Encoding** | UTF-8 with BOM (`utf-8-sig`) |
| **Delimiter** | Comma (`,`) |
| **Line endings** | LF (`\n`) |
| **Header** | Single header row with standardized column names |
| **Quoting** | Double-quoted fields when containing commas or newlines |
### U-Pb Standard Columns (64 total)
- **Bibliographic**: `Lead_Author`, `Year`, `Journal`, `Vol`, `Pages`, `Title`, `Web_Link`
- **Sample**: `Published_Sample_ID`, `Country_State`, `Region`, `Continent`, `Major_Geographic_Geologic_Unit`, `Minor_Geologic_Geographic_Unit`, `Group`, `Formation`, `Member`, `Locality`, `Profile`, `Latitude`, `Longitude`
- **Depositional Age**: `Depos_Age_Period`, `Depos_Age_Epoch`, `Depos_Age_Stage`, `Max_Depos_Age_Ma`, `Est_Depos_Age_Ma`, `Min_Depos_Age_Ma`
- **Analytical**: `Spectrometer`, `Spectrometer_Location`, `Institution`, `Spectrometer_Mode`, `Rock_Type_one`, `Rock_Type_two`, `Rock_Type_three`, `Grain`, `Spot_Location`, `Spot_diam`
- **Isotope Ratios**: `Pb206U238_iso`, `Pb207U235_iso`, `Pb207Pb206_iso`, `Pb208Th232_iso` (with one-sigma uncertainties)
- **Calculated Ages**: `Pb206U238_age`, `Pb207U235_age`, `Pb207Pb206_age`, `Best_age` (with one- and two-sigma uncertainties), `Discord`
- **Elemental**: `U_ppm`, `Th_ppm`, `Pb_ppm`, `Pb206Pb204`, `Pb204Pb206`, `UTh_ratio`, `ThU_ratio`
### Lu-Hf Columns (33 total)
Includes all bibliographic and sample metadata columns above, plus:
- `Upb_Age`, `Upb_Age_two_sigma`
- `176Hf177Hf_iso`, `176Lu177Hf_iso`, `176Yb177Hf_iso` (with 2-sigma uncertainties)
- `epsilon_Hf_0`, `epsilon_Hf_t` (with 1-sigma and 2-sigma uncertainties)
- `TDM1_Ma`, `TDM2_Ma` (with 2-sigma uncertainties)
---
## Usage Notes
1. **Load order**: When reassembling the full dataset, load parts in numerical order (`01` → `22`).
2. **Row overlap**: Parts are split sequentially; no duplicate rows exist across parts of the same dataset.
3. **Cross-dataset linkage**: Use `Lead_Author` + `Year` + `Published_Sample_ID` + `Grain` to link U-Pb records with Lu-Hf records.
4. **Expert vs. Total**: `Experts_checked_UPb_split_parts` is a **subset** of the total database, curated from peer-reviewed regional compilations. It does not contain all rows from `Total_UPb_split_parts`.
Files
onedz_datasets_csv.zip
Files
(161.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:ed2edefc62053b62fbc71a140fec32b0
|
161.4 MB | Preview Download |
Additional details
Identifiers
- Other
- onedz
Dates
- Created
-
2025-10-22