White Matter Microstructure and Macrostructure Brain Charts Across the Human Lifespan - Models

Kim, Michael; Schilling, Kurt

doi:10.5281/zenodo.15367426

Published May 8, 2025 | Version 0.0.1

Model Open

White Matter Microstructure and Macrostructure Brain Charts Across the Human Lifespan - Models

1. Vanderbilt University
2. Vanderbilt University Medical Center

Contributors

Project leader:

Kim, Michael¹

Supervisor:

Schilling, Kurt²

1. Vanderbilt University
2. Vanderbilt University Medical Center

Documentation Table of Contents

Introduction
- Build and Testing Environment
- Expected Runtime and Memory Usage
How to Run
- Loading Docker
- Obtaining Centile Curves
- Aligning Out-Of-Sample Datasets
Expected Outputs
Compare to Test Data

Introduction

The Docker available on this page contains the fit models for macrostructural and microstrucutral brain charts across the human lifespan (0-100 years of age). Researchers are able to use this docker to align their out-of-sample (new) datasets to these brain charts.

Build and Testing Environment

The container was built and tested using a machine running on Ubuntu 20.04, with 62.5GB of memory. The CPU is a Intel(R) Xeon(R) W-2255 CPU running at 3.70GHz. The docker was also tested an runs succesfully on a RedHat 7.7 OS machine (CPU: Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz)

Expected Runtime and Memory Usage

Use of the docker should require less than 5GB.

How to Run

Before doing anything, Docker needs to be installed. Official instructions can be found here: (https://docs.docker.com/get-started/get-docker/). Docker must be running properly before proceeding.

Loading Docker

Download the Docker (provided as a .tar file), and save it somewhere it can easily be located. Now, run the following command:

docker load -i </path/to/docker/.tar/file>

Alternatively, the Docker image can be loaded through the Docker Desktop GUI. Confirm that it is properly loaded by running

docker images

where the name of the docker (r_lifespan_env) should be present.

Obtaining Centile Curves

For researchers who wish to examine the normative trajectories of features more closely, we also provide a method for obtaining centiles of trajectories in a CSV format. To do so, run the following command:

docker run --rm \

-v </path/to/output/directory>:/OUTPUTS \

r_lifespan_env \

python3 /WMLifespan/scripts/output_centile_curves.py \

<tract> <measure> /OUTPUTS/centiles.csv

where </path/to/output/directory> is the directory you wish to save the cenile CSF file in and centiles.csv is the name of the file you wish to save discrete values of the normative trajectory for the given <tract> and <measure>. Note that <tract> must be one of the TractSeg defined tract names found on the TractSeg github page here: https://github.com/MIC-DKFZ/TractSeg, whereas <measure> must be one of {fa-mean, md-mean, ad-mean, rd-mean, volume, surface_area, avg_length}.

Aligning Out-Of-Sample Datasets

One of the most important aspects of brain charts is the ability to score new data within the normative trajectories to determine how abnormal quantitative brain metrics are. For any researchers who would like to use these brain charts, we provide the following tutorial:

1.) Preprocess diffusion MRI (dMRI) data to correct for susceptibility-induced and eddy-current induced artifacts. We recommend using the PreQual pipeline, as it provides a QA document to determine whether or not the data are acceptable to use: (https://github.com/MASILab/PreQual, https://zenodo.org/records/14593034). Instructions for running PreQual can be found in both the github respository and the Zenodo page.

2.) Fit diffusion tensors (dwi2tensor) and obtain FA/MD/AD/RD microstructural maps (tensor2metric) for the dMRI data. For consistency, we use MRtriX3 software (v.3.0.3).

3.) Resample the preprocessed dRMI data and the FA/MD/AD/RD maps to 1mm isotropic voxel sizes. For consistency, we use the MRtriX3 command, (example: mrgrid dwmri.nii.gz regrid dwmri_1mm_iso.nii.gz -voxel 1).

4.) Run TractSeg (https://github.com/MIC-DKFZ/TractSeg) on the resampled data to obtain the 72 TractSeg defined white matter tracts as .tck files.

5.) Get microstructural and macrostrucutral measures for each of the 72 white matter tracts. For consistency, we use scilpy (https://github.com/scilus/scilpy) to obtain microstructural and macrostructural features (v1.5.0). For this version, the scilpy scripts are called scil_evaluate_bundles_individual_measures.py for macrostructural and scil_compute_bundle_mean_std.py for microstrucutral measures, and the commands are:

scil_compute_bundle_mean_std.py <TRACT>.tck FA_map.nii.gz MD_map.nii.gz AD_map.nii.gz RD_map.nii.gz --density_weighting --reference=dwmri_1mm_iso.nii.gz > <TRACT>-DTI.json

scil_evaluate_bundles_individual_measures.py <TRACT>.tck <TRACT>-SHAPE.json --reference=dwmri_1mm_iso.nii.gz

where <TRACT> is the name of a TractSeg defined tract.

6.) Before alignment, the data must be properly formatted in a CSV file that can be read by the Docker image. The CSV is required to have columns age, sex, and diagnosis where age is a numerical value, sex is a binary variable where “male” is encoded as 0 and “female” is encoded as 1, and diagnosis is a categorical variable. Typically developing/aging (also referred to as “cognitively normal”) participants are encoded as “CN” for diagnosis, and to perform alignment there must be rows in the CSV file that contain “CN” as the diagnosis. For better alignment, ensure as many “CN” participants as possible, and note that having a small number of participants may result in poorly aligned data and thus poorly estimated centile scores. There must also be at least one quantitative variable column in the CSV file, where quantitative variables are named as:

<tract>-<measure>

<tract> must be one of the TractSeg defined tract names, whereas <measure> must be one of {fa-mean, md-mean, ad-mean, rd-mean, volume, surface_area, avg_length}. Thus, the CSV should follow formatting such as:

age	sex	diagnosis	AF_left-fa-mean	AF_right-md-mean	…
75.1	0	CN	0.453	0.00110	…
45	1	CN	0.562	0.00140	…
62.5	1	AD	0.398	0.00098	…
…	…	…	…	…	…

Note that in cases where rows have empty entries, centile scores will not be calculated for these metrics and as a result will have a missing entry in the respective centile score output. Further, only rows labeled with a “CN” diagnosis and with non-missing centile values will be used for estimating the random effect terms (for alignment purposes) for a particular measure.

7.) Run the following Docker command:

docker run --rm -v </path/to/OOS.csv>:/INPUTS/input.csv \

-v </path/to/output/directory>:/OUTPUTS \

r_lifespan_env \

python3 /WMLifespan/scripts/perform_OOS_alignment.py \

/INPUTS/input.csv /OUTPUTS/aligned.csv

where aligned.csv is the destination file you wish to save the aligned centile score values and input.csv is the structured CSV file from step 6.). The aligned.csv file will contain a new column for each of the metric columns the Docker could find (which should follow the <tract>-<measure> naming).

As detailed in the Methods section, these normative curves are cross-sectional in nature. Thus, researchers performing out-of-sample alignment should only include cross-sectional data in the CSV file, or one scan per participant. Should researchers wish to evaluate longitudinal data with the cross-sectional models, the flag can be used to also save the estimated random effect terms for the dataset. We also note that this alignment to the normative models assumes that the data in the CSV file come from the same primary dataset. Calculation of centile scores for multiple datasets need to be done in separate Docker commands, each with their own distinct input CSV file.

Expected Outputs

For obtaining centile trajectories, the CSV will contain one column for the ages being sampled at, with the remaining columns containing values corresponding to specific centiles across the lifespan at each of the sampled ages.

For the alignment process, the output CSV will be the input CSV, but also contain new columns corresponding to the aligned centile values for each of the datapoints (with the column heading <tract>-<metric>_centile_score). Centile scores should be between 0 and 1, where values are the percentile represented as a decimal.

Compare to Test Data

TO BE ADDED IN A LATER RELEASE

Files

Files (8.2 GB)

Name	Size	Download all
r_lifepsan_env.tar md5:9bf97ff94280c1f5f9864efec4fc15ae	8.2 GB	Download

Additional details

Programming language: Python, R
Development Status: Active

	All versions	This version
Views	125	95
Downloads	110	21
Data volume	287.4 GB	172.1 GB

White Matter Microstructure and Macrostructure Brain Charts Across the Human Lifespan - Models

Creators

Contributors

Project leader:

Supervisor:

Description

Documentation Table of Contents

Introduction

Build and Testing Environment

Expected Runtime and Memory Usage

How to Run

Loading Docker

Obtaining Centile Curves

Aligning Out-Of-Sample Datasets

Expected Outputs

Compare to Test Data

Files

Files (8.2 GB)

Additional details

Software