White Matter Microstructure and Macrostructure Brain Charts Across the Human Lifespan - Post-processing Container
Description
Documentation Table of Contents
- Introduction
- Processing Flow
- Build and Testing Environment
- Expected Runtime and Memory Usage
- Download Container
- How to Run
- Expected Inputs
- DWI Requirements
- No Global/Normalized Measures (72 WM tracts only)
- Command(s) to Run
- Running in Singularity/Apptainer
- Expected Outputs
- Optional Flags
- No Global Measurements
- Running with BIDS-organized Data
- Expected Inputs
- Compare to Test Data
- Appendix
Introduction
This page documents the post-processing container for generating the white-matter phenotypes used by the "White Matter Microstructure and Macrostructure Brain Charts Across the Human Lifespan" models.
The Docker image for this post-processing container is available here.
This container takes preprocessed diffusion MRI data and converts it into a single brain-chart-ready phenotype table. Given a preprocessed DWI scan, bval/bvec files, and a brain mask (and optional T1-weighted scan) the pipeline fits DTI, reconstructs 72 TractSeg white-matter bundles, calculates microstructural and macrostructural features for each bundle, optionally calculates global white-matter and normalized macrostructural measures using FreeSurfer-derived T1 measures, and aggregates all extracted features into measurements.csv.
The resulting measurements.csv file is intended to be used with the companion record, "White Matter Microstructure and Macrostructure Brain Charts Across the Human Lifespan - Models". After quality control and the addition of participant metadata columns (`age`, `sex`, and `diagnosis`), this CSV can be used as input to the models container to align new datasets to the lifespan brain charts and obtain centile scores.
Processing flow
Use this post-processing container if you have MRI data and need to extract the white-matter features used by the brain charts.
Use the companion Models record if you already have brain-chart-compatible feature columns and want to:
1. extract normative lifespan trajectories,
2. align a new dataset to the brain charts, or
3. obtain centile scores for participants.
Most users starting from MRI data will use both records in order:
1. run this post-processing container to generate `measurements.csv`;
2. QC the extracted features and add `age`, `sex`, and `diagnosis`;
3. run the Models container to obtain centile scores.
Build and Testing Environment
The Docker was built and tested on a workstation running Ubuntu 22.04 with an Intel(R) Xeon(R) W-2255 CPU @ 3.70GHz (20 cores) and 62.5GB memory.
Expected Runtime and Memory Usage
Running the entire containerized pipeline using the provided testing data took roughly 18 hours. Users are recommended to have 20GB of memory free to run the Docker, although the actual memory usage may vary depending on the size of the input data.
Download Container
As mentioned, the container is hosted at this link. To pull the container as a Docker image, run:
docker pull kimm58/wm_lifespan_processing:<tag>
where <tag> is the desired version. To instead pull the container as an apptainer/singularity image, run:
apptainer pull wm_lifespan_processing.simg docker://kimm58/wm_lifespan_processing:<tag>
Depending on permissions, you may need to run the above command with sudo permissions.
How to Run
The container consists of a pipeline with multiple steps, where each step can be optionally skipped (contingent on previously required processing steps having completed). In order, the pipeline chain follows:
DTI Fitting --> TractSeg --> Bundle Measurement Calculation --> FreeSurfer --> DWI-T1 Registration --> FreeSurfer White Matter Mask Metric Calculation --> Aggregation of Measurements
NOTE: in order to run the FreeSurfer step, you must have a valid FreeSurfer license, which can be requested here.
Expected Inputs
To run the full pipeline, you must provide a preprocessed diffusion-weighted MRI brain scan and a T1-weighted brain scan of the same participant from the same scanning session. To remain consistent with the DWI preprocessing used for the associated paper, please use the PreQual preprocessing pipeline (link) for denoising and correction of susceptibility-, motion-, and eddy-current-induced artifacts. The inputs are as follows:
```
T1.nii.gz ##T1-weighted image
dwmri.nii.gz ##DWI image
dwmri.bval ##DWI bval file (FSL format)
dwmri.bvec ##DWI bvec file (FSL format)
mask.nii.gz ##binary mask of the brain in DWI space
```
The filenames must be consistent with the above. The dwmri.nii.gz, dwmri.bval, dwmri.bvec, and mask.nii.gz files are present in the PreQual pipeline outputs (PREPROCESSED directory). If you are not using the PreQual pipeline for preprocessing, then you must rename the files and put them in the expected formats. (Please look at the provided example files for expected formatting)
DWI Requirements
Note that the recommended number of DWI directions for CSD is 45, but can be done with 30 or fewer (absolute minimum is 16). However, using fewer directions may result in a noisier estimation of the DWI signal.
No Global/Normalized Measures (72 WM tracts only)
A T1w image is required for the FreeSurfer-based global white-matter measures and normalized macrostructural measures. If you only have preprocessed DWI data, you can still run the tract-based portion of the pipeline to obtain the 72 TractSeg bundle measurements, but the global and normalized features will not be produced.
If you wish to only obtain the tract-based measurements OR do not have a T1-weighted image, you need to provide all inputs EXCEPT for the T1-weighted image. You may then also run the container with flags to skip the additional processing steps (see below for the command).
Command(s) to Run
To run the entire processing pipeline, first place the input files (see Expected Inputs section) in a directory and create a new output directory:
mv T1.nii.gz dwmri.nii.gz dwmri.bval dwmri.bvec mask.nii.gz /path/to/input/dir
mkdir /path/to/output/dir
Next, run the following command:
docker run -it --rm -v /path/to/input/dir:/INPUTS \
-v /path/to/output/dir:/OUTPUTS \
-v </path/to/FreeSurfer/license/file>:/usr/local/freesurfer/.license \
<docker_img>
where </path/to/FreeSurfer/license/file> is your FreeSurfer license file and <docker_img> is your Docker.
Running as Singularity/Apptainer
If you wish to run the container as a singularity/apptainer instead, please note that by default Singularity/Apptainer will try to make the HOME directory be that of the current shell session. On the Docker version, the HOME directory is set at /root (as default). The TractSeg weights are saved in /root/.tractseg, and thus the Singularity/Apptainer may try to download the weights again because the HOME directory it sets may be different by default. To circumvent this, you can create a dummy script to set up the environment beforehand (with HOME=/root) and bind it into the Singularity/Apptainer image. This way, Tractseg SHOULD read the weights in from /root instead.
To do this, first create a dummy environment file:
echo -e '#!/bin/sh\nexport HOME=/root' > dummy_env.sh
The contents of the file will be:
```
#!/bin/sh
export HOME=/root
```
Then the command (binding in the dummy environment file) to run the container will instead be:
apptainer exec -ec -B /tmp:/tmp \
-B dummy_env.sh:/.singularity.d/env/90-environment.sh \
-B /path/to/input/dir:/INPUTS -B /path/to/output/dir:/OUTPUTS \
-B </path/to/FreeSurfer/license/file>:/usr/local/freesurfer/.license \
<apptainer_img> /bin/bash -c "bash /SCRIPTS/run_proc.sh"
If the issue persists, you can remedy the issue by copying the weights from inside the container to an external directory and then placing them in a new home directory.
Expected Outputs
Upon completion of the pipeline, there will be outputs for every single step:
- DTI Fitting results are saved in the
DTIoutput subdirectory - Tractseg .tck files are saved in the
Tractseg/TOM_trackingsoutput subdirectory - Bundle Measurements are saved in the
Tractseg/measuresoutput subdirectory - FreeSurfer outputs are saved in the
freesurferoutput subdirectory - Registration outputs are saved in the
REGoutput subdirectory - Global WM masking outputs are saved in the
FreesurferWhiteMatterMaskoutput subdirectory - Aggregated measures will be saved as the file
measurements.csv
The main output most users will need is:measurements.csv — a single aggregated spreadsheet containing the white-matter phenotypes generated by the selected pipeline steps.
For the full pipeline, this includes:
- 72 TractSeg bundle-level microstructural measures: FA, MD, AD, and RD
- 72 TractSeg bundle-level macrostructural measures: volume, surface area, and average length
- normalized tract macrostructural measures, when the T1w/FreeSurfer steps are run
- global cerebral white-matter FA, MD, AD, RD, and white-matter volume, when the T1w/FreeSurfer steps are run
This file is the recommended starting point for using the companion brain-chart models.
Optional Flags
To skip any of the steps in the processing pipeline:
DTI Fitting --> TractSeg --> Bundle Measurement Calculation --> FreeSurfer --> DWI-T1 Registration --> FreeSurfer White Matter Mask Metric Calculation --> Aggregation of Measurements
use the following flags (see No Global Measurements section for an example):
--skip-dti --skip-tractseg --skip-scilpy --skip-freesurfer --skip-reg --skip-fswm --skip-aggregate
Note that if a step is skipped where a downstream process REQUIRES that previous step to have run, the outputs for that step must exist in the outputs directory in the expected format (see the provided example test outputs for file naming and directory structure).
No Global Measurements
To run the processing to obtain only the bundle measurements for the 72 TractSeg tracts (no global or normalized macrostructural measurements), run using the specific skip flags:
docker run -it --rm -v /path/to/input/dir:/INPUTS \
-v /path/to/output/dir:/OUTPUTS \
-v </path/to/FreeSurfer/license/file>:/usr/local/freesurfer/.license
<docker_img> --skip-freesurfer --skip-reg --skip-fswm
Running with BIDS-organized Data
While not a BIDS-App, the container can be run on BIDS data without copying or moving files in a BIDS dataset. The key is that the input directory structure must be as described relative to inside the container. By creatively binding files/folders into the container, the same effect can be achieved:
docker run -it --rm -v /path/to/T1:/INPUTS/T1.nii.gz \
-v /path/to/preproc/dwi:/INPUTS/dwmri.nii.gz \
-v /path/to/preproc/bval:/INPUTS/dwmri.bval \
-v /path/to/preproc/bvec:/INPUTS/dwmri.bvec \
-v /path/to/dwi/mask:/INPUTS/mask.nii.gz \
-v /path/to/session/output/derivatives:/OUTPUTS \
-v </path/to/FreeSurfer/license/file>:/usr/local/freesurfer/.license \
<docker_img>
Compare to Test Data
For testing data, we use a scanning session (participant 001) from the Ageility dataset, openly available on NITRC here. The DWI data have been preprocessed using the PreQual pipeline, with all DWI acquisitions being combined together.
To run the test data, download and unzip the TEST_INPUTS.zip file and place the contents inside a new inputs directory </path/to/inputs>. Remember to have the location of the FreeSurfer license file on hand at </path/to/FS/license>. Create an output directory:
mkdir </path/to/outputs>
Then run:
docker run -it --rm -v </path/to/inputs>:/INPUTS \
-v </path/to/outputs>:/OUTPUTS \
-v </path/to/FS/license>:/usr/local/freesurfer/.license <docker_img>
While you may compare each individual file, you can do a single comparison between the final aggregated measurement file in </path/to/outputs>/measurements.csv to the same one in the TEST_OUTPUTS.zip file that is downloadable on this Zenodo page.
NOTE: as some steps are non-deterministic in nature (e.g. registration), outputs may not be EXACTLY the same.
Appendix
Below, we describe the steps of the postprocessing container in detail.
DTI Fitting
First, custom Python code to extract all DWI volumes with b-values that are less than or equal to 1500 mm^2/s. Diffusion tensors are then fit to these volumes using dwi2tensor from MRtrix3 (version 3.0.3). Using the input DWI mask from PreQual and the tensors, scalar maps of FA, MD, AD, and RD are created using tensor2metric from MRtrix3.
TractSeg
First, the DTI scalar maps from the "DTI Fitting" step and the preprocessed DWI are resampled to 1mm iso using mrgrid from MRtrix3. While not used in this step, the resampled images will be used later. TractSeg (version 2.8) is then run on the preprocessed DWI data, with the specific commands as:
TractSeg -i $dwi1mm --raw_diffusion_input -o ${tractout} --bvals $bval --bvecs $bvec
TractSeg -i ${tractout}/peaks.nii.gz -o ${tractout} --output_type endings_segmentation
TractSeg -i ${tractout}/peaks.nii.gz -o ${tractout} --output_type TOM
Tracking -i ${tractout}/peaks.nii.gz -o ${tractout} --tracking_format tck
where $dwi1mm is the 1mm iso DWI file, ${tractout} is the output directory, $bval is the FSL-structured bval file from the preprocessed DWI, and $bvec is the FSL-structured bvec file from the preprocessed DWI.
Scilpy
For each tract ($s.tck file below) in the TractSeg outputs, the following Scilpy (version 1.5.0) commands are run:
scil_evaluate_bundles_individual_measures.py ${TOMtrack}/$s.tck ${measuresdir}/$s-SHAPE.json --reference ${dwi1mm}
scil_compute_bundle_mean_std.py ${TOMtrack}/$s.tck ${fa1mm} ${md1mm} ${ad1mm} ${rd1mm} --density_weighting --reference=${dwi1mm} > ${measuresdir}/$s-DTI.json
where ${TOMtrack} is the subdirectory in the TractSeg outputs containing the .tck files ("TOM_trackings" subdirectory), ${measuresdir} is the output directory for the microstructural and macrostructural bundle measurements, $s is a variable representing the name of each TractSeg tract, and ${fa1mm}, ${md1mm}, ${ad1mm}, ${rd1mm} are the 1mm isotropic DTI scalar maps.
FreeSurfer
The recon-all command from FreeSurfer (version 7.2.0) is used to obtain measurements of white matter volume, total brain volume without ventricles, and estimated total intracranial volume from a T1-weighted (T1w) scan coming from the same imaging session. Furthermore, the FreeSurfer segmentation is later used to create a white matter mask to calculate the global DTI measurements.
T1-b0 Registration
First, the brain mask from FreeSurfer is re-oriented back into the native T1w space. This T1w mask is then used to skull-strip the T1w image using fslmaths from FSL (version 6.0.4). dwiextract from MRtrix3 is then used to extract the mean b0 image from the preprocessed DWI scan. The T1w image is registered to the DWI b0 using epi_reg from FSL. Finally, the convert3d library (version 1.0.0) is used to convert the calculated transformation matrix from FSL to ANTs/ITK format.
FreeSurfer White Matter Mask Metric Calculation
First, the FreeSurfer segmentation is used to create a cerebral white matter mask. Specifically, the following labels are included:
{2: "Left-Cerebral-White-Matter", 41: "Right-Cerebral-White-Matter", 77: "WM_hypointensities", 78: "Left_WM_hypointensities",
79: "Right_WM_hypointensities", 251: "CC_Posterior", 252: "CC_Mid_Posterior", 253: "CC_Central", 254: "CC_Mid_Anterior", 255: "CC_Anterior",
28: "Left_Ventral_Dorsal_Column", 60: "Right_Ventral_Dorsal_Column", 250: "Fornix", 85: "Optic_Chiasm"}
which are consistent with the white matter regions included in both the FreeSurfer white matter surface files and the MRtrix3 5TT anatomically-constrained tractography mask (here). Following this, the mri_label2vol command is used to re-orient the mask back to the original T1w space. The transformation matrices from the "T1-b0 Registration" step are then used to register the white matter mask to the DWI space. Finally, the average FA, MD, AD, and RD are calculated within the WM mask to obtain global measurements of all microstructural measures, using custom Python code.
Normalization and Aggregation of Measurements
Custom Python code is first used to normalize macrostructural features (volume, surface area, average length) from each tract by the total WM volume, brain volume without ventricles, and estimated total intracranial volume from FreeSurfer. The normalization process assumes that the volume is a sphere and normalizes by the appropriate measurement (for average length normalization is by radius; for surface area normalization is by SA=4*pi*r^2).
Finally, all measurements are aggregated into a single CSV file. These measurements include:
- Microstructural and macrostructural features for each of the 72 TractSeg bundles (
$s-SHAPE.json,$s-DTI.jsonfiles from the Scilpy scripts) - Global white matter FA, MD, AD, RD, and volume measurements
- Normalized macrostructural measurements (volume, surface area, average length) for each of the 72 TractSeg bundles
Files
TEST_INPUTS.zip
Files
(879.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:752f0770c1a4410e1ac1120391c5208c
|
472.4 MB | Preview Download |
|
md5:bb60ec890ae3ecf2ae7af599f4087a9b
|
406.9 MB | Preview Download |
Additional details
Software
- Repository URL
- https://hub.docker.com/r/kimm58/wm_lifespan_processing
- Programming language
- Python , Shell