DY06_FoV2.9 B2_recon EV rotate 6.5 degrees.txm

Author: Niels Jeppesen (niejep@dtu.dk)

This notebook reads the TXM data format. In our experience NIfTI is much faster to read than TXM using the packages we use here.

The structure-tensor package we will be using here is a 2D and 3D strcture tensor package for Python implemented by Vedrana A. Dahl and Niels Jeppesen.

Installation

To run this notebook there are some prerequisites that must be installed in our Python environment. We generally recommend creating a new Python environment to run the notebooks in, for instance using conda create -n <env_name> python=3.7. This way we avoid conflicts with packages in your existing environment.

Install dependencies (for this notebook) using pip install numpy scipy scikit-image matplotlib nibabel tqdm structure-tensor. To start the notebook we will of course also need jupyter. Install the dxchange package from GitHub, see note below.

Note: The current release of the dxchange package has a bug in the TXM reader. This is fixed in the GitHub master branch, so for now the package must be installed directly from GitHub using pip install git+https://github.com/data-exchange/dxchange.git. The setup.py does not specify requirements for the package, which can be found in dxchange requirements.txt instead.

Now, let's go ahead and import the required modules. The structure_tensor_workers.py file should be in the notebook folder.

Load and display data

First, we'll load a sample of the data and some meta data. We will be using the following folder structure:

Let's have a need look at the data, to ensure it was loaded correctly.

Crop data

Clearly a large part of the data is not of interest to us, so let's crop out the area of interest. You can cut out any part of the data you want to analyze.

Create new shape.

Save as RAW

To work with the data, we will save it as a raw data file.

Choosing fiber threshold

To choose a good intensity threshold for the fibers we can use a histogram and choose the value manually. Alternatively we can use a method such as Otsu's threshold. For choosing the threshold we make sure only to include inside-sample data. This makes it easier to separate foreground (fibers) and background inside the sample.

Structure tensor

We set $\sigma$ and $\rho$ based on the size of the fibers that we want to analyze. For more details see the related paper and StructureTensorFiberAnalysisDemo notebook.

Then, we set the block size (crop_size), which will determine the maximum size of the blocks we will partition the data into for the ST calculation. The maximum black size will depend on crop_size, the truncate value used for Gaussian filters (default is 4), and $\rho$ or $\sigma$ (usually $\rho$ because ti is largest).

We also set the fiber_threshold and specify a list of devices to use for the calculation. The list determines how the blocks will be distributed between devices for calculations. If we have a dual GPU system supporting CUDA, we can specify ['cuda:0', 'cuda:1'], which will distribute blocks evenly between the two GPUs. ['cuda:0', 'cuda:1', 'cuda:1'] would move two thirds to GPU 1, while ['cuda:0', 'cuda:1', 'cpu'] will move one third of the blocks to the CPU. Remember to update the device list to match your system resources. Specifying CUDA devices that are not available will result in exceptions and/or undefined behaviour. If you don't have CUDA GPUs available, just set device = ['cpu'] and specify the number of processes later.

The class_vectors area used to segment the voxels based on the orientations of the ST eigenvectors. It is a list of unit vectors, which represent each of the fiber classes. The first vector is considered the primary orientation and will be used for calculating orientation metrics for each voxel. For more details see the related paper and StructureTensorFiberAnalysisDemo notebook.

Lastly, bins_per_degree determines the number of bins used per degree when aggregating the orientation metrics.

Create valid data mask

To create the mask for valid data we create a mask as before, except that we erode the mask with an amount equal to the kernel radius. This was we "remove" voxels that are affected by voxels without values (black/zero intensity).

Creating fiber mask

Now that we have a threshold and data mask, we combine them into a fiber mask, which is True for voxels which contain fibers.

We can plot the mask on top of the data. Here, red are fiber voxels while blue are non-fiber voxels, background and voxels ignored for being close to the edge of the data.

Calculating structure tensor (in blocks)

We can use the get_crops function to partition the data into blocks (crops) with proper padding. However, we will actually only use len(crops) here, as we will use multiprocessing to distribute the blocks accross multiple devices. We may include a function just for calculating the number of blocks/crops at a later point.

We will be using the structure_tensor_analysis_v1 function to calculate the structure tensor, $S$, and do the eigendecomposition for the blocks in parallel. We will be saving the following metrics to the disk using memory maps created below:

  1. theta contains the angle, $\theta$ between vec and class 0, at each voxel.

To calcualte these metrics, the structure_tensor_analysis_v1 uses the calculate_angles function, which is explained further in the StructureTensorFiberAnalysisDemo notebook. However, the metrics returned by structure_tensor_analysis_v1 have been aggregated (binned), as returning all the metrics for each block unaggregated would obvously consume large amounts of memory and be infeasible for large volumes. We will combine the aggregated data from each block afterwards.

In the code below we create memory-mapped files for the eigenvectors, along with three other metrics. The structure_tensor_analysis_v1 function will be using these to save the metrics for the volume straight to the disk, which may require a significant amount of disk space, but shouldn't result in memory issues. Besides the five types shown below, the function also supports saving $S$ and the eigenvalues. See structure_tensor_workers.py for details. In the example below we will be saving the results using 16-bit precision data types to save space. This is usually fine for visualization and most statistics. Saving the metrics to disk is optional. If you don't need the per voxel orientations for later, don't create the memory-maps and remove the entries from the init_args dictionary below. This will save you a lot of disk space and probably reduce the processing time.

Now we're finally ready to perform the analysis. We will be using multiprocessing.Pool to distribute the work across multiple CPU cores. If specified in the device list, the CPU will offload the majority of the work to a GPU, otherwise it'll do the calculations itself. Here, we will create four processes for each device in the list (processes=4 * len(device)). As our device list contains four GPUs, we will be starting 16 processes, four for each GPU. Beware that this may require a significant amount of GPU memory. If you experience out of memory exceptions, either reduce crop_size and/or the number of processes per device. Change the number of processes to fit your system.

All shared variables are passed to the workers using the init_args dictionary and init_workerfunction. This function is only called once per worker. When called, it will create all the memory-maps needed to read and write data during calculations, based on init_args. We will use the pool.imap_unordered function to tell the workers to perform the calculations for a block, by passing the structure_tensor_analysis_v1 function along with the index of the block which we want to analyze. The returned results are a tuple of aggregated metrics, as decribed earlier. For more details see the structure_tensor_analysis_v1 code. Below we will show have to combine the aggregated metrics and display them as histograms.

Another option is to save all the metrics to disk (using memory-maps as described earlier) with reasonable precision and perform the statistics directly on the full orientation volumes. This approach would be similar to what's done in the StructureTensorFiberAnalysisDemo notebook, except there data is kept in memory instead of on the disk. If you use this approach you may simply ignore the aggregated data returned by structure_tensor_analysis_v1, but it will require you to use a significant amount of disk space to store all the orientation and segmentation data. If the volumes are very big, working with them may also be slow and memory intensive, even if you use memmap to access the data.

Histograms

The results list contains the binned metrics for each separate block, so before we can shown these, we have to combine them. First we will filter out elements with no data (i.e. blocks with no fibers) and convert the list to a NumPy array.

Stiffness meassure

Angle from X ($\theta$)

The histograms below show the distribution of the absolute angles between the fibers and the X-axis unit vector, along with the median angle. The resoulution of the histograms is determined by bins_per_degree.

In-plane orientation (rotaiton in XY-plane)

The histograms below show the distribution of the rotation around the Z-axis (in-XY-plane rotation), the mean and the standard deviation for each class.

To calculate mean and standard deviation properly for the aggregated data, we create two function. We need these to calculate the mean and standard deviation for the complete volume, based on the means and standard deviations for each block, which is what we have available in the res array.

Using these function, we calculate mean and standard deviation for orientations in the XY-plane.

In-plane orientation (rotaiton in XZ-plane)

The histograms below show the distribution of the rotation around the Y-axis (in-XZ-plane rotation), the mean and the standard deviation for each class.

The class fractions and orientation distributions are useful for determining if the material follows production specifications. Here we've chosen a specific set of metrics to calculate and plot, but many other distributions can just as easily be obtained and plotted.

Slices

As we chose to save the eigenvectors for each voxel in the data volume to the disk using the vec memory map, we can actually access all the eigenvectors along with the original data if we like. We also have orientation metrics avilable through the theta memory-map.

Let's grab som data and their ST eigenvectors.

Let's can have a look at the data and the eigenvectors.

Slices with angle information

We can use the calculate_angles function to calculate the segmentation and orientation metrics for the slices we just read from the volumes.

First we define some plotting functions, then we calculate vec_class, eta_os, theta, out_xy_angle and in_xy_angle for each of the slices and plot them. All the figures will be saved to the figures folder we created in the beginning. It easy to create figures for other slices simply be changing which slices are extracted from the volumes.

NOTE: You can easily change which figures are made by commenting out the calls to fig_with_colorbar in the show_metrics function.

Layer orientations

The lay-up of the material means that layer may not be completely evenly oriented. The layers are noticable in the cross sections along the X-axis. Let's invetigate the orientations on these cross sections and see if the orientation differs.

For each of these slices, let's calculate the mean angle of the orientations across the visible layers.