Author: Niels Jeppesen (niejep@dtu.dk)
This notebook reads the TXM data format. In our experience NIfTI is much faster to read than TXM using the packages we use here.
The structure-tensor
package we will be using here is a 2D and 3D strcture tensor package for Python implemented by Vedrana A. Dahl and Niels Jeppesen.
To run this notebook there are some prerequisites that must be installed in our Python environment. We generally recommend creating a new Python environment to run the notebooks in, for instance using conda create -n <env_name> python=3.7
. This way we avoid conflicts with packages in your existing environment.
Install dependencies (for this notebook) using pip install numpy scipy scikit-image matplotlib nibabel tqdm structure-tensor
. To start the notebook we will of course also need jupyter
. Install the dxchange
package from GitHub, see note below.
Note: The current release of the dxchange
package has a bug in the TXM reader. This is fixed in the GitHub master branch, so for now the package must be installed directly from GitHub using pip install git+https://github.com/data-exchange/dxchange.git
. The setup.py
does not specify requirements for the package, which can be found in dxchange
requirements.txt instead.
Now, let's go ahead and import the required modules. The structure_tensor_workers.py
file should be in the notebook folder.
import os
from multiprocessing import Pool
import matplotlib as mpl
import matplotlib.font_manager as fm
import matplotlib.pyplot as plt
import numpy as np
from dxchange import reader
from mpl_toolkits.axes_grid1 import make_axes_locatable
from mpl_toolkits.axes_grid1.anchored_artists import AnchoredSizeBar
from scipy import ndimage
from scipy.spatial.transform import Rotation as R
from skimage import filters
from structure_tensor import eig_special_3d, structure_tensor_3d
from tqdm import tqdm
from structure_tensor_workers import calculate_angles, get_crops, init_worker, structure_tensor_analysis_v1
Warning: dxfile module not found spefile module not found netCDF4 module not found EdfFile module not found astropy module not found
plt.rcParams['image.interpolation'] = 'nearest'
First, we'll load a sample of the data and some meta data. We will be using the following folder structure:
../notebooks
contains the notebooks.../originals
should contain the original data files.../tmp
should contain any temperary files or output we create.../notebooks/figures/<file_name>
contains optionally saved figures.# Set file name and path.
file_name = 'DY06_FoV2.9 B2_recon EV rotate 6.5 degrees.txm'
file_path = os.path.join('../originals/', file_name)
# Change this to True to save figures to folder.
save_figures = False
# Create folder for to save figures.
fig_path = os.path.join('figures', os.path.basename(file_name))
if save_figures and fig_path and not os.path.exists(fig_path):
os.makedirs(fig_path)
# Read the first slice and metadata.
sample, meta = reader.read_txm(file_path, slice_range=(1, None, None))
data_shape = (meta['number_of_images'], meta['image_height'], meta['image_width'])
data_type = sample.dtype
voxel_size = meta['pixel_size']
fiber_diameter = 7.3
print('Shape:', data_shape)
print('Data type:', data_type)
print('Voxel size:', voxel_size, 'μm')
print('Fiber diameter size:', fiber_diameter, 'μm')
Shape: (1003, 1013, 992) Data type: uint16 Voxel size: 2.866401433944702 μm Fiber diameter size: 7.3 μm
Let's have a need look at the data, to ensure it was loaded correctly.
fig, axs = plt.subplots(1, 2, figsize=(20, 20))
i = data_shape[0] // 3
img, _ = reader.read_txm(file_path, slice_range=((i, i + 1), None, None))
axs[0].imshow(img.squeeze(), cmap='gray')
i = data_shape[0] // 3 * 2
img, _ = reader.read_txm(file_path, slice_range=((i, i + 1), None, None))
axs[1].imshow(img.squeeze(), cmap='gray')
[(ax.set_xlabel('Y'), ax.set_ylabel('Z')) for ax in axs]
plt.show()
Clearly a large part of the data is not of interest to us, so let's crop out the area of interest. You can cut out any part of the data you want to analyze.
x_slice = slice(75, data_shape[0] - 75)
y_slice = slice(255, data_shape[1] - 155)
z_slice = slice(155, data_shape[2] - 255)
fig, axs = plt.subplots(1, 2, figsize=(20, 20))
i = data_shape[0] // 3
img, _ = reader.read_txm(file_path, slice_range=((i, i + 1), y_slice, z_slice))
axs[0].imshow(img.squeeze(), cmap='gray')
i = data_shape[0] // 3 * 2
img, _ = reader.read_txm(file_path, slice_range=((i, i + 1), y_slice, z_slice))
axs[1].imshow(img.squeeze(), cmap='gray')
[(ax.set_xlabel('Y'), ax.set_ylabel('Z')) for ax in axs]
plt.show()
Create new shape.
data_shape = (x_slice.stop - x_slice.start, z_slice.stop - z_slice.start, y_slice.stop - y_slice.start)
print('New shape:', data_shape)
New shape: (853, 582, 603)
To work with the data, we will save it as a raw data file.
temp_folder = '../tmp/'
if temp_folder and not os.path.exists(temp_folder):
os.mkdir(temp_folder)
data_path = os.path.join(temp_folder, file_name + f'_{x_slice.start}-{x_slice.stop}_{y_slice.start}-{y_slice.stop}_{z_slice.start}-{z_slice.stop}.raw')
if not os.path.exists(data_path):
# Number of images read at a time.
# Reduce this to lower memory usage.
chunck_size = 1024
print('Creating new file:', data_path)
data = np.memmap(data_path, dtype=data_type, shape=data_shape, mode='w+')
for i in tqdm(range(0, data_shape[0], chunck_size)):
x0 = i + x_slice.start
x1 = min(x_slice.stop, x0 + chunck_size)
chunck, _ = reader.read_txm(file_path, slice_range=((x0, x1), y_slice, z_slice))
data[i:i + chunck_size] = np.rot90(chunck, k=1, axes=(1, 2))
else:
print('Using existing file:', data_path)
data = np.memmap(data_path, dtype=data_type, shape=data_shape, mode='r')
Using existing file: ../tmp/DY06_FoV2.9 B2_recon EV rotate 6.5 degrees.txm_75-928_255-858_155-737.raw
fig, axs = plt.subplots(2, 2, figsize=(20, 20))
axs[0, 0].imshow(data[data.shape[0] // 3], cmap='gray')
axs[0, 1].imshow(data[data.shape[0] // 3 * 2], cmap='gray')
[(ax.set_xlabel('Z'), ax.set_ylabel('Y')) for ax in axs[0]]
axs[1, 0].imshow(data[:, data.shape[1] // 2], cmap='gray')
axs[1, 0].set_xlabel('Y'), axs[1, 0].set_ylabel('X')
axs[1, 1].imshow(data[..., data.shape[-1] // 2], cmap='gray')
axs[1, 1].set_xlabel('Z'), axs[1, 1].set_ylabel('X')
plt.show()
To choose a good intensity threshold for the fibers we can use a histogram and choose the value manually. Alternatively we can use a method such as Otsu's threshold. For choosing the threshold we make sure only to include inside-sample data. This makes it easier to separate foreground (fibers) and background inside the sample.
threshold_bins = 1024
threshold_data = data[25:-25, 25:-25, 25:-25]
%%time
otsu_threshold = filters.threshold_otsu(threshold_data.reshape(-1, 1), nbins=threshold_bins)
hand_picked_threshold = 14400
print('Otsu threshold:', otsu_threshold)
print('Hand-picked threshold:', hand_picked_threshold)
Otsu threshold: 14463 Hand-picked threshold: 14400 CPU times: user 1.86 s, sys: 565 ms, total: 2.43 s Wall time: 2.43 s
print('Otsu fiber fraction:', round(np.count_nonzero(threshold_data >= otsu_threshold) / threshold_data.size, 3))
print('Hand-picked fiber fraction:', round(np.count_nonzero(threshold_data >= hand_picked_threshold) / threshold_data.size, 3))
Otsu fiber fraction: 0.547 Hand-picked fiber fraction: 0.595
percentiles = [.5, 99.5]
percentile_values = np.percentile(threshold_data, percentiles)
print(f'Percentiles ({percentiles}): {percentile_values}')
Percentiles ([0.5, 99.5]): [13332. 15622.]
ax = plt.subplot(1, 1, 1)
ax.hist(threshold_data.flat, bins=threshold_bins, density=True)
ax.axvline(otsu_threshold, c='r')
ax.axvline(hand_picked_threshold, c='g')
for p in percentile_values:
ax.axvline(p, c='k', ls='--')
ax.set_xlim(11000, 19000)
plt.show()
We set $\sigma$ and $\rho$ based on the size of the fibers that we want to analyze. For more details see the related paper and StructureTensorFiberAnalysisDemo notebook.
Then, we set the block size (crop_size
), which will determine the maximum size of the blocks we will partition the data into for the ST calculation. The maximum black size will depend on crop_size
, the truncate
value used for Gaussian filters (default is 4), and $\rho$ or $\sigma$ (usually $\rho$ because ti is largest).
We also set the fiber_threshold
and specify a list of devices to use for the calculation. The list determines how the blocks will be distributed between devices for calculations. If we have a dual GPU system supporting CUDA, we can specify ['cuda:0', 'cuda:1']
, which will distribute blocks evenly between the two GPUs. ['cuda:0', 'cuda:1', 'cuda:1']
would move two thirds to GPU 1, while ['cuda:0', 'cuda:1', 'cpu']
will move one third of the blocks to the CPU. Remember to update the device
list to match your system resources. Specifying CUDA devices that are not available will result in exceptions and/or undefined behaviour. If you don't have CUDA GPUs available, just set device = ['cpu']
and specify the number of processes later.
The class_vectors
area used to segment the voxels based on the orientations of the ST eigenvectors. It is a list of unit vectors, which represent each of the fiber classes. The first vector is considered the primary orientation and will be used for calculating orientation metrics for each voxel. For more details see the related paper and StructureTensorFiberAnalysisDemo notebook.
Lastly, bins_per_degree
determines the number of bins used per degree when aggregating the orientation metrics.
rho = fiber_diameter / voxel_size
rho = round(rho, 2)
sigma = rho / 2
print('sigma:', sigma)
print('rho:', rho)
truncate = 4
kernel_radius = int(max(rho, sigma) * truncate + 0.5)
print('kernel_radius:', kernel_radius)
crop_size = 230
print('crop_size:', crop_size)
print('Maximum block size:', crop_size + int(4 * max(rho, sigma) + 0.5))
# Important: Listing invalid CUDA devices may results in exceptions.
device = ['cuda:0']
class_names = ['0']
class_vectors = np.array([[0, 0, 1]], dtype=np.float64)
class_vectors /= np.linalg.norm(class_vectors, axis=-1)[..., np.newaxis]
bins_per_degree = 10
sigma: 1.275 rho: 2.55 kernel_radius: 10 crop_size: 230 Maximum block size: 240
To create the mask for valid data we create a mask as before, except that we erode the mask with an amount equal to the kernel radius. This was we "remove" voxels that are affected by voxels without values (black/zero intensity).
# Create mask memory map.
mask_path = os.path.join(temp_folder, file_name + '_mask.raw')
mask = np.memmap(mask_path, dtype=np.bool, shape=data.shape, mode='w+')
# Get mask for values different from 0.
mask[:] = data != 0
# Erode mask to keep only valid voxel (that are not affected by the edge).
mask[:] = ndimage.binary_erosion(mask, iterations=kernel_radius)
# Filter out first and last slices on all axis.
ignore = 25
mask[:ignore] = False
mask[-ignore:] = False
mask[:, :ignore] = False
mask[:, -ignore:] = False
mask[..., :ignore] = False
mask[..., -ignore:] = False
Now that we have a threshold and data mask, we combine them into a fiber mask, which is True
for voxels which contain fibers.
fiber_threshold = otsu_threshold
mask &= data > fiber_threshold
We can plot the mask on top of the data. Here, red are fiber voxels while blue are non-fiber voxels, background and voxels ignored for being close to the edge of the data.
plt.figure(figsize=(20, 20))
ax = plt.subplot(2, 2, 1)
ax.imshow(data[data.shape[0] // 3], cmap='gray')
ax.imshow(mask[data.shape[0] // 3], alpha=.5, cmap='bwr', interpolation='nearest')
ax = plt.subplot(2, 2, 2)
ax.imshow(data[data.shape[0] // 3 * 2], cmap='gray')
ax.imshow(mask[data.shape[0] // 3 * 2], alpha=.5, cmap='bwr', interpolation='nearest')
ax = plt.subplot(2, 2, 3)
ax.imshow(data[:, data.shape[1] // 2], cmap='gray')
ax.imshow(mask[:, data.shape[1] // 2], alpha=.5, cmap='bwr', interpolation='nearest')
ax = plt.subplot(2, 2, 4)
ax.imshow(data[..., data.shape[-1] // 2], cmap='gray')
ax.imshow(mask[..., data.shape[-1] // 2], alpha=.5, cmap='bwr', interpolation='nearest')
plt.show()