Mitosis Dataset for TCGA Diagnostic Slides
Description
Mitosis Detections and Mitotic Network in TCGA
This dataset contains mitosis detections, mitotic network structures, and social network analysis (SNA) measures derived from 11,161 diagnostic slides in The Cancer Genome Atlas (TCGA). Mitoses were automatically identified using the MDFS algorithm [1], and each detected mitosis was converted into a node within a mitotic network. The resulting graphs are provided in JSON format, with each file representing a single diagnostic slide.
JSON Data Format
Each JSON file contains four primary fields:
-
edge_index
Two parallel lists representing edges between nodes. The ii-th element in the first list corresponds to the source node index, and the ii-th element in the second list is the target node index. -
coordinates
A list of [x, y] positions for each node (mitosis). The (x,y) coordinates can be used for spatial visualization or further spatial analyses. -
feats
A list of feature vectors, with each row corresponding to a node. These features include:- type (an integer representing mitosis type. 1: typical mitosis, 2: atypical mitosis)
- Node_Degree (the number of nodes connected to the node)
- Clustering_Coeff (clustering coefficient of the node)
- Harmonic_Cen (Harmonic centrality of the node)
-
feat_names
The names of the features infeats. The order matches the columns in each node’s feature vector.
Example JSON Snippet
{
"edge_index": [[1, 2, 6, 10], [2, 4, 8, 11]],
"coordinates": [[27689.0, 12005.0], [24517.0, 17809.0], ...],
"feats": [[1.0, 0.0, 0.0, 0.0], [1.0, 1.0, 0.0, 0.115], ...],
"feat_names": ["type", "Node_Degree", "Clustering_Coeff", "Harmonic_Cen"]
}
Loading Data into NumPy
Below is a sample Python snippet to load one JSON file, extract node coordinates and the type feature, and combine them into a single NumPy array:
import json
import numpy as np
# Path to your JSON file
json_file_path = "example_graph.json"
with open(json_file_path, 'r') as f:
data = json.load(f)
# Convert coordinates to NumPy
coordinates = np.array(data["coordinates"])
# Identify the "type" column
feat_names = data["feat_names"]
type_index = feat_names.index("type")
# Extract features and isolate the "type" column
feats = np.array(data["feats"])
node_types = feats[:, type_index].reshape(-1, 1)
# Combine x, y, and type into a single array (N x 3)
combined_data = np.hstack([coordinates, node_types])
print(combined_data)
Building a NetworkX Graph
To visualize or analyze the network structure, you can construct a NetworkX graph as follows:
import json
import networkx as nx
import matplotlib.pyplot as plt
json_file_path = "example_graph.json"
with open(json_file_path, "r") as f:
data = json.load(f)
# Create a NetworkX Graph
G = nx.Graph()
# Add each node with position attributes
for i, (x, y) in enumerate(data["coordinates"]):
G.add_node(i, pos=(x, y))
# Add edges using the parallel lists in edge_index
# (Adjust for 1-based indexing if necessary)
for src, dst in zip(data["edge_index"][0], data["edge_index"][1]):
G.add_edge(src, dst)
Visualizing mitotic network using TIAToolbox
Having TIAToolbox installed, one can easily visualize the mitotic network on their respective whole slide images using the following command:
tiatoolbox visualize --slides path/to/slides --overlays path/to/overlays
The only thing to consider is that slides and overlays (provided graph json files) should have the same name. For more information, please refer to Visualization Interface Usage - TIA Toolbox 1.5.1 Documentation.
In case of using this dataset, please cite the following publication:
@article{jahanifar2024mitosis,
title={Mitosis detection, fast and slow: robust and efficient detection of mitotic figures},
author={Jahanifar, Mostafa and Shephard, Adam and Zamanitajeddin, Neda and Graham, Simon and Raza, Shan E Ahmed and Minhas, Fayyaz and Rajpoot, Nasir},
journal={Medical Image Analysis},
volume={94},
pages={103132},
year={2024},
publisher={Elsevier}
}
Files
Files
(149.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:09f9b86d291fbe7d07d262cfd990555d
|
149.2 MB | Download |
Additional details
Additional titles
- Alternative title
- Mitotic Netoworks of TCGA diagnostic slides
- Subtitle
- Mitosis detections, mitotic network, and SNA mesures of the mitotic network for 11,161 diagnostic slides from TCGA
References
- Jahanifar, Mostafa, et al. "Mitosis detection, fast and slow: robust and efficient detection of mitotic figures." Medical Image Analysis 94 (2024): 103132.