FlyWire Whole-brain Connectome Connectivity Data
Description
This repository contains the connectivity data for the FlyWire Connectome release. Currently, the latest release is version 783 (see also codex.flywire.ai).
The synapses represent a combination of four different data releases, which are combined by the FlyWire whole-brain connectome release. The synapses (as points in space) were detected and published by Buhmann et al., 2021 who made use of a cleft segmentation produced by Heinrich et al., 2018. The neurotransmitter for these synapses was then predicted and released by Eckstein, Bates et al., 2024. The segmentation and neuron IDs (= root IDs) were proofread by the FlyWire consortium and are released by Dorkenwald et al., 2024 as part of the FlyWire connectome paper package.
Because multiple methods were involved in the production of this resource, the description of the methods is distributed across these manuscripts. We provide a summary in Dorkenwald et al., 2024 (Methods->Synaptic connections).
Some of the files made available use feather as a file format. See a code example for reading these files, including a chunk-wise streaming to handle the large file.
flywire_synapses_783.feather
a pandas dataframe with all ~130 million synapses, their locations, neurotransmitter predictions and pre and postsynaptic partners (=root ids). This table contains all synapses that passed the thresholds (see methods section in Dorkenwald et al., 2024), but not all synapses were associated with proofread neurons (e.g., see Discussion->Limitations of our reconstruction in Dorkenwald et al., 2024). Hence, not all root IDs in this table will have a match in the proofread root IDs array. This table is provided for completeness, and to allow calculations about total synaptic input and output of neurons independent of other limitations.
Columns:
- id: synapse ID
- pre_pt_root_id: presynaptic neuron ID
- post_pt_root_id: postsynaptic neuron ID
- connection_score: score assigned by Buhmann et al.; higher is better. We did not use this score to threshold synapses in any analysis
- cleft_score: score derived from the cleft segmentation by Heinrich et al.; higher is better. We used a threshold of 50 for all analyses and the released dataset. Synapses with lower score are not made available but can be made available on demand.
- gaba: probability for neurotransmitter=GABA
- ach: probability for neurotransmitter=Acetylcholine
- glut: probability for neurotransmitter=Glutamate
- oct: probability for neurotransmitter=Octopamine
- ser: probability for neurotransmitter=Serotonin
- da: probability for neurotransmitter=Dopamine
- neuropil: the name of the neuropil associated with this synapse. Symmetric neuropils contain a hemisphere annotation after '_'. E.g., ME_L is the medulla in the left hemisphere. For mapping long names, see Ext. Data Fig. 1 or https://codex.flywire.ai/app/neuropils
- post_pt_position_{x,y,z}: Coordinate within the postsynaptic neuron (synapses were identified with two points, one in each neuron). Coordinates are in nanometers.
- pre_pt_position_{x,y,z}: Coordinate within the presynaptic neuron (synapses were identified with two points, one in each neuron). Coordinates are in nanometers.
per_neuron_neuropil_count_post_783.feather
a pandas dataframe containing the number of postsynapses per neuropil and segment id, i.e. this is a summarized version of flywire_synapses_783.feather
Columns:
- post_pt_root_id: segment ID
- neuropil: neuropil name. Symmetric neuropils contain a hemisphere annotation after '_'. E.g., ME_L is the medulla in the left hemisphere. For a mapping to long names see Ext. Data Fig. 1 or https://codex.flywire.ai/app/neuropils
- Count: number of synapses for this segment ID and neuropil
per_neuron_neuropil_count_pre_783.feather
a pandas dataframe containing the number of presynapses per neuropil and segment id, i.e. this is a summarized version of flywire_synapses_783.feather
Columns:
- pre_pt_root_id: segment ID
- neuropil: neuropil name. Symmetric neuropils contain a hemisphere annotation after '_'. E.g., ME_L is the medulla in the left hemisphere. For mapping long names, see Ext. Data Fig. 1 or https://codex.flywire.ai/app/neuropils
- Count: number of synapses for this segment ID and neuropil
proofread_root_ids_783.npy
an array of all proofread neuron ids (=root ids)
proofread_connections_783.feather
a pandas dataframe containing the proofread subset from flywire_synapses_783.feather and summarized per neuron-neuron pair and neuropil, i.e. this table contains one entry per neuron-neuron pair and neuropil if there is 1 or more synapses for a given combination
Columns:
- pre_pt_root_id: presynaptic neuron ID
- post_pt_root_id: postsynaptic neuron ID
- neuropil: neuropil name. Symmetric neuropils contain a hemisphere annotation after '_'. E.g., ME_L is the medulla in the left hemisphere. For mapping long names, see Ext. Data Fig. 1 or https://codex.flywire.ai/app/neuropils
- syn_count: number of synapses between these two neurons in this neuropil
- gaba_avg: average probability across the synapses for neurotransmitter=GABA
- ach_avg: average probability across the synapses for neurotransmitter=Acetylcholine
- glut_avg: average probability across the synapses for neurotransmitter=Glutamate
- oct_avg: average probability across the synapses for neurotransmitter=Octopamine
- ser_avg: average probability across the synapses for neurotransmitter=Serotonin
- da_avg: average probability across the synapses for neurotransmitter=Dopamine
Code for reading and streaming feather files
Read feather files with pandas:
import pandas as pd
df = pd.read_feather(path)
Stream large feather files in chunks:
import pyarrow.feather as feather
table = feather.read_table(path)
# Total number of rows in the Feather filenum_rows = table.num_rows
# Define chunk sizechunk_size = 1000
# Read and process the data in chunksfor i in range(0, num_rows, chunk_size): end_row = min(i + chunk_size, num_rows) chunk = table.slice(i, end_row - i) # Slice the table from i to end_row
# Convert to pandas DataFrame if needed df_chunk = chunk.to_pandas() # Now you can process each chunk DataFrame as needed print(df_chunk.head())