Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.

There is a newer version of the record available.

Published July 3, 2022 | Version v1
Journal article Open

Project files provided as supporting information to the manuscript "Kinetics of radiation-induced DNA double-strand breaks through coarse-grained simulations"

Description

README file to the project files provided as supporting information to the manuscript “Kinetics of radiation-induced DNA double-strand breaks through coarse-grained simulations"

Authors: Manuel Micheloni, Lorenzo Petrolli, Gianluca Lattanzi and Raffaello Potestio
==================================

The .zip file contains the following folders:

DNA_sequence_LAMMPS:
DNAsequence.txt: this is the LAMMPS data file. It contains the DNA sequence employed for the study. 

DSB_input_datafile
Containing the binary LAMMPS data file that are employed as starting point for the DSB MD simulations. Each file contains the DNA molecule at a certain end-to-end distance (that gives also le name to the binary file).

DSB_MD_simulation
0_MD_LAMMPS: 
Containing the LAMMPS simulation scripts. The subfolders are organized according to the logic of the study: for each DNA extension (folders named 1000, 1100, …, 1300), we investigated different DSB motifs (folders named 0,1, …,4). 
1_DSB_raw_data
Containing the relevant information about the MD trajectories. The files are named as “out_Ree_bd_n.mat”, where Ree is the DNA end-to-end distance, bd is the DSB distance and the index n=1,…,Nt where Nt is the total number of independent MD runs. 
Each “out”-file contains: i) “free-energy”, which is the internal energy contribution of those nucleotides between the breaks, and ii) POS_left/right_branch, matrices that provide the ids and the positions (xyz) of the nucleotides in the break, along trajectory. To better understand the distinction between “left/right”, see section “Assessment of the residual contact interface of the DSBs at the rupture time” in the Supplementary data. 
Finally, “POS_time” contains the length of the simulation run (step*dt).

2_Analysis_scripts
Containing the employed analysis scripts. 
In folder 1_SigmoidalFitting, we provide the scripts that perform a sigmoidal fitting procedure [1] on the internal energy profile of the nucleotides between the strand breaks. 
Each script saves i) the internal energy barriers (activation_free_energy) and ii) breaking times (breaking_time) of all MD simulations characterized by a certain (Ree,bd). Finally, i) and ii) are averaged thus, in iii) dE and iv) tau_b we report the respective mean value and standard deviation. It is possible to find all data in 3_DSB_proc_data.

[1] R P (2022). sigm_fit (https://www.mathworks.com/matlabcentral/fileexchange/42641-sigm_fit), MATLAB Central File Exchange. Retrieved July 2, 2022.

2_DsbDistanceAnalysis contains the scripts that generate the interaction matrices (in subfolder 1_interaction_matrix) that are employed in the subfolder 2_correlation_BreakTime_NucleotidesDistances. 
For additional details about the analysis, reference section “Analysis of the residual interactions at the characteristic time of a DSB rupture” of the article.
The scripts “fitting_bd#_DSB_dist_interaction.m” produce “IntMatrix_Ree_bd.mat” data files in which are contained i) “interaction_matrix_avg”, representing the average values of the  interactions in the bound state of the DNA molecule, and ii) “interaction_matrix_atBreak”, containing the relative distances of all nucleotides at the breaking time for all independent MD simulations characterized by a certain (Ree,bd).


3_DSB_proc_data
The data contained in 1_DSB_raw_data are processed by the scripts in 3_Analysis_scripts and saved in 3_DSB_proc_data. For further details, see the the description of 2_Analysis_scripts.

TimeScaling
0_MD_LAMMPS
Here we provide the LAMMPS script employed to compute the diffusion coefficient of the 3855-bp DNA molecule. Specifically, we acquire the mean-squared-dispalcement (MSD) from which it is possible to extract the diffusion coefficient.

1_Diffusion_data
Contains the MSDs for each independent simulation.


NB: most data are saved according to the format .mat, used by MATLAB, a numerical computing environment and proprietary programming language developed by MathWorks.

Notes

RP and MM acknowledge support from the Italian Ministry of Education, University and Research (MIUR) through the FARE grant for the project HAMMOCK (Grant R18ZHWY3NC).

Files

DNA_sequence_LAMMPS.zip

Files (497.0 MB)

Name Size Download all
md5:4c4491e97adec592fbe02d04bf55a68d
2.4 kB Preview Download
md5:853c79fbf125d9229012ed610d5cf45d
3.8 MB Preview Download
md5:3ec83677a782a08f665813df855ef5f9
464.2 MB Preview Download
md5:86ed7608646b8ca22aa17637d403d70c
3.9 kB Preview Download
md5:8895bbcf36744f20acd60c7e4db454cf
29.0 MB Preview Download

Additional details

Funding

VARIAMOLS – VAriable ResolutIon Algorithms for macroMOLecular Simulation 758588
European Commission