Published February 6, 2026 | Version v1
Dataset Open

Simulated datasets for LUFNet-CPC2026

  • 1. ROR icon Agency for Science, Technology and Research
  • 2. ROR icon National University of Singapore
  • 3. ROR icon Chosun University
  • 4. ROR icon Nanyang Technological University
  • 5. International Research Laboratory on Artificial Intelligence
  • 6. Centre for Frontier AI Research

Description

Links to code and LUFnet paper:

1. [LUFnet Code]( https://github.com/Gobliu/LUFNet-CPC2026.git )

2. [Scalable neural network driven molecular dynamics simulation](https://doi.org/10.1016/j.cpc.2026.110036)

 

The .zip files contain the full dataset used in our LUFnet paper.

Category Count (zip files) Total Samples
Training set 13 files 180,000
validation set 1 file 20,000
Inference set : Input for LUFnet 3 files 3,000
Rollout steps for Velocity-Verlet (to compare with LUFnet) 3 files 3,000

Table 1. Dataset summary

 

The dataset is provided in .pt format, compressed as .zip files, and includes the training and validation datasets for LUFnet, as well as the inference dataset used as input for LUFnet.

The dataset were prepared following the procedure described in the Methods section of the LUFnet paper, using Monte Carlo–generated initial configurations and Velocity-Verlet algorithm.

 

Training and validation datasets

Due to the large storage requirements, the total training set is split into multiple .pt files and provided in compressed zip format. The training data consist of 13 subsets, which can be merged into a single .pt file to load the full training set. The validation set is stored as a single .pt file.

  • Training set (13 files) : Train_set_part*.zip

  • Validation set (1 file) : Valid_set.zip

 

filename (.pt format) purpose # Samples # Particles Phase Ensemble Data shape
Train_intput_NVE_part0_dpt15000.pt training 15000 64 liquid NVE [15000, 3, 181, 64,3]
Train_intput_NVE_part1_dpt15000.pt training 15000 64 liquid NVE [15000, 3, 181, 64,3]
Train_intput_NVE_part2_dpt15000.pt training 15000 64 liquid NVE [15000, 3, 181, 64,3]
Train_intput_NVE_part3_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Train_intput_NVE_part4_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Train_intput_NVE_part5_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Train_intput_NVE_part6_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Train_intput_NVE_part7_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Train_intput_NVE_part8_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Train_intput_NVE_part9_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Train_intput_NVE_part10_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Train_intput_NVE_part11_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Train_intput_NVE_part12_dpt13500.pt training 13500 64 liquid NVE [13500, 3, 181, 64,3]
Valid_intput_NVE_dpt20000.pt validation 20000 64 liquid NVE [20000, 3, 181, 64,3]

Table 2. Detailed information for Training and validation sets

 

Table 2 shows detailed information about the .pt files for training and validation sets. Within these files, the trajectories include the initial phase-space configuration by MC simulations and configurations saved every 100 integration steps over a total of 18,000 steps using the Velocity-Verlet. The effective time step for LUFnet can be adjusted to τ=0.1.

After merging the separate training subsets into a single .pt file, the loaded tensor for the training set has the shape [nsamples, 3, time points, nparticles, dim] = [180,000, 3, 181, 64, 3]. While the validation dataset has the shape [20,000, 3, 181, 64, 3].

In this study, phase-space points were read off at larger time intervals (τ=0.05 for the 3D LJ system) and used as the training labels.

 

Inference dataset

The inference data are prepared as the sequence of input for LUFnet, similar to training data generated from the initial configurations. The inference datasets cover different conditions, including various numbers of particles, n = 64, 128, 256. 

  • Input for LUFnet (3 folders inside zip) : Inference_input_LUFnet.zip

 

filename (.pt format) purpose # Samples # Particles Phase Ensemble Data shape

Inference_input_NVE_LUFnet/

n64rho0.85T0.9/n64rho0.85T0.9.pt

Input for LUFnet 1000 64 liquid NVE [1000, 3, 81, 64, 3]

Inference_input_NVE_LUFnet/

n128rho0.85T0.9/n128rho0.85T0.9.pt

Input for LUFnet 1000 128 liquid NVE [1000, 3, 81, 128, 3]

Inference_input_NVE_LUFnet/

n256rho0.85T0.9/n256rho0.85T0.9.pt

Input for LUFnet 1000 256 liquid NVE [1000, 3, 81, 256, 3]

Table 3. Detailed information for inference set 

 

Table 3 shows detailed information about the .pt files for inference set. Within these files, the trajectories include the initial phase-space configuration by MC simulations and configurations saved every 100 integration steps over a total of 8,000 steps using the Velocity-Verlet. The effective time step for LUFnet can be adjusted to τ=0.1.

The loaded tensor for input for LUFnet has shape [1000, 3, 81, nparticles, 3] for each condition. 

 

Rollout data for the Velocity-Verlet algorithm for comparison with LUFnet

Simulations were performed using both LUFnet and Velocity-Verlet algorithm up to t= 100. For comparison with LUFnet (time step of = 0.05), the Velocity Verlet starts at t = 0.35. The configuration at t=0.35 is taken from  the LUFnet input sequence and defined as Index 0. A time step of τ = 0.001 was used with a total of 100,000 rollout steps; these data are provided here.  

  • Rollout steps for Velocity-Verlet (3 folders inside zip) : Inference_Rollout_steps_Velocity_verlet.zip

 

filename (.pt format) purpose # Samples # Particles Phase Ensemble Data shape

Rollout_steps_NVT_Velocity_verlet/

n64rho0.85T0.9/

n64rho0.85T0.9gamma20.pt

Rollout steps for Velocity-Verlet (to compare with LUFnet) 1000 64 liquid NVT [1000, 3, 1001, 64, 3]

Rollout_steps_NVT_Velocity_verlet/

n128rho0.85T0.9/

n128rho0.85T0.9gamma20.pt

Rollout steps for Velocity-Verlet (to compare with LUFnet) 1000 128 liquid NVT [1000, 3, 1001, 128, 3]

Rollout_steps_NVT_Velocity_verlet/

n256rho0.85T0.9/

n256rho0.85T0.9gamma20.pt

Rollout steps for Velocity-Verlet (to compare with LUFnet) 1000 256 liquid NVT [1000, 3, 1001, 256, 3]

Table 4. Detailed information for rollout steps for Velocity-Verlet (to compare with LUFnet)

 

Table 4 shows detailed information about the .pt files for rollout steps for Velocity-Verlet. Within these files, 

the loaded tensor for Velocity Verlet rollout data has shape [1000, 3, 1001, nparticles, 3] for each condition. 

For LUFnet, the time integration step was set to τ = 0.05, corresponding to 2,000 rollout steps in the Implementation provided on our github repository.

The 1,000 inference samples are split into 5 groups of 200 samples each for RDF and energy calculations. Metrics are computed for each group, and the mean and standard deviation across the 5 groups are used for plotting and table, as shown in Figures 3 and 4, and Table 2 of the LUFnet paper.

Files

Valid_set.zip

Files (129.3 GB)

Name Size Download all
md5:191f3098e461d8e9242c3180e7eafd96
1.7 GB Preview Download
md5:e04301b11064e7da37924e965842cb7b
20.8 GB Preview Download
md5:00919123c9409eec81bcdb36d7efbe3d
8.0 GB Preview Download
md5:ada4f20f3b1909b58ed8fda14a61b3dc
8.0 GB Preview Download
md5:300b48171eb83aa8ae848a3caa6fd499
7.2 GB Preview Download
md5:10ad256c5c1a7cc5bacd7108179a5cda
7.2 GB Preview Download
md5:5e3d4f12bf7484a607f1054307f976f6
7.2 GB Preview Download
md5:64b06d6f9ffc9b7cb757f745f1099507
8.0 GB Preview Download
md5:fc78407703a4376fb3c681f996ba768d
7.2 GB Preview Download
md5:ee43b337b0a0f38b226622bfe946d5dc
7.2 GB Preview Download
md5:491dcd15f9e4b622f2b3b742a4072d25
7.2 GB Preview Download
md5:143efd0e62b3433722270589d0be2b5a
7.2 GB Preview Download
md5:4ee7af36d7b7df34ab6e17b26dea499f
7.2 GB Preview Download
md5:3e3722fbb68097e28d2025b81159901c
7.2 GB Preview Download
md5:fd25574c370080063ded87fbe8329a66
7.2 GB Preview Download
md5:9aac6c8b3eeb7c8488de14a78ae0a74f
10.7 GB Preview Download

Additional details

Funding

Agency for Science, Technology and Research