Simulated datasets for LUFNet-CPC2026

Park, Sojeong; Liu, Wei; Lauw, Simon Julian; Kwak, Wooseop; Verma, Chandra shekhar; Lee, Hwee Kuan

doi:10.5281/zenodo.18241955

Published February 6, 2026 | Version v1

Dataset Open

Simulated datasets for LUFNet-CPC2026

1. Agency for Science, Technology and Research
2. National University of Singapore
3. Chosun University
4. Nanyang Technological University
5. International Research Laboratory on Artificial Intelligence
6. Centre for Frontier AI Research

Links to code and LUFnet paper:

1. [LUFnet Code]( https://github.com/Gobliu/LUFNet-CPC2026.git )

2. [Scalable neural network driven molecular dynamics simulation](https://doi.org/10.1016/j.cpc.2026.110036)

The .zip files contain the full dataset used in our LUFnet paper.

Category	Count (zip files)	Total Samples
Training set	13 files	180,000
validation set	1 file	20,000
Inference set : Input for LUFnet	3 files	3,000
Rollout steps for Velocity-Verlet (to compare with LUFnet)	3 files	3,000

Table 1. Dataset summary

The dataset is provided in .pt format, compressed as .zip files, and includes the training and validation datasets for LUFnet, as well as the inference dataset used as input for LUFnet.

The dataset were prepared following the procedure described in the Methods section of the LUFnet paper, using Monte Carlo–generated initial configurations and Velocity-Verlet algorithm.

Training and validation datasets

Due to the large storage requirements, the total training set is split into multiple .pt files and provided in compressed zip format. The training data consist of 13 subsets, which can be merged into a single .pt file to load the full training set. The validation set is stored as a single .pt file.

Training set (13 files) : Train_set_part*.zip
Validation set (1 file) : Valid_set.zip

filename (.pt format)	purpose	# Samples	# Particles	Phase	Ensemble	Data shape
Train_intput_NVE_part0_dpt15000.pt	training	15000	64	liquid	NVE	[15000, 3, 181, 64,3]
Train_intput_NVE_part1_dpt15000.pt	training	15000	64	liquid	NVE	[15000, 3, 181, 64,3]
Train_intput_NVE_part2_dpt15000.pt	training	15000	64	liquid	NVE	[15000, 3, 181, 64,3]
Train_intput_NVE_part3_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Train_intput_NVE_part4_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Train_intput_NVE_part5_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Train_intput_NVE_part6_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Train_intput_NVE_part7_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Train_intput_NVE_part8_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Train_intput_NVE_part9_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Train_intput_NVE_part10_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Train_intput_NVE_part11_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Train_intput_NVE_part12_dpt13500.pt	training	13500	64	liquid	NVE	[13500, 3, 181, 64,3]
Valid_intput_NVE_dpt20000.pt	validation	20000	64	liquid	NVE	[20000, 3, 181, 64,3]

Table 2. Detailed information for Training and validation sets

Table 2 shows detailed information about the .pt files for training and validation sets. Within these files, the trajectories include the initial phase-space configuration by MC simulations and configurations saved every 100 integration steps over a total of 18,000 steps using the Velocity-Verlet. The effective time step for LUFnet can be adjusted to τ=0.1.

After merging the separate training subsets into a single .pt file, the loaded tensor for the training set has the shape [nsamples, 3, time points, nparticles, dim] = [180,000, 3, 181, 64, 3]. While the validation dataset has the shape [20,000, 3, 181, 64, 3].

In this study, phase-space points were read off at larger time intervals (τ=0.05 for the 3D LJ system) and used as the training labels.

Inference dataset

The inference data are prepared as the sequence of input for LUFnet, similar to training data generated from the initial configurations. The inference datasets cover different conditions, including various numbers of particles, n = 64, 128, 256.

Input for LUFnet (3 folders inside zip) : Inference_input_LUFnet.zip

filename (.pt format)	purpose	# Samples	# Particles	Phase	Ensemble	Data shape
Inference_input_NVE_LUFnet/ n64rho0.85T0.9/n64rho0.85T0.9.pt	Input for LUFnet	1000	64	liquid	NVE	[1000, 3, 81, 64, 3]
Inference_input_NVE_LUFnet/ n128rho0.85T0.9/n128rho0.85T0.9.pt	Input for LUFnet	1000	128	liquid	NVE	[1000, 3, 81, 128, 3]
Inference_input_NVE_LUFnet/ n256rho0.85T0.9/n256rho0.85T0.9.pt	Input for LUFnet	1000	256	liquid	NVE	[1000, 3, 81, 256, 3]

Table 3. Detailed information for inference set

Table 3 shows detailed information about the .pt files for inference set. Within these files, the trajectories include the initial phase-space configuration by MC simulations and configurations saved every 100 integration steps over a total of 8,000 steps using the Velocity-Verlet. The effective time step for LUFnet can be adjusted to τ=0.1.

The loaded tensor for input for LUFnet has shape [1000, 3, 81, nparticles, 3] for each condition.

Rollout data for the Velocity-Verlet algorithm for comparison with LUFnet

Simulations were performed using both LUFnet and Velocity-Verlet algorithm up to t= 100. For comparison with LUFnet (time step of = 0.05), the Velocity Verlet starts at t = 0.35. The configuration at t=0.35 is taken from the LUFnet input sequence and defined as Index 0. A time step of τ = 0.001 was used with a total of 100,000 rollout steps; these data are provided here.

Rollout steps for Velocity-Verlet (3 folders inside zip) : Inference_Rollout_steps_Velocity_verlet.zip

filename (.pt format)	purpose	# Samples	# Particles	Phase	Ensemble	Data shape
Rollout_steps_NVT_Velocity_verlet/ n64rho0.85T0.9/ n64rho0.85T0.9gamma20.pt	Rollout steps for Velocity-Verlet (to compare with LUFnet)	1000	64	liquid	NVT	[1000, 3, 1001, 64, 3]
Rollout_steps_NVT_Velocity_verlet/ n128rho0.85T0.9/ n128rho0.85T0.9gamma20.pt	Rollout steps for Velocity-Verlet (to compare with LUFnet)	1000	128	liquid	NVT	[1000, 3, 1001, 128, 3]
Rollout_steps_NVT_Velocity_verlet/ n256rho0.85T0.9/ n256rho0.85T0.9gamma20.pt	Rollout steps for Velocity-Verlet (to compare with LUFnet)	1000	256	liquid	NVT	[1000, 3, 1001, 256, 3]

Table 4. Detailed information for rollout steps for Velocity-Verlet (to compare with LUFnet)

Table 4 shows detailed information about the .pt files for rollout steps for Velocity-Verlet. Within these files,

the loaded tensor for Velocity Verlet rollout data has shape [1000, 3, 1001, nparticles, 3] for each condition.

For LUFnet, the time integration step was set to τ = 0.05, corresponding to 2,000 rollout steps in the Implementation provided on our github repository.

The 1,000 inference samples are split into 5 groups of 200 samples each for RDF and energy calculations. Metrics are computed for each group, and the mean and standard deviation across the 5 groups are used for plotting and table, as shown in Figures 3 and 4, and Table 2 of the LUFnet paper.

Files

Valid_set.zip

Files (129.3 GB)

Name	Size	Download all
Inference_input_LUFnet.zip md5:191f3098e461d8e9242c3180e7eafd96	1.7 GB	Preview Download
Inference_Rollout_steps_Velocity_verlet.zip md5:e04301b11064e7da37924e965842cb7b	20.8 GB	Preview Download
Train_set_part0.zip md5:00919123c9409eec81bcdb36d7efbe3d	8.0 GB	Preview Download
Train_set_part1.zip md5:ada4f20f3b1909b58ed8fda14a61b3dc	8.0 GB	Preview Download
Train_set_part10.zip md5:300b48171eb83aa8ae848a3caa6fd499	7.2 GB	Preview Download
Train_set_part11.zip md5:10ad256c5c1a7cc5bacd7108179a5cda	7.2 GB	Preview Download
Train_set_part12.zip md5:5e3d4f12bf7484a607f1054307f976f6	7.2 GB	Preview Download
Train_set_part2.zip md5:64b06d6f9ffc9b7cb757f745f1099507	8.0 GB	Preview Download
Train_set_part3.zip md5:fc78407703a4376fb3c681f996ba768d	7.2 GB	Preview Download
Train_set_part4.zip md5:ee43b337b0a0f38b226622bfe946d5dc	7.2 GB	Preview Download
Train_set_part5.zip md5:491dcd15f9e4b622f2b3b742a4072d25	7.2 GB	Preview Download
Train_set_part6.zip md5:143efd0e62b3433722270589d0be2b5a	7.2 GB	Preview Download
Train_set_part7.zip md5:4ee7af36d7b7df34ab6e17b26dea499f	7.2 GB	Preview Download
Train_set_part8.zip md5:3e3722fbb68097e28d2025b81159901c	7.2 GB	Preview Download
Train_set_part9.zip md5:fd25574c370080063ded87fbe8329a66	7.2 GB	Preview Download
Valid_set.zip md5:9aac6c8b3eeb7c8488de14a78ae0a74f	10.7 GB	Preview Download

Additional details

Agency for Science, Technology and Research

	All versions	This version
Views	26	26
Downloads	11	11
Data volume	91.6 GB	91.6 GB

Simulated datasets for LUFNet-CPC2026

Authors/Creators

Description

Files

Valid_set.zip

Files (129.3 GB)

Additional details

Funding