UPDATE: Zenodo migration postponed to Oct 13 from 06:00-08:00 UTC. Read the announcement.

Preprint Open Access

Single-chain CG Polymers in Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning

Fu, Xiang; Xie, Tian; Rebello, Nathan; Olsen, Bradley; Jaakkola, Tommi


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">molecular dynamics</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">polymer</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">coarse graining</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">machine learning</subfield>
  </datafield>
  <controlfield tag="005">20220628134925.0</controlfield>
  <controlfield tag="001">6764836</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">MIT</subfield>
    <subfield code="a">Xie, Tian</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">MIT</subfield>
    <subfield code="a">Rebello, Nathan</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">MIT</subfield>
    <subfield code="a">Olsen, Bradley</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">MIT</subfield>
    <subfield code="a">Jaakkola, Tommi</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">3678717622</subfield>
    <subfield code="z">md5:8c5c003fbcfd48edc75a20ca6cb2a3e3</subfield>
    <subfield code="u">https://zenodo.org/record/6764836/files/polymer_test_5M.tar.gz</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">370981218</subfield>
    <subfield code="z">md5:644ea9f1efbd1df2e355f054cfdab7dd</subfield>
    <subfield code="u">https://zenodo.org/record/6764836/files/polymer_test.tar.gz</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">9482254610</subfield>
    <subfield code="z">md5:cc8279c05a75b267ed000b8b7b2c3e96</subfield>
    <subfield code="u">https://zenodo.org/record/6764836/files/polymer_train.tar.gz</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2022-04-21</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="o">oai:zenodo.org:6764836</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">MIT</subfield>
    <subfield code="0">(orcid)0000-0001-7480-6312</subfield>
    <subfield code="a">Fu, Xiang</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Single-chain CG Polymers in Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;The single-chain coarse-grained polymer preprocessed dataset&amp;nbsp;that&amp;#39;s described in the paper: &amp;quot;Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning&amp;quot;.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Paper:&amp;nbsp;&lt;a href="https://arxiv.org/abs/2204.10348"&gt;https://arxiv.org/abs/2204.10348&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Code:&amp;nbsp;&lt;a href="https://github.com/kyonofx/mlcgmd/"&gt;https://github.com/kyonofx/mlcgmd/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Website:&amp;nbsp;&lt;a href="https://xiangfu.co/mlcgmd"&gt;https://xiangfu.co/mlcgmd&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Video:&amp;nbsp;&lt;a href="https://youtu.be/l3aGVjQezsc"&gt;https://youtu.be/l3aGVjQezsc&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dataset description:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We have done some preprocessing and down-sampling to reduce the gigantic dataset size. The uploaded dataset is made of three components:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;em&gt;polymer_train.tar.gz&lt;/em&gt;&amp;nbsp;contains MD trajectories for 100 training class-I polymers, each 50k tau&amp;nbsp;long with a recording frequency of 5 tau (10k steps).&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;polymer_test_5M.tar.gz&lt;/em&gt; contains MD trajectories for 40 testing class-II polymers, each 4.95&amp;nbsp;million&amp;nbsp;tau long with a recording frequency of 500 tau (9900 steps). This data is used for final evaluation.&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;polymer_test.tar.gz&lt;/em&gt;&amp;nbsp;contains MD trajectories for 40 testing class-II polymers, each 5k tau long with a recording frequency of 5 tau (1k steps). This data&amp;nbsp;is&amp;nbsp;used for initializing the learned simulator at test time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Paper Abstract:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Molecular dynamics (MD) simulation is the workhorse of various scientific domains but is limited by high computational cost. Learning-based force fields have made major progress in accelerating ab-initio MD simulation but are still not fast enough for many real-world applications that require long-time MD simulation. In this paper, we adopt a different machine learning approach where we coarse-grain a physical system using graph clustering, and model the system evolution with a very large time-integration step using graph neural networks. A novel score-based GNN refinement module resolves the long-standing challenge of long-time simulation instability. Despite only trained with short MD trajectory data, our learned simulator can generalize to unseen novel systems and simulate for much longer than the training trajectories. Properties requiring 10-100 ns level long-time dynamics can be accurately recovered at several-orders-of-magnitude higher speed than classical force fields. We demonstrate the effectiveness of our method on two realistic complex systems: (1) single-chain coarse-grained polymers in implicit solvent; (2) multi-component Li-ion polymer electrolyte systems.&lt;/p&gt;

&lt;p&gt;If you find this dataset useful, please consider reference in your paper:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;@article{fu2022simulate,
  title={Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning},
  author={Fu, Xiang and Xie, Tian and Rebello, Nathan J and Olsen, Bradley D and Jaakkola, Tommi},
  journal={arXiv preprint arXiv:2204.10348},
  year={2022}
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;@article{webb2020targeted,
  title={Targeted sequence design within the coarse-grained polymer genome},
  author={Webb, Michael A and Jackson, Nicholas E and Gil, Phwey S and de Pablo, Juan J},
  journal={Science advances},
  volume={6},
  number={43},
  pages={eabc6216},
  year={2020},
  publisher={American Association for the Advancement of Science}
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">url</subfield>
    <subfield code="i">isDescribedBy</subfield>
    <subfield code="a">https://arxiv.org/abs/2204.10348</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">url</subfield>
    <subfield code="i">isCompiledBy</subfield>
    <subfield code="a">https://github.com/kyonofx/mlcgmd/</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.6764835</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.6764836</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">preprint</subfield>
  </datafield>
</record>
916
351
views
downloads
All versions This version
Views 916916
Downloads 351351
Data volume 1.9 TB1.9 TB
Unique views 622622
Unique downloads 159159

Share

Cite as