Preprint Open Access

Single-chain CG Polymers in Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning

Fu, Xiang; Xie, Tian; Rebello, Nathan; Olsen, Bradley; Jaakkola, Tommi


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.6764836</identifier>
  <creators>
    <creator>
      <creatorName>Fu, Xiang</creatorName>
      <givenName>Xiang</givenName>
      <familyName>Fu</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0001-7480-6312</nameIdentifier>
      <affiliation>MIT</affiliation>
    </creator>
    <creator>
      <creatorName>Xie, Tian</creatorName>
      <givenName>Tian</givenName>
      <familyName>Xie</familyName>
      <affiliation>MIT</affiliation>
    </creator>
    <creator>
      <creatorName>Rebello, Nathan</creatorName>
      <givenName>Nathan</givenName>
      <familyName>Rebello</familyName>
      <affiliation>MIT</affiliation>
    </creator>
    <creator>
      <creatorName>Olsen, Bradley</creatorName>
      <givenName>Bradley</givenName>
      <familyName>Olsen</familyName>
      <affiliation>MIT</affiliation>
    </creator>
    <creator>
      <creatorName>Jaakkola, Tommi</creatorName>
      <givenName>Tommi</givenName>
      <familyName>Jaakkola</familyName>
      <affiliation>MIT</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Single-chain CG Polymers in Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2022</publicationYear>
  <subjects>
    <subject>molecular dynamics</subject>
    <subject>polymer</subject>
    <subject>coarse graining</subject>
    <subject>machine learning</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2022-04-21</date>
  </dates>
  <resourceType resourceTypeGeneral="Preprint"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/6764836</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsDescribedBy" resourceTypeGeneral="Preprint">https://arxiv.org/abs/2204.10348</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsCompiledBy" resourceTypeGeneral="Software">https://github.com/kyonofx/mlcgmd/</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.6764835</relatedIdentifier>
  </relatedIdentifiers>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;The single-chain coarse-grained polymer preprocessed dataset&amp;nbsp;that&amp;#39;s described in the paper: &amp;quot;Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning&amp;quot;.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Paper:&amp;nbsp;&lt;a href="https://arxiv.org/abs/2204.10348"&gt;https://arxiv.org/abs/2204.10348&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Code:&amp;nbsp;&lt;a href="https://github.com/kyonofx/mlcgmd/"&gt;https://github.com/kyonofx/mlcgmd/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Website:&amp;nbsp;&lt;a href="https://xiangfu.co/mlcgmd"&gt;https://xiangfu.co/mlcgmd&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Video:&amp;nbsp;&lt;a href="https://youtu.be/l3aGVjQezsc"&gt;https://youtu.be/l3aGVjQezsc&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dataset description:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We have done some preprocessing and down-sampling to reduce the gigantic dataset size. The uploaded dataset is made of three components:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;em&gt;polymer_train.tar.gz&lt;/em&gt;&amp;nbsp;contains MD trajectories for 100 training class-I polymers, each 50k tau&amp;nbsp;long with a recording frequency of 5 tau (10k steps).&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;polymer_test_5M.tar.gz&lt;/em&gt; contains MD trajectories for 40 testing class-II polymers, each 4.95&amp;nbsp;million&amp;nbsp;tau long with a recording frequency of 500 tau (9900 steps). This data is used for final evaluation.&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;polymer_test.tar.gz&lt;/em&gt;&amp;nbsp;contains MD trajectories for 40 testing class-II polymers, each 5k tau long with a recording frequency of 5 tau (1k steps). This data&amp;nbsp;is&amp;nbsp;used for initializing the learned simulator at test time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Paper Abstract:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Molecular dynamics (MD) simulation is the workhorse of various scientific domains but is limited by high computational cost. Learning-based force fields have made major progress in accelerating ab-initio MD simulation but are still not fast enough for many real-world applications that require long-time MD simulation. In this paper, we adopt a different machine learning approach where we coarse-grain a physical system using graph clustering, and model the system evolution with a very large time-integration step using graph neural networks. A novel score-based GNN refinement module resolves the long-standing challenge of long-time simulation instability. Despite only trained with short MD trajectory data, our learned simulator can generalize to unseen novel systems and simulate for much longer than the training trajectories. Properties requiring 10-100 ns level long-time dynamics can be accurately recovered at several-orders-of-magnitude higher speed than classical force fields. We demonstrate the effectiveness of our method on two realistic complex systems: (1) single-chain coarse-grained polymers in implicit solvent; (2) multi-component Li-ion polymer electrolyte systems.&lt;/p&gt;

&lt;p&gt;If you find this dataset useful, please consider reference in your paper:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;@article{fu2022simulate,
  title={Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning},
  author={Fu, Xiang and Xie, Tian and Rebello, Nathan J and Olsen, Bradley D and Jaakkola, Tommi},
  journal={arXiv preprint arXiv:2204.10348},
  year={2022}
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;@article{webb2020targeted,
  title={Targeted sequence design within the coarse-grained polymer genome},
  author={Webb, Michael A and Jackson, Nicholas E and Gil, Phwey S and de Pablo, Juan J},
  journal={Science advances},
  volume={6},
  number={43},
  pages={eabc6216},
  year={2020},
  publisher={American Association for the Advancement of Science}
}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
  </descriptions>
</resource>
490
155
views
downloads
All versions This version
Views 490490
Downloads 155155
Data volume 919.5 GB919.5 GB
Unique views 237237
Unique downloads 5656

Share

Cite as