UPDATE: Zenodo migration postponed to Oct 13 from 06:00-08:00 UTC. Read the announcement.

Dataset Open Access

Protein Structure Initiative - TargetTrack 2000-2017 - all data files

Helen M. Berman, Margaret J. Gabanyi, Andrei Kouranov, David I. Micallef, John Westbrook; Protein Structure Initiative network of investigators

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Protein targets</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Protein Structure Initiative</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">high-throughput</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">structural genomics</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">structural biology</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">sequence data</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">targettrack</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">targetdb</subfield>
  <controlfield tag="005">20200124192609.0</controlfield>
  <controlfield tag="001">821654</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Protein Structure Initiative network of investigators</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">832912731</subfield>
    <subfield code="z">md5:200012a8a2a11ffd7e370ed142df36c3</subfield>
    <subfield code="u">https://zenodo.org/record/821654/files/TargetTrack-1Jul2017.tar.gz</subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2017-07-05</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="p">user-psi</subfield>
    <subfield code="o">oai:zenodo.org:821654</subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Protein Structure Initiative and Rutgers University</subfield>
    <subfield code="a">Helen M. Berman, Margaret J. Gabanyi, Andrei Kouranov, David I. Micallef, John Westbrook</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Protein Structure Initiative - TargetTrack 2000-2017 - all data files</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-psi</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by-sa/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution Share Alike 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;&lt;strong&gt;Protein Structure Initiative - TargetTrack protein target registration database (795 MB, gzipped tarball)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Protein Structure Initiative was a high-throughput structural genomics effort from 2000-2015 focused on developing technologies to enable greater coverage of protein structure space. Over its 15-year tenure, over 100 investigators at 35 centers (see ContributingCenters.xls) declared over 350,000 protein sequences (targets) that they would study using state-of-the-art protein production and structure determination methods.  Many of these targets were selected through bioinformatics-based methods to serve as representatives for sequence and structure clusters. &lt;/p&gt;

&lt;p&gt;From 2003-2010, these selected sequences and some basic identifying metadata were kept in a database called TargetDB, created at the Research Collaboratory for Structural Bioinformatics at Rutgers University. In 2008, a second database named PepcDB was created to track detailed experimental trial history and the standard protocols used by the PSI centers. These two databases became the principal structural genomics target databases, and were rolled into the &lt;strong&gt;PSI Structural Biology Knowledgebase&lt;/strong&gt; in 2008. &lt;/p&gt;

&lt;p&gt;As part of the third phase of the PSI, TargetDB and PepcDB were merged into a single resource, &lt;strong&gt;TargetTrack&lt;/strong&gt;, to facilitate one-stop access to the data as well as expanding the schema to include new required data items.  Participating centers deposited the latest status on their active targets and the protocols that were used (along with any deviations) on a weekly or quarterly basis.  TargetTrack provided a variety of pre-computed data downloads on a weekly basis as well. &lt;/p&gt;

&lt;p&gt;In July 2017, the Structural Biology Knowledgebase ceased operations.  The files provided in this tarball represent the final datafiles generated by TargetTrack (timestamp June 30, 2017).  &lt;strong&gt;Please read the README included in this dataset for descriptions of each file. &lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The entire TargetTrack datafile in XML format can be found in /TargetTrack XML files/tt.xml.gz&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Key documentation can be found in the /Documentation folder.&lt;br&gt;
TargetTrack schema: targetTrack-v1.4.1.pdf&lt;br&gt;
Spreadsheet with TargetTrack enumerations for relevant fields: targetTrackEnumeratedDataItems-v1.4.1-1.xls&lt;br&gt;
Image depicted the XML data schema: targetTrack-v1.4.1.jpg&lt;/p&gt;

&lt;p&gt;These files are 868 MB in total size, uncompressed. &lt;br&gt;
To open the tarball, use the command 'tar -zxvf TargetTrack-1Jul2017.tar.gz'&lt;/p&gt;

&lt;p&gt;-- created by the PSI Structural Biology Knowledgebase, July 5, 2017&lt;/p&gt;</subfield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.821653</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.821654</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
All versions This version
Views 5,0315,035
Downloads 1,1231,123
Data volume 935.4 GB935.4 GB
Unique views 4,4614,464
Unique downloads 816816


Cite as