High-quality RNA residues: RNA2023
Description
Introduction
--------------------------------------------------------------------------------
This is the RNA2023 dataset by the Richardson Lab at Duke University
These are high-quality residues from high-quality, low-redundancy RNA chains in the PDB.
For a similar set of quality-filtered protein residues, see the top2018 datasets at:
https://doi.org/10.5281/zenodo.4626149
https://doi.org/10.5281/zenodo.5115232
Corresponding authors
--------------------------------------------------------------------------------
dcrjsr at kinemage.biochem.duke.edu
christopher.sci.williams at gmail.com
Usage recommendations
--------------------------------------------------------------------------------
RNA residues that fail the filtering criteria described below have been removed from the files. As a result, these files can be considered pre-filtered and will return only results for residues of good model quality with supporting experimental data.
Files already contain hydrogens added by Reduce in the context of the original full models.
Two datasets are provided. The standard dataset is rna2023_pruned. We recommend this version as the default. The RNA backbone conformational space is highly diverse, and some real conformations fall below the statistical threshold for recognition as suites. Therefore we do not recommend excluding suite outliers from the dataset except in specialty cases. We also provide a rna2023_nosuiteout dataset. In this case, no residues having "!!" outlier suite identifications are permitted. This set may be useful in specialist cases where suite outliers are undesireable or where losing some real conformations is an acceptable sacrifice for maximal filtering.
Each dataset also has a mmCIF version.
Note: Chains are named based on author chain ids, except for 8b0x, chain a. To avoid conflicts with 8b0x chain A in file systems that do not support case-sensitive file names, 8b0x chain a has been renamed to chain AB, matching its PDB/mmCIF designation.
Additional files
--------------------------------------------------------------------------------
rna2023_pdbmetadata.csv contains information on release date, resolution, title, authors, etc for each source pdb.
rna2023_chain_list contains a list of all included chains, plus statistics on the number residues from the original chain passed the quality filters.
rna2023_suitename_table.csv and rna2023_suitename_table_nosuiteout.csv contain suitename identifications of rotameric RNA backbone conformations (1a, 1c, 2u, 6d, etc) precomputed for convenience.
Filtering criteria: Chain level
--------------------------------------------------------------------------------
The chain list was derived from http://rna.bgsu.edu/rna3dhub/nrlist, version 3.150 as of 2020/10/28, with a 1.9Å resolution cutoff.
We added 6ugg chain A and two recent EM ribosome structures: 8a3d and 8b0x
After residue-level filtering, chains with no complete suites were removed.
Filtering criteria: Residue level
--------------------------------------------------------------------------------
Even excellent structures usually contain some poorly-resolved regions. Residue-level filtering helps avoid including these regions in otherwise high-quality data
Residues are required to meet the following validation quality contain:
No sugar pucker outliers
No steric overlaps or "clashes", as per Probe >= 0.5Å
No covalent bond or angle geometry outliers
Optionally, no !! suite outliers
Residues from xray structures are required for meet the following fit-to-map criteria:
Average of worst 2 atoms' 2Fo-Fc map values >= 1.2
Average of worst 2 atoms' RSCC scores >= 0.7
No atoms modeled at partial occupancy
Residues from em structures are required for meet the following fit-to-map criteria:
RSCC >= 0.7
Residue inclusion fraction = 1.0 or >= 0.95, depending on structure
No atoms modeled at partial occupancy
Filtering is documented in each pruned file. See USER DOC lines in .pdb and data_rna2023_dataset loops in .cif
Version history
--------------------------------------------------------------------------------
Version 1.0 Jun 30, 2023
Initial version
Files
README.txt
Files
(25.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:0bc079a0b5fcfdc960a4f573adcbd046
|
4.2 kB | Preview Download |
|
md5:28427e5cf2a62e964d9fbd1143845347
|
3.3 kB | Preview Download |
|
md5:1619753172b144d20e1b1b311f7073a9
|
3.3 kB | Preview Download |
|
md5:b1d4dcf93b9fcbfc9da4b63e1c904db9
|
24.8 kB | Preview Download |
|
md5:f0db1f670f3e12f231eeae7db144308d
|
6.7 MB | Preview Download |
|
md5:e9943504c35148ef934c225f52285853
|
6.4 MB | Preview Download |
|
md5:62827266ec09834450a6640dc918b4f5
|
5.7 MB | Preview Download |
|
md5:bf0c1016159863f69adf659ee7330866
|
6.0 MB | Preview Download |
|
md5:04c4c4c5e19f5b2ba96428cc396731ea
|
132.8 kB | Preview Download |
|
md5:89bf68132719b4475ed4727da4df8259
|
118.9 kB | Preview Download |