Structural data used to test a new geometric deep learning RNA scoring function emulating fully de novo modeling conditions
This deposition contains two primary subdirectories, each compressed.
(Depositions in the Stanford Digital Repository must have nominally flat
directory structures, solved by tarballs.) The subdirectory
"nonnative_secstruct" corresponds to the originally published version of the
paper. The subdirectory "xtal_secstruct" corresponds to the corrected version of
For 11 of the 16 RNA molecules in this benchmark (benchmark 2 of the paper), all
of the candidate models in "nonnative_secstruct" were constructed using
incorrect Watson-Crick base pairing, in contrast to what would typically happen
in actual blind structure prediction (where modelers might try multiple
secondary structures and pick those that generate realistic 3D models). In
"xtal_sectruct", the candidate models were regenerated using correct
Watson-Crick base pairing.
Each of these primary subdirectories contains 16 .tar.gz compressed directories,
each of which contains 5,000 structural models in PDB format of 16 distinct RNA
molecules. These models were used to benchmark a new scoring function for RNA
The first order of business in choosing the cases studied in this benchmark was
ensuring that they did not overlap with any RNA molecules studied previously in
the ARES project. Thus, this benchmark does not represent a perfectly
comprehensive account of RNA structure and was not meant to: it is just one of
several ways you could select a set of structures complementary to those studied
previously in the paper. It is *likely* that, therefore, these models will be of
archival value alone; alternatively, these models, or the modeling problems they
address, could make up part of a larger comprehensive benchmark.
To that end, in each primary subdirectory we have also included inputs.tar.gz,
which provides all the files necessary for rerunning these benchmark cases
yourself. Models were generated with FARFAR2, code for RNA fragment assembly
documented extensively at https://new.rosettacommons.org/docs/latest/FARFAR2.
That documentation should demystify the executable commands found in each
If you are already familiar with the Das lab's repository for RNA benchmarking,
you can use that system to set up replications of the xtal_secstruct simulations
at https://github.com/DasLab/rna_benchmark, using the benchmark definition file
If you are interested in training a new RNA scoring function or sampling method,
consider the FARFAR2-Classics and FARFAR2-Puzzles benchmarks, available at
|Type of resource
|September 8, 2022; October 5, 2022; October 5, 2022; November 1, 2022
|September 2, 2022
|Deep learning (Machine learning)
- Use and reproduction
- User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
- This work is licensed under a Creative Commons Attribution Share Alike 4.0 International license (CC BY-SA).
- Preferred citation
- Watkins, Andrew M. and Rangan, Ramya and Townshend, Raphael J. L. and Eismann, Stephan and Karelina, Masha and Dror, Ron O. and Das, Rhiju. (2021). Structural data used to test a new geometric deep learning RNA scoring function emulating fully de novo modeling conditions. Stanford Digital Repository. Available at: https://purl.stanford.edu/sq987cc0358 https://doi.org/10.25740/sq987cc0358
Stanford Research DataView other items in this collection in SearchWorks
Also listed in
Loading usage metrics...