Structural data used to test a new geometric deep learning RNA scoring function emulating fully de novo modeling conditions

Placeholder Show Content

Abstract/Contents

Abstract

This deposition contains two primary subdirectories, each compressed.
(Depositions in the Stanford Digital Repository must have nominally flat
directory structures, solved by tarballs.) The subdirectory
"nonnative_secstruct" corresponds to the originally published version of the
paper. The subdirectory "xtal_secstruct" corresponds to the corrected version of
the paper.

For 11 of the 16 RNA molecules in this benchmark (benchmark 2 of the paper), all
of the candidate models in "nonnative_secstruct" were constructed using
incorrect Watson-Crick base pairing, in contrast to what would typically happen
in actual blind structure prediction (where modelers might try multiple
secondary structures and pick those that generate realistic 3D models). In
"xtal_sectruct", the candidate models were regenerated using correct
Watson-Crick base pairing.

Each of these primary subdirectories contains 16 .tar.gz compressed directories,
each of which contains 5,000 structural models in PDB format of 16 distinct RNA
molecules. These models were used to benchmark a new scoring function for RNA
structure.

The first order of business in choosing the cases studied in this benchmark was
ensuring that they did not overlap with any RNA molecules studied previously in
the ARES project. Thus, this benchmark does not represent a perfectly
comprehensive account of RNA structure and was not meant to: it is just one of
several ways you could select a set of structures complementary to those studied
previously in the paper. It is *likely* that, therefore, these models will be of
archival value alone; alternatively, these models, or the modeling problems they
address, could make up part of a larger comprehensive benchmark.

To that end, in each primary subdirectory we have also included inputs.tar.gz,
which provides all the files necessary for rerunning these benchmark cases
yourself. Models were generated with FARFAR2, code for RNA fragment assembly
documented extensively at https://new.rosettacommons.org/docs/latest/FARFAR2.
That documentation should demystify the executable commands found in each
README_FARFAR file.

If you are already familiar with the Das lab's repository for RNA benchmarking,
you can use that system to set up replications of the xtal_secstruct simulations
at https://github.com/DasLab/rna_benchmark, using the benchmark definition file
ares_benchmark2.txt.

If you are interested in training a new RNA scoring function or sampling method,
consider the FARFAR2-Classics and FARFAR2-Puzzles benchmarks, available at
https://purl.stanford.edu/wn364wz7925.

Description

Type of resource Dataset
Date modified September 8, 2022; October 5, 2022; October 5, 2022; November 1, 2022
Publication date September 2, 2022

Creators/Contributors

Author Watkins, Andrew ORCiD icon https://orcid.org/0000-0003-1617-1720 (unverified)
Author Rangan, Ramya ORCiD icon https://orcid.org/0000-0002-0960-0825 (unverified)
Author Townshend, Raphael ORCiD icon https://orcid.org/0000-0001-6362-1451 (unverified)
Author Eismann, Stephan
Author Karelina, Masha ORCiD icon https://orcid.org/0000-0003-1880-4536 (unverified)
Author Dror, Ron ORCiD icon https://orcid.org/0000-0002-6418-2793 (unverified)
Author Das, Rhiju ORCiD icon https://orcid.org/0000-0001-7497-0972 (unverified)

Subjects

Subject Biochemistry
Subject RNA structure
Subject fragment assembly
Subject blind prediction
Subject Computer science
Subject Deep learning (Machine learning)
Genre Data
Genre Data sets
Genre Dataset

Bibliographic information

Related item
DOI https://doi.org/10.25740/sq987cc0358
Location https://purl.stanford.edu/sq987cc0358

Access conditions

Use and reproduction
User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
License
This work is licensed under a Creative Commons Attribution Share Alike 4.0 International license (CC BY-SA).

Preferred citation

Preferred citation
Watkins, Andrew M. and Rangan, Ramya and Townshend, Raphael J. L. and Eismann, Stephan and Karelina, Masha and Dror, Ron O. and Das, Rhiju. (2021). Structural data used to test a new geometric deep learning RNA scoring function emulating fully de novo modeling conditions. Stanford Digital Repository. Available at: https://purl.stanford.edu/sq987cc0358 https://doi.org/10.25740/sq987cc0358

Collection

Contact information

Also listed in

Loading usage metrics...