1. rPredictorDB User Documentation

1.13. rPredictorDB: the Why & the What

rPredictorDB exists to make bioinformatics over ribosomal RNA easier.

The rPredictorDB website provides access to various tools you can use over a database of rRNA molecules. We assembled a toolkit for various common bioinformatical tasks, falling into two broad categories:

  • Search
  • Predict secondary structure

Search tools retrieve a set of rRNA molecules from our database, both according to some exact criteria (length, organism name or group, type of rRNA, etc.) and similarity criteria (“Find me more sequences like this one”).

Prediction tools take a rRNA sequence and suggest base pairs.

An overview of the available tools is in The rPredictorDB Toolkit. This overview should help you to select the right tools for your work.

A reference manual of the inputs for each individual tool can be found in The rPredictorDB Toolkit Reference. This should help you understand in detail how to use a tool you chose.

Warning

If there is any confusion about the terminology we use and how we use it, refer to the rPredictorDB glossary.

More accurately and technically, the rPredictorDB infrastructure consists of the following components:

  • rData: a database of rRNA sequences and secondary structures
  • rETL: the infrastructure necessary to populate rData and keep it up to date
  • rTools: a set of tools that perform standard tasks on the data like similarity search or secondary structure prediction
  • CP-predict: a new algorithm dedicated to rRNA secondary structure prediction
  • rWeb: an internet portal and back-end that makes the rData and rTools components accessible to the research community and general public
  • rDoc: thorough documentation, including relevant scientific literature

Probably of most interest to the casual user is the rWeb component, which is the interface through which you will communicate with the other rPredictorDB components.

1.13.5. rDoc

The documentation is split into three parts:

  • rDoc-User, which covers all you need to know to use rPredictorDB (you are reading the main page of rDoc-User right now). Installation instruction is not included with the User documentation, because it is not a task for users to undertake.
  • rDoc-Technical, which describes how rPredictorDB is done: rPredictorDB Technical Documentation. This includes installation instructions.
  • rDoc-Reference, which is useful if you would like to join the rPredictorDB development team. The reference documentation is here.

Note

Taken from Frequently Asked Questions.

Why build a website for rRNA bioinformatics at all?

While there are several websites dedicated directly to ribosomal RNA, we (Bioinformatics laboratory of the Microbiological Institute of the Czech Academy of Sciences and our team at the Faculty of Mathematics and Physics at Charles University) feel that a better job could have been done.

The main drawbacks of similar bioinformatical websites are a sharp learning curve and missing information or unclear purpose of information. In order to use a website such as the Protein Data bank or the Comparative RNA website, a user first has to have - or obtain - a good general idea of what the site does, why would anyone want such a site and what a lot of terminology means.

After this background knowledge is obtained, the user finds out that certain information is not well-curated or made explicit: no one (at least publicly) keeps track of whether a RNA sequence was obtained directly or from a DNA transcription site, information about the type of rRNA is sparse, sometimes the phylogeny of a molecule is missing, a resolved secondary structure is not sufficiently labeled, it is unclear what constitutes a truly unique identifier of an RNA sequence or structure, etc. For instance, the STRAND database of resolved RNA secondary structures contains a number of structures inconsistently labeled for RNA-protein complexes and duplicate sequences.

Another drawback of such sites is often little or missing support for mass data retrieval. When present, it is usually in the form of pre-packaged archives and the user has little choice over what subset of the data to download. (A notable exception here is the SILVA database of rRNA molecules.)

While perhap such drawbacks are of less concern to biologists, the fast-growing field of bioinformatics is sensitive to this kind of volatility in data sources. Nobody wants to spend a lot of time by finding out what uniquely identifies an RNA sequence or structure, filling in missing fields, etc., let alone downloading a set of several hundred sequences of interest one by one.

The rWeb and rData components of rPredictorDB were designed with overcoming these drawbacks in mind. Our goal is first and foremost clarity: answers to questions such as “Why would I ever want to do that?” should be easy to find (and relatively easy to read).

1.13.5.1. What now?

To start using the rPredictorDB website right away, go search or predict.

If you wish to read more about rPredictorDB:

Table Of Contents

Previous topic

rPredictorDB

Next topic

1. rWeb Tutorial

This Page