The rPredictorDB infrastructure has several major components:
We will now describe the individual components of rPredictorDB.
The rData components is further divided into parts. The major part is the rPredictorDB POSTGRES database, rDB. The rDB holds the rPredictorDB dataset - all the information about RNA that are available for searching in rPredictorDB. Also under the label of rData, databases for individual tools are grouped. These databases do not offer any extra information; they are merely extracted from rData (or directly from its sources) and re-formatted for efficient use by individual tools. The tool that needs this kind of re-formatting of the whole rPredictorDB dataset is currently Sequence search. The Taxonomy search and Annotations search, on the other hand, queries the database directly. (More on this in the section on rWeb.)
The dataset is generated by combining information external sources (SILVA, Rfam, ENA and Taxonomy-NCBI databases). This process is handled by the rETL component.
A full description of rDB itself is found in The Data of rPredictorDB.
The process of generating tool-specific representations of the rPredictorDB dataset is described in the rPredictorDB setup.
A more high-level description of the dataset is available in the User documentation, in rPredictorDB data and database.
The rETL (Extraction - Transformation - Load) component of rPredictorDB handles downloading and processing data from various sources in order to populate rDB with the rPredictorDB dataset. The process has many steps, from automated queries to parallelized processing of secondary structure predictions.
A detailed description of this component can be found in the section The ETL layer of rPredictorDB.
Within rPredictorDB, numerous external tools are integrated. (“External” here means “not a part of rWeb”.) They provide the “useful functionality” like various methods of similarity search or secondary structure prediction, including auxiliary functions like gluing various input/output formats togehter. The rTools component is a label under which this collection of external (and partially internal) tools is kept.
Warning
Note that there is a different perspective on what a “tool” is from the point of view of the rWeb component. In rWeb, a tool is a PHP class that integrates some search or prediction functionality into the rPredictorDB website. rTools, on the other hand, is a collection of programs that stand outside the rWeb component.
Not all tools are third-party: under rTools is also grouped Cppredict, a Matlab program that implements CP-predict: a two-phase algorithm for rRNA structure prediction (and some more utilities).
The connections between rTools and other components (rData, rETL and rWeb tool classes) merit further explanation of the nature of these relationships:
An overview of rTools is practically synonymous with the list of requirements of type “install a library/tool/package” in rPredictorDB setup. Furthermore, as new functionality will be made available through rWeb, the rTools component will grow accordingly.
The rWeb component serves as a presentation layer for collected and generated data. The web application is written in PHP with the Nette Framework.
The rWeb component is the most complex of rPredictorDB. It is organized into a layered architecture, with client-side scripts on the user end and a pipeline that runs a user’s query through presenters, parsers and finally tools and then the results back to the user.
The detailed description of rWeb design is available in the section rWeb: the rPredictorDB website.
The rDoc component, rPredictorDB documentation, is split into three major groups: User, Technical and API references. The User and Technical documentation are generated using the Sphinx library from a central repository. The reference documentation is further split into documentation for individual components, as each component (and in the case of rTools, each sub-component) has its own API reference, often in incompatible formats.
The User and Technical documentation generated in HTML form is integrated directly into rWeb, the API reference for rWeb is available as a part of the rPredictorDB site as well.