rPredictorDB¶

Introduction¶

rPredictorDB is a predictive database of secondary structures of individual RNAs and their formatted plots. The structures are generated by template-based prediction of RNA secondary structure with experimentally identified structures as templates. RNAs with large secondary structure are visualized using a template-based visualization method allowing for their formatted and readable display. The rPredictorDB web also allows for secondary structure template-based prediction for user-uploaded RNA sequences using templates stored in rPredictorDB.

Following is a brief, usage documentation. A technical, more detailed, documentation is available here.

Citations:

Pánek J, Modrák M, Schwarz M: An Algorithm for Template-Based Prediction of Secondary Structures of Individual RNA Sequences. Front Genet, 8, 147. doi:10.3389/fgene.2017.00147
Elias R, Hoksza D: TRAVeLer: a tool for template-based RNA secondary structure visualization. BMC Bioinformatics. 2017 Nov 15;18(1):487. doi:10.1186/s12859-017-1885-4

Search page¶

The Search page allows for searching the RNAs stored in rPredictorDB accompanied with their secondary structure(s) and their plot(s) base on various criteria. The items on the Search page have either a static description or a pop-up description that appear on mouse pointer overlap.

The search criteria are Taxonomy, Annotation and Sequence. If selected via their check boxes, an appropriate section appears bellow. Multiple criteria can be selected.

Taxonomy criterion: Restricts the search in the database to chosen taxonomic group(s) or an organism. A taxonomy browser is available (via a hyperlink) to choose taxonomic group(s) by browsing a phylogenetic tree.

Annotation criterion: search the stored RNAs using annotations stored in rPredictorDB. The items of the annotations that can be used for the search have appropriate text / check boxes. They are pre-filled with example keywords.

Sequence criterion: search the stored RNAs based on sequence similarity. The input is a RNA sequence. RNAs stored in rPredictorDB with sequences similar to the query sequence are identified and listed. The similarity is computed by BLAST.

Predict page¶

The Predict page allows for template–based RNA secondary structure prediction for user sequences using templates stored in rPredictorDB.

The page allows for sequence upload (by copy – paste) and selection of the template. The selection is either automatic, when the template with most similar sequence to the query sequence is identified and used for prediction, or manual. Templates are secondary structures extracted from experimentally identified structures, found mostly in PDB.

Download page¶

The Download page allows to download rPredictorDB database in CSV format or as a database dump. There is also publicly available the used prediction algorithm.

List of families and template structures included in rPredictorDB¶

RNA	Source of sequences	Templates and their source	Template sequence length (nucleotides)
16S rRNA	Silva	E. coli 16S rRNA (PDB ID 2ZM6)¹	1542
18S rRNA Chordata		H. sapiens 18S rRNA (PDB ID 4V6X)¹	1869
18S rRNA Diptera		D. melanogaster 18S rRNA (PDB ID 4V6W)¹	1995
5S rRNA Bacteria	Rfam (RF00001)	E. coli 5S (PDB ID 1C2X)¹	120
5S rRNA Eukarya	Rfam (RF00001)	H. sapiens 5S (PDB ID 6EKO)¹	120
5.8S rRNA	Rfam (RF00002)	Trypanosoma cruzi 5.8S RNA (PDB ID 5T5H)	169
6S RNA	Rfam (RF00013)	E. coli 6S RNA (Wassarman et al. 2000)^2,3	184
6S RNA	Rfam (RF00013)	B. subtilis 6S RNA (Ando et al.2002)^2,3	187
9S rRNA	Rfam (RF02545)	Trypanosoma brucei 9S rRNA (PDB ID 6HIY)	621
Aphthovirus internal ribosome entry site	Rfam (RF00210)	PDB ID 2NBX⁴	107
AdoCbl variant	Rfam (RF01689)	Marine metagenome (PDB ID 4FRN)	110
Archaeal signal recognition particle RNA	Rfam (RF01857)	Methanocaldococcus jannaschii (PDB ID 3NDB)	135
Cobalamin riboswitch	Rfam (RF00174)	Symbiobacterium thermophilum (PDB ID 4GXY)	172
C-DI-AMP riboswitch	Rfam (RF00379)	Thermovirga lienii C-DI-AMP riboswitch (PDB ID 4QK9)	123
CRPV-IRES	Rfam (RF00458)	Mammalian CRPV-IRES (PDB ID 6D9J)	190
CSFV IRES	Rfam (RF00209)	Viral CSFV IRES (PDB ID 4C4Q)	233
FMN riboswitch	Rfam (RF00050)	PDB ID 3F2Y⁴	112
Fungi U3	Rfam (RF01846)	S. cerevisiae u3 (PDB ID 5WYK)	333
gcvB	Sharma et al. 2007³	S. typhimurium gcvB (Sharma et al. 2007)³	206
GLMS ribosyme	Rfam (RF00234)	Bacillus anthracis GLMS ribosyme (PDB ID 3L3C)	141
Group I catalytic intron	Rfam (RF00028)	Staphylococcus virus Twort (PDB ID 1Y0Q)	192
Group II catalytic intron D1-D4-3	Rfam (RF02001)	PDB ID 3BWP⁴	172
Group II catalytic intron D1-D4-7	Rfam (RF02012)	Pylaiella littoralis (PDB ID 4R0D)	157
Group II intron lariat	NCBI⁵	Oceanobacillus iheyensis group II intron (PDB ID 5J02)	418
Group II intron lariat in post-catalytic state⁶	NCBI⁵	Pylaiella littoralis (PDB ID 6CIH)	621
IRES HCV	Rfam (RF00061)	H. sapiens IRES HCV (PDB ID 5A2Q)	257
Lariat capping ribozyme	Rfam (RF01807)	Didymium iridis lariat capping ribozyme (PDB ID 4P8Z)	188
Lysine riboswitch	Rfam (RF00168)	T. maritima lysine riboswitch (PDB ID 4ERL)	161
Mammalian CPEB3 ribozyme	Rfam (RF00622)	H. sapiens CPEB3 (Salehi-Ashtiani et al. 2006)³	78
M-box	Rfam (RF00380)	B. subtilis M-box (PDB ID 3PDR)	161
micF	Rfam (RF00033)	E. coli micF (Esterling et al. 1994)³	95
MLV encapsidation signal	Rfam (RF00374)	Viral MLV (PDB ID 1U6P)	101
ms1	Hnilicova et al. 2014³	M. smegmatis ms1 (Panek et al. 2011, Hnilicova et al. 2014)³	304
oxyS	Rfam (RF00035)	E. coli oxyS (Argaman et al. 2000)³	109
PHI29 PROHEAD RNA	Rfam (RF00044)	Bacteriophage PHI29 (PDB ID 1FOQ)	117
RNaseP arch	Rfam (RF00373)	Pyrococcus furiosus RNaseP	347
RNaseP bact a	NCBI⁵	T. tengcongensis RNaseP bact a (PDB ID 3Q1R)	347
RNaseP bact b	Rfam (RF00011)	PDB ID 2A64⁴	414
RNaseP nuc	Rfam (RF00009)	H. sapiens RNaseP (Marquez et al. 2005)³	341
ryhB	Davis et al. 2005³	E. coli ryhB (Davis et al. 2005)³	90
SAM I	Rfam (RF00162)	T. tengcongensis SAM I (PDB ID 2GIS)	94
spot42	Rfam (RF00021)	E. coli spot42 (Moller et al. 2002)	119
SRP bact small	Rfam (RF00169)	E. coli SRP (SRPDB ID esccol3d-97-11-17-stretched.pdb)	114
SRP bact large	Rfam (RF01854)	B. subtilis SRP (PDB ID 4UE4)	266
SRP Metazoa	NCBI⁵	H. sapiens SRP (PDB ID 4P3E)	301
Tetrahymena ribozyme	NCBI⁵	PDB ID 1X8W⁴	247
Tetrahymena telomerase RNA	Rfam (RF00025)	Tetrahymena Telomerase RNA (PDB ID 6D6V)	159
THF riboswitch	Rfam (RF01831)	PDB ID 4LVV⁴	89
tmRNA	Rfam (RF00023)	E. coli tmRNA (PDB ID 3IZ4)	377
TPP riboswitch	NCBI⁵	E. coli TPP (PDB ID 4NYG)	83
tRNA Gly eukaryotic	Rfam (RF00005)	H. sapiens tRNA Gly (PDB ID 5E6M)¹	74
tRNA Gly bacterial	Rfam (RF00005)	G. kaustophilus tRNA Gly (PDB ID 4MGM)¹	75
Trypanosomatid mitochondrial large subunit ribosomal RNA	Rfam (RF02546)	Trypanosoma brucei brucei (PDB ID 6HIV)	560
u1	Rfam (RF00003)	H. sapiens u1 (Nagai et al. 2002)³	163
u2	Rfam (RF00004)	H. sapiens u2 (Nagai et al. 2002)³	188
u4	Rfam (RF00015)	H. sapiens u4 (Krol et al. 1981)³	144
u5	Rfam (RF00020)	H. sapiens u5 (Sievers et al. 2011)³	116
u6	Rfam (RF00026)	H. sapiens u6 (PDB ID 5LQW)	112
vertebrate Telomerase RNA	Rfam (RF00024)	H. sapiens Telomerase RNA (Bentley et al. 2002)³	451
yeast u1	Rfam (RF00488)	S. cerevisiae (PDB ID 5ZWN)	565

¹ The template is applied to sequences according to taxonomy, i.e. a eukaryotic template to eukaryotic sequences, a prokaryotic template to prokaryotic sequences.
² It is impossible to distinguish which template should be used based on taxonomy, as some bacteria, e.g. Firmicutes, contain 6S RNAs of both template types. Therefore the template producing a structure with a better z-score is used for each 6S RNA.
³ Sequences and/or template structure were copied from the paper publishing the template structure.
⁴ Organism not described or a synthetic expression system used.
⁵ The sequences were obtained by NCBI BLAST search with "somewhat similar sequences" parameters against nr database with query sequences taken from PDB. The reason was that the sequences in an appropriate Rfam family seemed incompatible with PDB structure, as they either were short fragments or had very low sequence similarity to the PDB sequence.
⁶ This family contains several very short fragments producing substructures that are hard to match with the template structure. Nevertheless, we included them into rPredictorDB as they had significant BLAST E-values (< 1.10^-12) and also, as they represent a good example of RNAs with extremely fragmented sequences.

rPredictorDB¶

Introduction¶

Search page¶

Predict page¶

Download page¶

List of families and template structures included in rPredictorDB¶

Table Of Contents

Statistics

Navigation

rPredictorDB¶

Introduction¶

Search page¶

Predict page¶

Download page¶

List of families and template structures included in rPredictorDB¶

Table Of Contents

Statistics

Navigation