.. _Technical-rWeb:

rWeb: the rPredictorDB website
********************************

As has been said in :ref:`Technical-architecture-rweb`, the rWeb component of rPredictorDB serves as a presentation layer for collected and generated data. The web application is written in `PHP <http://php.net/>`_ with the `Nette Framework <http://nette.org/>`_. 

.. warning:: 

  Unless you are familiar with Nette, it may be difficult to understand how individual parts of the rWeb PHP code interact. The `Nette documentation is readily available <http://doc.nette.org/en/2.0/>`_.
  
  
.. note::

  A complete reference documentation for rWeb (including class inheritance tree) can be found at `http://rpredictordb.elixir-czech.cz/reference/ <http://rpredictordb.elixir-czech.cz/reference/>`_.

  
**The core design problem rWeb faces is how to integrate various bioinformatical functionalities and the respective tools with a database and provide easy access through an unified interface to all its capabilities, while being easily extensible at the same time.** To this end, we have designed a layered architecture according to the *double dispatch* principle for query processing. We will now describe the server-side back-end, both query processing logic and implementation of the tool classes, and then briefly describe client-side scripting. 

This layered architecture has three major parts:

* The query processing back-end, :ref:`Technical-rWeb-query`,

* The tool classes, :ref:`Technical-rWeb-tools`,

* Client-side scripting, :ref:`Technical-rWeb-clientside`


.. _Technical-rWeb-implementation:

Implementation overview
=========================

The whole rWeb application is divided into the following namespaces (also called modules):

* ``BaseModule`` contains the configuration of the entire application and base classes from which other classes are inherited: most importantly``BasePresenter`` (check `Nette documentation <http://api.nette.org/2.0.15/Nette.Application.UI.Presenter.html>`_), ``BaseService`` (extends `Nette Object <http://api.nette.org/2.0.15/Nette.Object.html>`_), ``BaseRepository`` (extends Nette Object) and ``Form`` (check `<http://api.nette.org/2.0.15/Nette.Application.UI.Form.html>`_). The form and the Base Presenter classes are extensions of the Nette classes of same name. The module also contains a router, which manages routes for the whole site, and a template for layout. (Routing enables us to use cool URIs that means URIs look like <http://rpredictordb.elixir-czech.cz/search>`_ instead of ``http://rpredictordb.elixir-czech.cz/SearchModule/presenters/SearchPresenter.php``. More info can be found again in the `corresponding Nette Documentation <http://doc.nette.org/en/2.0/routing>`_.)

* ``DispatchModule`` is the most important backend module - it contains services for executing proper utilities both for searching and for predicting. This is realized through Parser classes (``SearchParser`` and ``PredictParser``) and tool classes. Tool classes (such as ``DbTool`` or ``BlastTool``) are stored in ``DispatchModule\Tools`` namespace. Helper classes (currently a class for on-the-fly visualization) are stored in the ``DispatchModule\Helpers`` namespace.

   The function of ``SearchParser`` is described in :ref:`Technical-rWeb-query`; the tool classes are described in the section :ref:`Technical-rWeb-tools`.   
   
  Also a part of the module are the classes ``Sequence`` and ``ResultSet``, which serve as containers for query results. 

* ``PredictModule`` contains the presentation part of prediction logic. It contains the front-end part of the application that shows the prediction input form and prediction results to the user.

* ``SearchModule`` is similar to ``PredictModule``, except it takes care of the search form and search results. It also supports exporting functionality.

* ``AnalyseModule`` is somewhat similar to ``DispatchModule``. It contains presenters - one main presenter (``AnalysePresenter``) and additionally one presenter for each analytical tool. Models are the most important part of this module and they are placed in ``models`` subdirectory. Each model represents one analytical tool.

A class tree is available `in the reference documentation <http://rpredictordb.elixir-czech.cz/reference/tree.html>`_.

The application uses the `repository pattern <http://msdn.microsoft.com/en-us/library/ff649690.aspx>`_ for manipulating data. The ``BaseRepository`` class provides a default implementation of mechanisms for manipulating the database. The application also keeps a very simple `service oriented design <http://en.wikipedia.org/wiki/Service_layers_pattern>`_, so essentially all classes are used as a service throughout the application. A service typically works with repositories and is used by another service or by a presenter. Injecting a service into another service or a presenter is very simple::

	/**
	 * @var \DispatchModule\SearchParser
	 * @Inject
	 */
	protected $parser;

All you need to do to inject a service is to state its full path (according to PHP namespaces), add the tag ``@Inject`` and the variable name in which the service will be injected. This mechanism is provided by ``postConstruct`` methods in ``BasePresenter`` and ``BaseService``.


.. _Technical-rWeb-query:

Query processing
===========================

A user's query is processed by the following pipeline:

.. figure:: query-process.svg
   :figwidth: 650
   :width: 650
   :align: center

   Search query processing pipeline
   
Both the ``SearchModule`` and ``DispatchModule`` are used. (For a prediction query, there would be ``PredictionModule`` and ``PredictionPresenter`` instead.) The ``DispatchModule`` does the actual query processing work. The ``SearchModule`` only contains presenters and templates for filling the search form and displaying results - it gathers user input, sends it to ``DispatchModule`` for the "real" processing and displays results that ``DispatchModule`` sends back to it.
   
.. figure:: web-class.svg
   :figwidth: 650
   :width: 650
   :align: center

   Diagram of classes used for a query

As has been said earlier, the core part of the query processing pipeline is handled by the ``DispatchModule``. This module handles the user request, performs the requested actions and produces results. Central to its function is the ``SearchParser`` class, which manages the whole process. At the disposal of the ``SearchParser`` are several classes called *tools* - classes that actually perform the various search (or other) operations. Each of these tools actually represents one search method, e.g. search in the database or use some tool from external sources like Blast.

   
.. note:: 

  We are only talking about searching in this text; the workflow is the same for prediction queries as well, only with ``PredictParser`` instead of ``SearchParser``, etc. The search parser is mirrored in the ``PredictionParser``. However, the prediction results are not returned as a ``ResultSet``, but simply as a :ref:`User-glossary-dot-paren-file-string`.
  
The core query processing component, ``SearchParser``:

* gets data inputs from user,

* initializes all required tools,

* gives to each tool the inputs it needs,

* collects the results from each tool,

* merges collected results and passes the merged results to the user through the Presenter.

To be able to do this, ``SearchParser`` has at its disposal:

* exactly which search tools are present,

* which search tools need which input parameters.

While ``SearchParser`` has access to information about individual tools, it doesn't hold the information directly: it relies that the tools to provide information about themselves. The tools do so through the ``getWantedParameters`` and ``requiredParameter`` methods. This reinforces locality of information: **tool-specific information is kept only in the tool itself and is not duplicated elsewhere.**

In turn, extending the application by a new tool requires only that the tool implements the ``getWantedParameters`` and ``requiredParameter`` methods (and of course the ability to return a ``ResultSet`` and other requirements. This is ensured by implementing the ``ToolInterface`` interface). See :ref:`Technical-rWeb-tools-implementation`.

The data that flows through the ``DispatchModule`` from left to right is the input form: key-value pairs that the ``SearchParser`` can send to the appropriate tools based on their keys.

The same double-dispatch principle is also used to create the *search form* - the ``SearchPresenter`` uses ``SearchParser``'s ``addFormParameters`` method that returns the whole form with appropriate inputs from all the tools (because ``SearchParser`` is "the guy" that acts like he knows all about tools; he doesn't, but he can ask them).


.. _Technical-rWeb-tools:

Tools
==================

Tools are the "working blocks" of rPredictorDB and the backbone of its functionality. From this tool-centric perspective, the whole rWeb is just a pretty interface for the tools.

Tools provide different ways of searching (and predicting) above the rPredictorDB database (rData). In order for rPredictorDB to be easily extensible, all tools have a common architecture.

A tool needs to provide three functionalities:

* a definition of its *inputs* - what information the tool needs to run,

* the ability to *parse* these inputs on request from the ``SearchParser``,

* the *execution*, which runs the tool with the parsed inputs and returns a ``ResultSet``.

A tool doesn't need to worry about:

* how it will be accessed by the user (all HTML generation is handled by the ``SearchModule`` (or ``PredictModule``) via Nette),

* other tools (incl. input naming conflicts, combining results).

You can easily extend the rPredictorDB toolkit by your own tool. See :ref:`Technical-extending`. 


.. _Technical-rWeb-tools-implementation:

Tool implementation
---------------------

Each tool is represented by one class that has to be stored either in the ``searchTools`` directory, if it is a tool for searching, or ``predictTools``, if it is a prediction tool (the full path is ``app/DispatchModule/searchTools`` or ``predictTools``). Every tool class has to implement the ``ToolInterface`` interface. Additionally, it is **strongly** recommended that the tool class extends the ``BaseTool`` class, which contains default implementations of several methods.

The tasks of a tool can be divided into three phases:

* :ref:`Technical-rWeb-tools-implementation-input`
 
* :ref:`Technical-rWeb-tools-implementation-parse`

* :ref:`Technical-rWeb-tools-implementation-execute`

.. _Technical-rWeb-tools-implementation-input:

Specifying inputs
^^^^^^^^^^^^^^^^^^^^^

The **input part** defines which parameters the tool wants from the user. This part is utilized when the ``SearchParser`` passes tool information to the ``SearchPresenter`` to create the input form and then when the ``SearchParser`` dispatches individual input values to tools; it is the "public statement" the tool makes about itself to the layers above it in the query processing pipeline.

The inputs are defined in the ``wantedParameters`` array, which has to be structured according to the following example scheme::

	array(
	    'sequence' => array('text', 'Fill in sequence, or part of sequence'),
	    'accession' => array('text', 'Accession number',
		        array('multiplicators' => array('or' => 10))
		    ),
	    'firstpublished' => array('text', 'First published', array(
                'date' => true,
                'modifiers' => array(
                    'firstpublished_direction' => array('select', '', array(
                        'items' => array(self::RULE_LT => 'before', 
                                         self::RULE_GT => 'after')
                    )
                )),
                'multiplicators' => array('and' => 2)
            )),
	    'region_length' => array('text', 'Length',
                 array('modifiers' =>
                       array('region_length_direction' =>
                             array('select', '',
                                   array('items' =>
                                       array(self::RULE_LT => '<',
                                             self::RULE_GT => '>'
                                             )
                                   )
                             )
                        ),
                        'multiplicators' => array('and' => 2)
                     )
            ),
	);
	
This example code (simplified from ``DbTool.php``) says the following:

* The tool wants 4 *parameters* called ``sequence``, ``accession``, ``firstpublished`` and ``region_length``. The ``SearchParser``, when parsing the search form filled in by the user, will know that these 4 parameters should be given to this tool.

* Each parameter has also its *properties* defined. The standard way to define a property of an input parameter is by using an array containing two elements - ``type`` and ``label``. Types are parallel to standard `HTML forms <http://www.w3schools.com/html/html_forms.asp>`_ and a complete list of implemented types is in ``BaseModule/Form`` (see `API reference <http://rpredictordb.elixir-czech.cz/reference/class-BaseModule.Form.html>`_). ``Label`` is a custom text to display next to the input field.

.. note::

  Element names are prefixed with the tool name before generating the html code, so for the ``sequence`` parameter in ``DbTool``, the generated HTML code would look like this: ``<input type="text" name="db_sequence">``.

* Furthermore, each parameter can be customized by additional parameters. These extra parameters are stored in an array which is the third element of the parameter array. They can be forced by input type - like the ``items`` element for the ``select`` type in the ``firstpublished_direction`` parameter from the example above.

  Extra parameters are ``items`` for ``select`` (dropdown) and ``value`` for ``hidden`` type.

* Other additional properties can be *modifiers* and *multiplicators*. Modifiers are elements that directly affect the specified inputs - e.g. whether some value should be searched as "less than" or "greater than" as in example above in the ``region_length`` input (element ``region_length_direction``) and ``firstpublished`` input (element ``firstpublished_direction``).

  * A modifier can be any supported element from ``BaseModule/Form`` and it's built by the same rules as any input described in this section (possibly with it's own modificators and multiplicators - although it is not recommended).

  * Multipliers serve the purpose of getting multiple values for the same input. A multiplier can be either ``and`` or ``or``. Each multiplier further has a cardinality which limits how many times can the multiplier be applied. Basically, all that multipliers do is that they copy input element (including modifiers).

So, in the example above, we can see that ``accession`` is a text input and can be copied ten times with the ``or`` multiplier - users can add up to 10 accession numbers and the performed search will include results containing any of those accession numbers. The ``firstpublished`` parameter is a type of text with a special property ``date`` which display a calendar on the textbox. It can be multiplied twice and it has a modifier before or after. That way, querying interval between dates is possible. The ``region_length`` parameter is a text input with an additional selectbox determining whether search results should include items with quality greater or lesser than specified value. It can also be multiplied using ``and`` rule, which means user can perform queries like "find everything between X and Y", where X and Y are two values for the multiplied ``region_length`` input.

.. seealso:: 

   A more complicated tool: :ref:`Technical-dbtool`

.. seealso:: 

  For a step-by step tool writing tutorial: :ref:`Technical-extending`


.. _Technical-rWeb-tools-implementation-parse:

Parsing inputs
^^^^^^^^^^^^^^^

The **input parsing** of a tool utilizes the ``addCriteria($name, $value)`` method. This method is called from ``SearchParser``, once for each input field, and its purpose is to pass data input by the user into the tool. Through ``addCriteria``, the tool gets key-value pairs to store for the execution phase. See :ref:`Technical-extending` for a comprehensive example.

.. warning:: 

  It is important to note that if a tool uses multipliers, multiplied values are not added through this function - it is because of safety reasons enforced by Nette Framework. The form does not know in advance how many inputs of each type the user will send - therefore it cannot perform security checks on these inputs. Also, these added inputs have different names - the "_array[]" postfix is added to the end of input name (the ``[]`` brackets are a way of telling the server, that this input is an array and not a single value). 
  
  So, the first input is added normally through ``addCriteria``, and that is the best moment to handle all other elements of the array, which has to be done manually (e.g. load the elements via ``$accessions = $this->completeData['db_accession_array'];`` - ``completeData`` is a helper variable containing the multiplied fields from POST request).

.. _Technical-rWeb-tools-implementation-execute:

Execution
^^^^^^^^^^^^^^

The **execution** is, from the architectural point of view, the simplest. All that is important is that the ``execute()`` method returns an object of type ``\DispatchModule\ResultSet``, which holds a set of sequences. Otherwise it can do basically anything (e.g. run external tool through ``exec``).

For prediction tools, the execution method should return simply a string in the **dot-paren format**: a FASTA header, then a sequence and last a dot-paren representation of the secondary structure for the given sequence. There is no container object currently implemented for structures.

.. note::

    Tools can use addditional classes stored in ``models`` subdirectory of ``DispatchModule``. Those classes are only helpers providing additional functionality for tools. They are separated in own directory mainly because of simplicity and code readability. Typical use case might be writing a separate ``XmlParser`` - it's something that the tool needs for its execution, but it's not a part of the tool itself.


.. _Technical-rWeb-analytical-models:

Analytical models
==================

Analytical module is somewhat similar to ``DispatchModule``, however instead of tools it uses models. The meaning of the model is similar - each model is one block that contains specific functionality. However, because of the varying nature of the different models and because there is no need to make all models work together, the models are classes with essentially no further requirements. The only requirement is that each model should be available as a Nette service. This is not a strict requirement, however it is convenient that all models will meet it so they can be used in a same way.

Appart from that, models can do whatever is needed. Each model should be used by one presenter (which should have same - or at least similar - name as the model) that can have any number of actions (and therefore any number of templates).

To learn more about analytical models and how you can create your own, see :ref:`Technical-extending-analytical-model`.

.. _Technical-rWeb-clientside:

Client side scripting
=====================

Client side scripting is done in JavaScript with use of the `jQuery scripting library <http://jquery.com>`_. All javascript files are placed in the ``webroot/js`` directory (where ``webroot`` is the directory that contains the ``index.php`` entry point file, see :ref:`Technical-setup-rweb`). The following  subdirectories are there:

* ``Classes`` holds all client side logic. (See :ref:`Technical-rWeb-javascript-classes`)

* ``Lib`` contains all libraries that the project uses (mainly jQuery). See: :ref:`Technical-rWeb-javascript-lib`

* ``Langs`` contains language files; these are used in the UI to display text. See: :ref:`Technical-rWeb-javascript-langs`

* ``View`` holds presenter information - most of the time, these files only hook  DOM events to methods implemented in classes. See: :ref:`Technical-rWeb-javascript-views`

.. figure:: client-side-logic.svg
   :figwidth: 650
   :width: 650
   :align: center
   
   *Client-side script logic*


.. _Technical-rWeb-javascript:

Javascript files
---------------------

All javascript files are divided into separated directory except external jQuery libraries that are placed in the main directory. In the following part there will be described how every file is being used and what functionality it contains.

.. _Technical-rWeb-javascript-classes:

Classes
^^^^^^^^^^^^^^

* ``Ajax-sequence.js`` is the main class to visualise the sequence detail. It gets JSON sequence information and regenerates HTML code that is displayed on the screen.

* ``Export.js`` is the key file for exporting sequences. Firstly, the given result set is saved for later exports. Secondly, it creates jQuery dialog form where the export result is shown. And finally the export result is generated. At the moment there are 3 fully working format - CSV, JSON (contain all visible information on the screen) and FASTA/ dot-paren format.

* ``TaxonomyBrowser.js`` displays interactive dropdown-based browser of taxonomic tree - it asynchronously loads data for each taxonomy level and displays new dropdown for each sublevel

* ``UI.js`` works with the user interface. The UI class has three main properties - to work with the form (hides and shows the form, duplicating and deleting input fields), to work with the result set (shows and hides sequence, loads more results by HTTP request and sends an export request) and switches on/off tools.


.. _Technical-rWeb-javascript-langs:

Langs
^^^^^^^^^^^^^^

This directory contains two files ``Search.js`` and ``Predict.js`` that show a help to the user. Some of sequnce attributes might not be clear to a user - in that case a question tag is shown and a text (defined in this file) is shown.


.. _Technical-rWeb-javascript-lib:

Lib
^^^^^^^^^^^^^^

All used jQuery libraries can be found in this directory.

.. _Technical-rWeb-javascript-views:

Views
^^^^^^^^^^^^^^

* ``Predict.js`` is the main file to work with the predict page. The task of this file is to redistribute the work to previously mentioned files - like form requirements, sending HTTP requests, loading tools.

* ``Search.js`` is really close to the previous file. It performs the same requirements. In addition its task is also to load more results from the form input and display them. More results can be loaded by clicking on the button or by reaching the bottom of the page.

.. _Technical-rWeb-codingstandards:

Coding standards
=====================

The whole application follows `Nette Coding standards <http://doc.nette.org/en/2.0/coding-standard>`_. However, this is not a strict requirement; more of a recommendation. The application is documented according to `PhPdoc <http://www.phpdoc.org/>`_ conventions and uses `ApiGen <http://apigen.org/>`_ to build the reference manual.