Along with my master thesis Community Finding with Applications on Phylogenetic Networks, in which a set of visualization and analysis tools were developed, I was enrolled in an internship in Instituto Nacional de Saúde Doutor Ricardo Jorge. Some of the tools implemented during the thesis will be soon introduced in INSaFLU, a web application developed in this institution. Below, the app and the developed modules are detailed.
linearity concept of significance, standard deviation, chi square test, stude...
INSaFLU | Innovation and Entrepreneurship Report
1. Luís Rita
Innovation and Entrepreneurship Report
Introduction
Along with my master thesis Community Finding with Applications on Phylogenetic Networks, in which a set
of visualization and analysis tools were developed, I was enrolled in an internship in Instituto Nacional de
Saúde Doutor Ricardo Jorge. Some of the tools implemented during the thesis will be soon introduced in
INSaFLU, a web application developed in this institution. Below, the app and the developed modules are
detailed.
INSaFLU
INSaFLU (insaflu.insa.pt) is a web-based bioinformatics tool that pretends to enhance the laboratory
surveillance of the flu worldwide. It is capable of handling primary data (reads) from Next Generation
Sequencing methods and to return influenza type, sub-type, gene and whole-genome consensus
sequences, variants and minor variants annotation, alignments and phylogenetic trees of the influenza
virus (Borges et al, 2018). This way, by allowing advanced, multi-step software analyses in a user-friendly
manner, INSaFLU largely facilitates the whole-genome-based analyses of the influenza virus, thus
contributing to reinforce the routine laboratory surveillance of this important human pathogen.
During the Master in Technological Innovation in Health internship, the module generating
phylogenetic trees was upgraded with tools enhancing the representation of the strains’ evolutionary path
and their respective metadata. A geographical map section was developed, so existent patterns on the
space and time distribution of influenza can now be tracked.
Phylogenetic Tree
The user can represent the evolutionary relations of different influenza strains based on each viral segment
or the whole genome of the virus (8 segments), in a rectangular tree.
In the new module (Figure 1), the user can choose the shape of the tree (radial , rectangular ,
diagonal , hierarchical or circular ); to adjust specific structures of the tree in order to enhance
visual analysis: node size, label size and line width ; to visualize metadata by coloring the nodes of the
tree or to display colored blocks next to the leaves (with or without the identifying labels) ;
the colors are automatically generated, but the user can modify them by selecting one from a color palette
; a legend containing all metadata categories along with the respective values and generated colors
was added . Finally, additional options, such as: reset modifications , fix a dropdown to facilitate
analysis , expand/collapse hidden/visible fields / and a toggle button to hide or make visible all
controls were included.
2. Luís Rita
Geographical Map
It is intended to help the user analyzing influenza spatial and/or temporal distribution patterns (Figure 2).
This feature uses latitude and longitude values to displace the points in the map. In case one of the
following fields is present: onset date (YYYY-MM-DD), collection date (YYYY-MM-DD), lab reception date
(YYYY-MM-DD) or year (YYYY), the timeline is activated.
In the style button , a scale factor regulating the size of the points (proportional to the number of
cases in a given location) can be adjusted. Logarithmic and constant size options are available.
In the metadata button , it is chosen the category to represent each node as a pie chart. With the
area of each fraction proportional to the number of cases of a metadata type within a given location. Map
button includes many labelled and unlabeled tiles that can used to personalize the map view. Finally,
a toggle button was added to assure clear screen captures.
Phylogenetic Tree and Geographical Map are interconnected (when a branch is selected in the tree, the
respective samples are highlighted in the map), thus enabling more detailed phylogeographical analyses.
Conclusion
INSaFLU Phylogenetic Tree module was upgraded and a new one called Geographical Map was developed.
The first reinforced the capacity to monitor the influenza virus evolution and routes of transmission, while
also enhanced a more robust and integrative analysis of associated metadata data The second enables the
user not only to track the geographical spread of influenza virus, but also to link it with the patterns of virus
diversification (Phylogenetic Tree module) and time distribution.
With the implementation of both modules in INSaFLU, the web application will provide a more detailed
and user-friendly visualization of epidemiological and patient data, which is key to better interpret and take
advantage of the genome-scale analysis of influenza virus. In this context, it is expected that INSA, and
other public health laboratories performing laboratory surveillance of influenza virus with INSaFLU, will
largely benefit from the implemented tools, with benefit for public health. In other perspective, the way
the novel tools were designed will facilitate the quick generation of data-rich dynamic figures possible to
be used not only for surveillance purposes, but also for scientific data communication, for example, in
scientific papers.
Finally, it is noteworthy that it is already planned, in the future, to make these two modules separately
available (as a standalone website) so they can be used by any surveillance laboratories focused on the
influenza virus or other pathogenic organisms.
3. Luís Rita
Figure 1 Phylogenetic Tree module along with the toggle, shape, style, metadata, and legend buttons. The location of the selected strains (blue nodes) is highlighted in Figure 2.
4. Luís Rita
Figure 2 Geographical Map module along with the timeline, toggle, style, metadata, map and zoom buttons. Locations highlighted (black circles with an opaque border) based on the influenza strains that were
selected in Figure 1 (blue nodes).