1. Visual Analytics in omics - why, what, how?
Prof Jan Aerts
STADIUS - ESAT, Faculty of Engineering, University of Leuven, Belgium
Data Visualization Lab
jan.aerts@esat.kuleuven.be
jan@datavislab.org
creativecommons.org/licenses/by-nc/3.0/
2. • What problem are we trying to solve?
• What is Visual Analytics and how can it help?
• How do we actually do this?
• Some examples
• Challenges
2
4. hypothesis-driven -> data-driven
Scientific Research Paradigms (Jim Gray, Microsoft)
I have an hypothesis -> need to generate data to (dis)prove it.
I have data -> need to find hypotheses that I can test.
1st 1,000s years ago empirical
2nd 100s years ago theoretical
3rd last few decades computational
4rd today data exploration
4
5. What does this mean?
• immense re-use of existing datasets
• much of initial analysis is exploratory in nature => what’s my hypothesis?
• biologically interesting signals may be too poorly understood to be analyzed
in automated fashion
• visualization is very effective in facilitating human reasoning about complex
data
• automated algorithms often act as black boxes => biologists must have blind
faith in bioinformatician (and bioinformatician in his/her own skills)
5
17. • record information
• blueprints, photographs,
seismographs, ...
• analyze data to support reasoning
• develop & assess hypotheses
• discover errors in data
• expand memory
• find patterns (see Snow’s cholera map)
• communicate information
• share & persuade
• collaborate & revise
Why do we visualize data?
17
19. Steven’s psychophysical law
= proposed relationship between the magnitude of a physical stimulus and its
perceived intensity or strength
19
20. Accuracy of quantitative perceptual tasks
McKinlay
what/where (qualitative)how much (quantitative)
20
21. Accuracy of quantitative perceptual tasks
McKinlay
what/where (qualitative)how much (quantitative)
21
22. Accuracy of quantitative perceptual tasks
McKinlay
“power of the plane”
what/where (qualitative)how much (quantitative)
22
23. Pre-attentive vision
= ability of low-level human visual system to rapidly identify certain basic visual
properties
• some features “pop out”
• used for:
• target detection
• boundary detection
• counting/estimation
• ...
• visual system takes over => all cognitive power available for interpreting the
figure, rather than needing part of it for processing the figure
23
26. 1. Combining pre-attentive features does not always work => would need to
resort to “serial search” (most channel pairs; all channel triplets)
e.g. is there a red square in this picture
Limitations of preattentive vision
2. Speed depends on which channel (use one that is good for
categorical; see further (“accuracy”))
26
28. Gestalt laws - interplay between parts and the
whole
• simplicity
• proximity
• similarity
• connectedness
• good continuation
• common fate
• familiarity
• symmetry
28
38. To use vega
• Create the json file
• Create the index.html
• Run “python -m SimpleHTTPServer”
• Go to http://127.0.0.1:8000/index.html
• Get help at https://github.com/trifacta/vega/wiki
38
43. ParCoord
Boogaerts T et al. IEEE International Conference on
Bioinformatics & Bioengineering (2012)
Thomas Boogaerts
Endeavour gene prioritization
43
44. Data filtering (visual parameter setting)
TrioVis
Ryo Sakai
Sakai R et al. Bioinformatics (2013)
44
45. User-guided analysis
Spark
Nielsen et al. Genome Research (2012)
clustering
chromatin modification
DNA methylation
RNA-Seq
data samples
regions of interest
45