O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Principles of Data Visualization

3.038 visualizações

Publicada em

CERN School of Computing Talk - Eamonn Maguire

Publicada em: Dados e análise

Principles of Data Visualization

  1. 1. Eamonn Maguire, DPhil Principles of Data Visualization iCSC, CERN March 2017
  2. 2. The role of visualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. Visualization Tamara Munzner A Visualization should 1. Save time 2. Have a clear purpose* 3. Include only the relevant content* 4. Encodes data/information appropriately * from Noel Illinsky, http://complexdiagrams.com/
  3. 3. The role of visualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. Visualization is suitable when there is a need to augment human capabilities rather than replace people with computational decision-making methods. Visualization Tamara Munzner A Visualization should 1. Save time 2. Have a clear purpose* 3. Include only the relevant content* 4. Encodes data/information appropriately * from Noel Illinsky, http://complexdiagrams.com/
  4. 4. Course outcomes The what? Major data types and classifications of them The why? Why are we visualising at all? The how? How can we visualize? What archetypes can we use to guide us? Finally: case study? Given some data, how can we go about visualising it?
  5. 5. A lot of the content for this introduction comes from this book from Prof. Tamara Munzner (UBC, Vancouver, Canada) which I created the illustrations for. If you’re interested in learning more, it’s a great book to check out from the CERN library, or buy :)
  6. 6. The role of visualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. External representation: replace cognition with perception Visualization
  7. 7. The role of visualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. External representation: replace cognition with perception Visualization
  8. 8. Cerebral:Visualizing Multiple Experimental Conditions on a Graph with Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEETVCG (Proc. InfoVis) 14(6):1253-1260, 2008.] The role of visualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. External representation: replace cognition with perception Visualization
  9. 9. The statistics would lead us to believing that everything is the same The role of visualization systems is to provide visual representations of datasets that help people carry out tasks more effectively. Why visualize?
  10. 10. A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  11. 11. • Domain situation A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  12. 12. • Domain situation – who are the target users? A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  13. 13. • Domain situation – who are the target users? • Data/Task Abstraction A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  14. 14. • Domain situation – who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  15. 15. • Domain situation – who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  16. 16. • Domain situation – who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  17. 17. • Domain situation – who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  18. 18. • Domain situation – who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  19. 19. • Domain situation – who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? • visual encoding: how to draw A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  20. 20. • Domain situation – who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? • visual encoding: how to draw • interaction: how to manipulate A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  21. 21. • Domain situation – who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? • visual encoding: how to draw • interaction: how to manipulate • Algorithm A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  22. 22. • Domain situation – who are the target users? • Data/Task Abstraction – translate from specifics of domain to vocabulary of vis • What is shown? Data abstraction • Why is the user looking at it? Task abstraction • Visual Encoding • How is it shown? • visual encoding: how to draw • interaction: how to manipulate • Algorithm – efficient computation, layout algorithms etc. A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). A Multi-Level Typology of Abstract Visualization Tasks Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). Analysis framework: Four levels, three questions Data/task abstraction Visual encoding/interaction idiom Algorithm Domain situation
  23. 23. What are you visualising?
  24. 24. The branches of data visualization Information Visualization Scientific Visualization Position is given. Also medical visualizations Position is derived. Incl. GeoVis
  25. 25. The branches of data visualization Information Visualization Scientific Visualization Position is given. Also medical visualizations Position is derived. Incl. GeoVis
  26. 26. Why are you visualising this?
  27. 27. Discover Finding new insights in your data Implies a level of interactivity to query, compare, correlate etc.
  28. 28. Discover Finding new insights in your data This is typically where one should be careful in how information is presented. An erroneous data encoding could bring about wrong conclusions. Implies a level of interactivity to query, compare, correlate etc.
  29. 29. Discover Finding new insights in your data This is typically where one should be careful in how information is presented. An erroneous data encoding could bring about wrong conclusions. Implies a level of interactivity to query, compare, correlate etc.
  30. 30. Present Presenting your results, e.g. for a paper
  31. 31. Present Presenting your results, e.g. for a paper
  32. 32. Present Presenting your results, e.g. for a paper
  33. 33. Present Presenting your results, e.g. for a paper
  34. 34. Present Presenting your results, e.g. for a paper
  35. 35. Present Presenting your results, e.g. for a paper
  36. 36. Enjoy Infographics, art, or superfluous visualizations.
  37. 37. Enjoy Infographics, art, or superfluous visualizations. Bear in mind that this was a non-interactive figure in a paper
  38. 38. Enjoy Infographics, art, or superfluous visualizations. Bear in mind that this was a non-interactive figure in a paper
  39. 39. How can you encode information optimally?
  40. 40. How can you encode information optimally?
  41. 41. How can you encode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7
  42. 42. How can you encode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 Scatter
  43. 43. How can you encode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 Scatter Line
  44. 44. How can you encode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 Scatter Line Histogram
  45. 45. How can you encode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 Scatter Line Histogram Area JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10
  46. 46. How can you encode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Scatter Line Histogram Area Size JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10
  47. 47. How can you encode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Scatter Line Histogram Area Size Saturation JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10
  48. 48. How can you encode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Scatter Line Histogram Area Size Saturation Size & Saturation JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10
  49. 49. How can you encode information optimally? JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 2 10 4 5 6 9 1 3 5 3 4 7 0 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC Scatter Line Histogram Area Size Saturation Size & Saturation JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 0 5 10 Size, Saturation, & Position JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC 5
  50. 50. And that’s just a really simple low dimensional example Moreover, all of these visualizations encode the information, but the decode error (interpreting, comparing, …) for each is different So, why?
  51. 51. Stevens, 1975 Our perception system does not behave linearly. Some stimuli are perceived less or more than intended.
  52. 52. We have to be careful when mapping data to the visual world Some visual channels are more effective for some data types over others.
  53. 53. We have to be careful when mapping data to the visual world Some visual channels are more effective for some data types over others. Some data has a natural mapping that our brains expect given certain types of data
  54. 54. Natural Mappings
  55. 55. We have to be careful when mapping data to the visual world Some visual channels are more effective for some data types over others. Some data has a natural mapping that our brains expect given certain types of data There are many intricacies of the visual system that must be considered
  56. 56. The pop-out effect • Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  57. 57. The pop-out effect • Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  58. 58. The pop-out effect • Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  59. 59. The pop-out effect • Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  60. 60. The pop-out effect • Parallel processing on many individual channels – speed independent of distractor count – speed depends on channel and amount of difference from distractors • Serial search for (almost all) combinations – speed depends on number of distractors We pre-attentively process a scene, and some visual elements stand out more than others.
  61. 61. Not all exhibit the pop-out effect!
  62. 62. Not all exhibit the pop-out effect! Parallel line pairs do not pop out from tilted pairs…
  63. 63. Not all exhibit the pop-out effect! Parallel line pairs do not pop out from tilted pairs… And not all visual channels pop out as quickly as other. E.g. colour is always on top.
  64. 64. Relative Comparison
  65. 65. Relative Comparison
  66. 66. Relative Comparison
  67. 67. 36px Relative Comparison
  68. 68. Relative Comparison 4 values Unordered Unaligned
  69. 69. Relative Comparison 4 values Unordered Unaligned 11 values Unordered Unaligned
  70. 70. 4 values UnorderedAligned Relative Comparison
  71. 71. 4 values UnorderedAligned 8 values UnorderedAligned Relative Comparison
  72. 72. 8 values 20 values Relative Comparison
  73. 73. Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Random Grouped Target shown before hand (known) or not shown (unknown). The unique colour here is the orange square. A) Known and Unknown Target Search C) Response Time and Accuracy Results Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Subitizing Grouped Random Random Grouped Which grid has more colours? 7 8 B) Subitizing (how many colours?) How Capacity Limits of Attention Influence Information Visualization Effectiveness. Haroz S. and Whitney D., IEEE TVCG 2012
  74. 74. Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Random Grouped Target shown before hand (known) or not shown (unknown). The unique colour here is the orange square. A) Known and Unknown Target Search C) Response Time and Accuracy Results Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Subitizing Grouped Random Random Grouped Which grid has more colours? 7 8 B) Subitizing (how many colours?) How Capacity Limits of Attention Influence Information Visualization Effectiveness. Haroz S. and Whitney D., IEEE TVCG 2012
  75. 75. Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Random Grouped Target shown before hand (known) or not shown (unknown). The unique colour here is the orange square. A) Known and Unknown Target Search C) Response Time and Accuracy Results Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Subitizing Grouped Random Random Grouped Which grid has more colours? 7 8 B) Subitizing (how many colours?) How Capacity Limits of Attention Influence Information Visualization Effectiveness. Haroz S. and Whitney D., IEEE TVCG 2012
  76. 76. Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Random Grouped Target shown before hand (known) or not shown (unknown). The unique colour here is the orange square. A) Known and Unknown Target Search C) Response Time and Accuracy Results Known Target Search Colour Variety Unknown Target Search Grouped Random Grouped Random Subitizing Grouped Random Random Grouped Which grid has more colours? 7 8 B) Subitizing (how many colours?) How Capacity Limits of Attention Influence Information Visualization Effectiveness. Haroz S. and Whitney D., IEEE TVCG 2012
  77. 77. A. Law of Closure B. Law of Similarity D. Law of Connectedness E. Law of Symmetry [ ] { } ( ) C. Law of Proximity F. Law of Good Continuation G. Contour Saliency a c b a c b d H. Law of Common Fate I. Law of Past Experience J. Law of Pragnanz K. Figure/Ground Gestalt Laws
  78. 78. Dimension X A and B have the same width. However B and C are perceived more alike even though they are different widths and heights. Width and Height are integral dimensions Dimension X Width Integral/Separable Dimensions
  79. 79. Dimension X A and B have the same width. However B and C are perceived more alike even though they are different widths and heights. Width and Height are integral dimensions Dimension X Width Dimension X Colour A and B have the same colour and are perceived more similar. Colour and Height are Separable dimensions Integral/Separable Dimensions
  80. 80. Fully separableFully Integral Dimension X Width Dimension X Orientation Dimension X Colour Dimension X Motion Dimension X Motion Dimension X Colour Integral/Separable Dimensions
  81. 81. We don’t see in 3D, and we have difficulties interpreting information on the Z-axis. We have to be careful when mapping data to the visual world Some visual channels are more effective for some data types over others. Some data has a natural mapping that our brains expect given certain types of data There are many visual tricks that can be observed due to how the visual system works
  82. 82. 2D always wins… Our visual system is not good at interpreting information on the z-axis. *3D is normally only used for exploration of inherently 3D information, such as medical imaging data…
  83. 83. These options, taken randomly from google image searches so how widely 3D is abused in information visualization. All of these charts are manipulating our perception of the data by using the Z axis to occlude information…it would be avoided in 2D. 2D always wins…
  84. 84. 2D always wins…
  85. 85. 2D always wins…
  86. 86. http://cms-results.web.cern.ch/cms-results/public-results/preliminary-results/BPH-14-008/index.html 2D always wins…
  87. 87. We don’t see in 3D, and we have difficulties interpreting information on the Z-axis. We have to be careful when mapping data to the visual world Some visual channels are more effective for some data types over others. Some data has a natural mapping that our brains expect given certain types of data There are many visual tricks that can be observed due to how the visual system works Colour
  88. 88. The simplest, yet most abused of all visual encodings. The problem is that a smooth step in a value does not equate to a smooth colour transition… Colour
  89. 89. Additionally, colour is not equally binned in reality. We perceive colours differently due to an increased sensitivity to the yellow part of the spectrum… Wavelength (nm) IRUV Visible Spectrum
  90. 90. https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/ Luminosity is also not stable across the colours, meaning some colours will pop out more than others… and not always intentionally.
  91. 91. https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/ Luminosity is also not stable across the colours, meaning some colours will pop out more than others… and not always intentionally.
  92. 92. Gregory compared the wavelength of light with the smallest observable difference in hue (expressed as wavelength difference) And how we perceive changes in hue is also very different.
  93. 93. Is this a good visualization?
  94. 94. Is this a good visualization? But, grayscale would be just as useful and less visually distracting.
  95. 95. Is there a colour palette for scientific visualization that works?
  96. 96. HSL linear L rainbow palette Kindlmann, G. Reinhard, E. and Creem, S., 2002, Face-based Luminance Matching for Perceptual Colormap Generation, IEEE Proceedings of the conference on Visualization ’02 https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/
  97. 97. But there are some in CMS that are already moving away from the rainbow http://cms-results.web.cern.ch/cms-results/public-results/publications/ JME-13-004/index.html
  98. 98. Is there a colour palette for scientific visualization that works?
  99. 99. HSL linear L rainbow palette Kindlmann, G. Reinhard, E. and Creem, S., 2002, Face-based Luminance Matching for Perceptual Colormap Generation, IEEE Proceedings of the conference on Visualization ’02 https://mycarta.wordpress.com/2012/10/06/the-rainbow-is-deadlong-live-the-rainbow-part-3/
  100. 100. Binary Diverging Categorical Sequential Categorical Categorical There are also lots of default colour maps that can be applied to particular data types. http://colorbrewer2.org/
  101. 101. Semantic relevance Or just consistency When there are many colours for example, we find it difficult to remember abstract associations.
  102. 102. Selecting Semantically-Resonant Colors for Data Visualization Sharon Lin, Julie Fortuna, Chinmay Kulkarni, Maureen Stone, Jeffrey Heer Computer Graphics Forum (Proc. EuroVis), 2013 What are semantically resonant colours?
  103. 103. Semantic colouring is a good idea in theory, but there are limited areas where this really works. But, if you are going to use colour, have a consistent colour mapping. That way, the decoding time is less. Saving time… In general, CMS & other experiments are pretty good at this, but there are some examples…
  104. 104. http://cms-results.web.cern.ch/cms-results/public- results/preliminary-results/SUS-16-029/CMS-PAS- SUS-16-029_Figure_003.png
  105. 105. http://cms-results.web.cern.ch/cms-results/public- results/preliminary-results/SUS-16-029/CMS-PAS- SUS-16-029_Figure_003.png
  106. 106. What happens when we have a high number of dimensions? And that was just to represent a low number of dimensions
  107. 107. Parallel Coordinates Scatter Plot Matrices Glyphs Linked Plots Height Weight Cholesterol Multidimensional Visualization
  108. 108. Scatter Plot Matrices … Height Weight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois Multidimensional Visualization
  109. 109. Scatter Plot Matrices … Height Weight CholesterolHeight Weight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois Multidimensional Visualization
  110. 110. Scatter Plot Matrices … Height Weight CholesterolHeight Weight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois 1.76 Multidimensional Visualization
  111. 111. Scatter Plot Matrices … Height Weight CholesterolHeight Weight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois 1.76 63 1.76 4.5 Multidimensional Visualization
  112. 112. Scatter Plot Matrices … Height Weight CholesterolHeight Weight CholName 1.76 63 4.5John 1.79 70 4.15Mike 1.61 60 6.7Jim 1.84 90 5.03Francois 1.76 63 1.76 4.5 Height Weight Cholesterol Multidimensional Visualization
  113. 113. Visual Exploration of Large Structured Datasets. Wills. Proc. New Techniques and Trends in Statistics (NTTS), pp. 237–246. IOS Press, 1995. Linked Plots Multidimensional Visualization
  114. 114. When one visualization won’t cut it… Multidimensional Visualization
  115. 115. With dc.js, crossfilter, and d3.js Multidimensional Visualization
  116. 116. With dc.js, crossfilter, and d3.js Multidimensional Visualization
  117. 117. With dc.js, crossfilter, and d3.js Multidimensional Visualization
  118. 118. With dc.js, crossfilter, and d3.js Multidimensional Visualization
  119. 119. With dc.js, crossfilter, and d3.js Multidimensional Visualization
  120. 120. My Tutorial on Creating Dashboard Visualizations https://thor-project.github.io/dashboard-tutorial/
  121. 121. Parallel Coordinate Plots Positive Correlation Negative (inverse) Correlation No Correlation Multidimensional Visualization
  122. 122. Parallel Coordinate Plots Positive Correlation Negative (inverse) Correlation No Correlation Multidimensional Visualization
  123. 123. Lets take an example where we have many variables to display... Each user is represented by a circle user a user z user a user z 2 Dimensions 3 Dimensions Size indicates number of logins per day Parallel Coordinate Plots
  124. 124. As we get to higher levels of dimensions, we’ll have problems. Our choice of visual encoding will affect the visual availability of each dimension to the user. 4 Dimensions Color indicates users department 5 Dimensions Transparency indicates consistency in logins user a user z user a user z Parallel Coordinate Plots
  125. 125. Parallel coordinates are a visualization technique employed when a large number of dimensions need to be displayed (often without a temporal element) and where each of those dimensions can be equally important in the decision making process. Not so easy to spotEasy to see In the scatter plots here, it’s easy to see correlation between downloads and uploads, but with the other dimensions that’s difficult.
  126. 126. Positive Correlation Negative (inverse) Correlation No Correlation Parallel Coordinate Plots
  127. 127. 2 Dimensions Uploads Downloads Parallel Coordinate Plots
  128. 128. 2 Dimensions Uploads Downloads 1 User Parallel Coordinate Plots
  129. 129. 2 Variables Uploads Downloads 11 Users Parallel Coordinate Plots
  130. 130. 2 Dimensions Uploads Downloads 11 Users user a user z Parallel Coordinate Plots
  131. 131. 3 Dimensions Uploads Downloads Logins per day 11 Users Parallel Coordinate Plots
  132. 132. 4 Dimensions Uploads Downloads Logins per day Std. deviation in logins per day 11 Users 2 -2 Parallel Coordinate Plots
  133. 133. 5 Dimensions Uploads Downloads Logins per day Std. deviation in logins per day Department 11 Users We use color for department since it’s categorical information. 2 -2 We can keep adding more parallel lines, and comfortably have around 20 dimensions for many users displayed at once. Parallel Coordinate Plots
  134. 134. Clustering Outlier DetectionCorrelation Distribution Parallel coordinates provide an efficient way to visualize many variables, along with their associated clusters, anomalies, value distributions and correlations. Parallel Coordinate Plots
  135. 135. 83 • static item aggregation • task: find distribution • data: table • derived data – 4 quantitative attributes • median: central line • lower and upper quartile: boxes • lower upper fences: whiskers – outliers beyond fence cutoffs explicitly shown and kurtosis 20) and a bimo tion 0.31). Richer displays o multi-modality is particularly ! ! !! ! ! ! ! ! n s k mm!2024 !2024 Figure 4: From left to right: box Glyphs Multidimensional Visualization
  136. 136. Glyphs Multidimensional Visualization
  137. 137. A Simple Example | Student Test Results Multidimensional Visualization Math Physics English Religion Math Physics English Religion Table Scatter Plot Matrix Math Physics English Religion 85 90 65 50 40 95 80 50 40 60 71 60 90 95 80 65 50 90 80 90
  138. 138. 86 Math Physics English Religion 85 90 65 50 40 95 80 50 40 60 71 60 90 95 80 65 50 90 80 90 Table Parallel Coordinates Math Physics English Religion 100 90 80 70 60 50 40 30 20 10 0 A Simple Example | Student Test Results Multidimensional Visualization
  139. 139. 87 Math Physics English Religion 85 90 65 50 40 95 80 50 40 60 71 60 90 95 80 65 50 90 80 90 Table Glyph A Simple Example | Student Test Results Multidimensional Visualization Math Physics English Religion
  140. 140. 87 Math Physics English Religion 85 90 65 50 40 95 80 50 40 60 71 60 90 95 80 65 50 90 80 90 Table Glyph A Simple Example | Student Test Results Multidimensional Visualization Math Physics English Religion Teacher Arrange Spatially
  141. 141. What about topological data?
  142. 142. Graphs/Networks
  143. 143. Graphs/Networks Cerebral:Visualizing Multiple Experimental Conditions on a Graph with Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEETVCG (Proc. InfoVis) 14(6):1253-1260, 2008.]
  144. 144. Graphs/Networks https://cambridge-intelligence.com
  145. 145. Graphs/Networks https://cambridge-intelligence.com
  146. 146. Graphs/Networks Matrix Representations https://bost.ocks.org/mike/miserables/
  147. 147. Graphs/Networks Hive Plots http://jsfiddle.net/7a7b5dwp/
  148. 148. Graphs/Networks Hive Plots http://jsfiddle.net/7a7b5dwp/ http://jsfiddle.net/eamonnmag/vso70qnr/
  149. 149. Trees Cut Level 0.3 4 Clusters View as treemapDendrogram Search
  150. 150. • split by neighbourhood • then by type • colour by price • neighbourhood patterns – where it’s expensive – where you pay much more for detached type 94 Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, andWood. IEEETransactions on Visualization and Computer Graphics (Proc. InfoVis 2009) 15:6 (2009), 977–984. Treemaps Partitioning
  151. 151. • switch order of splits – type then neighbourhood • switch colour – by price variation • type patterns – within specific type, which neighbourhoods are inconsistent 95 Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, andWood. IEEETransactions on Visualization and Computer Graphics (Proc. InfoVis 2009) 15:6 (2009), 977–984. Treemaps Partitioning
  152. 152. Validation & User Testing
  153. 153. 97 Validating your visualisation…
  154. 154. 97 Validating your visualisation… Domain situation Observe target users using existing tools Visual encoding/interaction idiom Justify design with respect to alternatives Algorithm Measure system time/memory Analyze computational complexity Observe target users after deployment ( ) Measure adoption Analyze results qualitatively Measure human time with lab experiment (lab study) Data/task abstraction
  155. 155. 98 “Visualization can surprise you, but doesn't scale well. Modelling scales well, but can't surprise you.” Hadley Wickham The elephant in the room… scaling up
  156. 156. 99 One of the biggest challenges in HEP, Biology, chemistry, and business is scale. Our screens have a limited number of pixels. And our data is often much larger. You can do two things to get around this. The elephant in the room… scaling up
  157. 157. 100 GPU usage and WebGL Not yet compatible across all browsers, and not everyone has a dedicated GPU. Reduce the problem space Try and get users to focus queries as soon as possible to reduce the amount of data to be visualised. As done in imMens from Jeffrey Heer’s lab in Washington. Renders billions of data points at 50 fps in the browser. Provide ways to aggregate information in to meaningful overview visualisations… Solutions
  158. 158. 101 Munzner. A K Peters Visualization Series, CRC Press, Visualization Series, 2014. Visualization Analysis and Design. More??
  159. 159. 102 http://antarctic-design.co.uk/biovis-workshop15/ Tutorial on D3 Tutorial on Dashboard Visualizations https://thor-project.github.io/dashboard-tutorial/ Visualization Survey Sites Set Visualization - http://www.cvast.tuwien.ac.at/SetViz Time Series Visualization - http://survey.timeviz.net/ Parallel Coordinates Visualization Periodic Table of Visualizations Data Vis Catalogue Further Links
  160. 160. Eamonn Maguire CERN Data Visualization and suggestions for CMS CERN August 25th 2016 Questions @antarcticdesign eamonnmag@gmail.com

×