SlideShare uma empresa Scribd logo
1 de 28
Baixar para ler offline
Bibliotheca Digitalis
Reconstitution of Early Modern Cultural Networks
From Primary Source to Data
DARIAH / Biblissima Summer School
Le Mans, 4-8 July 2017
Visualisation in Digital Humanities
for Understanding, Cleaning, and
Explaining
5th and last day, July 8th – Digital representation and data accuracy for Humanities
Jean-Daniel Fekete
Research Scientist, INRIA
7/8/2017
1
Visualisation in Digital Humanities for
Understanding, Cleaning, and Explaining
Jean-Daniel Fekete
INRIA
http://www.aviz.fr/~fekete
Visualization?
Visualization is any technique for creating
images, diagrams, or animations to
communicate a message
[Wikipedia, Visualization, May 2016]
Information visualization is the study of
(interactive) visual representations of abstract
data to reinforce human cognition
[Card, S. and Mackinlay, J. and Shneiderman B., Readings in Information Visualization, 1999]
July 8th 2017 Summer School Le Mans
7/8/2017
2
Visualization and Visual Perception
• Visualization is grounded in the visual and
cognitive capabilities of humans
– Inferring from visual forms
• Relies on visual capabilities of the human eye
and brain
– Preattentive processing
– Ready…is there a red circle in the next slide?
July 8th 2017 Summer School Le Mans
Preattentive Processing
July 8th 2017 Summer School Le Mans
7/8/2017
3
Preattentive Processing
July 8th 2017 Summer School Le Mans
Preattentive Processing
• Preattentive processing
– 200ms response time (in a glimpse)
– Effortless
– Reliable estimates
• Many visual features can be perceived preattentively:
– Orientation of line/bloc, length, width, size, curvature, cardinality, etc.
• Problems:
– Preattentive features interfere with each other
• Except one
– Preattentive features have limitations
• 7 colors max (Healey, 96)
• 2 or 3 shapes
July 8th 2017 Summer School Le Mans
7/8/2017
4
Preattentive Processing
July 8th 2017 Summer School Le Mans
Where does Visualization Stands?
Theory / Law
Model
Descriptive statistics
Facts / Measurements
Support xor
Contradict Induces?
Fits
Describes
July 8th 2017 Summer School Le Mans
7/8/2017
5
Example
I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
Raw Data from Anscombe’s Quartet
[Source: Anscombe's quartet, Wikipedia]
July 8th 2017 Summer School Le Mans
Statistical Analysis
I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
Mean of x 9.0
Variance of x 11.0
Mean of y 7.5
Variance of y 4.12
Correlation between x and y 0.816
Linear regression line y = 3 + 0.5x
For all columns, the main descriptive statistics are identical
[Source: Anscombe's quartet, Wikipedia]
July 8th 2017 Summer School Le Mans
7/8/2017
6
Visual Representation of the Data
Visual representation reveals a different story
[Source: Anscombe's quartet, Wikipedia]
I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
July 8th 2017 Summer School Le Mans
Same Stats, Different Graphs: Generating Datasets with Varied Appearance
and Identical Statistics through Simulated Annealing [CHI17]
July 8th 2017 Summer School Le Mans
https://www.autodeskresearch.com/publications/samestats
7/8/2017
7
Where does Visualization Stands?
Theory / Law
Model
Visualization
Facts / Measurements
Support xor
Contradict Induces?
Fits
Describes
Descriptive
Statistics
July 8th 2017 Summer School Le Mans
Four Scales
• Most DH projects rely on the concept of
collections of documents or artifacts
• Visualization can be effective to make sense of
these collections
– But there is no “one size fits all”
• I will present visualizations to manage the
four scales
• With queries, smaller scales can be extracted
from larger scales
July 8th 2017 Summer School Le Mans
7/8/2017
8
Scale Matters!
• 100 - 103 : Small corpus (Master’s thesis / PhD)
• 103 – 106 : Collaborative project
• 106 – 109 : Institutional project (BnF, LoC) or portal
• > 109 : Large scale
– Europeana, Google
Powers of Ten™ (1977)
July 8th 2017 Summer School Le Mans
https://www.youtube.com/watch?v=0fKBhvDjuy0
100 – 103: Small Corpus
• Myriad of visualizations available for small
corpora
– Text, network, genealogy, manuscripts, maps, etc.
• Using these visualizations for exploring small
corpora reveals interesting unexpected
information ALWAYS
• On Web sites dedicated to small corpora,
visualization will help navigate and understand
the scope of the corpus
July 8th 2017 Summer School Le Mans
7/8/2017
9
100: One document
• N. McCurdy, J. Lein, K. Coles, M. Meyer. Poemage: Visualizing the Sonic Topology of
a Poem. IEEE Transactions on Visualization and Computer Graphics (Proceedings of
InfoVis 2015), pages 439-448, January 2016
July 8th 2017 Summer School Le Mans
http://www.sci.utah.edu/~nmccurdy/Poemage/
https://vimeo.com/136205958
http://xkcd.com/657/
100: One document
July 8th 2017 Summer School Le Mans
http://vis.cs.ucdavis.edu/~tanahashi/storylines/
7/8/2017
10
100 – 103: Small(ish) Networks
July 8th 2017 Summer School Le Mans
http://vistorian.net/
100 – 103: Small Corpus
N. Dufournaud
Thesis
~1000 documents
July 8th 2017 Summer School Le Mans
http://nicole.dufournaud.org/
7/8/2017
11
Genealogical trees
July 8th 2017 Summer School Le Mans
July 8th 2017 Summer School Le Mans
Transfer of the Land « La Fruglaye »
7/8/2017
12
July 8th 2017 Summer School Le Mans
Migration Map
Space&Time: GeoTime
[link]
July 8th 2017 Summer School Le Mans
7/8/2017
13
100 – 103: Archeological Collection
Create a spreadsheet
• 1 line per object found
• 1 column per feature
• 1 black dot at the
intersection when an object
has a feature
July 8th 2017 Summer School Le Mans
July 8th 2017 Summer School Le Mans
7/8/2017
14
100 – 103: Bertifier
• Play with our tool online
July 8th 2017 Summer School Le Mans
http://www.aviz.fr/bertifier
https://www.youtube.com/watch?v=tJxAF_a_yBQ
Visualizing an XML Corpus: Compus
• Transform the following XML document:
0 1 2 3 4
012345678901234567890123456789012345678901234567
<A>abcd<B>efgh</B><C>ijkl<D>mnop</D></C>qrst</A>
• into a set of intervals :
A=[0,48[, B=[7,18[, C=[18,40[, D=[25,36[
• One color is given to each element
• Only XML elements are visualized
July 8th 2017 Summer School Le Mans
7/8/2017
15
July 8th 2017 Summer School Le Mans
100 – 103: Diffamation
(Chevalier et al. CHI 2010, http://www.aviz.fr/diffamation/)
July 8th 2017 Summer School Le Mans
7/8/2017
16
100 – 103: Multidimensional Data
Summer School Le MansJuly 8th 2017
July 8th 2017 Summer School Le Mans
7/8/2017
17
100 – 103: Small Corpus
July 8th 2017 Summer School Le Mans
http://multiviz.gforge.inria.fr/scatterdice/oscars/
100 – 103: Small Corpus
• Myriad of visualizations available for small
corpora
– Text, network, genealogy, manuscripts, maps, etc.
• Using these visualizations for exploring small
corpora reveals interesting unexpected
information ALWAYS
• On Web sites dedicated to small corpora,
visualization will help navigate and understand
the scope of the corpus
July 8th 2017 Summer School Le Mans
7/8/2017
18
103 – 106: Library/Coll. Project
• Too many items to show each of them in detail
• Still need to provide guidance to users
• Many tools exist but entering data become
technical
July 8th 2017 Summer School Le Mans
103 – 106: Jigsaw
July 8th 2017 Summer School Le Mans
7/8/2017
19
103 – 106: Parallel Tag Clouds
Parallel Tag Clouds to Explore Faceted Text Corpora (Collins et al., VAST 2009)
July 8th 2017 Summer School Le Mans
http://vialab.science.uoit.ca/portfolio/parallel-tag-clouds-to-explore-faceted-text-corpora
July 8th 2017 Summer School Le Mans
7/8/2017
20
De-duplication
D-Dupe: An Interactive Tool for Entity Resolution in Social Networks (Mustafa Bilgic, Louis Licamele,
Lise Getoor, Ben Shneiderman), In Visual Analytics Science and Technology (VAST), 2006.
• Resolving named entity using relation network
July 8th 2017 Summer School Le Mans
103 – 106: Genealogies
July 8th 2017 Summer School Le Mans
7/8/2017
21
July 8th 2017 Summer School Le Mans
July 8th 2017 Summer School Le Mans
7/8/2017
22
July 8th 2017 Summer School Le Mans
106 – 109: Institutional project
• Only aggregated information can be presented
• Faceted browsing / search very useful!
– Use it!
• e.g. Europeana: 53 106 items
July 8th 2017 Summer School Le Mans
7/8/2017
23
106 – 109: Institutional project (HAL)
July 8th 2017 Summer School Le Mans
http://traces1.saclay.inria.fr/inria/
106 – 109: EU Project Cendari
July 8th 2017 Summer School Le Mans
7/8/2017
24
106 – 109: EU Project Cendari
July 8th 2017 Summer School Le Mans
https://notes.cendari.dariah.eu/
106 – 109: Institutional project
• Only aggregated information can be presented
• Faceted browsing / search very useful!
– Use it!
• e.g. Europeana: 53 106 items
• Problem: metadata quality and semantics
• What is the date of a book?
July 8th 2017 Summer School Le Mans
7/8/2017
25
> 109: World Scale
• Few providers
– Google
– Photo collections (Flickr)
– Astronomical databases
• The cost of computing facets is too high for
interactive time responses
• No good general solution
July 8th 2017 Summer School Le Mans
> 109: Internet Backbone
• Where are you?
• Who cares?
July 8th 2017 Summer School Le Mans
7/8/2017
26
> 109: Query Previews
• Query over very large data about the Earth
July 8th 2017 Summer School Le Mans
http://www.cs.umd.edu/hcil/eosdis/
Conclusion
• Larger collections are harder to manage
– Big data problem
• A large collection can always be queried to
extract a smaller collection
– Scaling down the results and increasing the number of
techniques usable
• Still, current technologies are limited for DH
– No management of uncertainty
– No reasonable model of old geographical concepts
– No good model of time and date
• Still, use the tools and ask for improvements!
July 8th 2017 Summer School Le Mans
7/8/2017
27
References
• Jacques Bertin, Semiology of Graphics: Diagrams, Networks, Maps.
ESRI Press; Nov. 2010. ISBN: 9781589482616
• Edward Tufte. The Visual Display of Quantitative Information.
Cheshire, CT: Graphics Press, 2010 ISBN 0-9613921-4-2
• Tamara Munzner. Visualization Analysis and Design. A K Peters
Visualization Series, CRC Press, 2014. ISBN 9781466508910
• Alberto Cairo. The Truthful Art: Data, Charts, and Maps for
Communication. New Riders, 2016. ISBN 0321934075
• Tableau for Students: https://www.tableau.com/academic/students
• Jänicke, Stefan; Franzini, Greta; Cheema, Muhammad Faisal;
Scheuermann, Gerik. On Close and Distant Reading in Digital
Humanities: A Survey and Future Challenges. Eurographics
Conference on Visualization (EuroVis) – STARs. 2015.
http://dx.doi.org/10.2312/eurovisstar.20151113
July 8th 2017 Summer School Le Mans

Mais conteúdo relacionado

Semelhante a Bibliotheca Digitalis Summer school: Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining - Jean-Daniel Fekete

Team 05 linked data generation
Team 05 linked data generationTeam 05 linked data generation
Team 05 linked data generation
plan4all
 
Eric E Monson, Text->Data 08 Nov 2012
Eric E Monson, Text->Data 08 Nov 2012Eric E Monson, Text->Data 08 Nov 2012
Eric E Monson, Text->Data 08 Nov 2012
emonson
 

Semelhante a Bibliotheca Digitalis Summer school: Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining - Jean-Daniel Fekete (20)

Team 05 linked data generation
Team 05 linked data generationTeam 05 linked data generation
Team 05 linked data generation
 
Education for hybrid society in Industry 4.0
Education for hybrid society in Industry 4.0Education for hybrid society in Industry 4.0
Education for hybrid society in Industry 4.0
 
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
 
Managing international comparative data
Managing international comparative dataManaging international comparative data
Managing international comparative data
 
SSHOC at EOSC-hub Week - ESS in SSHOC - Bodil Agasøster - NSD
SSHOC at EOSC-hub Week - ESS in SSHOC - Bodil Agasøster - NSDSSHOC at EOSC-hub Week - ESS in SSHOC - Bodil Agasøster - NSD
SSHOC at EOSC-hub Week - ESS in SSHOC - Bodil Agasøster - NSD
 
Info vis 4-22-2013-dc-vis-meetup-shneiderman
Info vis 4-22-2013-dc-vis-meetup-shneidermanInfo vis 4-22-2013-dc-vis-meetup-shneiderman
Info vis 4-22-2013-dc-vis-meetup-shneiderman
 
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-shareBigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
 
E- Learning
E- LearningE- Learning
E- Learning
 
Visual analytics
Visual analyticsVisual analytics
Visual analytics
 
EGI impact on science and megatrends
EGI impact on science and megatrendsEGI impact on science and megatrends
EGI impact on science and megatrends
 
Eric E Monson, Text->Data 08 Nov 2012
Eric E Monson, Text->Data 08 Nov 2012Eric E Monson, Text->Data 08 Nov 2012
Eric E Monson, Text->Data 08 Nov 2012
 
E learning
E learningE learning
E learning
 
KAIST Web Engineering Lab Introduction (2017 ver.)
KAIST Web Engineering Lab Introduction (2017 ver.)KAIST Web Engineering Lab Introduction (2017 ver.)
KAIST Web Engineering Lab Introduction (2017 ver.)
 
OpenAIRE - Bridging the worlds where science is performed and science is publ...
OpenAIRE - Bridging the worlds where science is performed and science is publ...OpenAIRE - Bridging the worlds where science is performed and science is publ...
OpenAIRE - Bridging the worlds where science is performed and science is publ...
 
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
 
Info vis 12-2012-v17-shneiderman
Info vis 12-2012-v17-shneidermanInfo vis 12-2012-v17-shneiderman
Info vis 12-2012-v17-shneiderman
 
How to become the best datascientist in Europe
How to become the best datascientist in EuropeHow to become the best datascientist in Europe
How to become the best datascientist in Europe
 
Digitalization of Education
Digitalization of EducationDigitalization of Education
Digitalization of Education
 
ICT project idea: the Danube Data Cube
ICT project idea: the Danube Data Cube ICT project idea: the Danube Data Cube
ICT project idea: the Danube Data Cube
 
Telling a Story – or Even Propaganda – Through Data Visualization
Telling a Story – or Even Propaganda – Through Data VisualizationTelling a Story – or Even Propaganda – Through Data Visualization
Telling a Story – or Even Propaganda – Through Data Visualization
 

Mais de Bibliothèques Virtuelles Humanistes - CESR, Université de Tours, UMR 7323

Mais de Bibliothèques Virtuelles Humanistes - CESR, Université de Tours, UMR 7323 (20)

Montaigne : derniers développements sur les travaux éditoriaux
Montaigne : derniers développements sur les travaux éditoriauxMontaigne : derniers développements sur les travaux éditoriaux
Montaigne : derniers développements sur les travaux éditoriaux
 
Les BVH & l’étude des matériels d’imprimerie anciens
 Les BVH & l’étude des matériels d’imprimerie anciens Les BVH & l’étude des matériels d’imprimerie anciens
Les BVH & l’étude des matériels d’imprimerie anciens
 
Évolutions de l’infrastructure & de la bibliothèque numérique
Évolutions de l’infrastructure & de la bibliothèque numériqueÉvolutions de l’infrastructure & de la bibliothèque numérique
Évolutions de l’infrastructure & de la bibliothèque numérique
 
Les « Bibliotheques françoises » (BibFr) – Avancée de l’indexation de La Croi...
Les « Bibliotheques françoises » (BibFr) – Avancée de l’indexation de La Croi...Les « Bibliotheques françoises » (BibFr) – Avancée de l’indexation de La Croi...
Les « Bibliotheques françoises » (BibFr) – Avancée de l’indexation de La Croi...
 
Édition numérique et valorisation du livre de compte de la reine Marguerite d...
Édition numérique et valorisation du livre de compte de la reine Marguerite d...Édition numérique et valorisation du livre de compte de la reine Marguerite d...
Édition numérique et valorisation du livre de compte de la reine Marguerite d...
 
Catalogues régionaux des Incunables des bibliothèques publiques de France
Catalogues régionaux des Incunables des bibliothèques publiques de FranceCatalogues régionaux des Incunables des bibliothèques publiques de France
Catalogues régionaux des Incunables des bibliothèques publiques de France
 
Une nouvelle base de données, Scripta Manent : le “Facebook” des années 1530-...
Une nouvelle base de données, Scripta Manent : le “Facebook” des années 1530-...Une nouvelle base de données, Scripta Manent : le “Facebook” des années 1530-...
Une nouvelle base de données, Scripta Manent : le “Facebook” des années 1530-...
 
Bilan 2022 & perspectives du programme de recherche BVH
Bilan 2022 & perspectives du programme de recherche BVHBilan 2022 & perspectives du programme de recherche BVH
Bilan 2022 & perspectives du programme de recherche BVH
 
Catalogues régionaux des Incunables des bibliothèques publiques de France : S...
Catalogues régionaux des Incunables des bibliothèques publiques de France : S...Catalogues régionaux des Incunables des bibliothèques publiques de France : S...
Catalogues régionaux des Incunables des bibliothèques publiques de France : S...
 
Architecture de la bibliothèque numérique : Déploiement du protocole IIIF - A...
Architecture de la bibliothèque numérique : Déploiement du protocole IIIF - A...Architecture de la bibliothèque numérique : Déploiement du protocole IIIF - A...
Architecture de la bibliothèque numérique : Déploiement du protocole IIIF - A...
 
Autour du projet BiRayMa : "Bibliothèque de Raymond Marcel" (CollEx-Persée) -...
Autour du projet BiRayMa : "Bibliothèque de Raymond Marcel" (CollEx-Persée) -...Autour du projet BiRayMa : "Bibliothèque de Raymond Marcel" (CollEx-Persée) -...
Autour du projet BiRayMa : "Bibliothèque de Raymond Marcel" (CollEx-Persée) -...
 
Rabelais : Les documents de Berne et l'Almanach d'Alessandria - Assemblée gén...
Rabelais : Les documents de Berne et l'Almanach d'Alessandria - Assemblée gén...Rabelais : Les documents de Berne et l'Almanach d'Alessandria - Assemblée gén...
Rabelais : Les documents de Berne et l'Almanach d'Alessandria - Assemblée gén...
 
Projet Scripta Manent : Une nouvelle base de données : les relations sociales...
Projet Scripta Manent : Une nouvelle base de données : les relations sociales...Projet Scripta Manent : Une nouvelle base de données : les relations sociales...
Projet Scripta Manent : Une nouvelle base de données : les relations sociales...
 
Projet Les Bibliotheques françoises de La Croix du Maine et de Du Verdier - A...
Projet Les Bibliotheques françoises de La Croix du Maine et de Du Verdier - A...Projet Les Bibliotheques françoises de La Croix du Maine et de Du Verdier - A...
Projet Les Bibliotheques françoises de La Croix du Maine et de Du Verdier - A...
 
Architecture de la bibliothèque numérique : Modélisation en XML-TEI - Assembl...
Architecture de la bibliothèque numérique : Modélisation en XML-TEI - Assembl...Architecture de la bibliothèque numérique : Modélisation en XML-TEI - Assembl...
Architecture de la bibliothèque numérique : Modélisation en XML-TEI - Assembl...
 
Architecture de la bibliothèque numérique : Veille fonctionnelle et technique...
Architecture de la bibliothèque numérique : Veille fonctionnelle et technique...Architecture de la bibliothèque numérique : Veille fonctionnelle et technique...
Architecture de la bibliothèque numérique : Veille fonctionnelle et technique...
 
Architecture de la bibliothèque numérique : Modélisation et migrations de don...
Architecture de la bibliothèque numérique : Modélisation et migrations de don...Architecture de la bibliothèque numérique : Modélisation et migrations de don...
Architecture de la bibliothèque numérique : Modélisation et migrations de don...
 
Production BVH : Epistemon (éditions numériques TEI-Renaissance) - Assemblée ...
Production BVH : Epistemon (éditions numériques TEI-Renaissance) - Assemblée ...Production BVH : Epistemon (éditions numériques TEI-Renaissance) - Assemblée ...
Production BVH : Epistemon (éditions numériques TEI-Renaissance) - Assemblée ...
 
Production BVH : Fac-similés (Numérisations) - Assemblée générale 2021, Progr...
Production BVH : Fac-similés (Numérisations) - Assemblée générale 2021, Progr...Production BVH : Fac-similés (Numérisations) - Assemblée générale 2021, Progr...
Production BVH : Fac-similés (Numérisations) - Assemblée générale 2021, Progr...
 
Bilan 2020-2021 & perspectives 2022+ Assemblée générale 2021, Programme de re...
Bilan 2020-2021 & perspectives 2022+ Assemblée générale 2021, Programme de re...Bilan 2020-2021 & perspectives 2022+ Assemblée générale 2021, Programme de re...
Bilan 2020-2021 & perspectives 2022+ Assemblée générale 2021, Programme de re...
 

Último

Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
 

Último (20)

20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 

Bibliotheca Digitalis Summer school: Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining - Jean-Daniel Fekete

  • 1. Bibliotheca Digitalis Reconstitution of Early Modern Cultural Networks From Primary Source to Data DARIAH / Biblissima Summer School Le Mans, 4-8 July 2017 Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining 5th and last day, July 8th – Digital representation and data accuracy for Humanities Jean-Daniel Fekete Research Scientist, INRIA
  • 2. 7/8/2017 1 Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining Jean-Daniel Fekete INRIA http://www.aviz.fr/~fekete Visualization? Visualization is any technique for creating images, diagrams, or animations to communicate a message [Wikipedia, Visualization, May 2016] Information visualization is the study of (interactive) visual representations of abstract data to reinforce human cognition [Card, S. and Mackinlay, J. and Shneiderman B., Readings in Information Visualization, 1999] July 8th 2017 Summer School Le Mans
  • 3. 7/8/2017 2 Visualization and Visual Perception • Visualization is grounded in the visual and cognitive capabilities of humans – Inferring from visual forms • Relies on visual capabilities of the human eye and brain – Preattentive processing – Ready…is there a red circle in the next slide? July 8th 2017 Summer School Le Mans Preattentive Processing July 8th 2017 Summer School Le Mans
  • 4. 7/8/2017 3 Preattentive Processing July 8th 2017 Summer School Le Mans Preattentive Processing • Preattentive processing – 200ms response time (in a glimpse) – Effortless – Reliable estimates • Many visual features can be perceived preattentively: – Orientation of line/bloc, length, width, size, curvature, cardinality, etc. • Problems: – Preattentive features interfere with each other • Except one – Preattentive features have limitations • 7 colors max (Healey, 96) • 2 or 3 shapes July 8th 2017 Summer School Le Mans
  • 5. 7/8/2017 4 Preattentive Processing July 8th 2017 Summer School Le Mans Where does Visualization Stands? Theory / Law Model Descriptive statistics Facts / Measurements Support xor Contradict Induces? Fits Describes July 8th 2017 Summer School Le Mans
  • 6. 7/8/2017 5 Example I II III IV x y x y x y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 Raw Data from Anscombe’s Quartet [Source: Anscombe's quartet, Wikipedia] July 8th 2017 Summer School Le Mans Statistical Analysis I II III IV x y x y x y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 Mean of x 9.0 Variance of x 11.0 Mean of y 7.5 Variance of y 4.12 Correlation between x and y 0.816 Linear regression line y = 3 + 0.5x For all columns, the main descriptive statistics are identical [Source: Anscombe's quartet, Wikipedia] July 8th 2017 Summer School Le Mans
  • 7. 7/8/2017 6 Visual Representation of the Data Visual representation reveals a different story [Source: Anscombe's quartet, Wikipedia] I II III IV x y x y x y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 July 8th 2017 Summer School Le Mans Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing [CHI17] July 8th 2017 Summer School Le Mans https://www.autodeskresearch.com/publications/samestats
  • 8. 7/8/2017 7 Where does Visualization Stands? Theory / Law Model Visualization Facts / Measurements Support xor Contradict Induces? Fits Describes Descriptive Statistics July 8th 2017 Summer School Le Mans Four Scales • Most DH projects rely on the concept of collections of documents or artifacts • Visualization can be effective to make sense of these collections – But there is no “one size fits all” • I will present visualizations to manage the four scales • With queries, smaller scales can be extracted from larger scales July 8th 2017 Summer School Le Mans
  • 9. 7/8/2017 8 Scale Matters! • 100 - 103 : Small corpus (Master’s thesis / PhD) • 103 – 106 : Collaborative project • 106 – 109 : Institutional project (BnF, LoC) or portal • > 109 : Large scale – Europeana, Google Powers of Ten™ (1977) July 8th 2017 Summer School Le Mans https://www.youtube.com/watch?v=0fKBhvDjuy0 100 – 103: Small Corpus • Myriad of visualizations available for small corpora – Text, network, genealogy, manuscripts, maps, etc. • Using these visualizations for exploring small corpora reveals interesting unexpected information ALWAYS • On Web sites dedicated to small corpora, visualization will help navigate and understand the scope of the corpus July 8th 2017 Summer School Le Mans
  • 10. 7/8/2017 9 100: One document • N. McCurdy, J. Lein, K. Coles, M. Meyer. Poemage: Visualizing the Sonic Topology of a Poem. IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis 2015), pages 439-448, January 2016 July 8th 2017 Summer School Le Mans http://www.sci.utah.edu/~nmccurdy/Poemage/ https://vimeo.com/136205958 http://xkcd.com/657/ 100: One document July 8th 2017 Summer School Le Mans http://vis.cs.ucdavis.edu/~tanahashi/storylines/
  • 11. 7/8/2017 10 100 – 103: Small(ish) Networks July 8th 2017 Summer School Le Mans http://vistorian.net/ 100 – 103: Small Corpus N. Dufournaud Thesis ~1000 documents July 8th 2017 Summer School Le Mans http://nicole.dufournaud.org/
  • 12. 7/8/2017 11 Genealogical trees July 8th 2017 Summer School Le Mans July 8th 2017 Summer School Le Mans Transfer of the Land « La Fruglaye »
  • 13. 7/8/2017 12 July 8th 2017 Summer School Le Mans Migration Map Space&Time: GeoTime [link] July 8th 2017 Summer School Le Mans
  • 14. 7/8/2017 13 100 – 103: Archeological Collection Create a spreadsheet • 1 line per object found • 1 column per feature • 1 black dot at the intersection when an object has a feature July 8th 2017 Summer School Le Mans July 8th 2017 Summer School Le Mans
  • 15. 7/8/2017 14 100 – 103: Bertifier • Play with our tool online July 8th 2017 Summer School Le Mans http://www.aviz.fr/bertifier https://www.youtube.com/watch?v=tJxAF_a_yBQ Visualizing an XML Corpus: Compus • Transform the following XML document: 0 1 2 3 4 012345678901234567890123456789012345678901234567 <A>abcd<B>efgh</B><C>ijkl<D>mnop</D></C>qrst</A> • into a set of intervals : A=[0,48[, B=[7,18[, C=[18,40[, D=[25,36[ • One color is given to each element • Only XML elements are visualized July 8th 2017 Summer School Le Mans
  • 16. 7/8/2017 15 July 8th 2017 Summer School Le Mans 100 – 103: Diffamation (Chevalier et al. CHI 2010, http://www.aviz.fr/diffamation/) July 8th 2017 Summer School Le Mans
  • 17. 7/8/2017 16 100 – 103: Multidimensional Data Summer School Le MansJuly 8th 2017 July 8th 2017 Summer School Le Mans
  • 18. 7/8/2017 17 100 – 103: Small Corpus July 8th 2017 Summer School Le Mans http://multiviz.gforge.inria.fr/scatterdice/oscars/ 100 – 103: Small Corpus • Myriad of visualizations available for small corpora – Text, network, genealogy, manuscripts, maps, etc. • Using these visualizations for exploring small corpora reveals interesting unexpected information ALWAYS • On Web sites dedicated to small corpora, visualization will help navigate and understand the scope of the corpus July 8th 2017 Summer School Le Mans
  • 19. 7/8/2017 18 103 – 106: Library/Coll. Project • Too many items to show each of them in detail • Still need to provide guidance to users • Many tools exist but entering data become technical July 8th 2017 Summer School Le Mans 103 – 106: Jigsaw July 8th 2017 Summer School Le Mans
  • 20. 7/8/2017 19 103 – 106: Parallel Tag Clouds Parallel Tag Clouds to Explore Faceted Text Corpora (Collins et al., VAST 2009) July 8th 2017 Summer School Le Mans http://vialab.science.uoit.ca/portfolio/parallel-tag-clouds-to-explore-faceted-text-corpora July 8th 2017 Summer School Le Mans
  • 21. 7/8/2017 20 De-duplication D-Dupe: An Interactive Tool for Entity Resolution in Social Networks (Mustafa Bilgic, Louis Licamele, Lise Getoor, Ben Shneiderman), In Visual Analytics Science and Technology (VAST), 2006. • Resolving named entity using relation network July 8th 2017 Summer School Le Mans 103 – 106: Genealogies July 8th 2017 Summer School Le Mans
  • 22. 7/8/2017 21 July 8th 2017 Summer School Le Mans July 8th 2017 Summer School Le Mans
  • 23. 7/8/2017 22 July 8th 2017 Summer School Le Mans 106 – 109: Institutional project • Only aggregated information can be presented • Faceted browsing / search very useful! – Use it! • e.g. Europeana: 53 106 items July 8th 2017 Summer School Le Mans
  • 24. 7/8/2017 23 106 – 109: Institutional project (HAL) July 8th 2017 Summer School Le Mans http://traces1.saclay.inria.fr/inria/ 106 – 109: EU Project Cendari July 8th 2017 Summer School Le Mans
  • 25. 7/8/2017 24 106 – 109: EU Project Cendari July 8th 2017 Summer School Le Mans https://notes.cendari.dariah.eu/ 106 – 109: Institutional project • Only aggregated information can be presented • Faceted browsing / search very useful! – Use it! • e.g. Europeana: 53 106 items • Problem: metadata quality and semantics • What is the date of a book? July 8th 2017 Summer School Le Mans
  • 26. 7/8/2017 25 > 109: World Scale • Few providers – Google – Photo collections (Flickr) – Astronomical databases • The cost of computing facets is too high for interactive time responses • No good general solution July 8th 2017 Summer School Le Mans > 109: Internet Backbone • Where are you? • Who cares? July 8th 2017 Summer School Le Mans
  • 27. 7/8/2017 26 > 109: Query Previews • Query over very large data about the Earth July 8th 2017 Summer School Le Mans http://www.cs.umd.edu/hcil/eosdis/ Conclusion • Larger collections are harder to manage – Big data problem • A large collection can always be queried to extract a smaller collection – Scaling down the results and increasing the number of techniques usable • Still, current technologies are limited for DH – No management of uncertainty – No reasonable model of old geographical concepts – No good model of time and date • Still, use the tools and ask for improvements! July 8th 2017 Summer School Le Mans
  • 28. 7/8/2017 27 References • Jacques Bertin, Semiology of Graphics: Diagrams, Networks, Maps. ESRI Press; Nov. 2010. ISBN: 9781589482616 • Edward Tufte. The Visual Display of Quantitative Information. Cheshire, CT: Graphics Press, 2010 ISBN 0-9613921-4-2 • Tamara Munzner. Visualization Analysis and Design. A K Peters Visualization Series, CRC Press, 2014. ISBN 9781466508910 • Alberto Cairo. The Truthful Art: Data, Charts, and Maps for Communication. New Riders, 2016. ISBN 0321934075 • Tableau for Students: https://www.tableau.com/academic/students • Jänicke, Stefan; Franzini, Greta; Cheema, Muhammad Faisal; Scheuermann, Gerik. On Close and Distant Reading in Digital Humanities: A Survey and Future Challenges. Eurographics Conference on Visualization (EuroVis) – STARs. 2015. http://dx.doi.org/10.2312/eurovisstar.20151113 July 8th 2017 Summer School Le Mans