SlideShare uma empresa Scribd logo
1 de 24
Data visualization and digital humanities research:  a survey of available data sets and tools  LITA National Forum 2011 St. Louis, MO Friday, September  30, 2011 Erik Mitchell, University of Maryland Susan  Sharpless Smith, Wake Forest University
Motivation “Digital humanities needs gateway drugs. Kudos to the pushers on the Google Books team.”  - Dan Cohen http://www.dancohen.org/2010/12/19/ “Linked open data could have the same leveraging effect that the World Wide Web had on computing, said Micki McGee, an assistant professor of sociology at Fordham University” 				-Steve Kolowich, The Promise of Digital Humanities, Inside HigherEd
Birth of a word “Imagine if you could record your life, everything you said, everything you did available in a perfect memory store at your finger tips. “ 			- Deb Roy – The Birth of a Word http://www.ted.com/
Overview Discuss examples of data-focused research tools Explore tools Consider roles for librarians Wrap-up/Q & A
Taxonomy of uses
Searching and Discovery Examples:  BYU Corpuahttp://corpus.byu.edu/ WOK Citation Mapping WOK
Visualization Free Visualization Tools
Analysis and publishing NodeXLhttp://nodexl.codeplex.com/
Tool Comparison - linguistics
Tool exploration Discover / Search What kinds of discovery tools exist and how common are the discovery features across different datasets / systems? Visualization What visualization features exist, are there products that are easy to use, are the skills transferable? Analysis / Annotation What analytical tools are included, what analysis techniques are common?
Perseus http://www.perseus.tufts.edu
JSTOR Data For Research http://dfr.jstor.org
Wordseer AditiMuralidharan Marti Hearst http://bebop.berkeley.edu/wordseer
Google’s Ngram Viewerbooks.google.com/ngramsculturomics.org But here's the rub. Google Books, as others point out, wasn't really built for research. . . That means Google Books didn't come with the interfaces scholars need for vast data manipulation . . .  http://chronicle.com/article/The-Humanities-Go-Google/65713/
Ted talk on Google NGRAM viewer http://www.ted.com/talks/what_we_learned_from_5_million_books.html
Concordancing Eric  Lease Morgan - http://dh.crc.nd.edu/sandbox/cyl/catalog/
Google’s public data explorer http://www.google.com/publicdata/
Data analysis - NodeXL http://nodexl.codeplex.com/ Analyzing Social Media Networks with NodeXL: Insights from a Connected World
Data cleaning – Google Refine http://code.google.com/p/google-refine
Data visualization – Google Fusion Tables http://www.google.com/fusiontables/DataSource?dsrcid=332788 http://google.com/fusiontables
Research/teaching need Researcher needs vary from advanced linguistic analysis and IT support to need for basic digital content/infrastructure Corpus-based research
Librarian contributions Domain specific, tool-type specific comparisons IT and research support – data analysis, data curation, tool/data sources identification  Shift from “reference” to “research” in sync with move from resource discovery to thematic analysis
Next steps Build new skills, develop new systems Create tutorials guides Explore connections between data/curation and publishing and these tools – so is there a connection Explore role of library discovery systems and consider new feature implementation.
Sites of interest

Mais conteúdo relacionado

Mais procurados

Humanities Users in the Digital Age: Library Needs Assessment
Humanities Users in the Digital Age: Library Needs AssessmentHumanities Users in the Digital Age: Library Needs Assessment
Humanities Users in the Digital Age: Library Needs AssessmentHarriett Green
 
Digital Libraries on International Campuses
Digital Libraries on International CampusesDigital Libraries on International Campuses
Digital Libraries on International CampusesHarriett Green
 
Building the Archive of DH Research
Building the Archive of DH ResearchBuilding the Archive of DH Research
Building the Archive of DH ResearchHarriett Green
 
User Engagement with Digital Archives: A Case Study of Emblematica Online
User Engagement with Digital Archives: A Case Study of Emblematica OnlineUser Engagement with Digital Archives: A Case Study of Emblematica Online
User Engagement with Digital Archives: A Case Study of Emblematica OnlineHarriett Green
 
Data for the Humanities
Data for the HumanitiesData for the Humanities
Data for the Humanitieslibrarianrafia
 
Tei2012 slides revised
Tei2012 slides revisedTei2012 slides revised
Tei2012 slides revisedHarriett Green
 
Robert hunter
Robert hunterRobert hunter
Robert hunteriDocQ
 
資訊素養工作坊PowerPoint
資訊素養工作坊PowerPoint資訊素養工作坊PowerPoint
資訊素養工作坊PowerPointkaikwong
 
Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012Elizabeth Brown
 
an introduction to social media and research
an introduction to social media and researchan introduction to social media and research
an introduction to social media and researchRichard Hall
 
Introduction to Digital humanities
Introduction to Digital humanitiesIntroduction to Digital humanities
Introduction to Digital humanitiesmarklocklear
 
Humanities data curation slides
Humanities data curation slidesHumanities data curation slides
Humanities data curation slidesHarriett Green
 
Meyer dig ethno_2013sdp
Meyer dig ethno_2013sdpMeyer dig ethno_2013sdp
Meyer dig ethno_2013sdpEric Meyer
 
The Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposiumThe Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposiumKatrin Weller
 
Query Design for Digital Methods by Richard Rogers
Query Design for Digital Methods by Richard RogersQuery Design for Digital Methods by Richard Rogers
Query Design for Digital Methods by Richard RogersDigital Methods Initiative
 

Mais procurados (20)

Electronic Books
Electronic BooksElectronic Books
Electronic Books
 
Humanities Users in the Digital Age: Library Needs Assessment
Humanities Users in the Digital Age: Library Needs AssessmentHumanities Users in the Digital Age: Library Needs Assessment
Humanities Users in the Digital Age: Library Needs Assessment
 
Digital Libraries on International Campuses
Digital Libraries on International CampusesDigital Libraries on International Campuses
Digital Libraries on International Campuses
 
Building the Archive of DH Research
Building the Archive of DH ResearchBuilding the Archive of DH Research
Building the Archive of DH Research
 
User Engagement with Digital Archives: A Case Study of Emblematica Online
User Engagement with Digital Archives: A Case Study of Emblematica OnlineUser Engagement with Digital Archives: A Case Study of Emblematica Online
User Engagement with Digital Archives: A Case Study of Emblematica Online
 
Research Management Tools
Research Management ToolsResearch Management Tools
Research Management Tools
 
Data for the Humanities
Data for the HumanitiesData for the Humanities
Data for the Humanities
 
Digital humanities
Digital humanitiesDigital humanities
Digital humanities
 
Tei2012 slides revised
Tei2012 slides revisedTei2012 slides revised
Tei2012 slides revised
 
Robert hunter
Robert hunterRobert hunter
Robert hunter
 
資訊素養工作坊PowerPoint
資訊素養工作坊PowerPoint資訊素養工作坊PowerPoint
資訊素養工作坊PowerPoint
 
Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012
 
an introduction to social media and research
an introduction to social media and researchan introduction to social media and research
an introduction to social media and research
 
Introduction to Digital humanities
Introduction to Digital humanitiesIntroduction to Digital humanities
Introduction to Digital humanities
 
Oss swot
Oss swotOss swot
Oss swot
 
Humanities data curation slides
Humanities data curation slidesHumanities data curation slides
Humanities data curation slides
 
EricRochesterResume
EricRochesterResumeEricRochesterResume
EricRochesterResume
 
Meyer dig ethno_2013sdp
Meyer dig ethno_2013sdpMeyer dig ethno_2013sdp
Meyer dig ethno_2013sdp
 
The Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposiumThe Hidden Data of Social Media Rearch_CSS-winter-symposium
The Hidden Data of Social Media Rearch_CSS-winter-symposium
 
Query Design for Digital Methods by Richard Rogers
Query Design for Digital Methods by Richard RogersQuery Design for Digital Methods by Richard Rogers
Query Design for Digital Methods by Richard Rogers
 

Destaque

Making Sense of a Pile of PDFs
Making Sense of a Pile of PDFsMaking Sense of a Pile of PDFs
Making Sense of a Pile of PDFsLauren Pressley
 
#socstrat: Leveraging Social and Mobile Technologies in Experiential Courses
#socstrat: Leveraging Social and Mobile Technologies in Experiential Courses#socstrat: Leveraging Social and Mobile Technologies in Experiential Courses
#socstrat: Leveraging Social and Mobile Technologies in Experiential CoursesSusan Smith
 
Building a Culture of Assessment @ ZSR
Building a Culture of Assessment @ ZSRBuilding a Culture of Assessment @ ZSR
Building a Culture of Assessment @ ZSRSusan Smith
 
From Department Director to Race Director
From Department Director to Race DirectorFrom Department Director to Race Director
From Department Director to Race DirectorSusan Smith
 
DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...
DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...
DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...Frederic Kaplan
 
Lessons Learned: Through a Librarian's Lens
Lessons Learned: Through a Librarian's LensLessons Learned: Through a Librarian's Lens
Lessons Learned: Through a Librarian's LensSusan Smith
 
A Decade of Presentation Lessons in One Hour
A Decade of Presentation Lessons in One HourA Decade of Presentation Lessons in One Hour
A Decade of Presentation Lessons in One HourLauren Pressley
 
A Library for the Whole Student: Creating a Multi-dimensional Culture of Heal...
A Library for the Whole Student: Creating a Multi-dimensional Culture of Heal...A Library for the Whole Student: Creating a Multi-dimensional Culture of Heal...
A Library for the Whole Student: Creating a Multi-dimensional Culture of Heal...Susan Smith
 
Natural Language Processing and Python
Natural Language Processing and PythonNatural Language Processing and Python
Natural Language Processing and Pythonanntp
 
Similarities & Differences in Financial Management Between a Small Private an...
Similarities & Differences in Financial Management Between a Small Private an...Similarities & Differences in Financial Management Between a Small Private an...
Similarities & Differences in Financial Management Between a Small Private an...Susan Smith
 
Rob Berman Value Proposition -- LinkedIn
Rob Berman Value Proposition  -- LinkedInRob Berman Value Proposition  -- LinkedIn
Rob Berman Value Proposition -- LinkedInRob Berman
 
Zsr presentation
Zsr presentationZsr presentation
Zsr presentationSusan Smith
 
A Culture of Creativity and Action
A Culture of Creativity and ActionA Culture of Creativity and Action
A Culture of Creativity and ActionLauren Pressley
 
Change: Personal & Professional
Change: Personal & ProfessionalChange: Personal & Professional
Change: Personal & ProfessionalLauren Pressley
 

Destaque (16)

Making Sense of a Pile of PDFs
Making Sense of a Pile of PDFsMaking Sense of a Pile of PDFs
Making Sense of a Pile of PDFs
 
#socstrat: Leveraging Social and Mobile Technologies in Experiential Courses
#socstrat: Leveraging Social and Mobile Technologies in Experiential Courses#socstrat: Leveraging Social and Mobile Technologies in Experiential Courses
#socstrat: Leveraging Social and Mobile Technologies in Experiential Courses
 
The Pittsburgh Citizen: An Introduction
The Pittsburgh Citizen: An IntroductionThe Pittsburgh Citizen: An Introduction
The Pittsburgh Citizen: An Introduction
 
Building a Culture of Assessment @ ZSR
Building a Culture of Assessment @ ZSRBuilding a Culture of Assessment @ ZSR
Building a Culture of Assessment @ ZSR
 
From Department Director to Race Director
From Department Director to Race DirectorFrom Department Director to Race Director
From Department Director to Race Director
 
DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...
DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...
DH101 2013/2014 course 7 - OCR, Printed text recognition, Handwriting recogni...
 
Lessons Learned: Through a Librarian's Lens
Lessons Learned: Through a Librarian's LensLessons Learned: Through a Librarian's Lens
Lessons Learned: Through a Librarian's Lens
 
A Decade of Presentation Lessons in One Hour
A Decade of Presentation Lessons in One HourA Decade of Presentation Lessons in One Hour
A Decade of Presentation Lessons in One Hour
 
A Library for the Whole Student: Creating a Multi-dimensional Culture of Heal...
A Library for the Whole Student: Creating a Multi-dimensional Culture of Heal...A Library for the Whole Student: Creating a Multi-dimensional Culture of Heal...
A Library for the Whole Student: Creating a Multi-dimensional Culture of Heal...
 
Natural Language Processing and Python
Natural Language Processing and PythonNatural Language Processing and Python
Natural Language Processing and Python
 
Similarities & Differences in Financial Management Between a Small Private an...
Similarities & Differences in Financial Management Between a Small Private an...Similarities & Differences in Financial Management Between a Small Private an...
Similarities & Differences in Financial Management Between a Small Private an...
 
Another Data Point
Another Data PointAnother Data Point
Another Data Point
 
Rob Berman Value Proposition -- LinkedIn
Rob Berman Value Proposition  -- LinkedInRob Berman Value Proposition  -- LinkedIn
Rob Berman Value Proposition -- LinkedIn
 
Zsr presentation
Zsr presentationZsr presentation
Zsr presentation
 
A Culture of Creativity and Action
A Culture of Creativity and ActionA Culture of Creativity and Action
A Culture of Creativity and Action
 
Change: Personal & Professional
Change: Personal & ProfessionalChange: Personal & Professional
Change: Personal & Professional
 

Semelhante a Data Visualization and Digital Tools for the Humanities

GOLD GALILEO 2010 Summary
GOLD GALILEO 2010 SummaryGOLD GALILEO 2010 Summary
GOLD GALILEO 2010 Summarydasmith1038
 
The Web, the User and the Library
The Web, the User and the LibraryThe Web, the User and the Library
The Web, the User and the LibraryGuus van den Brekel
 
Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0Guus van den Brekel
 
e-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Articlee-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly ArticleDavid De Roure
 
Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...Gerben Zaagsma
 
Learning as a Social Process
Learning as a Social ProcessLearning as a Social Process
Learning as a Social ProcessRobert Cormia
 
Social Media Tools and Mobile Apps for Research and Publishing
Social Media Tools and Mobile Apps for Research and PublishingSocial Media Tools and Mobile Apps for Research and Publishing
Social Media Tools and Mobile Apps for Research and PublishingCheryl Peltier-Davis
 
1 to-1 across disciplines-final
1 to-1 across disciplines-final1 to-1 across disciplines-final
1 to-1 across disciplines-finalehelfant
 
ESSIR 2013 - IR and Social Media
ESSIR 2013 - IR and Social MediaESSIR 2013 - IR and Social Media
ESSIR 2013 - IR and Social MediaArjen de Vries
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Learning in Networks of Knowledge
Learning in Networks of KnowledgeLearning in Networks of Knowledge
Learning in Networks of KnowledgeJudy O'Connell
 
Discovery elsewhere
Discovery elsewhereDiscovery elsewhere
Discovery elsewhereJenn Riley
 
Humanities Research with the Web of Data
Humanities Research with the Web of DataHumanities Research with the Web of Data
Humanities Research with the Web of DataMathieu d'Aquin
 
Online-Resources-and-ICT-in-Research.pptx
Online-Resources-and-ICT-in-Research.pptxOnline-Resources-and-ICT-in-Research.pptx
Online-Resources-and-ICT-in-Research.pptxRomaSmart1
 
HMID6303 Assignment 1 - Yeap
HMID6303 Assignment 1 - YeapHMID6303 Assignment 1 - Yeap
HMID6303 Assignment 1 - YeapYeap Aun
 
Future of Scholarly Communications
Future of Scholarly CommunicationsFuture of Scholarly Communications
Future of Scholarly CommunicationsDavid De Roure
 

Semelhante a Data Visualization and Digital Tools for the Humanities (20)

GOLD GALILEO 2010 Summary
GOLD GALILEO 2010 SummaryGOLD GALILEO 2010 Summary
GOLD GALILEO 2010 Summary
 
The Era of Open
The Era of OpenThe Era of Open
The Era of Open
 
The Web, the User and the Library
The Web, the User and the LibraryThe Web, the User and the Library
The Web, the User and the Library
 
Bibliotheek & Onderzoek 2.0?
Bibliotheek & Onderzoek 2.0?Bibliotheek & Onderzoek 2.0?
Bibliotheek & Onderzoek 2.0?
 
Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0Virtual Research Networks : Towards Research 2.0
Virtual Research Networks : Towards Research 2.0
 
e-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Articlee-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Article
 
Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...Introduction for skills seminar on Search and Data Mining, Master of European...
Introduction for skills seminar on Search and Data Mining, Master of European...
 
Learning as a Social Process
Learning as a Social ProcessLearning as a Social Process
Learning as a Social Process
 
Social Media Tools and Mobile Apps for Research and Publishing
Social Media Tools and Mobile Apps for Research and PublishingSocial Media Tools and Mobile Apps for Research and Publishing
Social Media Tools and Mobile Apps for Research and Publishing
 
1 to-1 across disciplines-final
1 to-1 across disciplines-final1 to-1 across disciplines-final
1 to-1 across disciplines-final
 
Lern, june 2016, digital media slides
Lern, june 2016, digital media slidesLern, june 2016, digital media slides
Lern, june 2016, digital media slides
 
ESSIR 2013 - IR and Social Media
ESSIR 2013 - IR and Social MediaESSIR 2013 - IR and Social Media
ESSIR 2013 - IR and Social Media
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Learning in Networks of Knowledge
Learning in Networks of KnowledgeLearning in Networks of Knowledge
Learning in Networks of Knowledge
 
Discovery elsewhere
Discovery elsewhereDiscovery elsewhere
Discovery elsewhere
 
Humanities Research with the Web of Data
Humanities Research with the Web of DataHumanities Research with the Web of Data
Humanities Research with the Web of Data
 
Online-Resources-and-ICT-in-Research.pptx
Online-Resources-and-ICT-in-Research.pptxOnline-Resources-and-ICT-in-Research.pptx
Online-Resources-and-ICT-in-Research.pptx
 
Online-Resources-and-ICT-in-Research.pptx
Online-Resources-and-ICT-in-Research.pptxOnline-Resources-and-ICT-in-Research.pptx
Online-Resources-and-ICT-in-Research.pptx
 
HMID6303 Assignment 1 - Yeap
HMID6303 Assignment 1 - YeapHMID6303 Assignment 1 - Yeap
HMID6303 Assignment 1 - Yeap
 
Future of Scholarly Communications
Future of Scholarly CommunicationsFuture of Scholarly Communications
Future of Scholarly Communications
 

Mais de Susan Smith

Wake Forest University Faculty Survey 2016
Wake Forest University Faculty Survey 2016Wake Forest University Faculty Survey 2016
Wake Forest University Faculty Survey 2016Susan Smith
 
Z. Smith Reynolds Library - The Sutton Years 2004-2015
Z. Smith Reynolds Library - The Sutton Years 2004-2015Z. Smith Reynolds Library - The Sutton Years 2004-2015
Z. Smith Reynolds Library - The Sutton Years 2004-2015Susan Smith
 
What ZSR Library Does to Build Value/Sage Value Research
What ZSR Library Does to Build Value/Sage Value ResearchWhat ZSR Library Does to Build Value/Sage Value Research
What ZSR Library Does to Build Value/Sage Value ResearchSusan Smith
 
Life Lessons Learned: Through a Librarian's Lens
Life Lessons Learned: Through a Librarian's LensLife Lessons Learned: Through a Librarian's Lens
Life Lessons Learned: Through a Librarian's LensSusan Smith
 
Working Together, Evolving Value for Academic Libraries/Examples from One Lib...
Working Together, Evolving Value for Academic Libraries/Examples from One Lib...Working Together, Evolving Value for Academic Libraries/Examples from One Lib...
Working Together, Evolving Value for Academic Libraries/Examples from One Lib...Susan Smith
 
Digital Forsyth: Through a Social Entrepreneurial Lens
Digital Forsyth: Through a Social Entrepreneurial LensDigital Forsyth: Through a Social Entrepreneurial Lens
Digital Forsyth: Through a Social Entrepreneurial LensSusan Smith
 
Digital Forsyth: A Partnership/Budgeting in a Collaborative Grant
Digital Forsyth: A Partnership/Budgeting in a Collaborative GrantDigital Forsyth: A Partnership/Budgeting in a Collaborative Grant
Digital Forsyth: A Partnership/Budgeting in a Collaborative GrantSusan Smith
 
ZSR Library Presents: Wikis and Blogs
ZSR Library Presents: Wikis and BlogsZSR Library Presents: Wikis and Blogs
ZSR Library Presents: Wikis and BlogsSusan Smith
 
Bringing Information Literacy into the Social Sphere: A Case Study Using Soci...
Bringing Information Literacy into the Social Sphere: A Case Study Using Soci...Bringing Information Literacy into the Social Sphere: A Case Study Using Soci...
Bringing Information Literacy into the Social Sphere: A Case Study Using Soci...Susan Smith
 
Teaching Them (2.0) to Fish: Web 2.0 as Subject and Method in Information Lit...
Teaching Them (2.0) to Fish: Web 2.0 as Subject and Method in Information Lit...Teaching Them (2.0) to Fish: Web 2.0 as Subject and Method in Information Lit...
Teaching Them (2.0) to Fish: Web 2.0 as Subject and Method in Information Lit...Susan Smith
 
Digital Forsyth: An NCECHO Collaborative Multi-year Digitization Project
Digital Forsyth: An NCECHO  Collaborative Multi-year  Digitization ProjectDigital Forsyth: An NCECHO  Collaborative Multi-year  Digitization Project
Digital Forsyth: An NCECHO Collaborative Multi-year Digitization ProjectSusan Smith
 
On the Road in the Deep South: A Collaborative Experiential Course in Social ...
On the Road in the Deep South: A Collaborative Experiential Course in Social ...On the Road in the Deep South: A Collaborative Experiential Course in Social ...
On the Road in the Deep South: A Collaborative Experiential Course in Social ...Susan Smith
 
Competing For Fun And Funds: The First Annual "Wake the Library 5K and fun Run
Competing For Fun And Funds: The First Annual "Wake the Library 5K and fun RunCompeting For Fun And Funds: The First Annual "Wake the Library 5K and fun Run
Competing For Fun And Funds: The First Annual "Wake the Library 5K and fun RunSusan Smith
 

Mais de Susan Smith (13)

Wake Forest University Faculty Survey 2016
Wake Forest University Faculty Survey 2016Wake Forest University Faculty Survey 2016
Wake Forest University Faculty Survey 2016
 
Z. Smith Reynolds Library - The Sutton Years 2004-2015
Z. Smith Reynolds Library - The Sutton Years 2004-2015Z. Smith Reynolds Library - The Sutton Years 2004-2015
Z. Smith Reynolds Library - The Sutton Years 2004-2015
 
What ZSR Library Does to Build Value/Sage Value Research
What ZSR Library Does to Build Value/Sage Value ResearchWhat ZSR Library Does to Build Value/Sage Value Research
What ZSR Library Does to Build Value/Sage Value Research
 
Life Lessons Learned: Through a Librarian's Lens
Life Lessons Learned: Through a Librarian's LensLife Lessons Learned: Through a Librarian's Lens
Life Lessons Learned: Through a Librarian's Lens
 
Working Together, Evolving Value for Academic Libraries/Examples from One Lib...
Working Together, Evolving Value for Academic Libraries/Examples from One Lib...Working Together, Evolving Value for Academic Libraries/Examples from One Lib...
Working Together, Evolving Value for Academic Libraries/Examples from One Lib...
 
Digital Forsyth: Through a Social Entrepreneurial Lens
Digital Forsyth: Through a Social Entrepreneurial LensDigital Forsyth: Through a Social Entrepreneurial Lens
Digital Forsyth: Through a Social Entrepreneurial Lens
 
Digital Forsyth: A Partnership/Budgeting in a Collaborative Grant
Digital Forsyth: A Partnership/Budgeting in a Collaborative GrantDigital Forsyth: A Partnership/Budgeting in a Collaborative Grant
Digital Forsyth: A Partnership/Budgeting in a Collaborative Grant
 
ZSR Library Presents: Wikis and Blogs
ZSR Library Presents: Wikis and BlogsZSR Library Presents: Wikis and Blogs
ZSR Library Presents: Wikis and Blogs
 
Bringing Information Literacy into the Social Sphere: A Case Study Using Soci...
Bringing Information Literacy into the Social Sphere: A Case Study Using Soci...Bringing Information Literacy into the Social Sphere: A Case Study Using Soci...
Bringing Information Literacy into the Social Sphere: A Case Study Using Soci...
 
Teaching Them (2.0) to Fish: Web 2.0 as Subject and Method in Information Lit...
Teaching Them (2.0) to Fish: Web 2.0 as Subject and Method in Information Lit...Teaching Them (2.0) to Fish: Web 2.0 as Subject and Method in Information Lit...
Teaching Them (2.0) to Fish: Web 2.0 as Subject and Method in Information Lit...
 
Digital Forsyth: An NCECHO Collaborative Multi-year Digitization Project
Digital Forsyth: An NCECHO  Collaborative Multi-year  Digitization ProjectDigital Forsyth: An NCECHO  Collaborative Multi-year  Digitization Project
Digital Forsyth: An NCECHO Collaborative Multi-year Digitization Project
 
On the Road in the Deep South: A Collaborative Experiential Course in Social ...
On the Road in the Deep South: A Collaborative Experiential Course in Social ...On the Road in the Deep South: A Collaborative Experiential Course in Social ...
On the Road in the Deep South: A Collaborative Experiential Course in Social ...
 
Competing For Fun And Funds: The First Annual "Wake the Library 5K and fun Run
Competing For Fun And Funds: The First Annual "Wake the Library 5K and fun RunCompeting For Fun And Funds: The First Annual "Wake the Library 5K and fun Run
Competing For Fun And Funds: The First Annual "Wake the Library 5K and fun Run
 

Último

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 

Último (20)

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 

Data Visualization and Digital Tools for the Humanities

  • 1. Data visualization and digital humanities research:  a survey of available data sets and tools LITA National Forum 2011 St. Louis, MO Friday, September 30, 2011 Erik Mitchell, University of Maryland Susan Sharpless Smith, Wake Forest University
  • 2. Motivation “Digital humanities needs gateway drugs. Kudos to the pushers on the Google Books team.” - Dan Cohen http://www.dancohen.org/2010/12/19/ “Linked open data could have the same leveraging effect that the World Wide Web had on computing, said Micki McGee, an assistant professor of sociology at Fordham University” -Steve Kolowich, The Promise of Digital Humanities, Inside HigherEd
  • 3. Birth of a word “Imagine if you could record your life, everything you said, everything you did available in a perfect memory store at your finger tips. “ - Deb Roy – The Birth of a Word http://www.ted.com/
  • 4. Overview Discuss examples of data-focused research tools Explore tools Consider roles for librarians Wrap-up/Q & A
  • 6. Searching and Discovery Examples: BYU Corpuahttp://corpus.byu.edu/ WOK Citation Mapping WOK
  • 8. Analysis and publishing NodeXLhttp://nodexl.codeplex.com/
  • 9. Tool Comparison - linguistics
  • 10. Tool exploration Discover / Search What kinds of discovery tools exist and how common are the discovery features across different datasets / systems? Visualization What visualization features exist, are there products that are easy to use, are the skills transferable? Analysis / Annotation What analytical tools are included, what analysis techniques are common?
  • 12. JSTOR Data For Research http://dfr.jstor.org
  • 13. Wordseer AditiMuralidharan Marti Hearst http://bebop.berkeley.edu/wordseer
  • 14. Google’s Ngram Viewerbooks.google.com/ngramsculturomics.org But here's the rub. Google Books, as others point out, wasn't really built for research. . . That means Google Books didn't come with the interfaces scholars need for vast data manipulation . . . http://chronicle.com/article/The-Humanities-Go-Google/65713/
  • 15. Ted talk on Google NGRAM viewer http://www.ted.com/talks/what_we_learned_from_5_million_books.html
  • 16. Concordancing Eric Lease Morgan - http://dh.crc.nd.edu/sandbox/cyl/catalog/
  • 17. Google’s public data explorer http://www.google.com/publicdata/
  • 18. Data analysis - NodeXL http://nodexl.codeplex.com/ Analyzing Social Media Networks with NodeXL: Insights from a Connected World
  • 19. Data cleaning – Google Refine http://code.google.com/p/google-refine
  • 20. Data visualization – Google Fusion Tables http://www.google.com/fusiontables/DataSource?dsrcid=332788 http://google.com/fusiontables
  • 21. Research/teaching need Researcher needs vary from advanced linguistic analysis and IT support to need for basic digital content/infrastructure Corpus-based research
  • 22. Librarian contributions Domain specific, tool-type specific comparisons IT and research support – data analysis, data curation, tool/data sources identification Shift from “reference” to “research” in sync with move from resource discovery to thematic analysis
  • 23. Next steps Build new skills, develop new systems Create tutorials guides Explore connections between data/curation and publishing and these tools – so is there a connection Explore role of library discovery systems and consider new feature implementation.

Notas do Editor

  1. Today presenting on a “summer exploration” project completed at WFUWide scope, exploratory in natureHere today to share what we foundFrom article: The Promise of Digital Humanities (Inside HigherEd) September 28, 2011They are building tools that could facilitate insights into history, language, art and culture that human researchers might never have been able to glean on their own. And some say that could help restore public interest in the humanities.Digital humanities is a hot topic this year: The NEH held a symposium on Tuesday for 60 recipients of its 2011 Digital Humanities Start-Up Grants, most of whom were given between $25,000 and $50,000. digital humanities -- a branch of scholarship that takes the computational rigor that has long undergirded the sciences and applies it the study of history, language, art and culture.
  2. Got interested because WFU faculty were talking about DH researchWe saw lots of enthusiasm but little knowledge about what really existed. Story – different definitions, WFU DH Institute, Computational Humanities, linguisticsPoint – it is clear that the field has energy and that DH is focusing on the same structures and information tools as libraries
  3. discuss how data and computational power is sexyWe pause on this video for a moment to mention this video specifically (Deb Roy, 20 minutes)Focuses on the impact of large scale data collection and cross analysis“Imagine if you could record your life, everything you said, everything you did available in a perfect memory store at your finger tips”Picture shows connection between a televised moment (Obama's State of the Union Speech) on the bottom of the screen and all of the social media conversations happening in real time at the top of the screen. Network graph - wider view of experience, understand ideas from more than one perspectivePoint –Consider the impact if librarians could help students and researchers begin this type of data analysis
  4. going to present some examples of “data” focused research tools. Definition: Databases that allow asking research questions focused on dataWe are going to explore tools that fit three functions – searching/discovery, visualization, analysis/publishConsider how these tools could impact teaching and researchConsider the roles that librarians can play in this field
  5. Goal in this chart is to introduce the types of tools – Show how they complement each otherDiscoveryText searchingCitation chaining: tracing citations both forward and backward, something core to academic research, WOK (citation mapping give a visual of this idea), and this data can be exported.Concept exploration, facets and contextual metadataVisualization (for both presentation and behind the scenes)MappingGraphingChartingData cleanup and normalizationAnalysis/PublishingDataset publishing:Statistical analysisAnnotation (tagging text), drilling in, inverse Be aware that there is overlap among the groups.
  6. Discuss types of discoveryCorpus (collection of written texts) exploration – Full text, linguistic components, concepts (copa, coca, ngram . .) examples at: http://corpus.byu.eduBibliometrics – citation trees (Web of Knowledge, DFR)Bibliometrics is a set of methods used to study or measure texts and information. Citation analysis and content analysis are commonly used bibliometric methods.Used to study impact, of researchers, papers, journals, academic output (Eigenfactor recommends) new project coming outMetadata – structured data on any topic (Google public data, GIS)Hybrid – JSTOR DFR (Data for Research) is a good example – it includes full text searching and metadata limiting and bibliometrics
  7. Purpose:main goal of data visualization is to communicate information clearly and effectively through graphical meansMany free tools are available for visualization (link on slide)Purpose of these tools is to provide visualization and data exploration platforms – Nodexl is an excel plug in for windowsTypes of visualizationData cleaningData analysisGraphical representations of data: ie: table, map, heatmap, line chart, bar graph, pie chart, scatter plot, timeline, storyline or motion (animation over time) (Google Fusion Tablesdoes all these)One example using GIS http://inside.uidaho.edu/ Google Fusion Table: http://www.google.com/fusiontables/Home
  8. These tools allow statistical analysis of data or provide a platform for visualization or publishing.Great understandable example is the Google public data explorerWe will look at this in a few minutes
  9. Second thing we did was explore. We tried to compare linguistic tools:Article - Literary & Linguistic Computing as a journal – Corpus design criteria – Volume 7, 1992How we explored – interviews, datasets, tools, Focused on linguisticsGoal of this slide is to talk through one comparison exercise: corpus.byu.eduCorpus of contemporary americanenglish, Google Books, Brisish National CorpusFindings – lack of consistency, new search features, Need here is for published comparative documentsAll familiar but different context Word frequency, concordancing, lemmatization (roots), semantic and syntactic relationships, kwic, sense disambiguation, links, population scope (open closed), randomPoint – librarians already know what these tools can do to an extent.word frequencyconcordancingIndex of words in text, often shown in context of sentence structureAbility to search/lemmatizationSearching words using rootsSemantic relationshipsDerived relationship (e.g. is done by, is described as)Syntactic relationships part of speech labelingSentence decomposition (Stanford parser)collocationKWIC, sequence of words that are taken togethersense disambiguation(e.g. run, running, ran)link to lexical databasedictionary of words - http://wordnet.princeton.edu/how is population defined?Is the corpus open our closed? Was it a random sample, a limited text source? What impact does that have on generalization?synchronic/diachronicDoes the corpus focus on a "point in time" or change over time?monolingual/bilingual/plurilingualWhat languages are represented?
  10. Nowwe are going to explore some toolsWe grouped into three areas: discovery, visualization, and analysisWe have included some questions that we asked as we explored
  11. A tool that I did not know about until recently is Perseus, mentioned in the project bambooDigital humanities research tool at Tufts.Listened to David Mimnofrom Princeton talk about Computational HumanitiesSpanning between distant and close readingDavid was the head programmer for Perseus for many yearsFeatures includeDirect access to text searchingAbility to explore connections between documents – lexicons, concordances (alphabetical list of words)See position of text in larger collection
  12. At the talk I was at, David also talked about his work on computational topic modeling using JSTOR dataIt is an interesting talk – you can find it at mith.umd.edu under digital dialogues – recap his ideaDavid’s idea was – if you analyze all of the text in a specific set of journals (Classics journals), you canSee changes in topics and language over time – he found that in the 1980s the two fields of philology (books) and archaeology converged in some journalsGenerate topics that show granular ‘aboutness’ – Some interesting discussions about value of human vs computing modelsExplore aboutness not from a qualitative ‘hunch’ but from statistical comparisonDemo – I want to see what topics academics have explored with janeausten1. dfr.jstor.org, login2. You can search, view chart data or view citations3. you can export although by default you are limited to 1000 records4. I searched for janeausten, limited to research articles, limited to subject language and literature5. I then downloaded the data >> data requests > submit new request6. Download key terms, csv -> janeaustenkeyterms7. check email, wait, download
  13. From another presentation at UMD MITH – AditiMuralidharanThis is a highly focused corpus database that includes semantic relationship analysis, visualization tools, and data annotationNeat hybrid systemWordseer focuses on Slave NarrativesDiscovery, Annotation, VisualizationSemantic relationshipsDemo: link to itExamples > god, point to chartAdd blessClick on heat maps, or read/annotate
  14. Google NGRAM I expect we are familiar with the NGRAM viewer http://books.google.com/ngramsWork by Jean-Baptiste Michel and lots of others2009 snapshot, 5.2 Million books, English, French, German, Hebrew, Russian, Spanish, ChineseBest data is between 1800 and 2000Searching – date, phrase, language, smoothing (Average of occurrence over years), ngrams (how far from other words it is – within 2, 3 4) Discover trends – for instance, while the concept of “good cats” has remained steady (but limited), there has been diminishing focus on ”good dogs” in the 20th century. Does this point to a disturbing trend in dog goodness?But be careful – culturomics.org – what does this data say – Paper by jean-baptisemichel, and lots of other folks “quantitative analysis of culture using millions of books”
  15. In fact there was a recent ted talk on the ngram viewer. In 15 minutes it gives a good overview of the background and uses of the system
  16. We found innumerable tools for processing!Eric Lease Morgan at Notre Dame has done some interesting work in this area and has released his Lingua perl modules for processingThere are other methods – Stanford parser for example offers these toolsDeveloped concordancing software Available in cpanGreat iPad demo hereHis data is from internet archive – interesting source for data for harvesting and analysisYou can see he focuses on some other specific search methdodsPoint of this one – wordseer and cathlolic – both special collection focused, different research tools availableProblem – this proves to be very confusing for people trying to practice a research method across multiple data sets.
  17. Google Public data explorerA visualization tool that animates so you can see change over time. You also can embed charts into your website (link icon in upper right corner)Over 40 data sets currently uploaded and ready to use.Allows simple visualization tools to be applied to any datasetQuick Demo of unemployment rateDo the search, Show how you limit to resultsViews: Line Chart, bar chart, map, bubble chart
  18. NodeXLis a tool to display and analyze data through a network graph. It is open source, windows only and is a Excel template.Specifically,NodeXL was designed especially to facilitate learning the concepts and methods of social network analysis with visualization as a key componentWhat can you do?*Easily* customize the graph’s appearance; zoom, scale and pan the graph; dynamically filter vertices and edges; alter the graph’s layout; find clusters of related vertices; and calculate graph metrics.What I like is that I could use it quickly by importing data. There are built-in connections for getting networks from Twitter, Flickr, YouTube, and your local email are provided.  Additional importers for Exchange Email, Facebook, and Hyperlink networks are available here.There is a 47 page tutorial, which was a good indication that it is not totally intuitive to learn, however it has good flexibility
  19. We also found a number of data clearning and tools. There is a great site digitalresearchtools.pbworks.com that lists a lot o these toolsGoogle refine runs in Chrome, it supports up to 200K rows – which is actually not that much when we get to humanities data1. goto erikmitchell.dyndns.org:3333 - explain what you are doing2. I downloaded key terms from jstor doing a search of janeausten3. I imported the file using defaults4. It imported weight and key terms5. Weight is the relevance or centrality to document (e.g. every document has a term with rank 1)6. Lets say I just want to see the central words7. weight, facet, numeric facet8. limit to .98-19. You can see this drops the matching rows10. Now lets say I want to see how many times each of these key terms is used11. keyterms -> facet -> text facet12. sort by count13. You can include and exclude, perform other data analysis,etc.If this is interesting there are some good quick video tutorials on the siteModify for XML or wiki publishing formats
  20. Google refine is also designed to work with their visualization tools. We showed public data explorer,There is also Google fusion tablesFusion tables makes it very easy to connect and explore dataHere is one link
  21. So what did we find.We found lots of tools, lots of uses, lots of dataWe ultimately decided that there is a strong research and teaching needslide is to talk about the data focused research activities that we found researchers engaged in.A second part of our project was to explore research needsWidely varied – some statistical , some linguistic, some just wanted to digitize stuffJerid is actually doing research on movie subtitles and translation . .not sure what this is. .Focus on - http://francojc.wordpress.com/List of publications from corpus-based research:http://corpus.byu.edu/publicationSearch.asp
  22. We also found that there are areas for us to contributeConversational.One BYU comparison:http://googlebooks.byu.edu/compare-googleBooks.asp Compares “possible” and “not possible” for the following functionality:Exact word and phrasesRelated words and cultural insightsSearching for conceptsChanges in meaningCollates (nearby words) s and cultural shiftsFunction of wordsGrammatical changesLanguage change and genreA tool to locate research data is being developed by Purdue Libraries (Michael Witt) and Penn State: Databib. The goal is “to create a community-driven, annotated bibliography of research data repositories”http://databib.lib.purdue.edu/
  23. Next stepsfirst - librarians already understand metadata interoperability and harvesting, we should expand our understanding of these fields to include full text data and develop toolkits to facilitate harvesting and meshing of research data from different sources. This includes tools like the Stanford NLP parser (Stanford Parsernlp.stanford.edu/software/lex-parser.shtml ) , a tool that facilitates the coding and parsing of text data.Second - librarians understand searching across multiple systems - we need to build on this skill by honing our abilities to perform content anslys and generalize results.Third - We need to better understand the landscape of research data. This means understanding types of data set and sources of data. It also means having thea ability to crosswalk data between databases. It also means getting past resource disovery and into resource analysisFourth - we need qualitative and quantitative research skills - we ned to be able to help researchers know when they have a representative sample, how to harvest, code and analyze that dataFifth - We bring a multi-disciplinary understanding of domains of knowledge - we need to leverage that familiarity with active research agendas.Story here is about that hathitrustserach in summon and in oclc. These searching platforms are tying to leverage book fulltext in a new way but what else could they do?
  24. Can we add the list of tools? https://digitalresearchtools.pbworks.com/w/page/17801672/FrontPage