SlideShare a Scribd company logo
1 of 50
Bits of Research
Dr. Michele C. Weigle
Graduate Program Director
Web Sciences and Digital Libraries (WS-DL) Lab
Department of Computer Science
Old Dominion University
June 26, 2014
Outline
• Bits of Research
– digital preservation and web archiving
– information visualization
• Potential Summer Projects
– Blue Button visualization (health informatics)
– visualizing aggregate health data
– exploring large document collections
2June 26, 2014
ODU's WS-DL Group
Web Sciences and Digital Libraries
– digital preservation
– web archiving
– web science (social media analysis, web usage analysis)
• Our recent work has been featured in the popular press
3June 26, 2014
@WebSciDL
http://ws-dl.cs.odu.edu
What is a web archive?
4June 26, 2014
What are some web archives?
5June 26, 2014
How can I access the archives?
http://www.mementoweb.org/
MementoFox
Memento for Chrome
http://ws-dl.blogspot.com/2010/03/2010-03-19-mementofox-add-on-released.html
http://ws-dl.blogspot.com/2013/10/2013-10-14-right-click-to-past-memento.html
6June 26, 2014
The Web holds our stories
7June 26, 2014
But webpages can disappear
• Average lifespan of a webpage - 50-100 days
• A year after publication, about 11% of content
shared on social media will be gone.
SalahEldeen and Nelson, "Losing My Revolution: How Many Resources Shared on Social Media Have Been Lost?", TPDL 2012
http://ws-dl.blogspot.com/2012/02/2012-02-11-losing-my-revolution-year.html
8June 26, 2014
But maybe it's archived
Ainsworth, AlSum, SalahEldeen, Weigle, and Nelson, "How Much of the Web is Archived?", JCDL 2011
http://ws-dl.blogspot.com/2011/06/2011-06-23-how-much-of-web-is-archived.html
9June 26, 2014
But social media is hard to archive
10June 26, 2014
Our Research Group Goals
• We believe that web archives are valuable
cultural resources, and we want everyone to
know about them.
• We want to make it easy for people to bridge
the gap between the live web and the archives.
• We believe that replaying the past is more
compelling than reading a summary.
11June 26, 2014
vs.
13June 26, 2014
Replaying the past can be
more compelling than just a
summary
26June 26, 2014
Overview of Current Projects
• What is archived?
– how much of the web is archived?
– how much of the Arabic web is archived?
– what domains does each archive hold?
• How do people use web archives?
– access patterns for humans and robots
– what languages do they look for?
• How well are web pages archived?
– archivability of webpages
– damage of archived pages
– tools to allow users to create their own archives
• Improving access for users
– telling stories with web archives
– archive visualization
June 26, 2014 27
How do people use web archives?
• We obtained a year's worth (2012) of requests
to the Internet Archive's Wayback Machine
– client IPs anonymized
28June 26, 2014
How do people use web archives?
• First, there are a lot of robots (aka bots) who
access the archive
– 10 bot sessions for every 1 human session
– maybe people don't know about the archive?
• Typical human sessions are pretty short
– people aren't spending lots of time in the archive
AlNoamany, Weigle, and Nelson, "Access Patterns for Robots and Humans in Web Archives", JCDL 2013
29June 26, 2014
How do people use web archives?
• 65% of the requested archived pages no longer
exist on the live web
• People use the archive because the pages they
are interested in no longer exist
AlNoamany, AlSum, Weigle, and Nelson, "Who and What Links to the Internet Archive", IJDL, to appear, 2013
30June 26, 2014
User Access Patterns
AlNoamany, Weigle, and Nelson, "Access Patterns for Robots and Humans in Web Archives", JCDL 2013
31June 26, 2014
Everybody Dips, Humans Dive, Robots Skim
Robots (34,203 sessions) Humans (3,431 sessions)
AlNoamany, Weigle, and Nelson, "Access Patterns for Robots and Humans in Web Archives", JCDL 2013
32June 26, 2014
What domains does each archive hold?
AlSum, Weigle, Nelson and Van de Sompel, "Profiling Web Archive Coverage for Top-Level Domain and Content Language," TPDL 2013.
33June 26, 2014
What domains does each archive hold?
AlSum, Weigle, Nelson and Van de Sompel, "Profiling Web Archive Coverage for Top-Level Domain and Content Language," TPDL 2013.
34June 26, 2014
http://ws-dl.blogspot.com/2012/10/2012-10-10-zombies-in-archives.html
Sept 3, 2008
2012
Sometimes the live web "leaks" into
the archive
35June 26, 2014
Archive What I See Now
• Standard web archiving tools are
difficult for non IT experts.
• "Save Page As" is not suitable
for archiving purposes.
• Pages are behind authentication.
• Pages change quickly, but
current state needs archiving.
June 26, 2014 36
Archive
What I
See
Now
http://bit.ly/wc-wail
WARCreate and WAIL
• WARCreate
– generate archive of any
webpage, right from your
browser
• WAIL
– gathers institutional-strength
archiving tools into simple
package/GUI
– run your own Wayback
Machine
June 26, 2014 37
http://bit.ly/wc-wail
Outline
• Bits of Research
– digital preservation and web archiving
– information visualization
• Potential Summer Projects
– Blue Button visualization (health informatics)
– visualizing aggregate health data
– exploring large document collections
38June 26, 2014
What is Information Visualization?
• "The purpose of information
visualization is insight, not
pictures" -Ben Shneiderman
• Interactive representation of
non-physically based data
• Concerned with how people
think, human perception
• Has ties to cognitive science,
psychology, data mining
June 26, 2014 39
Info Vis Research Is Collaborative
• Must have data and problems to solve
• Must work with experts in other domains
– medical doctors - hearing data analysis
– social scientists - welfare data record linkage
– technology analysts - trends in scientific research
– web archivists - viewing collections of web
archives
June 26, 2014 40
Navy Hearing Conservation
Program
June 26, 2014 41
FluNet Visualization
June 26, 2014 42
Academic Paper Browser
June 26, 2014 43
MeSH Viewer
June 26, 2014 44
Visualizing Currency Volatility
June 26, 2014 45
Outline
• Bits of Research
– digital preservation and web archiving
– information visualization
• Potential Summer Projects
– Blue Button visualization (health informatics)
– visualizing aggregate health data
– exploring large document collections
46June 26, 2014
Blue Button+ Visualization
• Blue Button - allows patients
to download their health data
• Goals:
– perform research into previous
approaches to visualizing Blue
Button data
– begin the development of a
novel visualization of Blue
Button data
June 26, 2014 47
http://blue-button.github.io/bbClear/bluebutton.html
Visualizing Aggregate Health Data
• Aggregate health data
– statistics for a large
number of people (by
country, US state, US
county, etc.)
• Goal:
– become familiar with
using public datasets
and web-based
visualization tools
June 26, 2014 48
http://www.ers.usda.gov/data-products/food-environment-atlas/go-
to-the-atlas.aspx#.U6hHpY1dUhs
Exploring Large Document Collections
• CORE
– API access to a large
number of Open Access
academic publication
repositories
• Goals:
– investigate the use of the
CORE API
– begin the implementation of
tools that could be used in
exploring large document
collections
June 26, 2014 49
http://www.dlib.org/dlib/july12/herrmannova/07herrmannova.html
Outline
• Bits of Research
– digital preservation and web archiving
– information visualization
• Potential Summer Projects
– Blue Button visualization (health informatics)
– visualizing aggregate health data
– exploring large document collections
50June 26, 2014

More Related Content

What's hot

Interoperability and Its Role In Standardization, Plus A ResourceSync Overview
Interoperability and Its Role In Standardization, Plus A ResourceSync OverviewInteroperability and Its Role In Standardization, Plus A ResourceSync Overview
Interoperability and Its Role In Standardization, Plus A ResourceSync OverviewPeter Murray
 
Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)Olaf Hartig
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Michael Nelson
 
Blogging/RSS in Education
Blogging/RSS in EducationBlogging/RSS in Education
Blogging/RSS in EducationE Robertson
 
Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunitiesAhmed AlSum
 
LITA Preconference: Getting Started with Drupal (handout)
LITA Preconference: Getting Started with Drupal (handout)LITA Preconference: Getting Started with Drupal (handout)
LITA Preconference: Getting Started with Drupal (handout)Rachel Vacek
 
BHIS Faculty Library Refresher 2016
BHIS Faculty Library Refresher 2016BHIS Faculty Library Refresher 2016
BHIS Faculty Library Refresher 2016jamieduic
 

What's hot (9)

Interoperability and Its Role In Standardization, Plus A ResourceSync Overview
Interoperability and Its Role In Standardization, Plus A ResourceSync OverviewInteroperability and Its Role In Standardization, Plus A ResourceSync Overview
Interoperability and Its Role In Standardization, Plus A ResourceSync Overview
 
Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)Answers to usual issues in getting started with consuming Linked Data (2010)
Answers to usual issues in getting started with consuming Linked Data (2010)
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
 
Library 2.0
Library 2.0Library 2.0
Library 2.0
 
Blogging/RSS in Education
Blogging/RSS in EducationBlogging/RSS in Education
Blogging/RSS in Education
 
Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunities
 
LITA Preconference: Getting Started with Drupal (handout)
LITA Preconference: Getting Started with Drupal (handout)LITA Preconference: Getting Started with Drupal (handout)
LITA Preconference: Getting Started with Drupal (handout)
 
BHIS Faculty Library Refresher 2016
BHIS Faculty Library Refresher 2016BHIS Faculty Library Refresher 2016
BHIS Faculty Library Refresher 2016
 
Open Education Resources in Libraries
Open Education Resources in LibrariesOpen Education Resources in Libraries
Open Education Resources in Libraries
 

Similar to Bits of Research

Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)TimelessFuture
 
Chaos&Order: Using visualization as a means to
 explore large heritage collec...
Chaos&Order: Using visualization as a means to
 explore large heritage collec...Chaos&Order: Using visualization as a means to
 explore large heritage collec...
Chaos&Order: Using visualization as a means to
 explore large heritage collec...TimelessFuture
 
Advancing access to information - together
Advancing access to information - togetherAdvancing access to information - together
Advancing access to information - togetherIna Smith
 
Presentation - First International Library Staff Exchange Week, Zagreb
Presentation - First International Library Staff Exchange Week, ZagrebPresentation - First International Library Staff Exchange Week, Zagreb
Presentation - First International Library Staff Exchange Week, ZagrebIva Vrkic
 
Moving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & howMoving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & howDavid T Palmer
 
"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with Archivematica"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with ArchivematicaJenny Mitcham
 
Building Research Environments Online
Building Research Environments OnlineBuilding Research Environments Online
Building Research Environments OnlineDeb Verhoeven
 
Blending in-person and online library services by utilizing mobile technology
Blending in-person and online library services by utilizing mobile technologyBlending in-person and online library services by utilizing mobile technology
Blending in-person and online library services by utilizing mobile technologyJason Casden
 
Researching Researchers: Avalon's Repository Usage
Researching Researchers: Avalon's Repository UsageResearching Researchers: Avalon's Repository Usage
Researching Researchers: Avalon's Repository UsageAvalon Media System
 
Rebecca Grant DAH Research Presentation
Rebecca Grant DAH Research PresentationRebecca Grant DAH Research Presentation
Rebecca Grant DAH Research Presentationdri_ireland
 
Emerging Technologies for Libraries and Librarians, 2013
Emerging Technologies for Libraries and Librarians, 2013Emerging Technologies for Libraries and Librarians, 2013
Emerging Technologies for Libraries and Librarians, 2013Jennifer Baxmeyer
 
10.2.14 Slides: “Doing It: Research Results on Non-ARL Academic Libraries Man...
10.2.14 Slides: “Doing It: Research Results on Non-ARL Academic Libraries Man...10.2.14 Slides: “Doing It: Research Results on Non-ARL Academic Libraries Man...
10.2.14 Slides: “Doing It: Research Results on Non-ARL Academic Libraries Man...DuraSpace
 
20160414 23 Research Data Things
20160414 23 Research Data Things20160414 23 Research Data Things
20160414 23 Research Data ThingsKatina Toufexis
 
WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)
WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)
WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)TimelessFuture
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open researchheila1
 

Similar to Bits of Research (20)

NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...
NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...
NISO Virtual Conference: Web-Scale Discovery Services: Transforming Access to...
 
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)Towards Research Engines: Supporting Search Stages in Web Archives (2015)
Towards Research Engines: Supporting Search Stages in Web Archives (2015)
 
Chaos&Order: Using visualization as a means to
 explore large heritage collec...
Chaos&Order: Using visualization as a means to
 explore large heritage collec...Chaos&Order: Using visualization as a means to
 explore large heritage collec...
Chaos&Order: Using visualization as a means to
 explore large heritage collec...
 
Advancing access to information - together
Advancing access to information - togetherAdvancing access to information - together
Advancing access to information - together
 
Final Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational ResearchFinal Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational Research
 
Presentation - First International Library Staff Exchange Week, Zagreb
Presentation - First International Library Staff Exchange Week, ZagrebPresentation - First International Library Staff Exchange Week, Zagreb
Presentation - First International Library Staff Exchange Week, Zagreb
 
Moving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & howMoving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & how
 
"In the Early Days of a Better Nation": Enhancing the power of metadata today...
"In the Early Days of a Better Nation": Enhancing the power of metadata today..."In the Early Days of a Better Nation": Enhancing the power of metadata today...
"In the Early Days of a Better Nation": Enhancing the power of metadata today...
 
"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with Archivematica"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with Archivematica
 
Sgci iwsg-a-10-10-16
Sgci iwsg-a-10-10-16Sgci iwsg-a-10-10-16
Sgci iwsg-a-10-10-16
 
Building Research Environments Online
Building Research Environments OnlineBuilding Research Environments Online
Building Research Environments Online
 
Blending in-person and online library services by utilizing mobile technology
Blending in-person and online library services by utilizing mobile technologyBlending in-person and online library services by utilizing mobile technology
Blending in-person and online library services by utilizing mobile technology
 
Researching Researchers: Avalon's Repository Usage
Researching Researchers: Avalon's Repository UsageResearching Researchers: Avalon's Repository Usage
Researching Researchers: Avalon's Repository Usage
 
Rebecca Grant DAH Research Presentation
Rebecca Grant DAH Research PresentationRebecca Grant DAH Research Presentation
Rebecca Grant DAH Research Presentation
 
Emerging Technologies for Libraries and Librarians, 2013
Emerging Technologies for Libraries and Librarians, 2013Emerging Technologies for Libraries and Librarians, 2013
Emerging Technologies for Libraries and Librarians, 2013
 
10.2.14 Slides: “Doing It: Research Results on Non-ARL Academic Libraries Man...
10.2.14 Slides: “Doing It: Research Results on Non-ARL Academic Libraries Man...10.2.14 Slides: “Doing It: Research Results on Non-ARL Academic Libraries Man...
10.2.14 Slides: “Doing It: Research Results on Non-ARL Academic Libraries Man...
 
20160414 23 Research Data Things
20160414 23 Research Data Things20160414 23 Research Data Things
20160414 23 Research Data Things
 
WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)
WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)
WebART: Facilitating Scholarly Use of Web Archives (IIPC, Apr. 2013)
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open research
 
Ebi
EbiEbi
Ebi
 

More from Michele Weigle

Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...Michele Weigle
 
WS-DL’s Work towards Enabling Personal Use of Web Archives
WS-DL’s Work towards Enabling Personal Use of Web ArchivesWS-DL’s Work towards Enabling Personal Use of Web Archives
WS-DL’s Work towards Enabling Personal Use of Web ArchivesMichele Weigle
 
Intro to Web Archiving
Intro to Web ArchivingIntro to Web Archiving
Intro to Web ArchivingMichele Weigle
 
Enabling Personal Use of Web Archives
Enabling Personal Use of Web ArchivesEnabling Personal Use of Web Archives
Enabling Personal Use of Web ArchivesMichele Weigle
 
Visualizing Webpage Changes Over Time
Visualizing Webpage Changes Over TimeVisualizing Webpage Changes Over Time
Visualizing Webpage Changes Over TimeMichele Weigle
 
How to Write an Academic Paper
How to Write an Academic PaperHow to Write an Academic Paper
How to Write an Academic PaperMichele Weigle
 
How to Prepare and Give and Academic Presentation
How to Prepare and Give and Academic PresentationHow to Prepare and Give and Academic Presentation
How to Prepare and Give and Academic PresentationMichele Weigle
 
My Academic Story via Internet Archive
My Academic Story via Internet ArchiveMy Academic Story via Internet Archive
My Academic Story via Internet ArchiveMichele Weigle
 
A Retasking Framework For Wireless Sensor Networks
A Retasking Framework For Wireless Sensor NetworksA Retasking Framework For Wireless Sensor Networks
A Retasking Framework For Wireless Sensor NetworksMichele Weigle
 
Strategies for Sensor Data Aggregation in Support of Emergency Response
Strategies for Sensor Data Aggregation in Support of Emergency ResponseStrategies for Sensor Data Aggregation in Support of Emergency Response
Strategies for Sensor Data Aggregation in Support of Emergency ResponseMichele Weigle
 
Detecting Off-Topic Web Pages at #CUWARC
Detecting Off-Topic Web Pages at #CUWARCDetecting Off-Topic Web Pages at #CUWARC
Detecting Off-Topic Web Pages at #CUWARCMichele Weigle
 
Energy Harvesting-aware Design for Wireless Nanonetworks
Energy Harvesting-aware Design for Wireless NanonetworksEnergy Harvesting-aware Design for Wireless Nanonetworks
Energy Harvesting-aware Design for Wireless NanonetworksMichele Weigle
 
2015-capwic-gradschool
2015-capwic-gradschool2015-capwic-gradschool
2015-capwic-gradschoolMichele Weigle
 
2015-odu-ece-tools-for-past-web
2015-odu-ece-tools-for-past-web2015-odu-ece-tools-for-past-web
2015-odu-ece-tools-for-past-webMichele Weigle
 
Tools for Managing the Past Web
Tools for Managing the Past WebTools for Managing the Past Web
Tools for Managing the Past WebMichele Weigle
 
Archive What I See Now - 2014 NEH ODH Overview
Archive What I See Now - 2014 NEH ODH OverviewArchive What I See Now - 2014 NEH ODH Overview
Archive What I See Now - 2014 NEH ODH OverviewMichele Weigle
 
Telling Stories with Web Archives
Telling Stories with Web ArchivesTelling Stories with Web Archives
Telling Stories with Web ArchivesMichele Weigle
 
"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overviewMichele Weigle
 
TDMA Slot Reservation in Cluster-Based VANETs
TDMA Slot Reservation in Cluster-Based VANETsTDMA Slot Reservation in Cluster-Based VANETs
TDMA Slot Reservation in Cluster-Based VANETsMichele Weigle
 
Visualizing Digital Collections at Archive-It
Visualizing Digital Collections at Archive-ItVisualizing Digital Collections at Archive-It
Visualizing Digital Collections at Archive-ItMichele Weigle
 

More from Michele Weigle (20)

Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language W...
 
WS-DL’s Work towards Enabling Personal Use of Web Archives
WS-DL’s Work towards Enabling Personal Use of Web ArchivesWS-DL’s Work towards Enabling Personal Use of Web Archives
WS-DL’s Work towards Enabling Personal Use of Web Archives
 
Intro to Web Archiving
Intro to Web ArchivingIntro to Web Archiving
Intro to Web Archiving
 
Enabling Personal Use of Web Archives
Enabling Personal Use of Web ArchivesEnabling Personal Use of Web Archives
Enabling Personal Use of Web Archives
 
Visualizing Webpage Changes Over Time
Visualizing Webpage Changes Over TimeVisualizing Webpage Changes Over Time
Visualizing Webpage Changes Over Time
 
How to Write an Academic Paper
How to Write an Academic PaperHow to Write an Academic Paper
How to Write an Academic Paper
 
How to Prepare and Give and Academic Presentation
How to Prepare and Give and Academic PresentationHow to Prepare and Give and Academic Presentation
How to Prepare and Give and Academic Presentation
 
My Academic Story via Internet Archive
My Academic Story via Internet ArchiveMy Academic Story via Internet Archive
My Academic Story via Internet Archive
 
A Retasking Framework For Wireless Sensor Networks
A Retasking Framework For Wireless Sensor NetworksA Retasking Framework For Wireless Sensor Networks
A Retasking Framework For Wireless Sensor Networks
 
Strategies for Sensor Data Aggregation in Support of Emergency Response
Strategies for Sensor Data Aggregation in Support of Emergency ResponseStrategies for Sensor Data Aggregation in Support of Emergency Response
Strategies for Sensor Data Aggregation in Support of Emergency Response
 
Detecting Off-Topic Web Pages at #CUWARC
Detecting Off-Topic Web Pages at #CUWARCDetecting Off-Topic Web Pages at #CUWARC
Detecting Off-Topic Web Pages at #CUWARC
 
Energy Harvesting-aware Design for Wireless Nanonetworks
Energy Harvesting-aware Design for Wireless NanonetworksEnergy Harvesting-aware Design for Wireless Nanonetworks
Energy Harvesting-aware Design for Wireless Nanonetworks
 
2015-capwic-gradschool
2015-capwic-gradschool2015-capwic-gradschool
2015-capwic-gradschool
 
2015-odu-ece-tools-for-past-web
2015-odu-ece-tools-for-past-web2015-odu-ece-tools-for-past-web
2015-odu-ece-tools-for-past-web
 
Tools for Managing the Past Web
Tools for Managing the Past WebTools for Managing the Past Web
Tools for Managing the Past Web
 
Archive What I See Now - 2014 NEH ODH Overview
Archive What I See Now - 2014 NEH ODH OverviewArchive What I See Now - 2014 NEH ODH Overview
Archive What I See Now - 2014 NEH ODH Overview
 
Telling Stories with Web Archives
Telling Stories with Web ArchivesTelling Stories with Web Archives
Telling Stories with Web Archives
 
"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview
 
TDMA Slot Reservation in Cluster-Based VANETs
TDMA Slot Reservation in Cluster-Based VANETsTDMA Slot Reservation in Cluster-Based VANETs
TDMA Slot Reservation in Cluster-Based VANETs
 
Visualizing Digital Collections at Archive-It
Visualizing Digital Collections at Archive-ItVisualizing Digital Collections at Archive-It
Visualizing Digital Collections at Archive-It
 

Recently uploaded

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 

Recently uploaded (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 

Bits of Research

  • 1. Bits of Research Dr. Michele C. Weigle Graduate Program Director Web Sciences and Digital Libraries (WS-DL) Lab Department of Computer Science Old Dominion University June 26, 2014
  • 2. Outline • Bits of Research – digital preservation and web archiving – information visualization • Potential Summer Projects – Blue Button visualization (health informatics) – visualizing aggregate health data – exploring large document collections 2June 26, 2014
  • 3. ODU's WS-DL Group Web Sciences and Digital Libraries – digital preservation – web archiving – web science (social media analysis, web usage analysis) • Our recent work has been featured in the popular press 3June 26, 2014 @WebSciDL http://ws-dl.cs.odu.edu
  • 4. What is a web archive? 4June 26, 2014
  • 5. What are some web archives? 5June 26, 2014
  • 6. How can I access the archives? http://www.mementoweb.org/ MementoFox Memento for Chrome http://ws-dl.blogspot.com/2010/03/2010-03-19-mementofox-add-on-released.html http://ws-dl.blogspot.com/2013/10/2013-10-14-right-click-to-past-memento.html 6June 26, 2014
  • 7. The Web holds our stories 7June 26, 2014
  • 8. But webpages can disappear • Average lifespan of a webpage - 50-100 days • A year after publication, about 11% of content shared on social media will be gone. SalahEldeen and Nelson, "Losing My Revolution: How Many Resources Shared on Social Media Have Been Lost?", TPDL 2012 http://ws-dl.blogspot.com/2012/02/2012-02-11-losing-my-revolution-year.html 8June 26, 2014
  • 9. But maybe it's archived Ainsworth, AlSum, SalahEldeen, Weigle, and Nelson, "How Much of the Web is Archived?", JCDL 2011 http://ws-dl.blogspot.com/2011/06/2011-06-23-how-much-of-web-is-archived.html 9June 26, 2014
  • 10. But social media is hard to archive 10June 26, 2014
  • 11. Our Research Group Goals • We believe that web archives are valuable cultural resources, and we want everyone to know about them. • We want to make it easy for people to bridge the gap between the live web and the archives. • We believe that replaying the past is more compelling than reading a summary. 11June 26, 2014
  • 12.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. Replaying the past can be more compelling than just a summary 26June 26, 2014
  • 27. Overview of Current Projects • What is archived? – how much of the web is archived? – how much of the Arabic web is archived? – what domains does each archive hold? • How do people use web archives? – access patterns for humans and robots – what languages do they look for? • How well are web pages archived? – archivability of webpages – damage of archived pages – tools to allow users to create their own archives • Improving access for users – telling stories with web archives – archive visualization June 26, 2014 27
  • 28. How do people use web archives? • We obtained a year's worth (2012) of requests to the Internet Archive's Wayback Machine – client IPs anonymized 28June 26, 2014
  • 29. How do people use web archives? • First, there are a lot of robots (aka bots) who access the archive – 10 bot sessions for every 1 human session – maybe people don't know about the archive? • Typical human sessions are pretty short – people aren't spending lots of time in the archive AlNoamany, Weigle, and Nelson, "Access Patterns for Robots and Humans in Web Archives", JCDL 2013 29June 26, 2014
  • 30. How do people use web archives? • 65% of the requested archived pages no longer exist on the live web • People use the archive because the pages they are interested in no longer exist AlNoamany, AlSum, Weigle, and Nelson, "Who and What Links to the Internet Archive", IJDL, to appear, 2013 30June 26, 2014
  • 31. User Access Patterns AlNoamany, Weigle, and Nelson, "Access Patterns for Robots and Humans in Web Archives", JCDL 2013 31June 26, 2014
  • 32. Everybody Dips, Humans Dive, Robots Skim Robots (34,203 sessions) Humans (3,431 sessions) AlNoamany, Weigle, and Nelson, "Access Patterns for Robots and Humans in Web Archives", JCDL 2013 32June 26, 2014
  • 33. What domains does each archive hold? AlSum, Weigle, Nelson and Van de Sompel, "Profiling Web Archive Coverage for Top-Level Domain and Content Language," TPDL 2013. 33June 26, 2014
  • 34. What domains does each archive hold? AlSum, Weigle, Nelson and Van de Sompel, "Profiling Web Archive Coverage for Top-Level Domain and Content Language," TPDL 2013. 34June 26, 2014
  • 36. Archive What I See Now • Standard web archiving tools are difficult for non IT experts. • "Save Page As" is not suitable for archiving purposes. • Pages are behind authentication. • Pages change quickly, but current state needs archiving. June 26, 2014 36 Archive What I See Now http://bit.ly/wc-wail
  • 37. WARCreate and WAIL • WARCreate – generate archive of any webpage, right from your browser • WAIL – gathers institutional-strength archiving tools into simple package/GUI – run your own Wayback Machine June 26, 2014 37 http://bit.ly/wc-wail
  • 38. Outline • Bits of Research – digital preservation and web archiving – information visualization • Potential Summer Projects – Blue Button visualization (health informatics) – visualizing aggregate health data – exploring large document collections 38June 26, 2014
  • 39. What is Information Visualization? • "The purpose of information visualization is insight, not pictures" -Ben Shneiderman • Interactive representation of non-physically based data • Concerned with how people think, human perception • Has ties to cognitive science, psychology, data mining June 26, 2014 39
  • 40. Info Vis Research Is Collaborative • Must have data and problems to solve • Must work with experts in other domains – medical doctors - hearing data analysis – social scientists - welfare data record linkage – technology analysts - trends in scientific research – web archivists - viewing collections of web archives June 26, 2014 40
  • 46. Outline • Bits of Research – digital preservation and web archiving – information visualization • Potential Summer Projects – Blue Button visualization (health informatics) – visualizing aggregate health data – exploring large document collections 46June 26, 2014
  • 47. Blue Button+ Visualization • Blue Button - allows patients to download their health data • Goals: – perform research into previous approaches to visualizing Blue Button data – begin the development of a novel visualization of Blue Button data June 26, 2014 47 http://blue-button.github.io/bbClear/bluebutton.html
  • 48. Visualizing Aggregate Health Data • Aggregate health data – statistics for a large number of people (by country, US state, US county, etc.) • Goal: – become familiar with using public datasets and web-based visualization tools June 26, 2014 48 http://www.ers.usda.gov/data-products/food-environment-atlas/go- to-the-atlas.aspx#.U6hHpY1dUhs
  • 49. Exploring Large Document Collections • CORE – API access to a large number of Open Access academic publication repositories • Goals: – investigate the use of the CORE API – begin the implementation of tools that could be used in exploring large document collections June 26, 2014 49 http://www.dlib.org/dlib/july12/herrmannova/07herrmannova.html
  • 50. Outline • Bits of Research – digital preservation and web archiving – information visualization • Potential Summer Projects – Blue Button visualization (health informatics) – visualizing aggregate health data – exploring large document collections 50June 26, 2014