SlideShare uma empresa Scribd logo
1 de 15
14 August 2013
Data 101:
A Gentle Introduction
Presented by
Kimberly Silk, MLS,
Data Librarian, Martin Prosperity Institute,
Rotman School of Management, University of Toronto
2
Our Agenda
• Defining data librarianship
• Basic terminology
• Common data sources
• Our challenge: data management, preservation,
discovery and access
• What are “big data”?
• What are data visualizations?
• Sources
• Q & A
3
Defining Data Librarianship
• Data librarianship is a relatively new area of practice,
emerging with the growth of digital media since the
1970s;
• Data librarians are professional library staff engaged in
managing research data as a resource, and supporting
researchers in these activities;
• We support our institutions and researchers in the
areas of data management, metadata management,
and teaching how to use data as a resource;
• Many of us work in the social sciences, but there is
growth in the natural sciences and humanities as well.
4
Basic Terminology
• Data – plural! Think: Squirrels!! 
• Microdata – raw data, individual records consisting of rows of
numbers (Excel spreadsheet);
• Statistics – summarized tables and cross-tabulations that have
been formulated from the raw data;
• Aggregate data – statistical summaries organized in a data file
structure (Excel) that permits further analysis;
• PUMF – Public Use Microdata File – raw data that is available for
public use; some data may be filtered and geographies repressed
to ensure personal privacy;
• Variables – a set of factors, traits or conditions that describes a
unit of analysis; for instance, sex, age, marital status, etc.
• Frequencies – the number of times an observation occurs in the
data;
5
Common Data Sources
• Gov’t- collected surveys
– US Census (American Fact Finder)
– Bureau of Labor Statistics, Bureau of Economic Analysis,
– Statistics Canada
– International sources such as UK Data Archive, Swedish
National Data Service, Australian Data Archive, etc.
– OECD iLibrary
– World DataBank
– Pew Research Center
– Gallup
– Thomson
6
Other International Data Sources
• Some countries do not gather data, have not
been gathering data for very long, or else limit
or filter available data
• For instance, Russia, India, China and other
developing countries may not gather, preserve
or release their data;
• The BRICs (Brazil, Russia, India, China) will
struggle with this issue as their economies
grow.
7
Uncommon Data Sources
• Data can come from everywhere;
• Occasionally, the MPI acquires data from
unusual sources, such as:
– Rolling Stone magazine
– MySpace social media site for bands
– CrunchBase database of technology companies
Data Management,
Preservation, Discovery &
Access
• We’ve conquered print collections,
but data present a new challenge;
• Like all digital files, metadata is
necessary to describe data assets;
• Like images, a single data set can
mean many things to many people;
• How do we manage these data to
make sure they are discoverable,
accessible, and preserved?
• Traditionally, data files have been
stored on network drives, and shared
or restricted according to the groups
who need to use them;
• Network drives are difficult to search,
can be hard to share and restrict, and
don’t deal with metadata well;
• Web pages with links has been a
common way to distribute data sets;
• We needed new tools – a new kind of
catalogue that is designed for the
specialized needs of data.
9
Data Discovery Platforms
• Nesstar – developed in Norway by Norwegian
Social Science Data Services, used by Statistics
Canada, UK Data Archive, NORC at the
University of Chicago
• ODESI – proprietary system developed and
used by Scholars Portal
• Dataverse – Open source system developed by
the Institute for Quantitative Social Science
(IQSS) at Harvard, used by NBER and ICPSR
Dataverse
• We installed an iteration
of Dataverse at the
University of Toronto, in
our “cloud”, and I manage
my data collections myself;
• As an open source
solution, it’s cost-effective
and my colleagues at
Scholar’s Portal support it
for me and other Ontario
universities.
• The data are associated
with studies; several data
sets can be associated
with a single study;
• The world can see the
metadata for each data
collection, but access to
the data sets themselves
are restricted to those
who contact me to get
permission.
12
What are Big Data?
• Big Data are data that are too large for the
average database management tool (Access and
Excel, for instance).
• Examples come from meteorology, genomics and
physics. At MPI we wrestle with large GIS data
sets (maps and satellite data), and deal with data
at the terabyte (1 trillion bytes) level.
• Larger data sets deal with petabytes (1
quadrillion bytes) and exabytes (1 quintillion
bytes).
13
Data Visualizations
• The visual representation of data ---- literally,
a picture can say a thousand [numbers]
• Edward Tufte is a key pioneer:
http://www.edwardtufte.com/tufte/
• Fantastic examples at Flowing Data:
http://flowingdata.com/
• RSA Animate: http://www.thersa.org/
14
Sources
• International Association for Social Science
Information Services & Technology (ASSIST) -
http://www.iassistdata.org/
• OECD iLibrary - http://www.oecd-ilibrary.org/
• World Bank Data - http://data.worldbank.org/
• UK Data Archive - http://data-archive.ac.uk/
• Nesstar - http://www.nesstar.com/
• Dataverse - http://thedata.org/
17 September 2012
Q & A
(and, Thank You!)
Kimberly Silk, MLS, Data Librarian,
Martin Prosperity Institute, University of Toronto
kimberly.silk@martinprosperity.org

Mais conteúdo relacionado

Mais procurados

Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersRebekah Cummings
 
Is It Too Late to Ensure Continuity of Access to the Scholarly Record?
Is It Too Late to Ensure Continuity of Access to the Scholarly Record?Is It Too Late to Ensure Continuity of Access to the Scholarly Record?
Is It Too Late to Ensure Continuity of Access to the Scholarly Record?EDINA, University of Edinburgh
 
Publishing Open Access isn’t the End of the Story
Publishing Open Access isn’t the End of the StoryPublishing Open Access isn’t the End of the Story
Publishing Open Access isn’t the End of the Storyariadnenetwork
 
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3mResearch Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3mLEARN Project
 
WP3: overzicht van de voortgang van WP# op de CLARIAH-dag
WP3: overzicht van de voortgang van WP# op de CLARIAH-dagWP3: overzicht van de voortgang van WP# op de CLARIAH-dag
WP3: overzicht van de voortgang van WP# op de CLARIAH-dagCLARIAH
 
Getting onboard the data training: How librarians fit in
Getting onboard the data training: How librarians fit inGetting onboard the data training: How librarians fit in
Getting onboard the data training: How librarians fit inDiane Clark
 
Introduction to Digital File Management
Introduction to Digital File ManagementIntroduction to Digital File Management
Introduction to Digital File ManagementRebekah Cummings
 
Research Data Services at the University of Utah
Research Data Services at the University of UtahResearch Data Services at the University of Utah
Research Data Services at the University of UtahRebekah Cummings
 
LIBER Webinar: Research Data Services Survey
LIBER Webinar: Research Data Services Survey LIBER Webinar: Research Data Services Survey
LIBER Webinar: Research Data Services Survey LIBER Europe
 
Winning the Tour de France, Research Data and Data Stewardship
Winning the Tour de France, Research Data and Data StewardshipWinning the Tour de France, Research Data and Data Stewardship
Winning the Tour de France, Research Data and Data StewardshipAlastair Dunning
 
Efforts to Promote Open Science in European Research Libraries
Efforts to Promote Open Science in European Research LibrariesEfforts to Promote Open Science in European Research Libraries
Efforts to Promote Open Science in European Research LibrariesLIBER Europe
 
The Landscape of Research Data Management
The Landscape of Research Data Management The Landscape of Research Data Management
The Landscape of Research Data Management Alastair Dunning
 
Relationship status: Libraries and linked data in Europe
Relationship status: Libraries and linked data in EuropeRelationship status: Libraries and linked data in Europe
Relationship status: Libraries and linked data in EuropeDiane Rasmussen Pennington
 
Lake us-canada policesupdate
Lake us-canada policesupdateLake us-canada policesupdate
Lake us-canada policesupdateSherry Lake
 
Managing Research Data in the Life Sciences
Managing Research Data in the Life SciencesManaging Research Data in the Life Sciences
Managing Research Data in the Life Sciencesalwerhane
 

Mais procurados (19)

Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 
Gonzalez-8-jun15
Gonzalez-8-jun15Gonzalez-8-jun15
Gonzalez-8-jun15
 
Allard - Research Data Services in Libraries
Allard - Research Data Services in LibrariesAllard - Research Data Services in Libraries
Allard - Research Data Services in Libraries
 
Is It Too Late to Ensure Continuity of Access to the Scholarly Record?
Is It Too Late to Ensure Continuity of Access to the Scholarly Record?Is It Too Late to Ensure Continuity of Access to the Scholarly Record?
Is It Too Late to Ensure Continuity of Access to the Scholarly Record?
 
Publishing Open Access isn’t the End of the Story
Publishing Open Access isn’t the End of the StoryPublishing Open Access isn’t the End of the Story
Publishing Open Access isn’t the End of the Story
 
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3mResearch Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
Research Data in an Open Science World - Prof. Dr. Eva Mendez, uc3m
 
WP3: overzicht van de voortgang van WP# op de CLARIAH-dag
WP3: overzicht van de voortgang van WP# op de CLARIAH-dagWP3: overzicht van de voortgang van WP# op de CLARIAH-dag
WP3: overzicht van de voortgang van WP# op de CLARIAH-dag
 
Getting onboard the data training: How librarians fit in
Getting onboard the data training: How librarians fit inGetting onboard the data training: How librarians fit in
Getting onboard the data training: How librarians fit in
 
Introduction to Digital File Management
Introduction to Digital File ManagementIntroduction to Digital File Management
Introduction to Digital File Management
 
Research Data Services at the University of Utah
Research Data Services at the University of UtahResearch Data Services at the University of Utah
Research Data Services at the University of Utah
 
LIBER Webinar: Research Data Services Survey
LIBER Webinar: Research Data Services Survey LIBER Webinar: Research Data Services Survey
LIBER Webinar: Research Data Services Survey
 
Winning the Tour de France, Research Data and Data Stewardship
Winning the Tour de France, Research Data and Data StewardshipWinning the Tour de France, Research Data and Data Stewardship
Winning the Tour de France, Research Data and Data Stewardship
 
Efforts to Promote Open Science in European Research Libraries
Efforts to Promote Open Science in European Research LibrariesEfforts to Promote Open Science in European Research Libraries
Efforts to Promote Open Science in European Research Libraries
 
The Landscape of Research Data Management
The Landscape of Research Data Management The Landscape of Research Data Management
The Landscape of Research Data Management
 
Open access lis journals
Open access lis journalsOpen access lis journals
Open access lis journals
 
Relationship status: Libraries and linked data in Europe
Relationship status: Libraries and linked data in EuropeRelationship status: Libraries and linked data in Europe
Relationship status: Libraries and linked data in Europe
 
Lake us-canada policesupdate
Lake us-canada policesupdateLake us-canada policesupdate
Lake us-canada policesupdate
 
Managing Research Data in the Life Sciences
Managing Research Data in the Life SciencesManaging Research Data in the Life Sciences
Managing Research Data in the Life Sciences
 
Data are the New Black
Data are the New BlackData are the New Black
Data are the New Black
 

Semelhante a Data 101: A Gentle Introduction

APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataHamilton Public Library
 
Guy avoiding-dat apocalypse
Guy avoiding-dat apocalypseGuy avoiding-dat apocalypse
Guy avoiding-dat apocalypseENUG
 
Computers in Libraries 2012 - Discovering Data: Cataloguing Data Collections
Computers in Libraries 2012 - Discovering Data: Cataloguing Data CollectionsComputers in Libraries 2012 - Discovering Data: Cataloguing Data Collections
Computers in Libraries 2012 - Discovering Data: Cataloguing Data CollectionsHamilton Public Library
 
Research Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the ChallengeResearch Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the ChallengeSpencer Keralis
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Datacunera
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
Digital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchDigital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchMartin Donnelly
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web DataMarieke Guy
 
RDA - The Research Data Alliance in a Nutshell
RDA - The Research Data Alliance in a NutshellRDA - The Research Data Alliance in a Nutshell
RDA - The Research Data Alliance in a NutshellResearch Data Alliance
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementSarah Jones
 
ICPSR Data Services
ICPSR Data ServicesICPSR Data Services
ICPSR Data ServicesICPSR
 
Ucsd library10182010
Ucsd library10182010Ucsd library10182010
Ucsd library10182010Philip Bourne
 
ICPSR Workshop Template - 2012/13
ICPSR Workshop Template - 2012/13ICPSR Workshop Template - 2012/13
ICPSR Workshop Template - 2012/13ICPSR
 
Shared Data & Big Data for Libraries
Shared Data & Big Data for LibrariesShared Data & Big Data for Libraries
Shared Data & Big Data for Librariesrobin fay
 
Shared data and the future of libraries
Shared data and the future of librariesShared data and the future of libraries
Shared data and the future of librariesRegan Harper
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 
Open Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and BenefitsOpen Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and Benefitsariadnenetwork
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsMartin Donnelly
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptxAkhirulAminulloh2
 

Semelhante a Data 101: A Gentle Introduction (20)

APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with Data
 
Guy avoiding-dat apocalypse
Guy avoiding-dat apocalypseGuy avoiding-dat apocalypse
Guy avoiding-dat apocalypse
 
Computers in Libraries 2012 - Discovering Data: Cataloguing Data Collections
Computers in Libraries 2012 - Discovering Data: Cataloguing Data CollectionsComputers in Libraries 2012 - Discovering Data: Cataloguing Data Collections
Computers in Libraries 2012 - Discovering Data: Cataloguing Data Collections
 
Research Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the ChallengeResearch Data Management in Academic Libraries: Meeting the Challenge
Research Data Management in Academic Libraries: Meeting the Challenge
 
Data Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach DataData Literacy: Creating and Managing Reserach Data
Data Literacy: Creating and Managing Reserach Data
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
Digital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening ResearchDigital Data Sharing: Opportunities and Challenges of Opening Research
Digital Data Sharing: Opportunities and Challenges of Opening Research
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web Data
 
RDA - The Research Data Alliance in a Nutshell
RDA - The Research Data Alliance in a NutshellRDA - The Research Data Alliance in a Nutshell
RDA - The Research Data Alliance in a Nutshell
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
ICPSR Data Services
ICPSR Data ServicesICPSR Data Services
ICPSR Data Services
 
Ucsd library10182010
Ucsd library10182010Ucsd library10182010
Ucsd library10182010
 
ICPSR Workshop Template - 2012/13
ICPSR Workshop Template - 2012/13ICPSR Workshop Template - 2012/13
ICPSR Workshop Template - 2012/13
 
Shared Data & Big Data for Libraries
Shared Data & Big Data for LibrariesShared Data & Big Data for Libraries
Shared Data & Big Data for Libraries
 
Shared data and the future of libraries
Shared data and the future of librariesShared data and the future of libraries
Shared data and the future of libraries
 
00-01 DSnDA.pdf
00-01 DSnDA.pdf00-01 DSnDA.pdf
00-01 DSnDA.pdf
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Open Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and BenefitsOpen Data Publication - Requirements, Good practices, and Benefits
Open Data Publication - Requirements, Good practices, and Benefits
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
Introduction Data Science.pptx
Introduction Data Science.pptxIntroduction Data Science.pptx
Introduction Data Science.pptx
 

Mais de Hamilton Public Library

OLA Super Conference 2019: Data Skills for 21st Century Library Practice
OLA Super Conference 2019: Data Skills for 21st Century Library PracticeOLA Super Conference 2019: Data Skills for 21st Century Library Practice
OLA Super Conference 2019: Data Skills for 21st Century Library PracticeHamilton Public Library
 
OLA Super Conference 2019: Research Round-up
OLA Super Conference 2019: Research Round-upOLA Super Conference 2019: Research Round-up
OLA Super Conference 2019: Research Round-upHamilton Public Library
 
OLA Super Conference 2019: Changing Stakeholder Perceptions About Library Value
OLA Super Conference 2019: Changing Stakeholder Perceptions About Library ValueOLA Super Conference 2019: Changing Stakeholder Perceptions About Library Value
OLA Super Conference 2019: Changing Stakeholder Perceptions About Library ValueHamilton Public Library
 
Constructing a Strategic Plan: Essential Processes and Components
Constructing a Strategic Plan: Essential Processes and ComponentsConstructing a Strategic Plan: Essential Processes and Components
Constructing a Strategic Plan: Essential Processes and ComponentsHamilton Public Library
 
Library Space Use Study: What we Learned
Library Space Use Study: What we Learned Library Space Use Study: What we Learned
Library Space Use Study: What we Learned Hamilton Public Library
 
Surfacing Integration in the Digital Scholarship Ecosystem
Surfacing Integration in the Digital Scholarship EcosystemSurfacing Integration in the Digital Scholarship Ecosystem
Surfacing Integration in the Digital Scholarship EcosystemHamilton Public Library
 
All Together Now: Collaboration and Coordination in Canada's Digital Scholars...
All Together Now: Collaboration and Coordination in Canada's Digital Scholars...All Together Now: Collaboration and Coordination in Canada's Digital Scholars...
All Together Now: Collaboration and Coordination in Canada's Digital Scholars...Hamilton Public Library
 
L-Index: Designing a New Method for Measuring Library Impact in Canada
L-Index: Designing a New Method for Measuring Library Impact in CanadaL-Index: Designing a New Method for Measuring Library Impact in Canada
L-Index: Designing a New Method for Measuring Library Impact in CanadaHamilton Public Library
 
Ink On Our Hands: Plotting the Map of Canada's Integrated Digital Scholarship...
Ink On Our Hands: Plotting the Map of Canada's Integrated Digital Scholarship...Ink On Our Hands: Plotting the Map of Canada's Integrated Digital Scholarship...
Ink On Our Hands: Plotting the Map of Canada's Integrated Digital Scholarship...Hamilton Public Library
 
Library Evaluation in 3 Parts - Presented by Dr. Bill Irwin, Computers in Lib...
Library Evaluation in 3 Parts - Presented by Dr. Bill Irwin, Computers in Lib...Library Evaluation in 3 Parts - Presented by Dr. Bill Irwin, Computers in Lib...
Library Evaluation in 3 Parts - Presented by Dr. Bill Irwin, Computers in Lib...Hamilton Public Library
 
Strategic Metrics Workshop: Computers in Libraries Conference, April 2015
Strategic Metrics Workshop: Computers in Libraries Conference, April 2015Strategic Metrics Workshop: Computers in Libraries Conference, April 2015
Strategic Metrics Workshop: Computers in Libraries Conference, April 2015Hamilton Public Library
 
CLA 2014: The Economic Impact of Libraries
CLA 2014: The Economic Impact of LibrariesCLA 2014: The Economic Impact of Libraries
CLA 2014: The Economic Impact of LibrariesHamilton Public Library
 
So Much More: The Economic Impact of Toronto Public Library on the City of To...
So Much More: The Economic Impact of Toronto Public Library on the City of To...So Much More: The Economic Impact of Toronto Public Library on the City of To...
So Much More: The Economic Impact of Toronto Public Library on the City of To...Hamilton Public Library
 
TRY 2011 - Mentoring the 21st Century Information Professional
TRY 2011 - Mentoring the 21st Century Information ProfessionalTRY 2011 - Mentoring the 21st Century Information Professional
TRY 2011 - Mentoring the 21st Century Information ProfessionalHamilton Public Library
 
Internet Librarian 2010 - Using Design Thinking to Enable Innovation
Internet Librarian 2010 - Using Design Thinking to Enable InnovationInternet Librarian 2010 - Using Design Thinking to Enable Innovation
Internet Librarian 2010 - Using Design Thinking to Enable InnovationHamilton Public Library
 

Mais de Hamilton Public Library (20)

OLA Super Conference 2019: Data Skills for 21st Century Library Practice
OLA Super Conference 2019: Data Skills for 21st Century Library PracticeOLA Super Conference 2019: Data Skills for 21st Century Library Practice
OLA Super Conference 2019: Data Skills for 21st Century Library Practice
 
OLA Super Conference 2019: Research Round-up
OLA Super Conference 2019: Research Round-upOLA Super Conference 2019: Research Round-up
OLA Super Conference 2019: Research Round-up
 
OLA Super Conference 2019: Changing Stakeholder Perceptions About Library Value
OLA Super Conference 2019: Changing Stakeholder Perceptions About Library ValueOLA Super Conference 2019: Changing Stakeholder Perceptions About Library Value
OLA Super Conference 2019: Changing Stakeholder Perceptions About Library Value
 
Constructing a Strategic Plan: Essential Processes and Components
Constructing a Strategic Plan: Essential Processes and ComponentsConstructing a Strategic Plan: Essential Processes and Components
Constructing a Strategic Plan: Essential Processes and Components
 
Library Space Use Study: What we Learned
Library Space Use Study: What we Learned Library Space Use Study: What we Learned
Library Space Use Study: What we Learned
 
Surfacing Integration in the Digital Scholarship Ecosystem
Surfacing Integration in the Digital Scholarship EcosystemSurfacing Integration in the Digital Scholarship Ecosystem
Surfacing Integration in the Digital Scholarship Ecosystem
 
Library Value Projects
Library Value ProjectsLibrary Value Projects
Library Value Projects
 
Trends in Demonstrating Library Value
Trends in Demonstrating Library ValueTrends in Demonstrating Library Value
Trends in Demonstrating Library Value
 
All Together Now: Collaboration and Coordination in Canada's Digital Scholars...
All Together Now: Collaboration and Coordination in Canada's Digital Scholars...All Together Now: Collaboration and Coordination in Canada's Digital Scholars...
All Together Now: Collaboration and Coordination in Canada's Digital Scholars...
 
L-Index: Designing a New Method for Measuring Library Impact in Canada
L-Index: Designing a New Method for Measuring Library Impact in CanadaL-Index: Designing a New Method for Measuring Library Impact in Canada
L-Index: Designing a New Method for Measuring Library Impact in Canada
 
Ink On Our Hands: Plotting the Map of Canada's Integrated Digital Scholarship...
Ink On Our Hands: Plotting the Map of Canada's Integrated Digital Scholarship...Ink On Our Hands: Plotting the Map of Canada's Integrated Digital Scholarship...
Ink On Our Hands: Plotting the Map of Canada's Integrated Digital Scholarship...
 
Library Evaluation in 3 Parts - Presented by Dr. Bill Irwin, Computers in Lib...
Library Evaluation in 3 Parts - Presented by Dr. Bill Irwin, Computers in Lib...Library Evaluation in 3 Parts - Presented by Dr. Bill Irwin, Computers in Lib...
Library Evaluation in 3 Parts - Presented by Dr. Bill Irwin, Computers in Lib...
 
Strategic Metrics Workshop: Computers in Libraries Conference, April 2015
Strategic Metrics Workshop: Computers in Libraries Conference, April 2015Strategic Metrics Workshop: Computers in Libraries Conference, April 2015
Strategic Metrics Workshop: Computers in Libraries Conference, April 2015
 
Evidence-Based Innovation
Evidence-Based InnovationEvidence-Based Innovation
Evidence-Based Innovation
 
Library Impact Studies: Lessons Learned
Library Impact Studies: Lessons LearnedLibrary Impact Studies: Lessons Learned
Library Impact Studies: Lessons Learned
 
Data, Metrics, and our Profession
Data, Metrics, and our ProfessionData, Metrics, and our Profession
Data, Metrics, and our Profession
 
CLA 2014: The Economic Impact of Libraries
CLA 2014: The Economic Impact of LibrariesCLA 2014: The Economic Impact of Libraries
CLA 2014: The Economic Impact of Libraries
 
So Much More: The Economic Impact of Toronto Public Library on the City of To...
So Much More: The Economic Impact of Toronto Public Library on the City of To...So Much More: The Economic Impact of Toronto Public Library on the City of To...
So Much More: The Economic Impact of Toronto Public Library on the City of To...
 
TRY 2011 - Mentoring the 21st Century Information Professional
TRY 2011 - Mentoring the 21st Century Information ProfessionalTRY 2011 - Mentoring the 21st Century Information Professional
TRY 2011 - Mentoring the 21st Century Information Professional
 
Internet Librarian 2010 - Using Design Thinking to Enable Innovation
Internet Librarian 2010 - Using Design Thinking to Enable InnovationInternet Librarian 2010 - Using Design Thinking to Enable Innovation
Internet Librarian 2010 - Using Design Thinking to Enable Innovation
 

Último

Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 

Último (20)

Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 

Data 101: A Gentle Introduction

  • 1. 14 August 2013 Data 101: A Gentle Introduction Presented by Kimberly Silk, MLS, Data Librarian, Martin Prosperity Institute, Rotman School of Management, University of Toronto
  • 2. 2 Our Agenda • Defining data librarianship • Basic terminology • Common data sources • Our challenge: data management, preservation, discovery and access • What are “big data”? • What are data visualizations? • Sources • Q & A
  • 3. 3 Defining Data Librarianship • Data librarianship is a relatively new area of practice, emerging with the growth of digital media since the 1970s; • Data librarians are professional library staff engaged in managing research data as a resource, and supporting researchers in these activities; • We support our institutions and researchers in the areas of data management, metadata management, and teaching how to use data as a resource; • Many of us work in the social sciences, but there is growth in the natural sciences and humanities as well.
  • 4. 4 Basic Terminology • Data – plural! Think: Squirrels!!  • Microdata – raw data, individual records consisting of rows of numbers (Excel spreadsheet); • Statistics – summarized tables and cross-tabulations that have been formulated from the raw data; • Aggregate data – statistical summaries organized in a data file structure (Excel) that permits further analysis; • PUMF – Public Use Microdata File – raw data that is available for public use; some data may be filtered and geographies repressed to ensure personal privacy; • Variables – a set of factors, traits or conditions that describes a unit of analysis; for instance, sex, age, marital status, etc. • Frequencies – the number of times an observation occurs in the data;
  • 5. 5 Common Data Sources • Gov’t- collected surveys – US Census (American Fact Finder) – Bureau of Labor Statistics, Bureau of Economic Analysis, – Statistics Canada – International sources such as UK Data Archive, Swedish National Data Service, Australian Data Archive, etc. – OECD iLibrary – World DataBank – Pew Research Center – Gallup – Thomson
  • 6. 6 Other International Data Sources • Some countries do not gather data, have not been gathering data for very long, or else limit or filter available data • For instance, Russia, India, China and other developing countries may not gather, preserve or release their data; • The BRICs (Brazil, Russia, India, China) will struggle with this issue as their economies grow.
  • 7. 7 Uncommon Data Sources • Data can come from everywhere; • Occasionally, the MPI acquires data from unusual sources, such as: – Rolling Stone magazine – MySpace social media site for bands – CrunchBase database of technology companies
  • 8. Data Management, Preservation, Discovery & Access • We’ve conquered print collections, but data present a new challenge; • Like all digital files, metadata is necessary to describe data assets; • Like images, a single data set can mean many things to many people; • How do we manage these data to make sure they are discoverable, accessible, and preserved? • Traditionally, data files have been stored on network drives, and shared or restricted according to the groups who need to use them; • Network drives are difficult to search, can be hard to share and restrict, and don’t deal with metadata well; • Web pages with links has been a common way to distribute data sets; • We needed new tools – a new kind of catalogue that is designed for the specialized needs of data.
  • 9. 9 Data Discovery Platforms • Nesstar – developed in Norway by Norwegian Social Science Data Services, used by Statistics Canada, UK Data Archive, NORC at the University of Chicago • ODESI – proprietary system developed and used by Scholars Portal • Dataverse – Open source system developed by the Institute for Quantitative Social Science (IQSS) at Harvard, used by NBER and ICPSR
  • 10. Dataverse • We installed an iteration of Dataverse at the University of Toronto, in our “cloud”, and I manage my data collections myself; • As an open source solution, it’s cost-effective and my colleagues at Scholar’s Portal support it for me and other Ontario universities. • The data are associated with studies; several data sets can be associated with a single study; • The world can see the metadata for each data collection, but access to the data sets themselves are restricted to those who contact me to get permission.
  • 11.
  • 12. 12 What are Big Data? • Big Data are data that are too large for the average database management tool (Access and Excel, for instance). • Examples come from meteorology, genomics and physics. At MPI we wrestle with large GIS data sets (maps and satellite data), and deal with data at the terabyte (1 trillion bytes) level. • Larger data sets deal with petabytes (1 quadrillion bytes) and exabytes (1 quintillion bytes).
  • 13. 13 Data Visualizations • The visual representation of data ---- literally, a picture can say a thousand [numbers] • Edward Tufte is a key pioneer: http://www.edwardtufte.com/tufte/ • Fantastic examples at Flowing Data: http://flowingdata.com/ • RSA Animate: http://www.thersa.org/
  • 14. 14 Sources • International Association for Social Science Information Services & Technology (ASSIST) - http://www.iassistdata.org/ • OECD iLibrary - http://www.oecd-ilibrary.org/ • World Bank Data - http://data.worldbank.org/ • UK Data Archive - http://data-archive.ac.uk/ • Nesstar - http://www.nesstar.com/ • Dataverse - http://thedata.org/
  • 15. 17 September 2012 Q & A (and, Thank You!) Kimberly Silk, MLS, Data Librarian, Martin Prosperity Institute, University of Toronto kimberly.silk@martinprosperity.org