Lecture on international organizations to a graduate class in the "Fundamentals of Data Curation" at the University of Illinois, Urbana-Champaign Graduate School of Library and Information Science.
Dev Dives: Streamline document processing with UiPath Studio Web
Parsons on "Playing in the International Data Space"
1. Playing in the International Data Space
Mark A. Parsons
Rensselaer Polytechnic Institute
!
!
24 October 2013
Fundamentals of Data Curation
University of Illinois, Urbana-Champaign
Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License
2. Stuff you should learn
• What international organizations are
• Who is organizing
• How they organize and how you can participate
• Why it matters to you and data science writ large and curation more specifically
3. Some history
• Biological classification or taxonomy
• Linnaeus’s Systema Naturae, 1735
• Darwin’s On the Origin of Species, 1859
• Encyclopedia of Life, 2007
• International Map of the World (standardized 1:1,000,000 map)
• Proposed 1891
• Begun 1913
• Never completed
• Metric system
• Introduced 1795
• Convention du Mètre, 1875
• International System of Units (SI) 1960
• One big holdout
• Time zones
• 1st use of standard (“railway”) time 1847
• International Meridian Conference 1884 established GMT but did not alter local times
• Final adoption of “standard offset” from GMT/UTC 1986
• Current number of time zones in China and India: 1
4. Some history
• Biological classification or taxonomy
• Linnaeus’s Systema Naturae, 1735
• Darwin’s On the Origin of Species, 1859
• Encyclopedia of Life, 2007
• International Map of the World (standardized 1:1,000,000 map)
• Proposed 1891
• Begun 1913
• Never completed
• Metric system
• Introduced 1795
• Convention du Mètre, 1875
• International System of Units (SI) 1960
Standards were
envisioned and desired,
but the evolution was
slow, driven by nation
states, and never
entirely successful.
• One big holdout
• Time zones
• 1st use of standard (“railway”) time 1847
• International Meridian Conference 1884 established GMT but did not alter local times
• Final adoption of “standard offset” from GMT/UTC 1986
• Current number of time zones in China and India: 1
5. Some more history
1865 — International Telegraph Union
1871 — First International Geographical Congress
1873 — International Meteorological Organization (became WMO within UN in 1951)
1899 — International Association of Academies, later the International Council of Science (ICSU)
early 1900s — Many other physical science unions established
1926 — International Federation of Library Associations and Institutions (IFLA)
1945 — United Nations
1952 — International Social Science Council
1957 — World Data Centers
1966 — ICSU Committee on Data for Science and Technology (CODATA)
1994 — World Wide Web Consortium (W3C) (Web itself c.1990)
2013 — Research Data Alliance (RDA)
6. Scientific professional societies...
• Convene and build communities and thereby validate and promote their field.
• Create and assert community consensus.
• Create standards, ethical guidelines, best practices, and certifications.
• Educate the public and their members.
• Provide a record of the discipline through publications from gray to black.
• Pursue focussed initiatives to further scientific goals.
• Seek to maintain a privileged position—power.
• Seek to grow their membership.
• Can be self-perpetuating and conservative especially at international level.
7. Where I belong
Current
Past
• American Geophysical Union (AGU)—
primary affiliation Earth and Space
Science Informatics $
• Association of American Geographers
(AAG) $$
• IEEE (Institute of Electrical and
Electronics Engineers) $$
• Research Data Alliance (RDA) free
• American Society of Information
Science and Technology (ASIST) $$
• US Permafrost Association $
• International Permafrost Association*
• Federation of Earth Science
Information Partners (ESIP) free
!
• Digital Curation Centre Associate free
!
• International Union of Geodesy and
Geoscience (IUGG) Union
Commission of Data and Information*
!
• CODATA*
*as an officer (organization does not have individual
members)
8. Players in international (data) organizations
• Governments
• agencies—can act but not speak on policy (short term $)
• ambassadors—can influence policy but not programs (sustained $)
• Foundations and charities
• National Academies
• Universities and Research Institutes, especially their libraries
• Professional societies and other NGOs
• UN and other intergovernmental bodies
• Companies—tech. companies (databases, software, info services, etc); commercial
publishers; data re-adapters (weather companies, map makers in the broadest
sense)
• Individuals
13. Managing Impacts Across Diverse Boundaries
Earth System
Law of the Sea
Meteorological
OSPAR
Navigational
NEAFC
Marine Ecosystem
Search and
Rescue
Slide courtesy Paul A. Berkman
15. Some international data organizations
• CODATA
• Mission: to strengthen international science for the benefit of society by promoting improved scientific and technical
data management and use.
• Subunit of ICSU but has its own paid membership subscription for Academies and unions
• Individuals participate as representatives from an org. member, as a task force member, or by attending biennial
meetings.
• WDS
• Mission: to ensure the long-term stewardship and provision of quality-assessed data and data services to the
international science community and other stakeholders.
• Subunit of ICSU but members are certified data repositories and services. No fee but there is a certification process.
• Individuals participate as representatives from an org. member, as a working group member (jointly with RDA), or by
attending biennial meetings (joint with CODATA?).
• Open Knowledge Foundation
• Mission: to promote open data and open content in all their forms.
• Non-profit with volunteer participation
• Individual sign up and participate in working groups, local groups, and task forces. Also attend myriad conferences
and “festivals”.
16. More international data organizations
• International Geospatial Society and Global Spatial Data Infrastructure Association
• Mission: to promote international cooperation and collaboration in support of local, national and
international spatial data infrastructure developments that will allow nations to better address social,
economic, and environmental issues of pressing importance.
• IGS is for individuals, GSDIA is for Organizations with “at least a nation-wide influence.”
• Individuals participate in committees, conferences, and trainings.
• Open Geospatial Consortium
• Mission: to serve as a global forum for the collaboration of developers and users of spatial data products
and services, and to advance the development of international standards for geospatial interoperability.
• Paid organizational membership for companies, government agencies, and universities.
• Individuals participate as representatives of their organizations, largely in standards development.
• DataCite
• Mission: to promote and facilitate data citation.
• Paid organizational membership for national library-type organizations.
• Individuals participate as representatives of their organization or attend the annual conference.
17. Some international data-related
organizations
• W3C
• Mission: to lead the World Wide Web to its full potential by developing protocols and guidelines that ensure
the long-term growth of the Web.
• Paid organizational membership. Many companies, some universities, some agencies.
• Individuals participate as representatives from an org. member, by participating in community groups and
discussion fora.
• IEEE
• Mission: to foster technological innovation and excellence for the benefit of humanity.
• Professional society and standards body.
• Individual membership with many types of participation including local chapters addressing areas well
beyond data
• IFLA
• Mission: to further accessibility, protection, and preservation of documentary cultural heritage and to
promote and support libraries.
• Paid membership for associations, institutes, and individuals.
• Individuals participate in myriad specialty groups, sections, and programs and attend annual meetings.
18. Research Data Alliance
• An alliance of individuals, organizations, and associates.
• Mission: “RDA builds the social and technical bridges that enable open sharing
of data.”
• A different sort of funding model—informally collaborating, hands-off agency
support
• A different sort of operating model recognizing the dynamics and tensions of
developing infrastructure.
• Free individual membership, inexpensive organizational membership, affiliation
with like minded organizations.
• Grass roots driven.
• More tactical than strategic.
• Global and regional but independent of nations.
19. Data Citation Case Study
• Initial efforts in the late 90s - early
00s
• Right idea, little traction
• Partially conflated with the citing
URLs issue
• A blossoming in the mid-late 00s.
• Multiple disciplines start
developing approaches and
guidelines
• DOI a big driver, esp. for DataCite,
but other identifiers used too
(including handles, LSIDs, UNFs,
ARKs and good ol’ URI/Ls)
• A slightly competitive atmosphere
• Now in a consensus phase
• CODATA/ICSTI/National
Academy report
• Force11 Manifesto
• RDA harmonization effort—
broadens and unites the
community
• Implementation phase just started
• Happens locally
• Requires culture change so
debates will continue
20. What should you do?
• Join RDA and participate in Interest and Working Groups
• Watch for upcoming student internships and fellowships.
• Become a Digital Curation Center Associate and attend one of their conferences
(publishing opportunity).
• Attend a conference or two of some of the other organizations.
• Join a professional society in your scientific discipline.
• If you don’t have a scientific discipline, get one. Curation requires it.
• Attend their meeting and help develop a data section or focus group (if they
don’t have one already).
33. Themes from A. Tsing on Collaboration
Friction—An ethnography of global connection
•“Actually existing universalisms are
hybrid, transient, and involved in
constant reformulation through
dialogue.” They work out through
friction.
•“There is no reason to think
collaborators have common goals.”
•Unity and diversity cover each
other up. Need to remember the
local.
34. Where Good Ideas Come From
Steven Johnson
• The Adjacent Possible—the importance of local
• It’s often not “Eureka!” but rather a slow hunch fading in to
view over time.
• Hunches need to collide with other hunches--create that
environment. Don’t protect IP share it. Connecting vs
protecting
• Sharing of failures as well
• Create spaces for that to happen—virtual and real coffee
shops
• “Chance favors the connected mind.”
35. Themes on Relationships
(I’m an introvert)
• The central challenge is diversity.
• We address it through variety and myriad interfaces and
connections.
• Fostering relationships is central to community and data science.
• they build social capital—success through giving
• they uncover tacit knowledge
• they inform methods
36. Data Science Methods
• User-driven design is not just end user. Engage providers and funders too.
• Case studies not just use cases.
• Ethnography—study relationships because data are often at the center of that
interaction—a boundary object.
• Agile is not just for software (courtesy Bruce Caron).
• Individuals and interactions over processes and tools
• Working volunteers over comprehensive documentation
• Member collaboration over contract negotiation
• Responding to change over following a plan.
37. Summary
• International (data) organizations grew out the idealistic, deterministic blossoming
of science.
• They are virtually infinite in their scope and number.
• They have many different forms and the best are highly adaptive and evolving
(while retaining core principles).
• Only diversity absorbs diversity.
• Networking and interconnection are the way to solve complex problems.
• We are in more global and democratic world, but also a more local world. Coalition
politics with new kinds of coalitions because there are new kinds of identity.
• Data science and curation need to focus on relationships, connections, interfaces.
• You must participate “glocally” to succeed.