SlideShare uma empresa Scribd logo
1 de 29
What business are we in?
Data-centric research, service
requirements and national
responses
Data Keynote, NEIC 2013
Dr Andrew Treloar
Australian National Data Service
Overview
• What business are we really in?
• Service requirements
• Infrastructure responses
• Research Data Alliance
• Conclusions
CC-BY @atreloar 2
Photo CC-BY www.flickr.com/photos/dgjones/7031731377/ 3
Photo CC-BY www.flickr.com/photos/pejmanphotos/1322835717/ 4
What Business are you in?
Theodore Levitt, The Changing Character of Capitalism,
Harvard Business Review, July–August 1956
“The railroads did not stop growing because the need for
passenger and freight transportation declined. That grew.
The railroads are in trouble today not because that need
was filled by others (cars, trucks, airplanes, and even
telephones) but because it was not filled by the railroads
themselves. They let others take customers away from
them because they assumed themselves to be in the
railroad business rather than in the transportation
business. The reason they defined their industry
incorrectly was that they were railroad oriented instead of
transportation oriented; they were product oriented
instead of customer oriented....”
CC-BY @atreloar 5
Photo CC-BY www.flickr.com/photos/spookman01/4904264919/ 6
Photo CC-BY www.flickr.com/photos/jerryjohn/63351338/CC-BY @atreloar 7
Photo CC-BY www.flickr.com/photos/stiefkind/6454784607/CC-BY @atreloar 8
Photo CC-BY www.flickr.com/photos/torkildr/3462607995/CC-BY @atreloar 9
CC-BY @atreloar 10
We are all in the Data business!
• Researchers
– with some exceptions
• Research infrastructure providers
– with no exceptions
• But what about publications?
CC-BY @atreloar 11
LHC output from 2009-2013
= 100PB
(www.symmetrymagazine.org/article/february-
2013/achievement-unlocked-100-petabytes-of-data)
Journal Literature size in context…
@atreloar
Data-centric view of research data re-
use
CC-BY @atreloar 13
eResearch infrastructure
requirements
• Create/Capture
– automated with capture of associated
metadata
• Store
– with appropriate levels of preservation
• Describe
– information for discovery, determination of
value, access, re-use
• Identify
– indirection operator to reduce brittlenessCC-BY @atreloar 14
eResearch infrastructure
requirements
• Register
– in institutional/national/discipline registries
• Discover
– via general or specialised search interfaces
• Access
– with appropriate levels of control, including
humans
• Exploit
– by re-analysis or combination
CC-BY @atreloar 15
Photo CC-BY http://www.flickr.com/photos/vintuitive/6855133329/
16
I come from a land
downunder…
CC-BY @atreloar 17
AU
• 6 States
• 2 Territories
• 2 islands
• 23M people
NZ
• 2 islands
• 4.5M people
You come from the frozen North…
CC-BY @atreloar 18
Nordic Countries
• 5 Countries
• 4 Territories
• So many islands
• 26M people
And yet there are some
similarities
CC-BY @atreloar 19
• Australia+NZ – 27.5M people
• Scandinavia – 26M people
Australian National Data Service
 An initiative of the Australian Government being
conducted as part of the National Collaborative
Research Infrastructure Strategy ($A24M) and the
Super Science Initiative ($A48M)
 A collaboration between Monash University, the
Australian National University and CSIRO
 30 staff, funded to mid 2015
 More researchers re-using more data more often
 Data as a first-class object
CC-BY @atreloar 20
ANDS enables transformation of:
Data that are:
Unmanaged
Disconnected
Invisible
Single use
To Structured Collections that are:
Managed
Connected
Findable
Reusable
so that Australian researchers can easily publish,
discover, access and use/re-use research data.
CC-BY @atreloar 21
Data-centric view of research data re-
use
CC-BY @atreloar 22
ANDS activities/services
 Plan
 Data management planning tools and resources (N)
 Create/Capture
 69 Data Capture projects at 23 universities
 Store
 working closely with national Research Data Storage
Infrastructure (N)
 Describe
 25 institutional Metadata Stores projects
 National Vocabulary Services (N)
CC-BY @atreloar 23
CC-BY @atreloar 24
 Identify (N)
 DataCite DOIs
 Register (N)
 Repository Interchange Format – Collections and Services
(RIF-CS) – based on ISO2146:2010
 Discover (N)
 Research Data Australia
ANDS activities/services
ANDS activities/services
 Access
 enforced by underlying data stores
 Exploit
 25 institutionally-focussed projects to demonstrate value of
combining data
 Advocate (N)
 Be the voice for data
 Work with Government and Research Funders to change
settings in favour of data sharing
CC-BY @atreloar 25
26
Research Data Alliance
 The Research Data Alliance (RDA) is a new international
organization (driven now by EC, US, AU, more soon) forming to
facilitate specific, short-term efforts that accelerate the sharing and
exchange of research data
 Unofficial motto: rough consensus and exchanged data
 Working groups will run over 12-18 months to produce
 Adopted standards
 Deployed infrastructure
 Adopted policy
 Implemented best practice, etc.
 Second Plenary in Washington DC, September 16-18
Slide by Fran Berman
27
 Data Type Registries
 Data Foundation and
Terminology
 Practical Policy
 PID Information Types
 Metadata Standards WG
 Community Capability Model
 Working Group on Data Citation:
Making Data Citable
 Structural Biology
 Defining Urban Data Exchange for
Science
 Marine Data Harmonization
 Repository Audit and Certification
 Big Data Analytics
 Metadata Standards Directory
Interest Group (MSDIG)
 The Engagement Group
 Legal Interoperability
 Preservation e-Infrastructure
 UPC Code for Data
 Publishing Data
 Data in Context
 Citation of Dynamic Data
 Agricultural Data Interoperability
Working Groups Interest Groups
Research Data Alliance
Slide by Fran Berman
Conclusion
• We are all in the data business
• Researchers need data services from their
infrastructure providers
• A number of services can best be provided
at national or regional level
• Research Data Alliance is working to
develop international solutions for data
interoperability – join us!
CC-BY @atreloar 28
Questions?
@atreloar
ands.org.au
rd-alliance.org
CC-BY @atreloar 29

Mais conteúdo relacionado

Mais de Andrew Treloar

Mais de Andrew Treloar (15)

Adding value to researchers' data
Adding value to researchers' dataAdding value to researchers' data
Adding value to researchers' data
 
The life-sciences as a pathfinder in data-intensive research practice
The life-sciences as a pathfinder in data-intensive research practiceThe life-sciences as a pathfinder in data-intensive research practice
The life-sciences as a pathfinder in data-intensive research practice
 
Past, present, and future of scholarly technology and practices
Past, present, and future of scholarly technology and practicesPast, present, and future of scholarly technology and practices
Past, present, and future of scholarly technology and practices
 
Scholarly archive-of-the-future
Scholarly archive-of-the-futureScholarly archive-of-the-future
Scholarly archive-of-the-future
 
Research data and the ANDS agenda in Australia
Research data and the ANDS agenda in AustraliaResearch data and the ANDS agenda in Australia
Research data and the ANDS agenda in Australia
 
Data drives decisions
Data drives decisionsData drives decisions
Data drives decisions
 
Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)
 
Journal literature size in the context of the LHC data
Journal literature size in the context of the LHC dataJournal literature size in the context of the LHC data
Journal literature size in the context of the LHC data
 
Seeking serendipity
Seeking serendipitySeeking serendipity
Seeking serendipity
 
Research data ecology
Research data ecologyResearch data ecology
Research data ecology
 
From Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly CommunicationFrom Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly Communication
 
Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...
 
The Past, Present and Future of data
The Past, Present and Future of dataThe Past, Present and Future of data
The Past, Present and Future of data
 
Data, librarians, and services
Data, librarians, and servicesData, librarians, and services
Data, librarians, and services
 
Ands National Identifier Solution
Ands National Identifier SolutionAnds National Identifier Solution
Ands National Identifier Solution
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

What business are we in? Data-centric research, service requirements and national responses

  • 1. What business are we in? Data-centric research, service requirements and national responses Data Keynote, NEIC 2013 Dr Andrew Treloar Australian National Data Service
  • 2. Overview • What business are we really in? • Service requirements • Infrastructure responses • Research Data Alliance • Conclusions CC-BY @atreloar 2
  • 5. What Business are you in? Theodore Levitt, The Changing Character of Capitalism, Harvard Business Review, July–August 1956 “The railroads did not stop growing because the need for passenger and freight transportation declined. That grew. The railroads are in trouble today not because that need was filled by others (cars, trucks, airplanes, and even telephones) but because it was not filled by the railroads themselves. They let others take customers away from them because they assumed themselves to be in the railroad business rather than in the transportation business. The reason they defined their industry incorrectly was that they were railroad oriented instead of transportation oriented; they were product oriented instead of customer oriented....” CC-BY @atreloar 5
  • 11. We are all in the Data business! • Researchers – with some exceptions • Research infrastructure providers – with no exceptions • But what about publications? CC-BY @atreloar 11
  • 12. LHC output from 2009-2013 = 100PB (www.symmetrymagazine.org/article/february- 2013/achievement-unlocked-100-petabytes-of-data) Journal Literature size in context… @atreloar
  • 13. Data-centric view of research data re- use CC-BY @atreloar 13
  • 14. eResearch infrastructure requirements • Create/Capture – automated with capture of associated metadata • Store – with appropriate levels of preservation • Describe – information for discovery, determination of value, access, re-use • Identify – indirection operator to reduce brittlenessCC-BY @atreloar 14
  • 15. eResearch infrastructure requirements • Register – in institutional/national/discipline registries • Discover – via general or specialised search interfaces • Access – with appropriate levels of control, including humans • Exploit – by re-analysis or combination CC-BY @atreloar 15
  • 17. I come from a land downunder… CC-BY @atreloar 17 AU • 6 States • 2 Territories • 2 islands • 23M people NZ • 2 islands • 4.5M people
  • 18. You come from the frozen North… CC-BY @atreloar 18 Nordic Countries • 5 Countries • 4 Territories • So many islands • 26M people
  • 19. And yet there are some similarities CC-BY @atreloar 19 • Australia+NZ – 27.5M people • Scandinavia – 26M people
  • 20. Australian National Data Service  An initiative of the Australian Government being conducted as part of the National Collaborative Research Infrastructure Strategy ($A24M) and the Super Science Initiative ($A48M)  A collaboration between Monash University, the Australian National University and CSIRO  30 staff, funded to mid 2015  More researchers re-using more data more often  Data as a first-class object CC-BY @atreloar 20
  • 21. ANDS enables transformation of: Data that are: Unmanaged Disconnected Invisible Single use To Structured Collections that are: Managed Connected Findable Reusable so that Australian researchers can easily publish, discover, access and use/re-use research data. CC-BY @atreloar 21
  • 22. Data-centric view of research data re- use CC-BY @atreloar 22
  • 23. ANDS activities/services  Plan  Data management planning tools and resources (N)  Create/Capture  69 Data Capture projects at 23 universities  Store  working closely with national Research Data Storage Infrastructure (N)  Describe  25 institutional Metadata Stores projects  National Vocabulary Services (N) CC-BY @atreloar 23
  • 24. CC-BY @atreloar 24  Identify (N)  DataCite DOIs  Register (N)  Repository Interchange Format – Collections and Services (RIF-CS) – based on ISO2146:2010  Discover (N)  Research Data Australia ANDS activities/services
  • 25. ANDS activities/services  Access  enforced by underlying data stores  Exploit  25 institutionally-focussed projects to demonstrate value of combining data  Advocate (N)  Be the voice for data  Work with Government and Research Funders to change settings in favour of data sharing CC-BY @atreloar 25
  • 26. 26 Research Data Alliance  The Research Data Alliance (RDA) is a new international organization (driven now by EC, US, AU, more soon) forming to facilitate specific, short-term efforts that accelerate the sharing and exchange of research data  Unofficial motto: rough consensus and exchanged data  Working groups will run over 12-18 months to produce  Adopted standards  Deployed infrastructure  Adopted policy  Implemented best practice, etc.  Second Plenary in Washington DC, September 16-18 Slide by Fran Berman
  • 27. 27  Data Type Registries  Data Foundation and Terminology  Practical Policy  PID Information Types  Metadata Standards WG  Community Capability Model  Working Group on Data Citation: Making Data Citable  Structural Biology  Defining Urban Data Exchange for Science  Marine Data Harmonization  Repository Audit and Certification  Big Data Analytics  Metadata Standards Directory Interest Group (MSDIG)  The Engagement Group  Legal Interoperability  Preservation e-Infrastructure  UPC Code for Data  Publishing Data  Data in Context  Citation of Dynamic Data  Agricultural Data Interoperability Working Groups Interest Groups Research Data Alliance Slide by Fran Berman
  • 28. Conclusion • We are all in the data business • Researchers need data services from their infrastructure providers • A number of services can best be provided at national or regional level • Research Data Alliance is working to develop international solutions for data interoperability – join us! CC-BY @atreloar 28

Notas do Editor

  1. Let me start with a quotation:“The railroads did not stop growing because the need for passenger and freight transportation declined. That grew. The railroads are in trouble today not because that need was filled by others (cars, trucks, airplanes, and even telephones) but because it was not filled by the railroads themselves. They let others take customers away from them because they assumed themselves to be in the railroad business <CLICK>.”Bergen Railway
  2. “…rather than in the transportation business [this is Dubai International Terminal]. The reason they defined their industry incorrectly was that they were railroad oriented instead of transportation oriented; they were product oriented instead of customer oriented....”Dubai International Terminal
  3. Talk about the importance of recognising what business you are actually in, as opposed to the business you think you are in.
  4. If the only tool you have is a hammer, then everything looks like a nail (apparently no direct Norwegian equivalent according to my hopefully future daughter-in-law native speaker informant)Yes, I work for a data organisation, and so I might be biased, but let’s look hard at some e-Research infrastructure businesses
  5. Networks exist to move what around? Data, and data derivatives (to a first approximation)
  6. Storage exists to store what? Data, and data derivatives
  7. HPC exists to generate and process what? DataI could go on: Visualisation? DataCalculation? Dataetc.
  8. Of course, it’s possible to take this too far. I look at this and see a data-collection instrument ;-)
  9. Of course, researchers also generate publications too, but they need the data in order to be able to do so.
  10. So, if we are all in the data business, what does that mean for researchers? How do we support what they need to do as they create, publish and reuse data?Here is one way of thinking about the functions that need to be supported (based on work by me and Dr Adrian Burton from ANDS)NOTE: This is somewhat idealised, and some of the steps are often done poorly or not at all. Publish = Store+Describe+Register+Identify
  11. And now, let me provide a more Australian flavour to the talk
  12. Recap
  13. Store – we don’t do storageDescribe – 25 of 40 universities
  14. Discover – quick demo if timeAdvocate – new verb
  15. Before I close, let me talk briefly about the Research Data Alliance.Nordic involvement in Organising Group? Nomination for Council?
  16. And WGs/IGs of course