SlideShare a Scribd company logo
1 of 5
Download to read offline
Software Impact, Metrics,
      and Citation

           Daniel S. Katz
     Program Director, Office of
        Cyberinfrastructure
Measuring Impact – Scenarios
1.  Developer of open source physics simulation
   –  Possible metrics
       •    How many downloads? (easiest to measure, least value)
       •    How many contributors?
       •    How many uses?
       •    How many papers cite it?
       •    How many papers that cite it are cited? (hardest to measure,
            most value)

2.  Developer of open source math library
   –  Possible metrics are similar, but citations are less
      likely
   –  What if users don’t download it?
       •    It’s part of a distro
       •    It’s pre-installed (and optimized) on an HPC system
       •    It’s part of a cloud image
       •    It’s a service
Vision for Metrics & Citation, part 1
•  Products (software, paper, data set) are
   registered
   –  Credit map (weighted list of contributors—people,
      products, etc.) is an input
   –  DOI is an output
   –  Leads to transitive credit
       •  E.g., paper 1 provides 25% credit to software A, and software A
          provides 10% credit to library X -> library X gets 2.5% credit for
          paper 1
       •  Helps developer – “my tools are widely used, give me tenure” or
          “NSF should fund my tool maintenance”
   –  Issues:
       •  Social: Trust in person who registers a product
            –  This seems to work for papers today (without weights) for both
               author lists and for citations
            –  Do weights require more than human memory?
       •  Technological: Registration system
            –  Where is it/them, what are interfaces, how do they work together?
Vision for Metrics & Citation, part 2
•  Product usage is recorded
   –  Where?
       •  Both the developer and user want to track usage
       •  Privacy issues? (legal, competitive, ...)
       •  Via a phone home mechanism?
   –  What does “using” a data set mean? And how could
      trigger a usage record
   –  Can general code be developed for this, to be
      incorporated in software packages?
•  With user input, tie later products to usage
   –  User may not know science outcome when using tool
   –  After science outcome is known, may be hard to
      determine which product usages were involved
Vision for Metrics & Citation, thoughts
•  Can this be done incrementally?
•  Lack of credit is a larger problem than often
   perceived
   –  Lack of credit is a disincentive for sharing software
      and data
   –  Providing credit would both remove disincentive as
      well as adding incentive
   –  See Lewin’s principal of force field analysis (1943)
•  For commercial tools, credit is tracked by $
   –  But this doesn’t help understand what tools were used
      for what outcomes
   –  Does this encourage collaboration?
•  Could a more economic model be used?
   –  NSF gives tokens are part of science grants, users
      distribute tokens while/after using tools

More Related Content

Similar to Software: impact, metrics, and citation

Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 

Similar to Software: impact, metrics, and citation (20)

Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
Intoduction to software engineering part 1
Intoduction to software engineering part 1Intoduction to software engineering part 1
Intoduction to software engineering part 1
 
Citation and reproducibility in software
Citation and reproducibility in softwareCitation and reproducibility in software
Citation and reproducibility in software
 
Agile data science
Agile data scienceAgile data science
Agile data science
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
NISI Agile Software Architecture Slide Deck
NISI Agile Software Architecture Slide DeckNISI Agile Software Architecture Slide Deck
NISI Agile Software Architecture Slide Deck
 
20160607 citation4software panel
20160607 citation4software panel20160607 citation4software panel
20160607 citation4software panel
 
Funding Software in Academia
Funding Software in AcademiaFunding Software in Academia
Funding Software in Academia
 
Inti escem-tours2012-acs
Inti escem-tours2012-acsInti escem-tours2012-acs
Inti escem-tours2012-acs
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine Jones
 
Software Ecosystems = Big Data
Software Ecosystems = Big DataSoftware Ecosystems = Big Data
Software Ecosystems = Big Data
 
Software Citation: Principles, Implementation, and Impact
Software Citation:  Principles, Implementation, and ImpactSoftware Citation:  Principles, Implementation, and Impact
Software Citation: Principles, Implementation, and Impact
 
unit 1.pptx regasts sthatbabs shshsbsvsbsh
unit 1.pptx regasts sthatbabs shshsbsvsbshunit 1.pptx regasts sthatbabs shshsbsvsbsh
unit 1.pptx regasts sthatbabs shshsbsvsbsh
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainability
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education Organisation
 
20160607 citation4software opening
20160607 citation4software opening20160607 citation4software opening
20160607 citation4software opening
 
Project Documentation Student Management System format.pptx
Project Documentation Student Management System format.pptxProject Documentation Student Management System format.pptx
Project Documentation Student Management System format.pptx
 
Introduction
IntroductionIntroduction
Introduction
 
Software Analytics
Software AnalyticsSoftware Analytics
Software Analytics
 

More from Daniel S. Katz

Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
Daniel S. Katz
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to Sustain
Daniel S. Katz
 

More from Daniel S. Katz (20)

Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
 
URSSI
URSSIURSSI
URSSI
 
Research Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSIResearch Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSI
 
Software citation
Software citationSoftware citation
Software citation
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflows
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groups
 
Working towards Sustainable Software for Science: Practice and Experience (WS...
Working towards Sustainable Software for Science: Practice and Experience (WS...Working towards Sustainable Software for Science: Practice and Experience (WS...
Working towards Sustainable Software for Science: Practice and Experience (WS...
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?
 
Looking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSFLooking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSF
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative results
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still Work
 
US University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and MetricsUS University Research Funding, Peer Reviews, and Metrics
US University Research Funding, Peer Reviews, and Metrics
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
A Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to SustainA Method to Select e-Infrastructure Components to Sustain
A Method to Select e-Infrastructure Components to Sustain
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Software: impact, metrics, and citation

  • 1. Software Impact, Metrics, and Citation Daniel S. Katz Program Director, Office of Cyberinfrastructure
  • 2. Measuring Impact – Scenarios 1.  Developer of open source physics simulation –  Possible metrics •  How many downloads? (easiest to measure, least value) •  How many contributors? •  How many uses? •  How many papers cite it? •  How many papers that cite it are cited? (hardest to measure, most value) 2.  Developer of open source math library –  Possible metrics are similar, but citations are less likely –  What if users don’t download it? •  It’s part of a distro •  It’s pre-installed (and optimized) on an HPC system •  It’s part of a cloud image •  It’s a service
  • 3. Vision for Metrics & Citation, part 1 •  Products (software, paper, data set) are registered –  Credit map (weighted list of contributors—people, products, etc.) is an input –  DOI is an output –  Leads to transitive credit •  E.g., paper 1 provides 25% credit to software A, and software A provides 10% credit to library X -> library X gets 2.5% credit for paper 1 •  Helps developer – “my tools are widely used, give me tenure” or “NSF should fund my tool maintenance” –  Issues: •  Social: Trust in person who registers a product –  This seems to work for papers today (without weights) for both author lists and for citations –  Do weights require more than human memory? •  Technological: Registration system –  Where is it/them, what are interfaces, how do they work together?
  • 4. Vision for Metrics & Citation, part 2 •  Product usage is recorded –  Where? •  Both the developer and user want to track usage •  Privacy issues? (legal, competitive, ...) •  Via a phone home mechanism? –  What does “using” a data set mean? And how could trigger a usage record –  Can general code be developed for this, to be incorporated in software packages? •  With user input, tie later products to usage –  User may not know science outcome when using tool –  After science outcome is known, may be hard to determine which product usages were involved
  • 5. Vision for Metrics & Citation, thoughts •  Can this be done incrementally? •  Lack of credit is a larger problem than often perceived –  Lack of credit is a disincentive for sharing software and data –  Providing credit would both remove disincentive as well as adding incentive –  See Lewin’s principal of force field analysis (1943) •  For commercial tools, credit is tracked by $ –  But this doesn’t help understand what tools were used for what outcomes –  Does this encourage collaboration? •  Could a more economic model be used? –  NSF gives tokens are part of science grants, users distribute tokens while/after using tools