CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
The Research Software Encyclopedia
1. The Research Software Encyclopedia
Vanessa Sochat, Research Software Engineer, Stanford University
Nicholas May, Software Engineer, RMIT University
Ian Cosden, Director Research Software Engineering, Princeton University
Carlos Martinez-Ortiz, Community manager, Netherlands eScience Center
Sadie Bartholomew, Computational Scientist, National Centre for Atmospheric Science
& University of Reading (UK)
2. The Research Software Encyclopedia
What is Research Software?
Vanessa Sochat, Research Software Engineer, Stanford University
Nicholas May, Software Engineer, RMIT University
Ian Cosden, Director Research Software Engineering, Princeton University
Carlos Martinez-Ortiz, Community manager, Netherlands eScience Center
Sadie Bartholomew, Computational Scientist, National Centre for Atmospheric Science
& University of Reading (UK)
3. 2019
Why should we create awareness for the role?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
- 3. Create more training and career opportunities.
4. 2019
Why should we create awareness for the role?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
- 3. Create more training and career opportunities.
5. 2019
Why should we create awareness for the role?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
- 3. Create more training and career opportunities.
6. 2019
Why should we create awareness for the role?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
- 3. Create more training and career opportunities.
11. 2019
What is a research software engineer?
- 1. Help with institutional or project funding.
12. 2019
What is a research software engineer?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
13. 2019
What is a research software engineer?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
- 3. Create more training and career opportunities.
14. 2019
Why should we create awareness for the role?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
- 3. Create more training and career opportunities.
15. 2019
Why should we create awareness for the role?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
- 3. Create more training and career opportunities.
16. 2019
Why should we create awareness for the role?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
- 3. Create more training and career opportunities.
17. 2019
Why should we create awareness for the role?
- 1. Help with institutional or project funding.
- 2. Grow an international community.
- 3. Create more training and career opportunities.
23. 2020
- Will my software be considered for this grant?
- Can I publish this work as research software?
24. 2020
- Will my software be considered for this grant?
- Can I publish this work as research software?
- Do I even work on research software?
25. 2020
- Will my software be considered for this grant?
- Can I publish this work as research software?
- Do I even work on research software?
- How will my institution decide about funding?
26. 2020
- Will my software be considered for this grant?
- Can I publish this work as research software?
- Do I even work on research software?
- How will my institution decide about funding?
- How will my institution decide about me?
28. 2020
How do we tackle answering:
“What is research software”
29. 2020
How do we tackle answering:
“What is research software”
- “I’m an expert!” I’ll publish a paper!
30. 2020
How do we tackle answering:
“What is research software”
- “I’m an expert!” I’ll publish a paper!
- “A committee of experts will figure it out.”
31. 2020
How do we tackle answering:
“What is research software”
- “I’m an expert!” I’ll publish a paper!
- “A committee of experts will figure it out.”
- “Let’s ask the community.”
32. 2020
How do we tackle answering:
“What is research software”
- “I’m an expert!” I’ll publish a paper!
- “A committee of experts will figure it out.”
- “Let’s ask the community.”
34. The Research Software Encyclopedia
- Is a community driven, open source effort
- It should not require substantial work/funding to maintain
35. The Research Software Encyclopedia
- Is a community driven, open source effort
- It should not require substantial work/funding to maintain
- We cannot derive a holistic definition to satisfy everyone.
36. The Research Software Encyclopedia
- Is a community driven, open source effort
- It should not require substantial work/funding to maintain
- We cannot derive a holistic definition to satisfy everyone.
- But we can answer questions about software (criteria)
We can categorize software (taxonomy)
37. The Research Software Encyclopedia
- Is a community driven, open source effort
- It should not require substantial work/funding to maintain
- We cannot derive a holistic definition to satisfy everyone.
- But we can answer questions about software (criteria)
We can categorize software (taxonomy)
- Apply a filter for these attributes to determine a yes/no
answer for a specific use case.
38. The Research Software Encyclopedia
- Is a community driven, open source effort
- It should not require substantial work/funding to maintain
- We cannot derive a holistic definition to satisfy everyone.
- But we can answer questions about software (criteria)
We can categorize software (taxonomy)
- Apply a filter for these attributes to determine a yes/no
answer for a specific use case.
- It should be fun!
39. The RSEpedia
39
Criteria and Taxonomy
How do we interact with metadata?
Tools
What questions should we ask about
research software?
How do we record and create interfaces to
interact with repositories?
Database
1
2
3
1
2
3
74. The Research Software Encyclopedia
- Is agnostic to whether something is or isn’t research software.
75. The Research Software Encyclopedia
- Is agnostic to whether something is or isn’t research software.
- Gives us a means to better communicate about software.
76. The Research Software Encyclopedia
- Is agnostic to whether something is or isn’t research software.
- Gives us a means to better communicate about software.
- Empowers the user to decide based on his or her use case.
77. The Research Software Encyclopedia
- Is agnostic to whether something is or isn’t research software.
- Gives us a means to better communicate about software.
- Empowers the user to decide based on his or her use case.
- The community database and software are:
- Open source and version controlled
- Require no long term investment for automation or hosting
- Updated automatically
79. I want to learn about software and annotate!
- Look for weekly “Software survey” posts on slack and Twitter
- A few clicks is all it takes to annotate the software for the week!
80. I care about the criteria and/or taxonomy
- Contribute to the taxonomy / criteria repository https://github.com/rseng/rseng
I want to learn about software and annotate!
- Look for weekly “Software survey” posts on slack and Twitter
- A few clicks is all it takes to annotate the software for the week!
81. I care about the larger project and vision for the RSEpedia
- Share your ideas and comments as a co-author on the paper (link)
82. I care about the larger project and vision for the RSEpedia
- Share your ideas and comments as a co-author on the paper (link)
I like to build things, let me work on the software!
- The software drives the interfaces, database interaction, and annotation
- Contribute to the repository at https://github.com/rseng/rse
83. I am excited to learn about new software.
- Suggest a repository to be featured for the “Software Survey” or write the post!
- Think of new venues to interact with the community about research software.
I care about the larger project and vision for the RSEpedia
- Share your ideas and comments as a co-author on the paper (link)
I like to build things, let me work on the software!
- The software drives the interfaces, database interaction, and annotation
- Contribute to the repository at https://github.com/rseng/rse
84. I am confident that
my work qualifies
for the grant.
2021?
85. I am confident that
my work qualifies
for the grant.
2021?
My institution
understands and
values my software.
86. I am confident that
my work qualifies
for the grant.
2021?
My institution
understands and
values my software.
87. I am confident that
my work qualifies
for the grant.
2021?
My institution
understands and
values my software.
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
So in 2019 and earlier, RSEs certainly had specific things to worry about. Things related directly to their work, “How do I optimize that script?”
So in 2019 and earlier, RSEs certainly had specific things to worry about. Things related directly to their work, “How do I optimize that script?”
And also things, slightly more anxiety provoking, that are directly related to their employment.
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
So then we time warp into the year 2020
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
And we needed to know these points because this idea of research software engineering was a pretty new thing, and we needed to get the word out so tha tinstitutios would realize that we’re important to fund. So that projects would make room for our work. And to become a part of this larger, internatioal community
Now, granted that 2020 has presented with unprecedented new challenges.
Now, granted that 2020 has presented with unprecedented new challenges.
The interesting part is that although we still have these worries, because of the state of the economy and all the uncertainty around that, the previous worries about our jobs, and even our lives, are compounded.
The interesting part is that although we still have these worries, because of the state of the economy and all the uncertainty around that, the previous worries about our jobs, and even our lives, are compounded. And what does that prompt? It prompts more self reflection than ever before. It prompts us to think about our role in the larger scheme of our families and our communities.
And the things that we worry are more directly related to our own sustainability. We look at our work, which very often is software, and say “Is this going to be eligible for that grant?”
Is this thing that I’m working so hard on, am I going to be able to publish it to help further my own career? For people to take me seriously?
And some of us can go into a sort of identity crisis. My god, I thought I was a research software engineer, but someone out there doesn’t think what I work on is research software. SHould I jump ship and go to industry?
Is my institution going to value the work of research software engineers for sustainability and all that jazz.
Is my institution going to value me?
So this is a very long winded way of leading you up to this question that has been constantly on my mind because it is so important for the reasons noted. What the heck is research software? Because guess what, our careers, our growth, depends on how we answer it.
So akin to the previous question, what is a research software engineer, this is something that I wanted to work on. And guess what, these same general strategies stil hold true in 2020. The only difference is that we are masked up and ready for anything at this point.
So akin to the previous question, what is a research software engineer, this is something that I wanted to work on. And guess what, these same general strategies stil hold true in 2020. The only difference is that we are masked up and ready for anything at this point.
So akin to the previous question, what is a research software engineer, this is something that I wanted to work on. And guess what, these same general strategies stil hold true in 2020. The only difference is that we are masked up and ready for anything at this point.
So akin to the previous question, what is a research software engineer, this is something that I wanted to work on. And guess what, these same general strategies stil hold true in 2020. The only difference is that we are masked up and ready for anything at this point.
So akin to the previous question, what is a research software engineer, this is something that I wanted to work on. And guess what, these same general strategies stil hold true in 2020. The only difference is that we are masked up and ready for anything at this point.
Which led to the creation of the research software encyclopedia, or RSEpedia. The RSEpedia takes the approach that we cannot come up with one holistic definition of research software to satisfy every use case, context, or funding body.
Which led to the creation of the research software encyclopedia, or RSEpedia. The RSEpedia takes the approach that we cannot come up with one holistic definition of research software to satisfy every use case, context, or funding body.
Which led to the creation of the research software encyclopedia, or RSEpedia. The RSEpedia takes the approach that we cannot come up with one holistic definition of research software to satisfy every use case, context, or funding body.
Which led to the creation of the research software encyclopedia, or RSEpedia. The RSEpedia takes the approach that we cannot come up with one holistic definition of research software to satisfy every use case, context, or funding body.
Which led to the creation of the research software encyclopedia, or RSEpedia. The RSEpedia takes the approach that we cannot come up with one holistic definition of research software to satisfy every use case, context, or funding body.
This entire process should be fun! We should be learning about new software, discussing, and not have it be a stressful experience.
This entire process should be fun! We should be learning about new software, discussing, and not have it be a stressful experience.
This entire process should be fun! We should be learning about new software, discussing, and not have it be a stressful experience.
This entire process should be fun! We should be learning about new software, discussing, and not have it be a stressful experience.
This entire process should be fun! We should be learning about new software, discussing, and not have it be a stressful experience.
This entire process should be fun! We should be learning about new software, discussing, and not have it be a stressful experience.
they feed into a web interface where you can interactively explore the taxonomy or criteria lists.
And importantly, static apis that serve the latest versions of the taxonomy and criteria.
This means interactive visualizations
And this is also where we have the weekly software survey where we showcase software in the database, meaning directing people to it on slack Twitter or other social media, and then prompting them to annotate it for criteria and taxonomy items. But database, what the heck am I talking about?
So a core component to generate this database that we will discuss shortly is the rse software itself.
This entire process should be fun! We should be learning about new software, discussing, and not have it be a stressful experience.
So this exposes a command line client to give you a whole slew of commands to interact with a research software database, which by the way, is just a GitHub repository.
And guess what, we have these same annotation interfaces, being served statically on GitHub pages.
So this exposes a command line client to give you a whole slew of commands to interact with a research software database, which by the way, is just a GitHub repository.
Which means you can easily run a command line or interactive annotation session for a software database on your local machine. So if you are in a headless environment, this could actually mean command line annotation.
Or if you have a web browser, it’s a more human friendly interface
Or if you have a web browser, it’s a more human friendly interface
Or if you have a web browser, it’s a more human friendly interface
But either way, after you annotate, all of the changes come down to changes in this flat file database. And so you can commit to the software database repository, and either push or open a pull request with your changes. This is super cool because the history of your annotaitons, your contribution to the database, is the same as it would be for open source software.
Now this database, what in the heck am I talking about?
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
The database is namespaced by the version control system. So for example, the research software encyclopedia software currently has parsers for GitLab and GitHub, and more can be added as needed. We are assuming that a maintained software repository uses one main service and that works as a DOI for it.
But it’s not really enough to have a database that you cannot see. For this reason, we also have an automated generation of a static interface that is exactly the same as the one you would run locally, but with slightly different actions when you do annotation.
And guess what, we have these same annotation interfaces, being served statically on GitHub pages.
And guess what, we have these same annotation interfaces, being served statically on GitHub pages.
And guess what, we have these same annotation interfaces, being served statically on GitHub pages.
except, when you click submit it opens a new window with a pre-populated issue on GitHub, again associated with your user acconut so you get credit for the annotation.
opening the issue automatically triggers a GitHub workflow that will then open a pull request to add the annotations to the database, and reference the issue. And then you’re done!
opening the issue automatically triggers a GitHub workflow that will then open a pull request to add the annotations to the database, and reference the issue. And then you’re done!
well you could add in bulk from a text file, or add a repository one off, and then do a pull request. And actually I did this at the onset because I wanted to add the Numfocus repos.
But that’s kind of manual and arduous. So instead. the research software encyclopedia software has scrapers that can automatically find and add new repositories each week!
And the scrapers are run as a scheduled job on a weekly basis to discover new repositories in these databases.
And finally, given that we have the database, and we have some specific use case where we can say “these are the criteria and categories that are important to me”
The research software encyclopedia software has an analyze function that lets you summarize the criteria and categories, either based on a “majority wins” strategy, or with a custom threshld that you can set for either criteria or taxonomy. So this here shows a single software repository being analyzed, and as the database grows with annotations we will have functions to do this in bulk and export some final list for your use case.
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
So this is a very long winded way of leading you up to this question that has been constantly on my mind because it is so important for the reasons noted. What the heck is research software? Because guess what, our careers, our growth, depends on how we answer it.
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
Now although you can make a software database to maybe showcase your work or a custom set, just as I alluded to, we have a community software database, which is just a repository of software and metadata.
So in 2019 and earlier, RSEs certainly had specific things to worry about. Things related directly to their work, “How do I optimize that script?”
So in 2019 and earlier, RSEs certainly had specific things to worry about. Things related directly to their work, “How do I optimize that script?”
So in 2019 and earlier, RSEs certainly had specific things to worry about. Things related directly to their work, “How do I optimize that script?”
So in 2019 and earlier, RSEs certainly had specific things to worry about. Things related directly to their work, “How do I optimize that script?”