SlideShare a Scribd company logo
1 of 12
Download to read offline
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
Joint CGO-PPoPP’17 Artifact Evaluation Discussion
AE Chairs
CGO’17: Joseph Devietti, University of Pennsylvania
PPoPP’17: Wonsun Ahn, University of Pittsburgh
AE CGO-PPoPP-PACT Steering Committee
Grigori Fursin, dividiti / cTuning foundation
Bruce Childers, University of Pittsburgh
Agenda
• Results and issues
• Awards by NVIDIA and dividiti
• Discussion how to improve
and scale future AE
Fantastic Artifact Evaluators and Supporters
cTuning.org/committee.html
cTuning.org/ae/artifacts.html
http://dividiti.blogspot.com/2017/01/artifact-evaluation-discussion-session.html
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
How CGO-PPoPP-PACT AE works
time line
paper
accepted
paper
accepted
artifacts
submitted
artifacts
submitted
evaluatorevaluator
bidding
artifacts
assigned
artifacts
assigned
evaluationsevaluations
available
evaluationsevaluations
finalized
artifactartifact
decision
7..12 days
to prepare
artifacts
according to
guidelines:
cTuning.org/
submission.html
2..4 days
for evaluators
to bid on
artifacts
(according
to their
knowledge and
access to
required
SW/HW)
2 days
to assign
artifacts –
ensure at least
3 reviews
per artifact,
reduce risks,
avoid mix ups
minimize
conflicts of
interests
2 weeks
to review
artifacts
according to
guidelines:
cTuning.org/
reviewing.html
3..4 days
for authors to
respond to
reviews and fix
problems
2..3 days
to finalize
reviews
NOTE: we
consider AE
a cooperative
process and try
to help authors
fix artifacts and
pass evaluation
(particularly if
artifacts will be
open-sourced)
Light communication between
authors and reviewers
is allowed via AE chairs
(to preserve anonymity
of the reviewers)
Year PPoPP CGO Total Problems Rejected
2015 10 8 18 7 2 
2016 12 11 23 4 0
2017 14 13 27 7 0
2..3 days
to add
AE stamp
and AE appendix
to a camera-
ready paper
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
AE: good, bad and ugly …
Good: many interesting and open source artifacts –
authors and evaluators take AE seriously!
Bad: * too many artifacts – need to somehow scale AE while keeping the quality
(41 evaluators, ~120 reviews to handle during 2.5 weeks)
* sometimes difficult to find evaluators with appropriate skills
and access to proprietary SW and rare HW
* very intense schedule and not enough time for rebuttals
* communication between authors and reviewers via AE chairs is a bottleneck
Ugly: * too many ad-hoc scripts to prepare and run experiments
* no common workflow frameworks (in contrast with some other sciences)
* no common formats and APIs (benchmarks, data sets, tools)
* difficult to reproduce empirical results across diverse SW/HW and inputs
time line
paper
accepted
paper
accepted
artifacts
submitted
artifacts
submitted
evaluatorevaluator
bidding
artifacts
assigned
artifacts
assigned
evaluationsevaluations
available
evaluationsevaluations
finalized
artifactartifact
decision
7..12 days 2..4 days 2 days 2 weeks 3..4 days 2..3 days 2..3 days
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
Joint CGO-PPoPP’17 awards
a) Promote “good” (well documented, consistent and easy to use) artifacts
NVIDIA donated “Pascal Titan X GPGPU card”
for the highest ranked artifact
b) Promote using workflow frameworks to share artifacts and experiments
as customizable and reusable components with common meta description and API
DIVIDITI donated $500 for the highest ranked artifact
shared using Collective Knowledge workflow framework
dividiti.com
cKnowledge.org
Collective Knowledge is being developed by the community to simplify AE process
and improve sharing of artifacts as customizable and reusable Python components
with extensible JSON meta-description and JSON API, assemble cross-platform workflows,
automate and crowdsource empirical experiments, and enable interactive reports.
Joint CGO/PPoPP Artifact Evaluation Award
for the distinguished open-source artifact
shared in the Collective Knowledge format
“Software Prefetching for Indirect Memory Accesses”
Sam Ainsworth, Timothy M. Jones
University of Cambridge
The cTuning foundation and dividiti
are pleased to grant
February 2017
Joint CGO/PPoPP Distinguished Artifact Award
Xiuxia Zhang1, Guangming Tan1, Shuangbai Xue1, Jiajia Li2, Mingyu Chen1
1 Chinese Academy of Sciences 2 Georgia Institute of Technology
February 2017
for
“Demystifying GPU Microarchitecture
to Tune SGEMM Performance”
The cTuning foundation and
NVIDIA are pleased to present
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
Discussion how to improve/scale future AE
time line
paper
accepted
paper
accepted
artifacts
submitted
artifacts
submitted
evaluatorevaluator
bidding
artifacts
assigned
artifacts
assigned
evaluationsevaluations
available
evaluationsevaluations
finalized
artifactartifact
decision
7..12 days 2..4 days 2 days 2 weeks 3..4 days 2..3 days 2..3 days
1) Introduce two evaluation options: private and public
a) traditional evaluation for private artifacts (for example, from industry, though less and less common)
time line
paper
accepted
paper
accepted
artifacts
submitted
artifacts
submitted
AE chairs announce
public artifacts at
XSEDE/GRID5000/etc
AE chairs announce
public artifacts at
XSEDE/GRID5000/etc
AE chairs monitor open
artifacts are evaluated
AE chairs monitor open
discussions until
artifacts are evaluated
artifactartifact
decision
any time 1..2 days from a few days to 2 weeks 3..4 days 2..3 days
b) open evaluation of public and open-source artifacts (if already avialable at
GitHub, BitBucket, GitLab with “discussion mechanisms” during submission…)
At CGO/PPoPP’17, we have sent out requests to validate several open-source artifacts to the public
mailing lists from the conferences, network of excellence, supercomputer centers, etc.
We found evaluators willing to help and having an access to rare hardware or supercomputers as well
as required software and proprietary benchmarks
Authors quickly fixed issues and answered research questions while AE chairs steered the discussion!
See public reviewing examples at cTuning.org/ae/artifacts.html and adapt-workshop.org
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
2) Enable public or private discussion channels between authors and reviewers
for each artifact (rather than communicating via AE chairs)
Useful technology: slack.com , reddit.com
Evaluators can still be anonymous if they wish so …
3) Help authors prepare artifacts and workflows for unified evaluation
(community service by volunteers?)
This year we processed more than 120 evaluation reports. Nearly all artifacts had their
own ad-hoc scripts to build and run workflows, process outputs, validate results, etc.
Since it’s a huge burden for evaluators, they ask us to gradually introduce common
workflows and data formats to unify evaluation.
A possible solution is to introduce an optional service (based on distinguished artifacts)
to help authors convert their ad-hoc scripts to some common format and thus scale AE!
Furthermore, it may help researchers easily reuse and customize past artifacts, and build
upon them!
Discussion how to improve/scale future AE
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
Discussion how to improve/scale future AE
4) Should we update Artifact Appendices?
Two years ago we introduced Artifact Appendix templates to unify
Artifact submissions and let authors add up to two pages of such
appendices to their camera ready paper:
http://cTuning.org/ae/submission.html
http://cTuning.org/ae/submission_extra.html
The idea is to help readers better understand what was evaluated
and let them reproduce published research and build upon it.
We did not receive complaints about our appendices and many
researchers decided to add them to their camera ready papers (see
http://cTuning.org/ae/artifacts.html).
Similar AE appendices are now used by other conferences
(SC,RTSS):
http://sc17.supercomputing.org/submitters/technical-
papers/reproducibility-initiatives-for-technical-papers/artifact-description-
paper-title
We suggest to get in touch with AE chairs from all related conferences to sync on future
AE submission and reviewing procedures to avoid defragmentation!
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
Discussion how to improve/scale future AE
5) Decide whether to evaluate all experiments or still allow
partial validation or even only artifact sharing
We do not yet have a common methodology to fully validate experimental results from the
research papers in our domain – we also know that full validation of empirical experiments is
very challenging and time consuming.
At the same time, making artifacts available is also extremely valuable to the community (data
sets, predictive models, architecture simulators and their models, benchmarks, tools,
experimental workflows).
Last year we participated in ACM workshop on reproducible research and co-authored the
following ACM Result and Artifact Review and Badging policy (based on our AE experience):
http://www.acm.org/publications/policies/artifact-review-badging
It suggests using several separate badges:
• Artifacts publicly available
• Artifacts evaluated (functional, reusable)
• Results validated (replicated, reproduced)
We consider using above policy and badges for the next AE – feedback is welcome!
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
Discussion how to improve/scale future AE
6) Evaluate artifacts for “tool” papers during main reviewing
We now discuss the possibility to validate artifacts for so-called tool papers during
main reviewing. Such evaluation will influence acceptance decision.
Similar approach seems to be used at SuperComputing’17
(will be useful to discuss that with SC’17 AE organizers).
Current problems are:
• Artifact Evaluation committee may not be prepared yet (though we have a
joint AEC from last years)
• If we ask PC members to evaluate papers and artifacts at the same time, it’s
an extra burden. Furthermore, PC members may not have required technical
skills (that’s why AEC is usually assembled from postdocs and research
engineers)
• CGO and PPoPP use double blind reviewing. However reviewing artifacts
without revealing authors identity is very non-trivial and places an extra
unnecessary burden on the authors and evaluators (and may kill AE)
(we should check how/if SC’16/SC’17 solve this problem
since they also use double blind reviewing).
Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017
We need your feedback! Thank you!!!
We need your feedback - remember that new AE procedures
may affect you at the future conferences
• Contact AE steering committee: http://cTuning.org/committee.html
• Mailing list: https://groups.google.com/forum/#!forum/collective-knowledge
Extra resources
• Artifact Evaluation Website: http://cTuning.org/ae
• ACM Result and Artifact Review and Badging policy:
http://www.acm.org/publications/policies/artifact-review-badging
• CK workflow framework: http://cKnowledge.org
• Community driven artifact/paper evaluation:
http://dl.acm.org/citation.cfm?doid=2618137.2618142

More Related Content

Viewers also liked

CK: from ad hoc computer engineering to collaborative and reproducible data s...
CK: from ad hoc computer engineering to collaborative and reproducible data s...CK: from ad hoc computer engineering to collaborative and reproducible data s...
CK: from ad hoc computer engineering to collaborative and reproducible data s...Grigori Fursin
 
Fichas tecnicas tipologia_cepillos_grupo_207102_19
Fichas tecnicas tipologia_cepillos_grupo_207102_19Fichas tecnicas tipologia_cepillos_grupo_207102_19
Fichas tecnicas tipologia_cepillos_grupo_207102_19jdparraa23
 
Cómo hablar para que los niños escuchen
Cómo hablar para que los niños escuchenCómo hablar para que los niños escuchen
Cómo hablar para que los niños escuchenAna Gabriela
 
La naturaleza del aprendiza Karina Domínguez
La naturaleza del aprendiza Karina DomínguezLa naturaleza del aprendiza Karina Domínguez
La naturaleza del aprendiza Karina DomínguezKarizza Dominguez
 
Cómo hablar para que los niños escuchen y cómo escuchar para que los niños ha...
Cómo hablar para que los niños escuchen y cómo escuchar para que los niños ha...Cómo hablar para que los niños escuchen y cómo escuchar para que los niños ha...
Cómo hablar para que los niños escuchen y cómo escuchar para que los niños ha...Chona18
 
tecnicas monitoreo calidad aire - monitoreo ambiental.com
 tecnicas monitoreo calidad aire - monitoreo ambiental.com tecnicas monitoreo calidad aire - monitoreo ambiental.com
tecnicas monitoreo calidad aire - monitoreo ambiental.comAmparo Elizabeth Robles Zambrano
 
Media research task 6 audience research
Media research task 6 audience researchMedia research task 6 audience research
Media research task 6 audience researchAaron Alderman
 
Pesquisando o ensino de ciências por investigação na educação básica utilizan...
Pesquisando o ensino de ciências por investigação na educação básica utilizan...Pesquisando o ensino de ciências por investigação na educação básica utilizan...
Pesquisando o ensino de ciências por investigação na educação básica utilizan...Ronaldo Santana
 
Formal Assessment of Creativity by Katja Hölttä-Otto (Aalto University)
Formal Assessment of Creativity by Katja Hölttä-Otto (Aalto University)Formal Assessment of Creativity by Katja Hölttä-Otto (Aalto University)
Formal Assessment of Creativity by Katja Hölttä-Otto (Aalto University)EduSkills OECD
 

Viewers also liked (12)

CK: from ad hoc computer engineering to collaborative and reproducible data s...
CK: from ad hoc computer engineering to collaborative and reproducible data s...CK: from ad hoc computer engineering to collaborative and reproducible data s...
CK: from ad hoc computer engineering to collaborative and reproducible data s...
 
Fichas tecnicas tipologia_cepillos_grupo_207102_19
Fichas tecnicas tipologia_cepillos_grupo_207102_19Fichas tecnicas tipologia_cepillos_grupo_207102_19
Fichas tecnicas tipologia_cepillos_grupo_207102_19
 
Cómo hablar para que los niños escuchen
Cómo hablar para que los niños escuchenCómo hablar para que los niños escuchen
Cómo hablar para que los niños escuchen
 
La naturaleza del aprendiza Karina Domínguez
La naturaleza del aprendiza Karina DomínguezLa naturaleza del aprendiza Karina Domínguez
La naturaleza del aprendiza Karina Domínguez
 
Cómo hablar para que los niños escuchen y cómo escuchar para que los niños ha...
Cómo hablar para que los niños escuchen y cómo escuchar para que los niños ha...Cómo hablar para que los niños escuchen y cómo escuchar para que los niños ha...
Cómo hablar para que los niños escuchen y cómo escuchar para que los niños ha...
 
Npc informe final
Npc informe finalNpc informe final
Npc informe final
 
tecnicas monitoreo calidad aire - monitoreo ambiental.com
 tecnicas monitoreo calidad aire - monitoreo ambiental.com tecnicas monitoreo calidad aire - monitoreo ambiental.com
tecnicas monitoreo calidad aire - monitoreo ambiental.com
 
Media research task 6 audience research
Media research task 6 audience researchMedia research task 6 audience research
Media research task 6 audience research
 
Pesquisando o ensino de ciências por investigação na educação básica utilizan...
Pesquisando o ensino de ciências por investigação na educação básica utilizan...Pesquisando o ensino de ciências por investigação na educação básica utilizan...
Pesquisando o ensino de ciências por investigação na educação básica utilizan...
 
Biodiversidad
BiodiversidadBiodiversidad
Biodiversidad
 
Affective Assessment
Affective AssessmentAffective Assessment
Affective Assessment
 
Formal Assessment of Creativity by Katja Hölttä-Otto (Aalto University)
Formal Assessment of Creativity by Katja Hölttä-Otto (Aalto University)Formal Assessment of Creativity by Katja Hölttä-Otto (Aalto University)
Formal Assessment of Creativity by Katja Hölttä-Otto (Aalto University)
 

Similar to Joint CGO-PPoPP'17 Artifact Evaluation Discussion

Artifact Evaluation Experience CGO'15 / PPoPP'15
Artifact Evaluation Experience CGO'15 / PPoPP'15Artifact Evaluation Experience CGO'15 / PPoPP'15
Artifact Evaluation Experience CGO'15 / PPoPP'15Grigori Fursin
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainabilityDaniel S. Katz
 
Research data spring: streamlining deposit
Research data spring: streamlining depositResearch data spring: streamlining deposit
Research data spring: streamlining depositJisc RDM
 
2017 06-01-eswc2017-ug
2017 06-01-eswc2017-ug2017 06-01-eswc2017-ug
2017 06-01-eswc2017-ugMonika Solanki
 
Archivematica for research data
Archivematica for research dataArchivematica for research data
Archivematica for research dataJisc RDM
 
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsLink Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsHolistic Benchmarking of Big Linked Data
 
Author's workflow and the role of open access
Author's workflow and the role of open accessAuthor's workflow and the role of open access
Author's workflow and the role of open accessPaola Gargiulo
 
capturing the impact of software AAS 2017
capturing the impact of software AAS 2017capturing the impact of software AAS 2017
capturing the impact of software AAS 2017Heather Piwowar
 
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...Advanced-Concepts-Team
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13DataDryad
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...Martin Klein
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...openminted_eu
 
Tracking research data footprints - slides
Tracking research data footprints - slidesTracking research data footprints - slides
Tracking research data footprints - slidesARDC
 
HPC I/O for Computational Scientists
HPC I/O for Computational ScientistsHPC I/O for Computational Scientists
HPC I/O for Computational Scientistsinside-BigData.com
 
Commodity Semantic Search: A Case Study of DiscoverEd
Commodity Semantic Search: A Case Study of DiscoverEdCommodity Semantic Search: A Case Study of DiscoverEd
Commodity Semantic Search: A Case Study of DiscoverEdNathan Yergler
 
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...Sean Cleveland Ph.D.
 

Similar to Joint CGO-PPoPP'17 Artifact Evaluation Discussion (20)

Artifact Evaluation Experience CGO'15 / PPoPP'15
Artifact Evaluation Experience CGO'15 / PPoPP'15Artifact Evaluation Experience CGO'15 / PPoPP'15
Artifact Evaluation Experience CGO'15 / PPoPP'15
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
Research data spring: streamlining deposit
Research data spring: streamlining depositResearch data spring: streamlining deposit
Research data spring: streamlining deposit
 
2017 06-01-eswc2017-ug
2017 06-01-eswc2017-ug2017 06-01-eswc2017-ug
2017 06-01-eswc2017-ug
 
Kaptur environmental assessment methodology
Kaptur environmental assessment methodologyKaptur environmental assessment methodology
Kaptur environmental assessment methodology
 
Archivematica for research data
Archivematica for research dataArchivematica for research data
Archivematica for research data
 
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsLink Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
 
Author's workflow and the role of open access
Author's workflow and the role of open accessAuthor's workflow and the role of open access
Author's workflow and the role of open access
 
capturing the impact of software AAS 2017
capturing the impact of software AAS 2017capturing the impact of software AAS 2017
capturing the impact of software AAS 2017
 
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
ACT Talk, Giuseppe Totaro: High Performance Computing for Distributed Indexin...
 
Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...
 
Tracking research data footprints - slides
Tracking research data footprints - slidesTracking research data footprints - slides
Tracking research data footprints - slides
 
Debbie Liang Resume
Debbie Liang ResumeDebbie Liang Resume
Debbie Liang Resume
 
HPC I/O for Computational Scientists
HPC I/O for Computational ScientistsHPC I/O for Computational Scientists
HPC I/O for Computational Scientists
 
Commodity Semantic Search: A Case Study of DiscoverEd
Commodity Semantic Search: A Case Study of DiscoverEdCommodity Semantic Search: A Case Study of DiscoverEd
Commodity Semantic Search: A Case Study of DiscoverEd
 
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
ADVANCING RESEARCH COMPUTING ON CAMPUSES: BEST PRACTICES WORKSHOP - Facilitat...
 

Recently uploaded

Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
complex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfcomplex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfSubhamKumar3239
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clonechaudhary charan shingh university
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxGiDMOh
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas
 
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...HafsaHussainp
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxzaydmeerab121
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11GelineAvendao
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxtuking87
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
How we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptxHow we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptxJosielynTars
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPRPirithiRaju
 
The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionJadeNovelo1
 

Recently uploaded (20)

PLASMODIUM. PPTX
PLASMODIUM. PPTXPLASMODIUM. PPTX
PLASMODIUM. PPTX
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
complex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfcomplex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdf
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clone
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptx
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and Functions
 
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
DOG BITE management in pediatrics # for Pediatric pgs# topic presentation # f...
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptx
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
 
Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
How we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptxHow we decide powerpoint presentation.pptx
How we decide powerpoint presentation.pptx
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
 
The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and Function
 

Joint CGO-PPoPP'17 Artifact Evaluation Discussion

  • 1. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 Joint CGO-PPoPP’17 Artifact Evaluation Discussion AE Chairs CGO’17: Joseph Devietti, University of Pennsylvania PPoPP’17: Wonsun Ahn, University of Pittsburgh AE CGO-PPoPP-PACT Steering Committee Grigori Fursin, dividiti / cTuning foundation Bruce Childers, University of Pittsburgh Agenda • Results and issues • Awards by NVIDIA and dividiti • Discussion how to improve and scale future AE Fantastic Artifact Evaluators and Supporters cTuning.org/committee.html cTuning.org/ae/artifacts.html http://dividiti.blogspot.com/2017/01/artifact-evaluation-discussion-session.html
  • 2. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 How CGO-PPoPP-PACT AE works time line paper accepted paper accepted artifacts submitted artifacts submitted evaluatorevaluator bidding artifacts assigned artifacts assigned evaluationsevaluations available evaluationsevaluations finalized artifactartifact decision 7..12 days to prepare artifacts according to guidelines: cTuning.org/ submission.html 2..4 days for evaluators to bid on artifacts (according to their knowledge and access to required SW/HW) 2 days to assign artifacts – ensure at least 3 reviews per artifact, reduce risks, avoid mix ups minimize conflicts of interests 2 weeks to review artifacts according to guidelines: cTuning.org/ reviewing.html 3..4 days for authors to respond to reviews and fix problems 2..3 days to finalize reviews NOTE: we consider AE a cooperative process and try to help authors fix artifacts and pass evaluation (particularly if artifacts will be open-sourced) Light communication between authors and reviewers is allowed via AE chairs (to preserve anonymity of the reviewers) Year PPoPP CGO Total Problems Rejected 2015 10 8 18 7 2  2016 12 11 23 4 0 2017 14 13 27 7 0 2..3 days to add AE stamp and AE appendix to a camera- ready paper
  • 3. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 AE: good, bad and ugly … Good: many interesting and open source artifacts – authors and evaluators take AE seriously! Bad: * too many artifacts – need to somehow scale AE while keeping the quality (41 evaluators, ~120 reviews to handle during 2.5 weeks) * sometimes difficult to find evaluators with appropriate skills and access to proprietary SW and rare HW * very intense schedule and not enough time for rebuttals * communication between authors and reviewers via AE chairs is a bottleneck Ugly: * too many ad-hoc scripts to prepare and run experiments * no common workflow frameworks (in contrast with some other sciences) * no common formats and APIs (benchmarks, data sets, tools) * difficult to reproduce empirical results across diverse SW/HW and inputs time line paper accepted paper accepted artifacts submitted artifacts submitted evaluatorevaluator bidding artifacts assigned artifacts assigned evaluationsevaluations available evaluationsevaluations finalized artifactartifact decision 7..12 days 2..4 days 2 days 2 weeks 3..4 days 2..3 days 2..3 days
  • 4. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 Joint CGO-PPoPP’17 awards a) Promote “good” (well documented, consistent and easy to use) artifacts NVIDIA donated “Pascal Titan X GPGPU card” for the highest ranked artifact b) Promote using workflow frameworks to share artifacts and experiments as customizable and reusable components with common meta description and API DIVIDITI donated $500 for the highest ranked artifact shared using Collective Knowledge workflow framework dividiti.com cKnowledge.org Collective Knowledge is being developed by the community to simplify AE process and improve sharing of artifacts as customizable and reusable Python components with extensible JSON meta-description and JSON API, assemble cross-platform workflows, automate and crowdsource empirical experiments, and enable interactive reports.
  • 5. Joint CGO/PPoPP Artifact Evaluation Award for the distinguished open-source artifact shared in the Collective Knowledge format “Software Prefetching for Indirect Memory Accesses” Sam Ainsworth, Timothy M. Jones University of Cambridge The cTuning foundation and dividiti are pleased to grant February 2017
  • 6. Joint CGO/PPoPP Distinguished Artifact Award Xiuxia Zhang1, Guangming Tan1, Shuangbai Xue1, Jiajia Li2, Mingyu Chen1 1 Chinese Academy of Sciences 2 Georgia Institute of Technology February 2017 for “Demystifying GPU Microarchitecture to Tune SGEMM Performance” The cTuning foundation and NVIDIA are pleased to present
  • 7. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 Discussion how to improve/scale future AE time line paper accepted paper accepted artifacts submitted artifacts submitted evaluatorevaluator bidding artifacts assigned artifacts assigned evaluationsevaluations available evaluationsevaluations finalized artifactartifact decision 7..12 days 2..4 days 2 days 2 weeks 3..4 days 2..3 days 2..3 days 1) Introduce two evaluation options: private and public a) traditional evaluation for private artifacts (for example, from industry, though less and less common) time line paper accepted paper accepted artifacts submitted artifacts submitted AE chairs announce public artifacts at XSEDE/GRID5000/etc AE chairs announce public artifacts at XSEDE/GRID5000/etc AE chairs monitor open artifacts are evaluated AE chairs monitor open discussions until artifacts are evaluated artifactartifact decision any time 1..2 days from a few days to 2 weeks 3..4 days 2..3 days b) open evaluation of public and open-source artifacts (if already avialable at GitHub, BitBucket, GitLab with “discussion mechanisms” during submission…) At CGO/PPoPP’17, we have sent out requests to validate several open-source artifacts to the public mailing lists from the conferences, network of excellence, supercomputer centers, etc. We found evaluators willing to help and having an access to rare hardware or supercomputers as well as required software and proprietary benchmarks Authors quickly fixed issues and answered research questions while AE chairs steered the discussion! See public reviewing examples at cTuning.org/ae/artifacts.html and adapt-workshop.org
  • 8. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 2) Enable public or private discussion channels between authors and reviewers for each artifact (rather than communicating via AE chairs) Useful technology: slack.com , reddit.com Evaluators can still be anonymous if they wish so … 3) Help authors prepare artifacts and workflows for unified evaluation (community service by volunteers?) This year we processed more than 120 evaluation reports. Nearly all artifacts had their own ad-hoc scripts to build and run workflows, process outputs, validate results, etc. Since it’s a huge burden for evaluators, they ask us to gradually introduce common workflows and data formats to unify evaluation. A possible solution is to introduce an optional service (based on distinguished artifacts) to help authors convert their ad-hoc scripts to some common format and thus scale AE! Furthermore, it may help researchers easily reuse and customize past artifacts, and build upon them! Discussion how to improve/scale future AE
  • 9. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 Discussion how to improve/scale future AE 4) Should we update Artifact Appendices? Two years ago we introduced Artifact Appendix templates to unify Artifact submissions and let authors add up to two pages of such appendices to their camera ready paper: http://cTuning.org/ae/submission.html http://cTuning.org/ae/submission_extra.html The idea is to help readers better understand what was evaluated and let them reproduce published research and build upon it. We did not receive complaints about our appendices and many researchers decided to add them to their camera ready papers (see http://cTuning.org/ae/artifacts.html). Similar AE appendices are now used by other conferences (SC,RTSS): http://sc17.supercomputing.org/submitters/technical- papers/reproducibility-initiatives-for-technical-papers/artifact-description- paper-title We suggest to get in touch with AE chairs from all related conferences to sync on future AE submission and reviewing procedures to avoid defragmentation!
  • 10. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 Discussion how to improve/scale future AE 5) Decide whether to evaluate all experiments or still allow partial validation or even only artifact sharing We do not yet have a common methodology to fully validate experimental results from the research papers in our domain – we also know that full validation of empirical experiments is very challenging and time consuming. At the same time, making artifacts available is also extremely valuable to the community (data sets, predictive models, architecture simulators and their models, benchmarks, tools, experimental workflows). Last year we participated in ACM workshop on reproducible research and co-authored the following ACM Result and Artifact Review and Badging policy (based on our AE experience): http://www.acm.org/publications/policies/artifact-review-badging It suggests using several separate badges: • Artifacts publicly available • Artifacts evaluated (functional, reusable) • Results validated (replicated, reproduced) We consider using above policy and badges for the next AE – feedback is welcome!
  • 11. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 Discussion how to improve/scale future AE 6) Evaluate artifacts for “tool” papers during main reviewing We now discuss the possibility to validate artifacts for so-called tool papers during main reviewing. Such evaluation will influence acceptance decision. Similar approach seems to be used at SuperComputing’17 (will be useful to discuss that with SC’17 AE organizers). Current problems are: • Artifact Evaluation committee may not be prepared yet (though we have a joint AEC from last years) • If we ask PC members to evaluate papers and artifacts at the same time, it’s an extra burden. Furthermore, PC members may not have required technical skills (that’s why AEC is usually assembled from postdocs and research engineers) • CGO and PPoPP use double blind reviewing. However reviewing artifacts without revealing authors identity is very non-trivial and places an extra unnecessary burden on the authors and evaluators (and may kill AE) (we should check how/if SC’16/SC’17 solve this problem since they also use double blind reviewing).
  • 12. Grigori Fursin Joint CGOGrigori Fursin Joint CGO--PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017PPoPP’17 Artifact Evaluation Discussion Austin, TX February 2017 We need your feedback! Thank you!!! We need your feedback - remember that new AE procedures may affect you at the future conferences • Contact AE steering committee: http://cTuning.org/committee.html • Mailing list: https://groups.google.com/forum/#!forum/collective-knowledge Extra resources • Artifact Evaluation Website: http://cTuning.org/ae • ACM Result and Artifact Review and Badging policy: http://www.acm.org/publications/policies/artifact-review-badging • CK workflow framework: http://cKnowledge.org • Community driven artifact/paper evaluation: http://dl.acm.org/citation.cfm?doid=2618137.2618142