SlideShare a Scribd company logo
1 of 17
Download to read offline
PLANETS, OPF & SCAPE

                   A summary of the tools from these
                 preservation projects, and where their
                        development is heading



www.openplanetsfoundation.org
PLANETS

• A big project to build digital preservation tools...




www.openplanetsfoundation.org
OPF’s Challenge

• The Open Planets Foundation was set up to sustain
  the PLANETS outputs into the future.
   – But the tools are
          • Numerous, often complex, & of mixed quality/maturity
          • Require complex technology stacks (JEE)
    – So, how do we make the code sustainable?
          • Selection, modularisation, simplification
          • Aim for a flexible suite of modular tools, rather than a
            monolithic system


www.openplanetsfoundation.org
SCAPE

• http://www.scape-project.eu/
• Many PLANETS partners
   – Including OPF
• Many new partners too
• Driven by data
   – Web archiving, science data, large-scale
• Cluster computing for scale
   – Based on the HADOOP platform

www.openplanetsfoundation.org
PLATO




www.openplanetsfoundation.org
The PLANETS Testbed




www.openplanetsfoundation.org
The PLANETS Testbed:
           Too Many Good Ideas In One Place
• Designing experiments
   – Web GUI for complex workflows
• Running experiments
   – All services hosted centrally, plus test corpora
• Analysing the results
   – Per-experiment automated & manual analysis
   – Multi-experiment aggregation & data mining
• Sharing all of the above

www.openplanetsfoundation.org
Re-imagining The PLANETS Testbed:
                A Modular Approach
• Use separate tools in each role
   – Experiment Design
   – Execution
   – Analysis

• Publish results from each
   – Loosely coupled instead of all-in-one
            • i.e. sharing is built into the design


www.openplanetsfoundation.org
Experiment Design:
                   SCAPE Workflows In Taverna
• As part of SCAPE




www.openplanetsfoundation.org
Experiment Design Support:
                      SCAPE Service Registry




www.openplanetsfoundation.org
Experiment Design Support:
                     OPF Shared Test Corpora
• Simple collections accessed over HTTP
   – No special browser software required
• Publicly hosted by HATII
   – May also be mirrored by OPF members
• Stabilise corpora from Planets
   – Adsorb corpora from SCAPE & elsewhere
• Look for Open Source CMS/Annotation tools
   – Layer on top of HTTP collections

www.openplanetsfoundation.org
Experiment Design Support:
      Sharing & Publishing Via myExperiment




www.openplanetsfoundation.org
Experiment Execution Support:
           SCAPE’s Lightweight Tool Wrapping
• PIT: Preservation-action Invocation Tool
   – Uses XML ‘tool specification’ documents that
     describe preservation actions
            • Command-line templates, Java classes, PLANETS/SCAPE
              web services, etc
      – Built to be shared
            • Can be published via, e.g. myExperiment
            • Should lead to more reproducible results
      – Re-using PLANETS interoperability code

www.openplanetsfoundation.org
Experiment Execution:
 Multi-platform Tool & Workflow Invocation
• Shared tool specifications make multi-platform
  execution easier
   – From the command line
   – From within Taverna
   – From the SCAPE cluster platform
   – From a simplified web interface
• Run local-first, remote/service as needed
• Collect results in a standard form, using Testbed code


www.openplanetsfoundation.org
Experiment Execution:
     Publishing Experimental Results Via REF
• OPF Results Evaluation Framework: REF
  – Hard-coded experiments of common interest
            • Can run the experiment automatically
      – Publishes results as linked data
            • http://data.openplanetsfoundation.org/ref/extension/
• Built by Dave Tarrant, based on P2 format registry
   – Will come up again in the Identification session
   – SCAPE aims to publish much more data


www.openplanetsfoundation.org
Analysing Results:
                     Linked Data & Future Plans
• REF allows data to be inspected
   – Concentrating on collecting data at present
• Will expose SPARQL endpoint for data queries
   – Analysis, visualisation can be build upon that

• Please add analysis Issues for your Datasets and
  preservation processes to the wiki!
   – e.g. what graphs and statistics would be useful?


www.openplanetsfoundation.org
Summary

• PLATO
   – SCAPE will add Preservation Watch & more
• The PLANETS Testbed
   – Re-imagined as a gateway to a complementary
     suite of preservation tools and data services
   – SCAPE leveraging work from Taverna, IMPACT
• Development driven by user needs
   – SCAPE Scenarios, AQuA/Hackathon Issues


www.openplanetsfoundation.org

More Related Content

Similar to Planets, OPF & SCAPE - presentation of tools on digital preservation

ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and SparkODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
Carolyn Duby
 

Similar to Planets, OPF & SCAPE - presentation of tools on digital preservation (20)

Taverna workflows in the cloud
Taverna workflows in the cloudTaverna workflows in the cloud
Taverna workflows in the cloud
 
DAWN and Scientific Workflows
DAWN and Scientific WorkflowsDAWN and Scientific Workflows
DAWN and Scientific Workflows
 
"Esup CAS Packaging" : Deploy and customize easily a CAS4 server
"Esup CAS Packaging" : Deploy and customize easily a CAS4 server"Esup CAS Packaging" : Deploy and customize easily a CAS4 server
"Esup CAS Packaging" : Deploy and customize easily a CAS4 server
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
OpenStack London Meetup, 18 Nov 2015
OpenStack London Meetup, 18 Nov 2015OpenStack London Meetup, 18 Nov 2015
OpenStack London Meetup, 18 Nov 2015
 
Everyday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web DeveloperEveryday Tools for the Semantic Web Developer
Everyday Tools for the Semantic Web Developer
 
OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...
OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...
OPNFV Webinar – No Time to Wait: Accelerating NFV Time to Market Through Open...
 
Create great cncf user base from lessons learned from other open source com...
Create great cncf user base from   lessons learned from other open source com...Create great cncf user base from   lessons learned from other open source com...
Create great cncf user base from lessons learned from other open source com...
 
딥러닝프레임워크비교
딥러닝프레임워크비교딥러닝프레임워크비교
딥러닝프레임워크비교
 
A View on eScience
A View on eScienceA View on eScience
A View on eScience
 
Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source Mark Hughes Annual Seminar Presentation on Open Source
Mark Hughes Annual Seminar Presentation on Open Source
 
Packaging computational biology tools for broad distribution and ease-of-reuse
Packaging computational biology tools for broad distribution and ease-of-reusePackaging computational biology tools for broad distribution and ease-of-reuse
Packaging computational biology tools for broad distribution and ease-of-reuse
 
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and SparkODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
 
Swt
SwtSwt
Swt
 
SC11 Science Gateway Group Overview
SC11 Science Gateway Group OverviewSC11 Science Gateway Group Overview
SC11 Science Gateway Group Overview
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
 
New Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 EditionNew Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 Edition
 
OpenStack Documentation Projects and Processes
OpenStack Documentation Projects and ProcessesOpenStack Documentation Projects and Processes
OpenStack Documentation Projects and Processes
 
2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow Environments2016-10-20 BioExcel: Advances in Scientific Workflow Environments
2016-10-20 BioExcel: Advances in Scientific Workflow Environments
 
Pieper NISO Virtual Conf Feb17
Pieper NISO Virtual Conf Feb17Pieper NISO Virtual Conf Feb17
Pieper NISO Virtual Conf Feb17
 

More from SCAPE Project

More from SCAPE Project (20)

C sz z6
C sz z6C sz z6
C sz z6
 
SCAPE Information Day at BL - Characterising content in web archives with Nanite
SCAPE Information Day at BL - Characterising content in web archives with NaniteSCAPE Information Day at BL - Characterising content in web archives with Nanite
SCAPE Information Day at BL - Characterising content in web archives with Nanite
 
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
 
SCAPE Information Day at BL - Large Scale Processing with Hadoop
SCAPE Information Day at BL - Large Scale Processing with HadoopSCAPE Information Day at BL - Large Scale Processing with Hadoop
SCAPE Information Day at BL - Large Scale Processing with Hadoop
 
SCAPE Information day at BL - Flint, a Format and File Validation Tool
SCAPE Information day at BL - Flint, a Format and File Validation ToolSCAPE Information day at BL - Flint, a Format and File Validation Tool
SCAPE Information day at BL - Flint, a Format and File Validation Tool
 
SCAPE Webinar: Tools for uncovering preservation risks in large repositories
SCAPE Webinar: Tools for uncovering preservation risks in large repositoriesSCAPE Webinar: Tools for uncovering preservation risks in large repositories
SCAPE Webinar: Tools for uncovering preservation risks in large repositories
 
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
 
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
 
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
 
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
 
Hadoop and its applications at the State and University Library, SCAPE Inform...
Hadoop and its applications at the State and University Library, SCAPE Inform...Hadoop and its applications at the State and University Library, SCAPE Inform...
Hadoop and its applications at the State and University Library, SCAPE Inform...
 
Scape project presentation - Scalable Preservation Environments
Scape project presentation - Scalable Preservation EnvironmentsScape project presentation - Scalable Preservation Environments
Scape project presentation - Scalable Preservation Environments
 
LIBER Satellite Event, SCAPE by Sven Schlarb
LIBER Satellite Event, SCAPE by Sven SchlarbLIBER Satellite Event, SCAPE by Sven Schlarb
LIBER Satellite Event, SCAPE by Sven Schlarb
 
Content profiling and C3PO
Content profiling and C3POContent profiling and C3PO
Content profiling and C3PO
 
Control policy formulation
Control policy formulationControl policy formulation
Control policy formulation
 
Preservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, AarhusPreservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, Aarhus
 
An image based approach for content analysis in document collections
An image based approach for content analysis in document collectionsAn image based approach for content analysis in document collections
An image based approach for content analysis in document collections
 
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
 
TAVERNA Components - Semantically annotated and sharable units of functionality
TAVERNA Components - Semantically annotated and sharable units of functionalityTAVERNA Components - Semantically annotated and sharable units of functionality
TAVERNA Components - Semantically annotated and sharable units of functionality
 
Automatic Preservation Watch
Automatic Preservation WatchAutomatic Preservation Watch
Automatic Preservation Watch
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Planets, OPF & SCAPE - presentation of tools on digital preservation

  • 1. PLANETS, OPF & SCAPE A summary of the tools from these preservation projects, and where their development is heading www.openplanetsfoundation.org
  • 2. PLANETS • A big project to build digital preservation tools... www.openplanetsfoundation.org
  • 3. OPF’s Challenge • The Open Planets Foundation was set up to sustain the PLANETS outputs into the future. – But the tools are • Numerous, often complex, & of mixed quality/maturity • Require complex technology stacks (JEE) – So, how do we make the code sustainable? • Selection, modularisation, simplification • Aim for a flexible suite of modular tools, rather than a monolithic system www.openplanetsfoundation.org
  • 4. SCAPE • http://www.scape-project.eu/ • Many PLANETS partners – Including OPF • Many new partners too • Driven by data – Web archiving, science data, large-scale • Cluster computing for scale – Based on the HADOOP platform www.openplanetsfoundation.org
  • 7. The PLANETS Testbed: Too Many Good Ideas In One Place • Designing experiments – Web GUI for complex workflows • Running experiments – All services hosted centrally, plus test corpora • Analysing the results – Per-experiment automated & manual analysis – Multi-experiment aggregation & data mining • Sharing all of the above www.openplanetsfoundation.org
  • 8. Re-imagining The PLANETS Testbed: A Modular Approach • Use separate tools in each role – Experiment Design – Execution – Analysis • Publish results from each – Loosely coupled instead of all-in-one • i.e. sharing is built into the design www.openplanetsfoundation.org
  • 9. Experiment Design: SCAPE Workflows In Taverna • As part of SCAPE www.openplanetsfoundation.org
  • 10. Experiment Design Support: SCAPE Service Registry www.openplanetsfoundation.org
  • 11. Experiment Design Support: OPF Shared Test Corpora • Simple collections accessed over HTTP – No special browser software required • Publicly hosted by HATII – May also be mirrored by OPF members • Stabilise corpora from Planets – Adsorb corpora from SCAPE & elsewhere • Look for Open Source CMS/Annotation tools – Layer on top of HTTP collections www.openplanetsfoundation.org
  • 12. Experiment Design Support: Sharing & Publishing Via myExperiment www.openplanetsfoundation.org
  • 13. Experiment Execution Support: SCAPE’s Lightweight Tool Wrapping • PIT: Preservation-action Invocation Tool – Uses XML ‘tool specification’ documents that describe preservation actions • Command-line templates, Java classes, PLANETS/SCAPE web services, etc – Built to be shared • Can be published via, e.g. myExperiment • Should lead to more reproducible results – Re-using PLANETS interoperability code www.openplanetsfoundation.org
  • 14. Experiment Execution: Multi-platform Tool & Workflow Invocation • Shared tool specifications make multi-platform execution easier – From the command line – From within Taverna – From the SCAPE cluster platform – From a simplified web interface • Run local-first, remote/service as needed • Collect results in a standard form, using Testbed code www.openplanetsfoundation.org
  • 15. Experiment Execution: Publishing Experimental Results Via REF • OPF Results Evaluation Framework: REF – Hard-coded experiments of common interest • Can run the experiment automatically – Publishes results as linked data • http://data.openplanetsfoundation.org/ref/extension/ • Built by Dave Tarrant, based on P2 format registry – Will come up again in the Identification session – SCAPE aims to publish much more data www.openplanetsfoundation.org
  • 16. Analysing Results: Linked Data & Future Plans • REF allows data to be inspected – Concentrating on collecting data at present • Will expose SPARQL endpoint for data queries – Analysis, visualisation can be build upon that • Please add analysis Issues for your Datasets and preservation processes to the wiki! – e.g. what graphs and statistics would be useful? www.openplanetsfoundation.org
  • 17. Summary • PLATO – SCAPE will add Preservation Watch & more • The PLANETS Testbed – Re-imagined as a gateway to a complementary suite of preservation tools and data services – SCAPE leveraging work from Taverna, IMPACT • Development driven by user needs – SCAPE Scenarios, AQuA/Hackathon Issues www.openplanetsfoundation.org