SlideShare uma empresa Scribd logo
1 de 13
iRODS:
Interoperability in
Data Management
Leesa Brieger, RENCI-UNC
Mike Wan, DICE-UCSD
integrated Rule-Oriented Data System
(iRODS)
•

Developed by the Data Intensive Cyber Environments (DICE) group,
UNC and UCSD

•

Follow-on to SRB, the Storage Resource Broker from SDSC
– decade-long development experience, community-driven

•

Modular, extensible, customizable

•

Open source (BSD license)

•

Supported by the Renaissance Computing Institute (RENCI), UNC
– a research unit of UNC Chapel Hill
– state-supported
– governed by the Triangle universities (UNC, NCSU, Duke)
HDF, HDF-EOS Workshop XV, April 1719, 2012

2
iRODS
I.

Data grid middleware

II.

Data management infrastructure

III.

Framework for implementing policy-driven data
management

The extensibility and modularity of iRODS make it a customizable and
resource-agnostic infrastructure.

HDF, HDF-EOS Workshop XV, April 17-19,
2012

3
iRODS as Data Grid
iRODS View of Distributed Data

User Client
User sees a single collection
My Data:
disk, filesystem,
WOS storage unit...

My Data:
tape, database, filesystem
...

Partner’s Data
remote
disk, tape, filesystem...

•iRODS installs over heterogeneous data resources
• Users share & manage distributed data as a single collection

• iCAT metadata catalogue: DB that manages the logical-tophysical mappings (data objects, users, resources)
HDF, HDF-EOS Workshop XV, April 1719, 2012

4
Data Life Cycle
Usage evolves across stages of the
data life cycle; management
policy evolves along with it.

Creation
Active
Use

Publication
& Sharing

Local Policy

Reference
Collection

Service/Use
Distribution

Discovery and
Re-purposing

Archival
Collection/
Deletion
Retention/
Preservation

iRODS modularity and extensibility allows support for changing
s ds
management requirements over the data life cycle.

HDF, HDF-EOS Workshop XV, April 1719, 2012

5
iRODS Design Goals
• Data grid abstraction for data, users, resources
• Abstract out the data management
– Separate data administration from storage administration
• drivers allow iRODS to talk local storage protocol
• rule engine runs services and data operations

– Policy-based data management
• Data management: specialized modules of microservices (C
code) and rules for running data-side services
• Policy-based: event-triggered rule execution

– Policy follows data around the grid
• collection management independent of remote storage
HDF, HDF-EOS Workshop XV, April 17-19,
locations
2012

6
Interoperability
• Federation
– Data grids with independent administration can federate and crosscommunicate

• Clients
– User-supplied or specialty client interfaces
– Many specialized views of the collections

• iRODS core extensions for resource agnosticism/fitting in with
existing infrastructure
–
–
–
–

network transport (RBUDP)
authentication mechanisms (Kerberos, Shibboleth, GSI, etc)
external databases (DataBase Resources - DBRs)
storage drivers (HPSS, WOS, EC2, etc)
HDF, HDF-EOS Workshop XV, April 17-19,
2012

7
Interoperability Through Microservices
iRODS provides a structure for implementing custom services
– Rules and microservice modules
– Can be user-defined
– Data-side services: format
conversion, extraction, visualization, accounting &reporting, …
– Archival: replication, curation procedures, long-term archival
procedures
– Access: access control policy

– Discoverability: metadata organization and management
– Symbolic links: integrate data from other collections into iRODS
repository
• microservice drivers

– Universal mass storage driver – plug in new protocols
HDF, HDF-EOS Workshop XV, April 1719, 2012

8
Interoperability Through Integration with
Existing Infrastructure
• Data management integrated with storage management: OSG,
DDN

• Data management integrated with standard interfaces and
services:
–
–
–
–

Fedora (librarians)
DataVerse (social scientists)
HDF5 (cosmologists)
NetCDF (NASA climate scientists, NSF earth scientists - hydrologists)

HDF, HDF-EOS Workshop XV, April 1719, 2012

9
Integration with HDF5
Mike Wan and Peter Cao, 2008

Interactive access to HDF5 files on a remote iRODS server –
browsing of metadata and data sharing with services
•

Clients access to data (subsets) and metadata in HDF5 files stored
remotely; transfers only of requested data and metadata, not of full
files

•

iRODS microservices and APIs created to support HDF5 functionality on
HDF5 objects

•

islice – extracts a slice from a FLASH (cosmology) file stored on a
remote iRODS server

•

Remote viewing of HDF5 iRODS data

•

HDFView

HDF, HDF-EOS Workshop XV, April 1719, 2012

– iRODS HDF5 Java objects were added to the HDF-Java products

10
Integration with NetCDF
Mike Wan, 2012
• Add NETCDF functionalities to iRODS:
– wrap NETCDF APIs into iRODS APIs and micro-services

• New iRODS APIs to wrap basic NETCDF APIs (libnetcdf) and a higherlevel libcf subsetting function
– Basic: nc_create, nc_open, nc_close
– Inquiry functions: nc_inq_varid, nc_inq_dimid, nc_inq_dim, nc_inq_var
– Subsetting functions:
nc_get_vars_text, nc_get_vars_string, nc_get_vars_int, nc_get_vars_float,
nc_get_vars_double, …
– Higher-level subsetting function of libcf for CF data: nccf_get_vara

• New NETCDF-based iRODS micro-services
– Allow NETCDF workflows to be performed data-side on the iRODS servers
HDF, HDF-EOS Workshop XV, April 17-19,
– One for each of the new APIs, for server-side operations
11
2012
– 5 micro-services for accessing data elements in the new data structures
iRODS for Interoperability – NASA (NCCS)
Separating metadata from the data object
(from NetCDF files into the iCAT)

Using an iRODS FUSE client
to expose data to the ESG
Data Node

In support of discovery, long term curation,
and reuse/repurposing of the data
HDF, HDF-EOS Workshop XV, April 1719, 2012

12
E-iRODS from RENCI – the RedHat Model
• Initial release based on iRODS 3.0
– Tracks community code, with a delay
– Download beta release binaries at http://e-irods.com

• Hardened binary release of iRODS
– Passes continuous integration with back-ported bug fixes from
community trunk
– Packaging and signing: initially RPM and DEB

• Certification
• Documentation
• Subscription Support Contracts – leesa@renci.org for information
HDF, HDF-EOS Workshop XV, April 17-19,
2012

13

Mais conteúdo relacionado

Mais procurados

DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
Andrea Bollini
 

Mais procurados (20)

DataverseEU as multilingual repository
DataverseEU as multilingual repositoryDataverseEU as multilingual repository
DataverseEU as multilingual repository
 
Digital Preservation in Production (DPN and DuraCloud Vault)
Digital Preservation in Production (DPN and DuraCloud Vault)Digital Preservation in Production (DPN and DuraCloud Vault)
Digital Preservation in Production (DPN and DuraCloud Vault)
 
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
 
ARIADNE: progress in the first nine month
ARIADNE: progress in the first nine monthARIADNE: progress in the first nine month
ARIADNE: progress in the first nine month
 
Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
HDF5 iRODS
 
Hadoop training in Bangalore
Hadoop training in BangaloreHadoop training in Bangalore
Hadoop training in Bangalore
 
How Worthy is DSpace for Digital Libraries
How Worthy is DSpace for Digital LibrariesHow Worthy is DSpace for Digital Libraries
How Worthy is DSpace for Digital Libraries
 
ARCLib project presentation from Pasig 2016
ARCLib project presentation from Pasig 2016ARCLib project presentation from Pasig 2016
ARCLib project presentation from Pasig 2016
 
Open-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDFOpen-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDF
 
TYPO3 and CMIS
TYPO3 and CMISTYPO3 and CMIS
TYPO3 and CMIS
 
DataverseNL as structured data hub
DataverseNL as structured data hubDataverseNL as structured data hub
DataverseNL as structured data hub
 
Building COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science ProjectBuilding COVID-19 Museum as Open Science Project
Building COVID-19 Museum as Open Science Project
 
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
 
DataCite How To: Use the MDS
DataCite How To: Use the MDSDataCite How To: Use the MDS
DataCite How To: Use the MDS
 
Moving ahead: The ARIADNE integration process
Moving ahead: The ARIADNE integration processMoving ahead: The ARIADNE integration process
Moving ahead: The ARIADNE integration process
 
Geoservices Activities at EDINA
Geoservices Activities at EDINAGeoservices Activities at EDINA
Geoservices Activities at EDINA
 
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
 
Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...
 
Putting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAMPutting Historical Data in Context: how to use DSpace-GLAM
Putting Historical Data in Context: how to use DSpace-GLAM
 

Destaque

Destaque (20)

Data Management for Grown Ups
Data Management for Grown UpsData Management for Grown Ups
Data Management for Grown Ups
 
ODSC and iRODS
ODSC and iRODSODSC and iRODS
ODSC and iRODS
 
NAGARA: SRB and iRODS
NAGARA: SRB and iRODSNAGARA: SRB and iRODS
NAGARA: SRB and iRODS
 
Green Shoots: Research Data Management Pilot at Imperial College London
Green Shoots:Research Data Management Pilot at Imperial College LondonGreen Shoots:Research Data Management Pilot at Imperial College London
Green Shoots: Research Data Management Pilot at Imperial College London
 
Research Data Management en bibliotheken
Research Data Management en bibliothekenResearch Data Management en bibliotheken
Research Data Management en bibliotheken
 
iRODS User Group Meeting 2016 - MUMC+
iRODS User Group Meeting 2016 - MUMC+iRODS User Group Meeting 2016 - MUMC+
iRODS User Group Meeting 2016 - MUMC+
 
UDT
UDTUDT
UDT
 
Connecting HDF with ISO Metadata Standards
Connecting HDF with ISO Metadata StandardsConnecting HDF with ISO Metadata Standards
Connecting HDF with ISO Metadata Standards
 
HDF Tools Tutorial
HDF Tools TutorialHDF Tools Tutorial
HDF Tools Tutorial
 
HDF Tools Updates and Discussions
HDF Tools Updates and DiscussionsHDF Tools Updates and Discussions
HDF Tools Updates and Discussions
 
Using IDL with Suomi NPP VIIRS Data
Using IDL with Suomi NPP VIIRS DataUsing IDL with Suomi NPP VIIRS Data
Using IDL with Suomi NPP VIIRS Data
 
Status of HDF-EOS, Related Software and Tools
 Status of HDF-EOS, Related Software and Tools Status of HDF-EOS, Related Software and Tools
Status of HDF-EOS, Related Software and Tools
 
HDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSSHDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSS
 
Granules Are Forever
Granules Are ForeverGranules Are Forever
Granules Are Forever
 
Earth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project UpdateEarth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project Update
 
HDF4 Mapping Project Update
HDF4 Mapping Project UpdateHDF4 Mapping Project Update
HDF4 Mapping Project Update
 
Images of HDF5
Images of HDF5Images of HDF5
Images of HDF5
 
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFViewHDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
 
HDF OPeNDAP Project Update and Demo
HDF OPeNDAP Project Update and DemoHDF OPeNDAP Project Update and Demo
HDF OPeNDAP Project Update and Demo
 
Bridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data ProductsBridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data Products
 

Semelhante a iRODS: Interoperability in Data Management

Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
DataTactics
 

Semelhante a iRODS: Interoperability in Data Management (20)

iRODS 4.0 and Beyond (DDN UK User Group Meeting, September 2014)
iRODS 4.0 and Beyond (DDN UK User Group Meeting, September 2014)iRODS 4.0 and Beyond (DDN UK User Group Meeting, September 2014)
iRODS 4.0 and Beyond (DDN UK User Group Meeting, September 2014)
 
EOSC-hub service portfolio
EOSC-hub service portfolioEOSC-hub service portfolio
EOSC-hub service portfolio
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015COMSODE networking session at ICT Lisbon 2015
COMSODE networking session at ICT Lisbon 2015
 
DSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: SlidesDSpace-CRIS Workshop OR2015: Slides
DSpace-CRIS Workshop OR2015: Slides
 
Persistent identifiers in DataverseEU project
Persistent identifiers in DataverseEU projectPersistent identifiers in DataverseEU project
Persistent identifiers in DataverseEU project
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
CPaaS.io Y1 Review Meeting - Holistic Data Management
CPaaS.io Y1 Review Meeting - Holistic Data ManagementCPaaS.io Y1 Review Meeting - Holistic Data Management
CPaaS.io Y1 Review Meeting - Holistic Data Management
 
Information Systems
Information SystemsInformation Systems
Information Systems
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
 
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...
E FFICIENT  D ATA  R ETRIEVAL  F ROM  C LOUD  S TORAGE  U SING  D ATA  M ININ...E FFICIENT  D ATA  R ETRIEVAL  F ROM  C LOUD  S TORAGE  U SING  D ATA  M ININ...
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...
 
2013.05 - IASSIST 2013 - 2
2013.05 - IASSIST 2013 - 22013.05 - IASSIST 2013 - 2
2013.05 - IASSIST 2013 - 2
 
Zloch, Bosch, Wegener: A technical perspective...
Zloch, Bosch, Wegener: A technical perspective... Zloch, Bosch, Wegener: A technical perspective...
Zloch, Bosch, Wegener: A technical perspective...
 
Wedi
WediWedi
Wedi
 
Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...
 
A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)Securing Hadoop in an Enterprise Context (v2)
Securing Hadoop in an Enterprise Context (v2)
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 

Mais de The HDF-EOS Tools and Information Center

Mais de The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

iRODS: Interoperability in Data Management

  • 1. iRODS: Interoperability in Data Management Leesa Brieger, RENCI-UNC Mike Wan, DICE-UCSD
  • 2. integrated Rule-Oriented Data System (iRODS) • Developed by the Data Intensive Cyber Environments (DICE) group, UNC and UCSD • Follow-on to SRB, the Storage Resource Broker from SDSC – decade-long development experience, community-driven • Modular, extensible, customizable • Open source (BSD license) • Supported by the Renaissance Computing Institute (RENCI), UNC – a research unit of UNC Chapel Hill – state-supported – governed by the Triangle universities (UNC, NCSU, Duke) HDF, HDF-EOS Workshop XV, April 1719, 2012 2
  • 3. iRODS I. Data grid middleware II. Data management infrastructure III. Framework for implementing policy-driven data management The extensibility and modularity of iRODS make it a customizable and resource-agnostic infrastructure. HDF, HDF-EOS Workshop XV, April 17-19, 2012 3
  • 4. iRODS as Data Grid iRODS View of Distributed Data User Client User sees a single collection My Data: disk, filesystem, WOS storage unit... My Data: tape, database, filesystem ... Partner’s Data remote disk, tape, filesystem... •iRODS installs over heterogeneous data resources • Users share & manage distributed data as a single collection • iCAT metadata catalogue: DB that manages the logical-tophysical mappings (data objects, users, resources) HDF, HDF-EOS Workshop XV, April 1719, 2012 4
  • 5. Data Life Cycle Usage evolves across stages of the data life cycle; management policy evolves along with it. Creation Active Use Publication & Sharing Local Policy Reference Collection Service/Use Distribution Discovery and Re-purposing Archival Collection/ Deletion Retention/ Preservation iRODS modularity and extensibility allows support for changing s ds management requirements over the data life cycle. HDF, HDF-EOS Workshop XV, April 1719, 2012 5
  • 6. iRODS Design Goals • Data grid abstraction for data, users, resources • Abstract out the data management – Separate data administration from storage administration • drivers allow iRODS to talk local storage protocol • rule engine runs services and data operations – Policy-based data management • Data management: specialized modules of microservices (C code) and rules for running data-side services • Policy-based: event-triggered rule execution – Policy follows data around the grid • collection management independent of remote storage HDF, HDF-EOS Workshop XV, April 17-19, locations 2012 6
  • 7. Interoperability • Federation – Data grids with independent administration can federate and crosscommunicate • Clients – User-supplied or specialty client interfaces – Many specialized views of the collections • iRODS core extensions for resource agnosticism/fitting in with existing infrastructure – – – – network transport (RBUDP) authentication mechanisms (Kerberos, Shibboleth, GSI, etc) external databases (DataBase Resources - DBRs) storage drivers (HPSS, WOS, EC2, etc) HDF, HDF-EOS Workshop XV, April 17-19, 2012 7
  • 8. Interoperability Through Microservices iRODS provides a structure for implementing custom services – Rules and microservice modules – Can be user-defined – Data-side services: format conversion, extraction, visualization, accounting &reporting, … – Archival: replication, curation procedures, long-term archival procedures – Access: access control policy – Discoverability: metadata organization and management – Symbolic links: integrate data from other collections into iRODS repository • microservice drivers – Universal mass storage driver – plug in new protocols HDF, HDF-EOS Workshop XV, April 1719, 2012 8
  • 9. Interoperability Through Integration with Existing Infrastructure • Data management integrated with storage management: OSG, DDN • Data management integrated with standard interfaces and services: – – – – Fedora (librarians) DataVerse (social scientists) HDF5 (cosmologists) NetCDF (NASA climate scientists, NSF earth scientists - hydrologists) HDF, HDF-EOS Workshop XV, April 1719, 2012 9
  • 10. Integration with HDF5 Mike Wan and Peter Cao, 2008 Interactive access to HDF5 files on a remote iRODS server – browsing of metadata and data sharing with services • Clients access to data (subsets) and metadata in HDF5 files stored remotely; transfers only of requested data and metadata, not of full files • iRODS microservices and APIs created to support HDF5 functionality on HDF5 objects • islice – extracts a slice from a FLASH (cosmology) file stored on a remote iRODS server • Remote viewing of HDF5 iRODS data • HDFView HDF, HDF-EOS Workshop XV, April 1719, 2012 – iRODS HDF5 Java objects were added to the HDF-Java products 10
  • 11. Integration with NetCDF Mike Wan, 2012 • Add NETCDF functionalities to iRODS: – wrap NETCDF APIs into iRODS APIs and micro-services • New iRODS APIs to wrap basic NETCDF APIs (libnetcdf) and a higherlevel libcf subsetting function – Basic: nc_create, nc_open, nc_close – Inquiry functions: nc_inq_varid, nc_inq_dimid, nc_inq_dim, nc_inq_var – Subsetting functions: nc_get_vars_text, nc_get_vars_string, nc_get_vars_int, nc_get_vars_float, nc_get_vars_double, … – Higher-level subsetting function of libcf for CF data: nccf_get_vara • New NETCDF-based iRODS micro-services – Allow NETCDF workflows to be performed data-side on the iRODS servers HDF, HDF-EOS Workshop XV, April 17-19, – One for each of the new APIs, for server-side operations 11 2012 – 5 micro-services for accessing data elements in the new data structures
  • 12. iRODS for Interoperability – NASA (NCCS) Separating metadata from the data object (from NetCDF files into the iCAT) Using an iRODS FUSE client to expose data to the ESG Data Node In support of discovery, long term curation, and reuse/repurposing of the data HDF, HDF-EOS Workshop XV, April 1719, 2012 12
  • 13. E-iRODS from RENCI – the RedHat Model • Initial release based on iRODS 3.0 – Tracks community code, with a delay – Download beta release binaries at http://e-irods.com • Hardened binary release of iRODS – Passes continuous integration with back-ported bug fixes from community trunk – Packaging and signing: initially RPM and DEB • Certification • Documentation • Subscription Support Contracts – leesa@renci.org for information HDF, HDF-EOS Workshop XV, April 17-19, 2012 13

Notas do Editor

  1. This is the first mention of microservices… defined in the next slide.