SlideShare a Scribd company logo
1 of 31
Data Management at UNC-CH
(DICE Center)
Reagan W. Moore
Arcot Rajasekar
Mike Conway
Antoine de Torcy
Chien-Yi Hou
Leesa Brieger
Jewel Ward
Terrell Russell
Hao Xu
{moore,sekar,mwan}@diceresearch.org
http://irods.diceresearch.org
Wayne Schroeder
Mike Wan
Bing Zhu
Sheau-Yen Chen
Paul Tooby
Data Management Challenges
• TUCASI data grid
– Shared collections between UNC / NCSU / Duke
• RENCI data grid
– Regional data grid for distributing data (NC orthophotos)
• NCSU / RENCI cloud
– Couple cloud storage to institutional repository
• NCDC data grid
– Data processing pipeline between NCDC and RENCI
• SILS Lifetime Library
– Personal digital collection
• Carolina Digital Repository
– Institutional archive for data collections
• ITS Archive
– Replication across tape-based storage
• NARA Persistent Archive
– Preservation prototype
Common Infrastructure
• The same policy-based data management system is
used to support all of the data management
applications.
– Observe that the primary difference between the
applications is the set of policies and procedures that are
enforced.
– Requires the ability to:
• Turn policies into computer executable rules
• Use rules to control execution of procedures
• Track execution of procedures through stae information
4
4
Policy-based Data Management
integrated Rule Oriented Data System
• Purpose - reason a collection is assembled
• Properties - attributes needed to ensure the purpose
• Policies - controls for enforcing desired properties
• Procedures - functions that implement the policies
• State information - results of applying the procedures
• Assessment criteria - validation that state information conforms to
the desired purpose
• Federation - controlled sharing of logical name spaces
These are the essential elements for policy-based data management
Data Life Cycle
Project
Collection
Private
Local
Policy
Data
Grid
Shared
Distribution
Policy
Digital
Library
Published
Description
Policy
Data
Processing
Pipeline
Analyzed
Service
Policy
Reference
Collection
Preserved
Representation
Policy
Federation
Sustained
Re-purposing
Policy
Stages correspond to addition of new policies for a broader community
Virtualize the stages of the data life cycle through policy evolution
The driving purpose changes at each stage of the data life cycle
Policies
• Data distribution
• Data replication
• Extraction of required metadata
• Integrity verification
• Time-dependent access restrictions
• Versioning
• Deletion control
• Retention/disposition
• Creation of derived data products
7
User w/Client
Can Search, Access, Add and
Manage Data
& Metadata
Access distributed data with Web-based Browser or iRODS GUI or Command Line clients.
Overview of iRODS Architecture
iRODS Data
Server
Disk, Tape, etc.
iRODS
Metadata
Catalog
Track information
iRODS Data System
iRODS Rule
Engine
Tracks Policies
Policy-Based Data Management
• Observe that each application uses a different user
interface
– Requires that the access mechanisms be decoupled from
the data management framework
• Observe that each application uses different types of
storage systems
– Requires that the system interoperate with multiple
storage protocols
• Implemented as infrastructure independence
Infrastructure IndependenceInfrastructure Independence
Storage SystemStorage System
Storage ProtocolStorage Protocol
Access InterfaceAccess Interface
Policy Enforcement PointsPolicy Enforcement Points
Standard Micro-servicesStandard Micro-services
Map from the actions
requested by the client to
multiple policy enforcement
points.
Map from policy to standard
micro-services.
Map from micro-services to
standard Posix I/O operations.
Map from standard Posix I/O
operations to the protocol
supported by the storage
system
Standard I/O OperationsStandard I/O Operations
DataGrid
iput ../src/irm.c checks 10 policy enforcement points
srbbrick14:10900:ApplyRule#116:: acChkHostAccessControl
srbbrick14:10900:GotRule#117:: acChkHostAccessControl
srbbrick14:10900:ApplyRule#118:: acSetPublicUserPolicy
srbbrick14:10900:GotRule#119:: acSetPublicUserPolicy
srbbrick14:10900:ApplyRule#120:: acAclPolicy
srbbrick14:10900:GotRule#121:: acAclPolicy
srbbrick14:10900:ApplyRule#122:: acSetRescSchemeForCreate
srbbrick14:10900:GotRule#123:: acSetRescSchemeForCreate
srbbrick14:10900:execMicroSrvc#124:: msiSetDefaultResc(demoResc,null)
srbbrick14:10900:ApplyRule#125:: acRescQuotaPolicy
srbbrick14:10900:GotRule#126:: acRescQuotaPolicy
srbbrick14:10900:execMicroSrvc#127:: msiSetRescQuotaPolicy(off)
srbbrick14:10900:ApplyRule#128:: acSetVaultPathPolicy
srbbrick14:10900:GotRule#129:: acSetVaultPathPolicy
srbbrick14:10900:execMicroSrvc#130:: msiSetGraftPathScheme(no,1)
srbbrick14:10900:ApplyRule#131:: acPreProcForModifyDataObjMeta
srbbrick14:10900:GotRule#132:: acPreProcForModifyDataObjMeta
srbbrick14:10900:ApplyRule#133:: acPostProcForModifyDataObjMeta
srbbrick14:10900:GotRule#134:: acPostProcForModifyDataObjMeta
srbbrick14:10900:ApplyRule#135:: acPostProcForCreate
srbbrick14:10900:GotRule#136:: acPostProcForCreate
srbbrick14:10900:ApplyRule#137:: acPostProcForPut
srbbrick14:10900:GotRule#138:: acPostProcForPut
srbbrick14:10900:GotRule#139:: acPostProcForPut
srbbrick14:10900:GotRule#140:: acPostProcForPut
Lessons Learned (NARA Prototype)
• Scalability / Reliability / Robustness
– Maintain interactivity through appropriate catalog indexing
– Manage network outages at the client level
– Minimize memory used
• Assessment / Audit trails
– Query metadata to validate properties
– Map audit trails to standard events
– Track actions from policy enforcement to procedure execution
• Federation / Extensibility / Ease of Use
– Replicate metadata and records to a deep archive
– Extensible architecture that enables software evolution
– Provide simplest possible mapping from policies to record series
Level of Sophistication
• How many different access mechanisms are
needed?
• How many different policy enforcement
points should be checked?
• How many standard functions are needed to
implement preservation procedures?
• How many preservation metadata attributes
are needed to track preservation properties?
• How many preservation policies are needed?
Data Grid Clients (35)
API Client Developer
Browser
DCAPE UNC
iExplore DICE-Bing Zhu
JUX IN2P3
Peta Web browser PetaShare
Digital Library
Akubra/iRODS DICE
Dspace MIT
Fedora on Fuse IN2P3
Fedora/iRODS module DICE
Islandora DICE
File System
Davis - Webdav ARCS
Dropbox / iDrop DICE-Mike Conway
FUSE IN2P3, DICE,
FUSE optimization PetaShare
OpenDAP ARCS
PetaFS (Fuse) Petashare - LSU
Petashell (Parrot) PetaShare
Grid
GridFTP - Griffin ARCS
Jsaga IN2P3
Parrot Notre Dame-Doug Thain
Saga KEK
API Client Developer
I/O Libraries
PHP - DICE DICE-Bing Zhu
C API DICE-Mike Wan
C I/O library DICE-Wayne Schroeder
Jargon DICE-Mike Conway
Pyrods - Python SHAMAN-Jerome Fusillier
Portal EnginFrame NICE / RENCI
Preservation
Arch DICE-Mike Conway
Tools
Archive tools-NOAO NOAO
Big Board visualization RENCI
File-format-identifier GA Tech
icommands DICE
Pcommands PetaShare
Resource Monitoring IN2P3
Sync-package Academica Sinica
URSpace Academica Sinica
Web Service
VOSpace NVOA
Shibboleth King's College
Workflows Kepler DICE
Stork LSU
Taverna RENCI
Policy Enforcement Points (71)
ACTION
acCreateUser
acDeleteUser
acGetUserbyDN
acTrashPolicy
acAclPolicy
acSetCreateConditions
acDataDeletePolicy
acRenameLocalZone
acSetRescSchemeForCreate
acRescQuotaPolicy
acSetMultiReplPerResc
acSetNumThreads
acVacuum
acSetResourceList
acSetCopyNumber
acVerifyChecksum
acCreateUserZoneCollections
acDeleteUserZoneCollections
acPurgeFiles
acRegisterData
acGetIcatResults
acSetPublicUserPolicy
acCreateDefaultCollections
acDeleteDefaultCollections
POST-ACTION POLICY
acPostProcForCreateUser
acPostProcForDeleteUser
acPostProcForModifyUser
acPostProcForModifyUserGroup
acPostProcForDelete
acPostProcForCollCreate
acPostProcForRmColl
acPostProcForModifyAVUMetadata
acPostProcForModifyCollMeta
acPostProcForModifyDataObjMeta
acPostProcForModifyAccessControl
acPostProcForOpen
acPostProcForObjRename
acPostProcForCreateResource
acPostProcForDeleteResource
acPostProcForModifyResource
acPostProcForModifyResourceGroup
acPostProcForCreateToken
acPostProcForDeleteToken
acPostProcForFilePathReg
acPostProcForGenQuery
acPostProcForPut
acPostProcForCopy
acPostProcForCreate
PRE-ACTION POLICY
acPreProcForCreateUser
acPreProcForDeleteUser
acPreProcForModifyUser
acPreProcForModifyUserGroup
acChkHostAccessControl
acPreProcForCollCreate
acPreProcForRmColl
acPreProcForModifyAVUMetadata
acPreProcForModifyCollMeta
acPreProcForModifyDataObjMeta
acPreProcForModifyAccessControl
acPreprocForDataObjOpen
acPreProcForObjRename
acPreProcForCreateResource
acPreProcForDeleteResource
acPreProcForModifyResource
acPreProcForModifyResourceGroup
acPreProcForCreateToken
acPreProcForDeleteToken
acNoChkFilePathPerm
acPreProcForGenQuery
acSetReServerNumProc
acSetVaultPathPolicy
Micro-services - How many are needed?
print_hello msiOprDisallowed msiRmColl msiGetValByKey
print_hello_arg msiDataObjCreate msiReplColl msiAddKeyVal
msiVacuum msiDataObjOpen msiCollRepl assign
msiQuota msiDataObjClose msiPhyPathReg ifExec
msiDeleteUnusedAVUs msiDataObjLseek msiObjStat break
msiGoodFailure msiDataObjRead msiDataObjRsync applyAllRules
msiSetResource msiDataObjWrite msiCollRsync msiExecStrCondQuery
msiCheckPermission msiDataObjUnlink msiFreeBuffer msiExecStrCondQueryWithOptions
msiCheckOwner msiDataObjRepl msiNoChkFilePathPerm msiExecGenQuery
msiCreateUser msiDataObjCopy msiNoTrashCan msiMakeQuery
msiCreateCollByAdmin msiExtractNaraMetadata msiSetPublicUserOpr msiMakeGenQuery
msiSendMail msiSetMultiReplPerResc whileExec msiGetMoreRows
recover_print_hello msiAdmChangeCoreIRB forExec msiAddSelectFieldToGenQuery
msiCommit msiAdmShowIRB delayExec msiAddConditionToGenQuery
msiRollback msiAdmShowDVM remoteExec msiPrintGenQueryOutToBuffer
msiDeleteCollByAdmin msiAdmShowFNM forEachExec msiExecCmd
msiDeleteUser msiAdmAppendToTopOfCoreIRB msiSleep msiSetGraftPathScheme
msiAddUserToGroup msiAdmClearAppRuleStruct writeString msiSetRandomScheme
msiSetDefaultResc msiAdmAddAppRuleStruct writeLine msiCheckHostAccessControl
msiSetRescSortScheme msiGetObjType writeBytesBuf msiGetIcatTime
msiSysReplDataObj msiAssociateKeyValuePairsToObj writePosInt msiGetTaggedValueFromString
msiStageDataObj msiExtractTemplateMDFromBuf writeKeyValPairs msiXmsgServerConnect
msiSetDataObjPreferredResc msiReadMDTemplateIntoTagStruct msiGetDiffTime msiXmsgCreateStream
msiSetDataObjAvoidResc msiDataObjPut msiGetSystemTime msiCreateXmsgInp
msiSortDataObj msiDataObjGet msiHumanToSystemTime msiSendXmsg
msiSysChksumDataObj msiDataObjChksum msiStrToBytesBuf msiRcvXmsg
msiSetDataTypeFromExt msiDataObjPhymv msiApplyDCMetadataTemplate msiXmsgServerDisConnect
msiSetNoDirectRescInp msiDataObjRename msiListEnabledMS msiString2KeyValPair
msiSetNumThreads msiDataObjTrim msiSendStdoutAsEmail msiStrArray2String
msiDeleteDisallowed msiCollCreate msiPrintKeyValPair msiRdaToStdout
Micro-services (229)
msiRdaToDataObj msiSetACL msiPropertiesGet msiIsData
msiRdaNoResults msiSetRescQuotaPolicy msiPropertiesSet msiGetCollectionContentsReport
msiRdaCommit msiAdmReadRulesFromFileIntoStruct msiPropertiesExists msiGetCollectionSize
msiAW1 msiAdmInsertRulesFromStructIntoDB msiPropertiesToString msiStructFileBundle
msiRdaRollback msiGetRulesFromDBIntoStruct msiPropertiesFromString msiCollectionSpider
msiRenameLocalZone msiAdmWriteRulesFromStructIntoFile msiRecursiveCollCopy msiFlagDataObjwithAVU
msiRenameCollection writeXMsg msiGetDataObjACL msiFlagInfectedObjs
msiAclPolicy readXMsg msiGetCollectionACL
msiRemoveKeyValuePairsFromObj msiSetReplComment msiGetDataObjAVUs
msiDataObjPutWithOptions msiSetBulkPutPostProcPolicy msiGetDataObjPSmeta
msiDataObjReplWithOptions msiVerifyOwner msiGetCollectionPSmeta
msiDataObjChksumWithOptions msiVerifyAVU msiGetDataObjAIP
msiDataObjGetWithOptions msiVerifyACL msiLoadMetadataFromDataObj
msiSetReServerNumProc msiVerifyExpiry msiExportRecursiveCollMeta
msiGetStdoutInExecCmdOut msiVerifyDataType msiCopyAVUMetadata
msiGetStderrInExecCmdOut msiVerifyFileSizeRange msiGetUserInfo
msiAddKeyValToMspStr msiVerifySubCollOwner msiGetUserACL
msiPrintGenQueryInp msiVerifySubCollACL msiCreateUserAccountsFromDataObj
msiTarFileExtract msiVerifySubCollAVU msiLoadUserModsFromDataObj
msiTarFileCreate msiListFields msiDeleteUsersFromDataObj
msiPhyBundleColl msiHiThere msiLoadACLFromDataObj
msiWriteRodsLog msiTestWritePosInt msiGetAuditTrailInfoByUserID
msiServerMonPerf msiListColl msiGetAuditTrailInfoByObjectID
msiFlushMonStat msiTestForEachExec msiGetAuditTrailInfoByActionID
msiDigestMonStat msiGetFormattedSystemTime msiGetAuditTrailInfoByKeywords
msiSplitPath msiPropertiesNew msiGetAuditTrailInfoByTimeStamp
msiGetSessionVarValue msiPropertiesClear msiSetDataType
msiAutoReplicateService msiPropertiesClone msiGuessDataType
msiDataObjAutoMove msiPropertiesAdd msiMergeDataCopies
msiGetContInxFromGenQueryOut msiPropertiesRemove msiIsColl
State Information Attributes - How Many?
ZONE_ID DATA_ID COLL_MAP_ID META_RESC_GROUP_ATTR_VALUE
ZONE_NAME DATA_COLL_ID COLL_INHERITANCE META_RESC_GROUP_ATTR_UNITS
ZONE_TYPE DATA_NAME COLL_COMMENTS META_RESC_GROUP_ATTR_ID
ZONE_CONNECTION DATA_REPL_NUM COLL_CREATE_TIME META_USER_ATTR_NAME
ZONE_COMMENT DATA_VERSION COLL_MODIFY_TIME META_USER_ATTR_VALUE
ZONE_CREATE_TIME DATA_TYPE_NAME COLL_ACCESS_TYPE META_USER_ATTR_UNITS
ZONE_MODIFY_TIME DATA_SIZE COLL_ACCESS_NAME META_USER_ATTR_ID
USER_ID DATA_RESC_GROUP_NAME COLL_TOKEN_NAMESPACE RESC_GROUP_RESC_ID
USER_NAME DATA_RESC_NAME COLL_ACCESS_USER_ID RESC_GROUP_NAME
USER_TYPE DATA_PATH COLL_ACCESS_COLL_ID RESC_GROUP_ID
USER_ZONE DATA_OWNER_NAME META_DATA_ATTR_NAME USER_GROUP_ID
USER_DN DATA_OWNER_ZONE META_DATA_ATTR_VALUE USER_GROUP_NAME
USER_INFO DATA_REPL_STATUS META_DATA_ATTR_UNITS RULE_EXEC_ID
USER_COMMENT DATA_STATUS META_DATA_ATTR_ID RULE_EXEC_NAME
USER_CREATE_TIME DATA_CHECKSUM META_DATA_CREATE_TIME RULE_EXEC_REI_FILE_PATH
USER_MODIFY_TIME DATA_EXPIRY META_DATA_MODIFY_TIME RULE_EXEC_USER_NAME
RESC_ID DATA_MAP_ID META_COLL_ATTR_NAME RULE_EXEC_ADDRESS
RESC_NAME DATA_COMMENTS META_COLL_ATTR_VALUE RULE_EXEC_TIME
RESC_ZONE_NAME DATA_CREATE_TIME META_COLL_ATTR_UNITS RULE_EXEC_FREQUENCY
RESC_TYPE_NAME DATA_MODIFY_TIME META_COLL_ATTR_ID RULE_EXEC_PRIORITY
RESC_CLASS_NAME DATA_ACCESS_TYPE META_NAMESPACE_COLL RULE_EXEC_ESTIMATED_EXE_TIME
RESC_LOC DATA_ACCESS_NAME META_NAMESPACE_DATA RULE_EXEC_NOTIFICATION_ADDR
RESC_VAULT_PATH DATA_TOKEN_NAMESPACE META_NAMESPACE_RESC RULE_EXEC_LAST_EXE_TIME
RESC_FREE_SPACE DATA_ACCESS_USER_ID META_NAMESPACE_USER RULE_EXEC_STATUS
RESC_FREE_SPACE_TIME DATA_ACCESS_DATA_ID META_NAMESPACE_RESC_GROUP TOKEN_NAMESPACE
RESC_INFO COLL_ID META_RESC_ATTR_NAME TOKEN_ID
RESC_COMMENT COLL_NAME META_RESC_ATTR_VALUE TOKEN_NAME
RESC_CREATE_TIME COLL_PARENT_NAME META_RESC_ATTR_UNITS TOKEN_VALUE
RESC_MODIFY_TIME COLL_OWNER_NAME META_RESC_ATTR_ID TOKEN_VALUE2
RESC_STATUS COLL_OWNER_ZONE META_RESC_GROUP_ATTR_NAME TOKEN_VALUE3
State Information Attributes (205)
TOKEN_COMMENT RULE_BASE_NAME DVM_MODIFY_TIME
AUDIT_OBJ_ID RULE_NAME FNM_ID
AUDIT_USER_ID RULE_EVENT FNM_VERSION
AUDIT_ACTION_ID RULE_CONDITION FNM_BASE_NAME
AUDIT_COMMENT RULE_BODY FNM_EXT_FUNC_NAME
AUDIT_CREATE_TIME RULE_RECOVERY FNM_INT_FUNC_NAME
AUDIT_MODIFY_TIME RULE_STATUS FNM_STATUS
SL_HOST_NAME RULE_OWNER_NAME FNM_OWNER_NAME
SL_RESC_NAME RULE_OWNER_ZONE FNM_OWNER_ZONE
SL_CPU_USED RULE_DESCR_1 FNM_COMMENT
SL_MEM_USED RULE_DESCR_2 FNM_CREATE_TIME
SL_SWAP_USED RULE_INPUT_PARAMS FNM_MODIFY_TIME
SL_RUNQ_LOAD RULE_OUTPUT_PARAMS QUOTA_USER_ID
SL_DISK_SPACE RULE_DOLLAR_VARS QUOTA_RESC_ID
SL_NET_INPUT RULE_ICAT_ELEMENTS QUOTA_LIMIT
SL_NET_OUTPUT RULE_SIDEEFFECTS QUOTA_OVER
SL_CREATE_TIME RULE_COMMENT QUOTA_MODIFY_TIME
SLD_RESC_NAME RULE_CREATE_TIME QUOTA_USAGE_USER_ID
SLD_LOAD_FACTOR RULE_MODIFY_TIME QUOTA_USAGE_RESC_ID
SLD_CREATE_TIME DVM_ID QUOTA_USAGE
RULE_BASE_MAP_VERSION DVM_VERSION QUOTA_USAGE_MODIFY_TIME
RULE_BASE_MAP_PRIORITY DVM_BASE_NAME QUOTA_USER_NAME
RULE_BASE_MAP_BASE_NAME DVM_EXT_VAR_NAME QUOTA_USER_ZONE
RULE_BASE_MAP_OWNER_NAME DVM_CONDITION QUOTA_USER_TYPE
RULE_BASE_MAP_OWNER_ZONE DVM_INT_MAP_PATH QUOTA_RESC_NAME
RULE_BASE_MAP_COMMENT DVM_STATUS
RULE_BASE_MAP_CREATE_TIME DVM_OWNER_NAME
RULE_BASE_MAP_MODIFY_TIME DVM_OWNER_ZONE
RULE_ID DVM_COMMENT
RULE_VERSION DVM_CREATE_TIME
Example Disposition Procedure
acPurgeFiles()
{ msiGetIcatTime(*Time,unix);
acGetIcatResults(remove,DATA_EXPIRY < '*Time',*List);
forEachExec(*List)
{ msiDataObjUnlink(*List,*Status);
msiGetValByKey(*List,DATA_NAME,*D);
msiGetValByKey(*List,COLL_NAME,*E);
writeLine(stdout,Purged File *E/*D at *Time );
}
}
Create and Apply Policies
• Maintain a repository of rules in iRODS
– Rules characterized as XML files with input parameters explicitly
characterized as rule parameters
• Build policy template and save in policy repository
– Select which rule sets will be applied
• Map policy template to a record series (collection)
– Add policy parameters as metadata on the record series
• On ingestion of a file into the record series:
– Retrieve policies and parameters from the record series metadata
– Invoke associated rules and apply procedures
• This decouples the choice of access interface from the
execution of your policies
iDrop Client - Ingesting a Record
QuickTime™ and a
decompressor
are needed to see this picture.
Data Grid Record Series
Arch - Defining a Policy
QuickTime™ and a
decompressor
are needed to see this picture.
Policies on a Record Series
QuickTime™ and a
decompressor
are needed to see this picture.
iDrop Interface
QuickTime™ and a
decompressor
are needed to see this picture.
Record Transfer Status
QuickTime™ and a
decompressor
are needed to see this picture.
Tracking Policies on Record Series
QuickTime™ and a
decompressor
are needed to see this picture.
iRODS Distributed Data Management
Supported Storage Systems
• File systems - Windows, Linux, Mac
• Tape archives - HPSS, Sam-QFS
• Repositories - Flickr, Web sites
• Cloud storage- Amazon S3, EC2
• Relational database - PostgreSQL, Oracle, mySQL
• Under development
– Table driven resources
Modular Architecture
• Authentication
– GSI (PKI), Kerberos, Shibboleth, Challenge-response
• Authorization
– Roles, user groups, resource groups, policy constraints, ACLs
• Transport
– TCP/IP (parallel I/O streams), Reliable Blast UDP
• Metadata catalog
– PostgreSQL, mySQL, Oracle
• Distributed rule engine
– Scheduler, messaging system, execution engine, rule base
• Scales to 200 million files, multiple petabytes of data
Enforce NSF Data Management Plan
• Policy for selecting data for organization into a collection
• Policy for choice of descriptive metadata
• Policy for data sharing (how long the data are held proprietary)
• Choice of access services provided for the data
• Policy for creation of a reference collection (with data formats
and analysis procedures)
• Policy for retention and disposition
• Policy for commitment for storage support by project /
department / institution
31
iRODS is a "coordinated NSF/OCI-Nat'l Archives research activity" under the
auspices of the President's NITRD Program and is identified as among the priorities
underlying the President's 2011 Budget Supplement in the area of Human and
Computer Interaction Information Management technology research.
Reagan W. Moore
rwmoore@renci.org
http://irods.diceresearch.org
NSF OCI-0848296 “NARA Transcontinental Persistent Archives Prototype”
NSF SDCI-0721400 “Data Grids for Community Driven Applications”

More Related Content

What's hot

Hitachi NAS Platform 4000 Series Datasheet
Hitachi NAS Platform 4000 Series DatasheetHitachi NAS Platform 4000 Series Datasheet
Hitachi NAS Platform 4000 Series DatasheetHitachi Vantara
 
Hitachi Unified Storage and Hitachi NAS Platform 4000 Series -- Datasheet
Hitachi Unified Storage and Hitachi NAS Platform 4000 Series -- DatasheetHitachi Unified Storage and Hitachi NAS Platform 4000 Series -- Datasheet
Hitachi Unified Storage and Hitachi NAS Platform 4000 Series -- DatasheetHitachi Vantara
 
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...dbpublications
 
Data Analytics: HDFS with Big Data : Issues and Application
Data Analytics:  HDFS  with  Big Data :  Issues and ApplicationData Analytics:  HDFS  with  Big Data :  Issues and Application
Data Analytics: HDFS with Big Data : Issues and ApplicationDr. Chitra Dhawale
 
iRODS/Dataverse Project by Jonathan Crabtree
iRODS/Dataverse Project by Jonathan CrabtreeiRODS/Dataverse Project by Jonathan Crabtree
iRODS/Dataverse Project by Jonathan Crabtreedatascienceiqss
 
Csci12 report aug18
Csci12 report aug18Csci12 report aug18
Csci12 report aug18karenostil
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing dataWorld Agroforestry (ICRAF)
 
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...The HDF-EOS Tools and Information Center
 

What's hot (10)

238 02 staples
238 02 staples238 02 staples
238 02 staples
 
Hitachi NAS Platform 4000 Series Datasheet
Hitachi NAS Platform 4000 Series DatasheetHitachi NAS Platform 4000 Series Datasheet
Hitachi NAS Platform 4000 Series Datasheet
 
Hitachi Unified Storage and Hitachi NAS Platform 4000 Series -- Datasheet
Hitachi Unified Storage and Hitachi NAS Platform 4000 Series -- DatasheetHitachi Unified Storage and Hitachi NAS Platform 4000 Series -- Datasheet
Hitachi Unified Storage and Hitachi NAS Platform 4000 Series -- Datasheet
 
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
BFC: High-Performance Distributed Big-File Cloud Storage Based On Key-Value S...
 
Data Analytics: HDFS with Big Data : Issues and Application
Data Analytics:  HDFS  with  Big Data :  Issues and ApplicationData Analytics:  HDFS  with  Big Data :  Issues and Application
Data Analytics: HDFS with Big Data : Issues and Application
 
iRODS/Dataverse Project by Jonathan Crabtree
iRODS/Dataverse Project by Jonathan CrabtreeiRODS/Dataverse Project by Jonathan Crabtree
iRODS/Dataverse Project by Jonathan Crabtree
 
Csci12 report aug18
Csci12 report aug18Csci12 report aug18
Csci12 report aug18
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
 
IBM System Storage N3000 Express
IBM System Storage N3000 ExpressIBM System Storage N3000 Express
IBM System Storage N3000 Express
 
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
 

Viewers also liked

Go go gadgets! Implementing a technology collection for staff use
Go go gadgets! Implementing a technology collection for staff useGo go gadgets! Implementing a technology collection for staff use
Go go gadgets! Implementing a technology collection for staff useGary Wilhelm
 
Digital storytelling reflection
Digital storytelling reflectionDigital storytelling reflection
Digital storytelling reflectionMegan Smith
 
Using Dataverse Virtual Archive Technology for Research Data Management
Using Dataverse Virtual Archive Technology for Research Data ManagementUsing Dataverse Virtual Archive Technology for Research Data Management
Using Dataverse Virtual Archive Technology for Research Data ManagementGary Wilhelm
 
Network Attached Storage Initiative
Network Attached Storage InitiativeNetwork Attached Storage Initiative
Network Attached Storage InitiativeGary Wilhelm
 
Transitions and Extensions – What Schools Have Learned from Sakai Migrations
Transitions and Extensions – What Schools Have Learned from Sakai Migrations Transitions and Extensions – What Schools Have Learned from Sakai Migrations
Transitions and Extensions – What Schools Have Learned from Sakai Migrations Gary Wilhelm
 
Supporting your remote clients with bomgar
Supporting your remote clients with bomgarSupporting your remote clients with bomgar
Supporting your remote clients with bomgarGary Wilhelm
 
Network Attached Storage (NAS) Initiative
Network Attached Storage (NAS) Initiative Network Attached Storage (NAS) Initiative
Network Attached Storage (NAS) Initiative Gary Wilhelm
 

Viewers also liked (9)

Go go gadgets! Implementing a technology collection for staff use
Go go gadgets! Implementing a technology collection for staff useGo go gadgets! Implementing a technology collection for staff use
Go go gadgets! Implementing a technology collection for staff use
 
Digital storytelling reflection
Digital storytelling reflectionDigital storytelling reflection
Digital storytelling reflection
 
Using Dataverse Virtual Archive Technology for Research Data Management
Using Dataverse Virtual Archive Technology for Research Data ManagementUsing Dataverse Virtual Archive Technology for Research Data Management
Using Dataverse Virtual Archive Technology for Research Data Management
 
Taller 4 ahiver
Taller 4 ahiverTaller 4 ahiver
Taller 4 ahiver
 
Network Attached Storage Initiative
Network Attached Storage InitiativeNetwork Attached Storage Initiative
Network Attached Storage Initiative
 
Transitions and Extensions – What Schools Have Learned from Sakai Migrations
Transitions and Extensions – What Schools Have Learned from Sakai Migrations Transitions and Extensions – What Schools Have Learned from Sakai Migrations
Transitions and Extensions – What Schools Have Learned from Sakai Migrations
 
Supporting your remote clients with bomgar
Supporting your remote clients with bomgarSupporting your remote clients with bomgar
Supporting your remote clients with bomgar
 
Traits
TraitsTraits
Traits
 
Network Attached Storage (NAS) Initiative
Network Attached Storage (NAS) Initiative Network Attached Storage (NAS) Initiative
Network Attached Storage (NAS) Initiative
 

Similar to Building Cyber-infrastructure at UNC-CH

AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...Amazon Web Services
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarFAIRDOM
 
Moore RDAP11 Policy-based Data Management
Moore RDAP11 Policy-based Data ManagementMoore RDAP11 Policy-based Data Management
Moore RDAP11 Policy-based Data ManagementASIS&T
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...David Wallom
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL David Smelker
 
Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...David Wallom
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Dr. Anita Goel
 
Lecture-1.ppt
Lecture-1.pptLecture-1.ppt
Lecture-1.pptChSheraz3
 
Policy-based Data Management
Policy-based Data Management Policy-based Data Management
Policy-based Data Management Gary Wilhelm
 
#MFSummit2016 Operate: The race for space
#MFSummit2016 Operate: The race for space#MFSummit2016 Operate: The race for space
#MFSummit2016 Operate: The race for spaceMicro Focus
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
 
Mastering Cloud Data Cost Control: A FinOps Approach
Mastering Cloud Data Cost Control: A FinOps ApproachMastering Cloud Data Cost Control: A FinOps Approach
Mastering Cloud Data Cost Control: A FinOps ApproachDenodo
 
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachLunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachDenodo
 
OpenStack Swift In the Enterprise
OpenStack Swift In the EnterpriseOpenStack Swift In the Enterprise
OpenStack Swift In the EnterpriseHostway|HOSTING
 

Similar to Building Cyber-infrastructure at UNC-CH (20)

AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
AWS re:Invent 2016: FINRA: Building a Secure Data Science Platform on AWS (BD...
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
Moore RDAP11 Policy-based Data Management
Moore RDAP11 Policy-based Data ManagementMoore RDAP11 Policy-based Data Management
Moore RDAP11 Policy-based Data Management
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL
 
Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
 
Lecture-1.ppt
Lecture-1.pptLecture-1.ppt
Lecture-1.ppt
 
Policy-based Data Management
Policy-based Data Management Policy-based Data Management
Policy-based Data Management
 
#MFSummit2016 Operate: The race for space
#MFSummit2016 Operate: The race for space#MFSummit2016 Operate: The race for space
#MFSummit2016 Operate: The race for space
 
Information Systems
Information SystemsInformation Systems
Information Systems
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
Mastering Cloud Data Cost Control: A FinOps Approach
Mastering Cloud Data Cost Control: A FinOps ApproachMastering Cloud Data Cost Control: A FinOps Approach
Mastering Cloud Data Cost Control: A FinOps Approach
 
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachLunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
 
Introduction to data management, terminologies and use of data management pla...
Introduction to data management, terminologies and use of data management pla...Introduction to data management, terminologies and use of data management pla...
Introduction to data management, terminologies and use of data management pla...
 
Data Domain-Driven Design
Data Domain-Driven DesignData Domain-Driven Design
Data Domain-Driven Design
 
OpenStack Swift In the Enterprise
OpenStack Swift In the EnterpriseOpenStack Swift In the Enterprise
OpenStack Swift In the Enterprise
 

More from Gary Wilhelm

IPv6: We Care So You Don't Have To
IPv6: We Care So You Don't Have ToIPv6: We Care So You Don't Have To
IPv6: We Care So You Don't Have ToGary Wilhelm
 
Virtualization and you: where are we?
Virtualization and you: where are we?Virtualization and you: where are we?
Virtualization and you: where are we?Gary Wilhelm
 
Online Copyright Education
Online Copyright EducationOnline Copyright Education
Online Copyright EducationGary Wilhelm
 
Increasing Utilization of Software Site Licenses
Increasing Utilization of Software Site LicensesIncreasing Utilization of Software Site Licenses
Increasing Utilization of Software Site LicensesGary Wilhelm
 
Leveraging Centralized IT Support Services as a First Point of Contact
Leveraging Centralized IT Support Services as a First Point of ContactLeveraging Centralized IT Support Services as a First Point of Contact
Leveraging Centralized IT Support Services as a First Point of ContactGary Wilhelm
 
S#$% My Network Says (CTC Retreat 2010)
S#$% My Network Says (CTC Retreat 2010)S#$% My Network Says (CTC Retreat 2010)
S#$% My Network Says (CTC Retreat 2010)Gary Wilhelm
 

More from Gary Wilhelm (7)

IPv6: We Care So You Don't Have To
IPv6: We Care So You Don't Have ToIPv6: We Care So You Don't Have To
IPv6: We Care So You Don't Have To
 
After the Breach
After the BreachAfter the Breach
After the Breach
 
Virtualization and you: where are we?
Virtualization and you: where are we?Virtualization and you: where are we?
Virtualization and you: where are we?
 
Online Copyright Education
Online Copyright EducationOnline Copyright Education
Online Copyright Education
 
Increasing Utilization of Software Site Licenses
Increasing Utilization of Software Site LicensesIncreasing Utilization of Software Site Licenses
Increasing Utilization of Software Site Licenses
 
Leveraging Centralized IT Support Services as a First Point of Contact
Leveraging Centralized IT Support Services as a First Point of ContactLeveraging Centralized IT Support Services as a First Point of Contact
Leveraging Centralized IT Support Services as a First Point of Contact
 
S#$% My Network Says (CTC Retreat 2010)
S#$% My Network Says (CTC Retreat 2010)S#$% My Network Says (CTC Retreat 2010)
S#$% My Network Says (CTC Retreat 2010)
 

Recently uploaded

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Recently uploaded (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

Building Cyber-infrastructure at UNC-CH

  • 1. Data Management at UNC-CH (DICE Center) Reagan W. Moore Arcot Rajasekar Mike Conway Antoine de Torcy Chien-Yi Hou Leesa Brieger Jewel Ward Terrell Russell Hao Xu {moore,sekar,mwan}@diceresearch.org http://irods.diceresearch.org Wayne Schroeder Mike Wan Bing Zhu Sheau-Yen Chen Paul Tooby
  • 2. Data Management Challenges • TUCASI data grid – Shared collections between UNC / NCSU / Duke • RENCI data grid – Regional data grid for distributing data (NC orthophotos) • NCSU / RENCI cloud – Couple cloud storage to institutional repository • NCDC data grid – Data processing pipeline between NCDC and RENCI • SILS Lifetime Library – Personal digital collection • Carolina Digital Repository – Institutional archive for data collections • ITS Archive – Replication across tape-based storage • NARA Persistent Archive – Preservation prototype
  • 3. Common Infrastructure • The same policy-based data management system is used to support all of the data management applications. – Observe that the primary difference between the applications is the set of policies and procedures that are enforced. – Requires the ability to: • Turn policies into computer executable rules • Use rules to control execution of procedures • Track execution of procedures through stae information
  • 4. 4 4 Policy-based Data Management integrated Rule Oriented Data System • Purpose - reason a collection is assembled • Properties - attributes needed to ensure the purpose • Policies - controls for enforcing desired properties • Procedures - functions that implement the policies • State information - results of applying the procedures • Assessment criteria - validation that state information conforms to the desired purpose • Federation - controlled sharing of logical name spaces These are the essential elements for policy-based data management
  • 5. Data Life Cycle Project Collection Private Local Policy Data Grid Shared Distribution Policy Digital Library Published Description Policy Data Processing Pipeline Analyzed Service Policy Reference Collection Preserved Representation Policy Federation Sustained Re-purposing Policy Stages correspond to addition of new policies for a broader community Virtualize the stages of the data life cycle through policy evolution The driving purpose changes at each stage of the data life cycle
  • 6. Policies • Data distribution • Data replication • Extraction of required metadata • Integrity verification • Time-dependent access restrictions • Versioning • Deletion control • Retention/disposition • Creation of derived data products
  • 7. 7 User w/Client Can Search, Access, Add and Manage Data & Metadata Access distributed data with Web-based Browser or iRODS GUI or Command Line clients. Overview of iRODS Architecture iRODS Data Server Disk, Tape, etc. iRODS Metadata Catalog Track information iRODS Data System iRODS Rule Engine Tracks Policies
  • 8. Policy-Based Data Management • Observe that each application uses a different user interface – Requires that the access mechanisms be decoupled from the data management framework • Observe that each application uses different types of storage systems – Requires that the system interoperate with multiple storage protocols • Implemented as infrastructure independence
  • 9. Infrastructure IndependenceInfrastructure Independence Storage SystemStorage System Storage ProtocolStorage Protocol Access InterfaceAccess Interface Policy Enforcement PointsPolicy Enforcement Points Standard Micro-servicesStandard Micro-services Map from the actions requested by the client to multiple policy enforcement points. Map from policy to standard micro-services. Map from micro-services to standard Posix I/O operations. Map from standard Posix I/O operations to the protocol supported by the storage system Standard I/O OperationsStandard I/O Operations DataGrid
  • 10. iput ../src/irm.c checks 10 policy enforcement points srbbrick14:10900:ApplyRule#116:: acChkHostAccessControl srbbrick14:10900:GotRule#117:: acChkHostAccessControl srbbrick14:10900:ApplyRule#118:: acSetPublicUserPolicy srbbrick14:10900:GotRule#119:: acSetPublicUserPolicy srbbrick14:10900:ApplyRule#120:: acAclPolicy srbbrick14:10900:GotRule#121:: acAclPolicy srbbrick14:10900:ApplyRule#122:: acSetRescSchemeForCreate srbbrick14:10900:GotRule#123:: acSetRescSchemeForCreate srbbrick14:10900:execMicroSrvc#124:: msiSetDefaultResc(demoResc,null) srbbrick14:10900:ApplyRule#125:: acRescQuotaPolicy srbbrick14:10900:GotRule#126:: acRescQuotaPolicy srbbrick14:10900:execMicroSrvc#127:: msiSetRescQuotaPolicy(off) srbbrick14:10900:ApplyRule#128:: acSetVaultPathPolicy srbbrick14:10900:GotRule#129:: acSetVaultPathPolicy srbbrick14:10900:execMicroSrvc#130:: msiSetGraftPathScheme(no,1) srbbrick14:10900:ApplyRule#131:: acPreProcForModifyDataObjMeta srbbrick14:10900:GotRule#132:: acPreProcForModifyDataObjMeta srbbrick14:10900:ApplyRule#133:: acPostProcForModifyDataObjMeta srbbrick14:10900:GotRule#134:: acPostProcForModifyDataObjMeta srbbrick14:10900:ApplyRule#135:: acPostProcForCreate srbbrick14:10900:GotRule#136:: acPostProcForCreate srbbrick14:10900:ApplyRule#137:: acPostProcForPut srbbrick14:10900:GotRule#138:: acPostProcForPut srbbrick14:10900:GotRule#139:: acPostProcForPut srbbrick14:10900:GotRule#140:: acPostProcForPut
  • 11. Lessons Learned (NARA Prototype) • Scalability / Reliability / Robustness – Maintain interactivity through appropriate catalog indexing – Manage network outages at the client level – Minimize memory used • Assessment / Audit trails – Query metadata to validate properties – Map audit trails to standard events – Track actions from policy enforcement to procedure execution • Federation / Extensibility / Ease of Use – Replicate metadata and records to a deep archive – Extensible architecture that enables software evolution – Provide simplest possible mapping from policies to record series
  • 12. Level of Sophistication • How many different access mechanisms are needed? • How many different policy enforcement points should be checked? • How many standard functions are needed to implement preservation procedures? • How many preservation metadata attributes are needed to track preservation properties? • How many preservation policies are needed?
  • 13. Data Grid Clients (35) API Client Developer Browser DCAPE UNC iExplore DICE-Bing Zhu JUX IN2P3 Peta Web browser PetaShare Digital Library Akubra/iRODS DICE Dspace MIT Fedora on Fuse IN2P3 Fedora/iRODS module DICE Islandora DICE File System Davis - Webdav ARCS Dropbox / iDrop DICE-Mike Conway FUSE IN2P3, DICE, FUSE optimization PetaShare OpenDAP ARCS PetaFS (Fuse) Petashare - LSU Petashell (Parrot) PetaShare Grid GridFTP - Griffin ARCS Jsaga IN2P3 Parrot Notre Dame-Doug Thain Saga KEK API Client Developer I/O Libraries PHP - DICE DICE-Bing Zhu C API DICE-Mike Wan C I/O library DICE-Wayne Schroeder Jargon DICE-Mike Conway Pyrods - Python SHAMAN-Jerome Fusillier Portal EnginFrame NICE / RENCI Preservation Arch DICE-Mike Conway Tools Archive tools-NOAO NOAO Big Board visualization RENCI File-format-identifier GA Tech icommands DICE Pcommands PetaShare Resource Monitoring IN2P3 Sync-package Academica Sinica URSpace Academica Sinica Web Service VOSpace NVOA Shibboleth King's College Workflows Kepler DICE Stork LSU Taverna RENCI
  • 14. Policy Enforcement Points (71) ACTION acCreateUser acDeleteUser acGetUserbyDN acTrashPolicy acAclPolicy acSetCreateConditions acDataDeletePolicy acRenameLocalZone acSetRescSchemeForCreate acRescQuotaPolicy acSetMultiReplPerResc acSetNumThreads acVacuum acSetResourceList acSetCopyNumber acVerifyChecksum acCreateUserZoneCollections acDeleteUserZoneCollections acPurgeFiles acRegisterData acGetIcatResults acSetPublicUserPolicy acCreateDefaultCollections acDeleteDefaultCollections POST-ACTION POLICY acPostProcForCreateUser acPostProcForDeleteUser acPostProcForModifyUser acPostProcForModifyUserGroup acPostProcForDelete acPostProcForCollCreate acPostProcForRmColl acPostProcForModifyAVUMetadata acPostProcForModifyCollMeta acPostProcForModifyDataObjMeta acPostProcForModifyAccessControl acPostProcForOpen acPostProcForObjRename acPostProcForCreateResource acPostProcForDeleteResource acPostProcForModifyResource acPostProcForModifyResourceGroup acPostProcForCreateToken acPostProcForDeleteToken acPostProcForFilePathReg acPostProcForGenQuery acPostProcForPut acPostProcForCopy acPostProcForCreate PRE-ACTION POLICY acPreProcForCreateUser acPreProcForDeleteUser acPreProcForModifyUser acPreProcForModifyUserGroup acChkHostAccessControl acPreProcForCollCreate acPreProcForRmColl acPreProcForModifyAVUMetadata acPreProcForModifyCollMeta acPreProcForModifyDataObjMeta acPreProcForModifyAccessControl acPreprocForDataObjOpen acPreProcForObjRename acPreProcForCreateResource acPreProcForDeleteResource acPreProcForModifyResource acPreProcForModifyResourceGroup acPreProcForCreateToken acPreProcForDeleteToken acNoChkFilePathPerm acPreProcForGenQuery acSetReServerNumProc acSetVaultPathPolicy
  • 15. Micro-services - How many are needed? print_hello msiOprDisallowed msiRmColl msiGetValByKey print_hello_arg msiDataObjCreate msiReplColl msiAddKeyVal msiVacuum msiDataObjOpen msiCollRepl assign msiQuota msiDataObjClose msiPhyPathReg ifExec msiDeleteUnusedAVUs msiDataObjLseek msiObjStat break msiGoodFailure msiDataObjRead msiDataObjRsync applyAllRules msiSetResource msiDataObjWrite msiCollRsync msiExecStrCondQuery msiCheckPermission msiDataObjUnlink msiFreeBuffer msiExecStrCondQueryWithOptions msiCheckOwner msiDataObjRepl msiNoChkFilePathPerm msiExecGenQuery msiCreateUser msiDataObjCopy msiNoTrashCan msiMakeQuery msiCreateCollByAdmin msiExtractNaraMetadata msiSetPublicUserOpr msiMakeGenQuery msiSendMail msiSetMultiReplPerResc whileExec msiGetMoreRows recover_print_hello msiAdmChangeCoreIRB forExec msiAddSelectFieldToGenQuery msiCommit msiAdmShowIRB delayExec msiAddConditionToGenQuery msiRollback msiAdmShowDVM remoteExec msiPrintGenQueryOutToBuffer msiDeleteCollByAdmin msiAdmShowFNM forEachExec msiExecCmd msiDeleteUser msiAdmAppendToTopOfCoreIRB msiSleep msiSetGraftPathScheme msiAddUserToGroup msiAdmClearAppRuleStruct writeString msiSetRandomScheme msiSetDefaultResc msiAdmAddAppRuleStruct writeLine msiCheckHostAccessControl msiSetRescSortScheme msiGetObjType writeBytesBuf msiGetIcatTime msiSysReplDataObj msiAssociateKeyValuePairsToObj writePosInt msiGetTaggedValueFromString msiStageDataObj msiExtractTemplateMDFromBuf writeKeyValPairs msiXmsgServerConnect msiSetDataObjPreferredResc msiReadMDTemplateIntoTagStruct msiGetDiffTime msiXmsgCreateStream msiSetDataObjAvoidResc msiDataObjPut msiGetSystemTime msiCreateXmsgInp msiSortDataObj msiDataObjGet msiHumanToSystemTime msiSendXmsg msiSysChksumDataObj msiDataObjChksum msiStrToBytesBuf msiRcvXmsg msiSetDataTypeFromExt msiDataObjPhymv msiApplyDCMetadataTemplate msiXmsgServerDisConnect msiSetNoDirectRescInp msiDataObjRename msiListEnabledMS msiString2KeyValPair msiSetNumThreads msiDataObjTrim msiSendStdoutAsEmail msiStrArray2String msiDeleteDisallowed msiCollCreate msiPrintKeyValPair msiRdaToStdout
  • 16. Micro-services (229) msiRdaToDataObj msiSetACL msiPropertiesGet msiIsData msiRdaNoResults msiSetRescQuotaPolicy msiPropertiesSet msiGetCollectionContentsReport msiRdaCommit msiAdmReadRulesFromFileIntoStruct msiPropertiesExists msiGetCollectionSize msiAW1 msiAdmInsertRulesFromStructIntoDB msiPropertiesToString msiStructFileBundle msiRdaRollback msiGetRulesFromDBIntoStruct msiPropertiesFromString msiCollectionSpider msiRenameLocalZone msiAdmWriteRulesFromStructIntoFile msiRecursiveCollCopy msiFlagDataObjwithAVU msiRenameCollection writeXMsg msiGetDataObjACL msiFlagInfectedObjs msiAclPolicy readXMsg msiGetCollectionACL msiRemoveKeyValuePairsFromObj msiSetReplComment msiGetDataObjAVUs msiDataObjPutWithOptions msiSetBulkPutPostProcPolicy msiGetDataObjPSmeta msiDataObjReplWithOptions msiVerifyOwner msiGetCollectionPSmeta msiDataObjChksumWithOptions msiVerifyAVU msiGetDataObjAIP msiDataObjGetWithOptions msiVerifyACL msiLoadMetadataFromDataObj msiSetReServerNumProc msiVerifyExpiry msiExportRecursiveCollMeta msiGetStdoutInExecCmdOut msiVerifyDataType msiCopyAVUMetadata msiGetStderrInExecCmdOut msiVerifyFileSizeRange msiGetUserInfo msiAddKeyValToMspStr msiVerifySubCollOwner msiGetUserACL msiPrintGenQueryInp msiVerifySubCollACL msiCreateUserAccountsFromDataObj msiTarFileExtract msiVerifySubCollAVU msiLoadUserModsFromDataObj msiTarFileCreate msiListFields msiDeleteUsersFromDataObj msiPhyBundleColl msiHiThere msiLoadACLFromDataObj msiWriteRodsLog msiTestWritePosInt msiGetAuditTrailInfoByUserID msiServerMonPerf msiListColl msiGetAuditTrailInfoByObjectID msiFlushMonStat msiTestForEachExec msiGetAuditTrailInfoByActionID msiDigestMonStat msiGetFormattedSystemTime msiGetAuditTrailInfoByKeywords msiSplitPath msiPropertiesNew msiGetAuditTrailInfoByTimeStamp msiGetSessionVarValue msiPropertiesClear msiSetDataType msiAutoReplicateService msiPropertiesClone msiGuessDataType msiDataObjAutoMove msiPropertiesAdd msiMergeDataCopies msiGetContInxFromGenQueryOut msiPropertiesRemove msiIsColl
  • 17. State Information Attributes - How Many? ZONE_ID DATA_ID COLL_MAP_ID META_RESC_GROUP_ATTR_VALUE ZONE_NAME DATA_COLL_ID COLL_INHERITANCE META_RESC_GROUP_ATTR_UNITS ZONE_TYPE DATA_NAME COLL_COMMENTS META_RESC_GROUP_ATTR_ID ZONE_CONNECTION DATA_REPL_NUM COLL_CREATE_TIME META_USER_ATTR_NAME ZONE_COMMENT DATA_VERSION COLL_MODIFY_TIME META_USER_ATTR_VALUE ZONE_CREATE_TIME DATA_TYPE_NAME COLL_ACCESS_TYPE META_USER_ATTR_UNITS ZONE_MODIFY_TIME DATA_SIZE COLL_ACCESS_NAME META_USER_ATTR_ID USER_ID DATA_RESC_GROUP_NAME COLL_TOKEN_NAMESPACE RESC_GROUP_RESC_ID USER_NAME DATA_RESC_NAME COLL_ACCESS_USER_ID RESC_GROUP_NAME USER_TYPE DATA_PATH COLL_ACCESS_COLL_ID RESC_GROUP_ID USER_ZONE DATA_OWNER_NAME META_DATA_ATTR_NAME USER_GROUP_ID USER_DN DATA_OWNER_ZONE META_DATA_ATTR_VALUE USER_GROUP_NAME USER_INFO DATA_REPL_STATUS META_DATA_ATTR_UNITS RULE_EXEC_ID USER_COMMENT DATA_STATUS META_DATA_ATTR_ID RULE_EXEC_NAME USER_CREATE_TIME DATA_CHECKSUM META_DATA_CREATE_TIME RULE_EXEC_REI_FILE_PATH USER_MODIFY_TIME DATA_EXPIRY META_DATA_MODIFY_TIME RULE_EXEC_USER_NAME RESC_ID DATA_MAP_ID META_COLL_ATTR_NAME RULE_EXEC_ADDRESS RESC_NAME DATA_COMMENTS META_COLL_ATTR_VALUE RULE_EXEC_TIME RESC_ZONE_NAME DATA_CREATE_TIME META_COLL_ATTR_UNITS RULE_EXEC_FREQUENCY RESC_TYPE_NAME DATA_MODIFY_TIME META_COLL_ATTR_ID RULE_EXEC_PRIORITY RESC_CLASS_NAME DATA_ACCESS_TYPE META_NAMESPACE_COLL RULE_EXEC_ESTIMATED_EXE_TIME RESC_LOC DATA_ACCESS_NAME META_NAMESPACE_DATA RULE_EXEC_NOTIFICATION_ADDR RESC_VAULT_PATH DATA_TOKEN_NAMESPACE META_NAMESPACE_RESC RULE_EXEC_LAST_EXE_TIME RESC_FREE_SPACE DATA_ACCESS_USER_ID META_NAMESPACE_USER RULE_EXEC_STATUS RESC_FREE_SPACE_TIME DATA_ACCESS_DATA_ID META_NAMESPACE_RESC_GROUP TOKEN_NAMESPACE RESC_INFO COLL_ID META_RESC_ATTR_NAME TOKEN_ID RESC_COMMENT COLL_NAME META_RESC_ATTR_VALUE TOKEN_NAME RESC_CREATE_TIME COLL_PARENT_NAME META_RESC_ATTR_UNITS TOKEN_VALUE RESC_MODIFY_TIME COLL_OWNER_NAME META_RESC_ATTR_ID TOKEN_VALUE2 RESC_STATUS COLL_OWNER_ZONE META_RESC_GROUP_ATTR_NAME TOKEN_VALUE3
  • 18. State Information Attributes (205) TOKEN_COMMENT RULE_BASE_NAME DVM_MODIFY_TIME AUDIT_OBJ_ID RULE_NAME FNM_ID AUDIT_USER_ID RULE_EVENT FNM_VERSION AUDIT_ACTION_ID RULE_CONDITION FNM_BASE_NAME AUDIT_COMMENT RULE_BODY FNM_EXT_FUNC_NAME AUDIT_CREATE_TIME RULE_RECOVERY FNM_INT_FUNC_NAME AUDIT_MODIFY_TIME RULE_STATUS FNM_STATUS SL_HOST_NAME RULE_OWNER_NAME FNM_OWNER_NAME SL_RESC_NAME RULE_OWNER_ZONE FNM_OWNER_ZONE SL_CPU_USED RULE_DESCR_1 FNM_COMMENT SL_MEM_USED RULE_DESCR_2 FNM_CREATE_TIME SL_SWAP_USED RULE_INPUT_PARAMS FNM_MODIFY_TIME SL_RUNQ_LOAD RULE_OUTPUT_PARAMS QUOTA_USER_ID SL_DISK_SPACE RULE_DOLLAR_VARS QUOTA_RESC_ID SL_NET_INPUT RULE_ICAT_ELEMENTS QUOTA_LIMIT SL_NET_OUTPUT RULE_SIDEEFFECTS QUOTA_OVER SL_CREATE_TIME RULE_COMMENT QUOTA_MODIFY_TIME SLD_RESC_NAME RULE_CREATE_TIME QUOTA_USAGE_USER_ID SLD_LOAD_FACTOR RULE_MODIFY_TIME QUOTA_USAGE_RESC_ID SLD_CREATE_TIME DVM_ID QUOTA_USAGE RULE_BASE_MAP_VERSION DVM_VERSION QUOTA_USAGE_MODIFY_TIME RULE_BASE_MAP_PRIORITY DVM_BASE_NAME QUOTA_USER_NAME RULE_BASE_MAP_BASE_NAME DVM_EXT_VAR_NAME QUOTA_USER_ZONE RULE_BASE_MAP_OWNER_NAME DVM_CONDITION QUOTA_USER_TYPE RULE_BASE_MAP_OWNER_ZONE DVM_INT_MAP_PATH QUOTA_RESC_NAME RULE_BASE_MAP_COMMENT DVM_STATUS RULE_BASE_MAP_CREATE_TIME DVM_OWNER_NAME RULE_BASE_MAP_MODIFY_TIME DVM_OWNER_ZONE RULE_ID DVM_COMMENT RULE_VERSION DVM_CREATE_TIME
  • 19. Example Disposition Procedure acPurgeFiles() { msiGetIcatTime(*Time,unix); acGetIcatResults(remove,DATA_EXPIRY < '*Time',*List); forEachExec(*List) { msiDataObjUnlink(*List,*Status); msiGetValByKey(*List,DATA_NAME,*D); msiGetValByKey(*List,COLL_NAME,*E); writeLine(stdout,Purged File *E/*D at *Time ); } }
  • 20. Create and Apply Policies • Maintain a repository of rules in iRODS – Rules characterized as XML files with input parameters explicitly characterized as rule parameters • Build policy template and save in policy repository – Select which rule sets will be applied • Map policy template to a record series (collection) – Add policy parameters as metadata on the record series • On ingestion of a file into the record series: – Retrieve policies and parameters from the record series metadata – Invoke associated rules and apply procedures • This decouples the choice of access interface from the execution of your policies
  • 21. iDrop Client - Ingesting a Record QuickTime™ and a decompressor are needed to see this picture. Data Grid Record Series
  • 22. Arch - Defining a Policy QuickTime™ and a decompressor are needed to see this picture.
  • 23. Policies on a Record Series QuickTime™ and a decompressor are needed to see this picture.
  • 24. iDrop Interface QuickTime™ and a decompressor are needed to see this picture.
  • 25. Record Transfer Status QuickTime™ and a decompressor are needed to see this picture.
  • 26. Tracking Policies on Record Series QuickTime™ and a decompressor are needed to see this picture.
  • 28. Supported Storage Systems • File systems - Windows, Linux, Mac • Tape archives - HPSS, Sam-QFS • Repositories - Flickr, Web sites • Cloud storage- Amazon S3, EC2 • Relational database - PostgreSQL, Oracle, mySQL • Under development – Table driven resources
  • 29. Modular Architecture • Authentication – GSI (PKI), Kerberos, Shibboleth, Challenge-response • Authorization – Roles, user groups, resource groups, policy constraints, ACLs • Transport – TCP/IP (parallel I/O streams), Reliable Blast UDP • Metadata catalog – PostgreSQL, mySQL, Oracle • Distributed rule engine – Scheduler, messaging system, execution engine, rule base • Scales to 200 million files, multiple petabytes of data
  • 30. Enforce NSF Data Management Plan • Policy for selecting data for organization into a collection • Policy for choice of descriptive metadata • Policy for data sharing (how long the data are held proprietary) • Choice of access services provided for the data • Policy for creation of a reference collection (with data formats and analysis procedures) • Policy for retention and disposition • Policy for commitment for storage support by project / department / institution
  • 31. 31 iRODS is a "coordinated NSF/OCI-Nat'l Archives research activity" under the auspices of the President's NITRD Program and is identified as among the priorities underlying the President's 2011 Budget Supplement in the area of Human and Computer Interaction Information Management technology research. Reagan W. Moore rwmoore@renci.org http://irods.diceresearch.org NSF OCI-0848296 “NARA Transcontinental Persistent Archives Prototype” NSF SDCI-0721400 “Data Grids for Community Driven Applications”