SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
Coping Strategies
for the Death of Unlimited Storage
GlobusWorld 2022
Panelists
Sarah Bailey, UC Berkeley
Christopher Clements, San Diego State University
Jim Leous, The Pennsylvania State University
Charles McClary, Indiana University
Hellen Zziwa, Harvard University
Moderator
Bob Flynn, Internet2
Microsoft
• September 2013: 7 GB/user
• June 2014: 1TB/user; ??/enterprise
• October 2014: Unlimited
• November 2015: 1TB/user; ??/enterprise
• 2019: Up to 25 TB/user, upon request
• 2019: Many universities move to license
certain products for only “knowledge
workers”
Google Drive
• April 2012: 5 GB/user
• May 2013: 30 GB/user
• August 2014: Unlimited
• December 2019: Researching charges for
accounts and unlimited storage
• February 2021: End of unlimited storage.
Change to tiered pricing model
Box
• 2012: 50 GB/user; # users x 2 GB/enterprise
• 2013: 100 GB/user; # users x 4 GB/enterprise
• August 2015: Unlimited
• December 2019: Change to $820/TB/year
pricing model
• Spring 2020: Change to $130/TB/year
History of Cloud Storage Quotas/Licenses/Account Limits
Slide Credit: Ian Crew, UC Berkeley
Data won’t
stop growing
Because we
LOVE our
Data
We used to
have all the
room we
needed for
our Data
Not so much
anymore
How do you
solve a
problem like
your Data?
Bob Flynn
Your Moderator
Program Manager, Cloud
Infrastructure & Platform Services
Sarah Bailey Chris Clements Jim Leous
University of California
Berkeley
Harvard
University
San Diego State
University
Indiana
University
Charles McClary Hellen Zziwa
Your Panel
The Pennsylvania
State University
UC Berkeley
Sarah Bailey
Content Collaboration Service Lead
Background
● Available services - Box, Google Drive, and Sharepoint
● OneDrive is not currently available at UCB
● Sharepoint is our secure storage solution
● Priorities: effective change management and communication with the campus community and
transparent and balanced management of services in the portfolio.
What issues have emerged related to migrating data?
● Tools for migration are scoped too broadly (Globus)
● Would require secure certification for server running Globus, and high level of vendor trust
● Data throughput is too low
● Tools for monitoring service usage are inadequate for completing the requirements established by
service providers
LETUS
Learning Environments, Technologies
& User Services
User Services Director Chris Clements
San Diego State University
User Services Portfolio
Enterprise Application Support
• Google Workspace
• O365/Azure
• ServiceNow
• Zoom
• Canvas
• Mediasite
• Adobe Acrobat Sign
• Adobe Creative Cloud
• Globus
• Duo Security
• Slack
Support Services
• Help Desk Services
• Desktop Services
• Identity Management Support (SDSUid)
• Duo Multi-Factor (MFA) Support
• Network and Wireless Troubleshooting
• Software Distribution
• ZoomCorps
• ServiceNow Corps
Cloud Storage Today
Google Workspace
• 76,000 Active Accounts
• 1 petabyte of data
• Primary repository for PL data
• Campus standard for communication and collaboration (faculty, staff, and students)
Microsoft OneDrive
• Used for PC backup and folder redirection
SharePoint
• Departmental use only
• The long-term strategy is to phase this out
Azure
• 356 terabytes of primary and backup storage
• Replaced traditional tape backup
Amazon S3
• Projects requiring GovCloud
Preparing For Limited Storage
What we are doing right
• Automation (provisioning and timely deprovisioning of accounts)
• Staff, volunteers, and consultant accounts are deleted 90 days after separation
• Faculty and students who are graduating have a 1-year grace period
• Accounts for life are not offered to our alumni
• Routine auditing of accounts
What we are working on
• Appropriate retention policies
• Consider how Google’s new storage tools can be applied
• Training up and expanding our enterprise platform administrators
• Increase user base training and documentation
What Does The Future Hold?
Future direction
• SDSU prefers the Google Workspace platform
• Our users prefer the file sharing interface of Google Workspace. Moving data to cold storage isn’t
a desirable option
• Continue to use Globus with our Google and Amazon connectors for sharing research data
• Moving forward, we will consider all storage options
What we need from our cloud partners
• Additional add-on storage that is competitively priced and easily acquired
• Tools for administrators and end-users to help manage their storage
• Better communication as it relates to changes and assistance with implementation strategies
e.g., training materials, sample communications…
HUIT
HUIT 018
HUIT
Harvard University IT
Hellen Zziwa, Director of Strategy & Engineering, Technology
Partner Services
HUIT
HUIT 019
0.4% users account for 81.6% of Google storage use
Harvard’s Google Use Cases
■ Medium term storage
■ Sharing externally
■ Archival storage
HUIT
HUIT 020
Destroy?
Google Drive
20
Share
Central Storage
Create /
Capture
Local Storage
Current state: Sample Workflow
Google Drive (Sync) used to Collect, Share and Archive large datasets/files (cryo-EM/video)
Collect
Process
Archive
HUIT
HUIT 021
An unmet storage need: Medium-term storage
21
Need for medium term storage
As a faculty member in the
History of Art and
Architecture..
I want to store and manage
a large image collection, with
various permissions, so that I
can use them for teaching,
research and publication.
I want to temporarily store a
large collection of audio/visual
files, so that I can review and
process them before
depositing into the DRS.
As a special collections
curator...
As a digital archivist…
I want to store large amounts
of digital content transferred
by donors, so that I can
securely appraise and
describe it before it is
archived.
HUIT
HUIT 022
22
Future state: Sample Workflow
Leverage Globus to expand options for data collection / sharing
Destroy?
Google Drive
Share
Central Storage
Create /
Capture
Local Storage
Collect
Process Archive
AWS S3
OneDrive
Tape Library
AWS S3 Glacier
HUIT
HUIT 023
■ How do we protect users against inadvertent retrieval
penalties from AWS S3 Glacier Deep Archive?
■ What happens to shared data/files when the owner
leaves the University?
■ How do we provide insights into the data to help with
data lifecycle management?
Open Questions
Globus@IU
Research Technologies Storage
Advanced Cyberinfrastructure
Research Technologies
Indiana University
Globus@IU
Overview of IU on-prem Storage and data transfer
1. Storage
1. Redundant HPSS Tape Library archive storage service (total 354
PB)
2. Redundant GPFS storage service (total 7.2 PB)
3. Lustre storage service (total 11.6 PB)
2. Data transfer methods
1. HPSS – HSI, SFTP, Globus
2. GPFS – test native client (HPC), Samba, SFTP, Globus
3. Lustre – native client (HPC), Globus
Cloud storage @IU
In the beginning – Box.com
• Well adopted by users
• Significant price increase
• Major IU project with vended migration tool to move to Google & MS
• Prepped Globus Box connector to assist but…
• Vended migration tool with movement to Google and MS was the focus
• Serious concern for well-being of Tape Library service
• Drag-n-drop “many small files”
Cloud storage @IU (cont.)
Now - Google Drive & MS OneDrive/Sharepoint/Team
• Reasonably well adopted but split usage
• Google with price increase
• Major IU project to coordinate reduction of Google use and data migration
• Concerns for what data should go where (wrt…data classifications)
• Effort is in progress
• Prepped Globus connectors for Google Drive & MS to assist
• Serious concern for well-being of Tape Library service
• Premium connectors are set to “invite only”
• Plan to “white glove” research users (i.e. protect the tape library)
Tape Library and Globus
• It's WAY too easy to ingest "many small files"
• Retrieving "many small files" is a challenge
• Accessing each file has overhead and can beat up tape drives and robot pickers
• User perceived slow performance
Best Practice: encourage users to aggregate small files
Feature Request: Provide a way to auto-aggregate small files or block/throttle the upload of small files.
OACIOR
Office of the Associate CIO for
OACIOR
Office of the Associate CIO for
Cloud Storage Transitions
Jim Leous
Office of the Associate CIO for Research
Penn State
OACIOR
Office of the Associate CIO for
Box to O365 Transition
• "We're just moving people from one cloud storage to another, right?"
• We started with 143k and we're down to on the order of a couple of
hundred remaining to migrate.
• Office 365 hasn't been the answer to everything
• The "tough stuff" is left...
OACIOR
Office of the Associate CIO for
What else is there?
What remains in the "Clean-up List" -- a list of mostly researchers
where O365 is not the solution (currently ~250 accounts):
• AWS S3, Glacier
• Azure
• On-prem
• Wasabi
• Dropbox
• Google Drive
OACIOR
Office of the Associate CIO for
Workflows Matter!
• We've been telling people to move research data
into the cloud for 6-7 years
• "It's FREE!” we said…
• Researchers designed ways to get data from
instruments into Box
• Researchers designed ways to edit video in the cloud
• What we realized is that workflows matter!
OACIOR
Office of the Associate CIO for
Cloud Limitations Limit Science
Some current solutions, notably Office 365, have constraints or limits
that make them less than useful for Big Data studies:
• Limits on individual file size
• Limits on total "volume" of data
• Limits on pathname length
Questions?

Mais conteúdo relacionado

Semelhante a Coping Strategies for the Death of Unlimited Storage

Managing eResources at Universities
Managing eResources at UniversitiesManaging eResources at Universities
Managing eResources at Universities
PK Mishra
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
Andrew Brust
 

Semelhante a Coping Strategies for the Death of Unlimited Storage (20)

Moving data to the cloud BY CESAR ROJAS from Pivotal
Moving data to the cloud BY CESAR ROJAS from PivotalMoving data to the cloud BY CESAR ROJAS from Pivotal
Moving data to the cloud BY CESAR ROJAS from Pivotal
 
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
 
Using Archivemedia to preserve research data
Using Archivemedia to preserve research dataUsing Archivemedia to preserve research data
Using Archivemedia to preserve research data
 
Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016
 
Yaron Haviv, Iguaz.io - OpenStack and BigData - OpenStack Israel 2015
Yaron Haviv, Iguaz.io - OpenStack and BigData - OpenStack Israel 2015Yaron Haviv, Iguaz.io - OpenStack and BigData - OpenStack Israel 2015
Yaron Haviv, Iguaz.io - OpenStack and BigData - OpenStack Israel 2015
 
Webinar: Does Object Storage Make Sense for Backups?
Webinar: Does Object Storage Make Sense for Backups?Webinar: Does Object Storage Make Sense for Backups?
Webinar: Does Object Storage Make Sense for Backups?
 
2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking
2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking
2019-04-17 Bio-IT World G Suite-Jira Cloud Sample Tracking
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & Alluxio
 
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindDeliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Globus status and publication plans
Globus status and publication plansGlobus status and publication plans
Globus status and publication plans
 
Managing eResources at Universities
Managing eResources at UniversitiesManaging eResources at Universities
Managing eResources at Universities
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
 
Is OLAP Dead?: Can Next Gen Tools Take Over?
Is OLAP Dead?: Can Next Gen Tools Take Over?Is OLAP Dead?: Can Next Gen Tools Take Over?
Is OLAP Dead?: Can Next Gen Tools Take Over?
 
Islandora Webinar: Building a Repository Roadmap
Islandora Webinar: Building a Repository RoadmapIslandora Webinar: Building a Repository Roadmap
Islandora Webinar: Building a Repository Roadmap
 
Webinar: Cloud Storage vs. On-Premises Storage
Webinar: Cloud Storage vs. On-Premises StorageWebinar: Cloud Storage vs. On-Premises Storage
Webinar: Cloud Storage vs. On-Premises Storage
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
 
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud WorldAchieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud World
 
Data Storage & Preservation
Data Storage & PreservationData Storage & Preservation
Data Storage & Preservation
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
 

Mais de Globus

Mais de Globus (20)

Advanced Globus System Administration Topics
Advanced Globus System Administration TopicsAdvanced Globus System Administration Topics
Advanced Globus System Administration Topics
 
Instrument Data Automation: The Life of a Flow
Instrument Data Automation: The Life of a FlowInstrument Data Automation: The Life of a Flow
Instrument Data Automation: The Life of a Flow
 
Building Research Applications with Globus PaaS
Building Research Applications with Globus PaaSBuilding Research Applications with Globus PaaS
Building Research Applications with Globus PaaS
 
Reliable, Remote Computation at All Scales
Reliable, Remote Computation at All ScalesReliable, Remote Computation at All Scales
Reliable, Remote Computation at All Scales
 
Best Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using GlobusBest Practices for Data Sharing Using Globus
Best Practices for Data Sharing Using Globus
 
An Introduction to Globus for Researchers
An Introduction to Globus for ResearchersAn Introduction to Globus for Researchers
An Introduction to Globus for Researchers
 
Introduction to Research Automation with Globus
Introduction to Research Automation with GlobusIntroduction to Research Automation with Globus
Introduction to Research Automation with Globus
 
Globus for System Administrators
Globus for System AdministratorsGlobus for System Administrators
Globus for System Administrators
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
 
Introduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for ResearchersIntroduction to Data Transfer and Sharing for Researchers
Introduction to Data Transfer and Sharing for Researchers
 
Introduction to the Globus Platform for Developers
Introduction to the Globus Platform for DevelopersIntroduction to the Globus Platform for Developers
Introduction to the Globus Platform for Developers
 
Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)Introduction to the Command Line Interface (CLI)
Introduction to the Command Line Interface (CLI)
 
Automating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and ComputeAutomating Research Data with Globus Flows and Compute
Automating Research Data with Globus Flows and Compute
 
Automating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus PlatformAutomating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and Introduction to the Globus Platform
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
 
Introduction to Globus for System Administrators
Introduction to Globus for System AdministratorsIntroduction to Globus for System Administrators
Introduction to Globus for System Administrators
 
Introduction to Globus for New Users
Introduction to Globus for New UsersIntroduction to Globus for New Users
Introduction to Globus for New Users
 
Working with Globus Platform Services and Portals
Working with Globus Platform Services and PortalsWorking with Globus Platform Services and Portals
Working with Globus Platform Services and Portals
 
Globus Automation
Globus AutomationGlobus Automation
Globus Automation
 
Advanced Globus System Administration
Advanced Globus System AdministrationAdvanced Globus System Administration
Advanced Globus System Administration
 

Último

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Último (20)

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

Coping Strategies for the Death of Unlimited Storage

  • 1. Coping Strategies for the Death of Unlimited Storage GlobusWorld 2022 Panelists Sarah Bailey, UC Berkeley Christopher Clements, San Diego State University Jim Leous, The Pennsylvania State University Charles McClary, Indiana University Hellen Zziwa, Harvard University Moderator Bob Flynn, Internet2
  • 2. Microsoft • September 2013: 7 GB/user • June 2014: 1TB/user; ??/enterprise • October 2014: Unlimited • November 2015: 1TB/user; ??/enterprise • 2019: Up to 25 TB/user, upon request • 2019: Many universities move to license certain products for only “knowledge workers” Google Drive • April 2012: 5 GB/user • May 2013: 30 GB/user • August 2014: Unlimited • December 2019: Researching charges for accounts and unlimited storage • February 2021: End of unlimited storage. Change to tiered pricing model Box • 2012: 50 GB/user; # users x 2 GB/enterprise • 2013: 100 GB/user; # users x 4 GB/enterprise • August 2015: Unlimited • December 2019: Change to $820/TB/year pricing model • Spring 2020: Change to $130/TB/year History of Cloud Storage Quotas/Licenses/Account Limits Slide Credit: Ian Crew, UC Berkeley
  • 5. We used to have all the room we needed for our Data
  • 7. How do you solve a problem like your Data?
  • 8. Bob Flynn Your Moderator Program Manager, Cloud Infrastructure & Platform Services
  • 9. Sarah Bailey Chris Clements Jim Leous University of California Berkeley Harvard University San Diego State University Indiana University Charles McClary Hellen Zziwa Your Panel The Pennsylvania State University
  • 10. UC Berkeley Sarah Bailey Content Collaboration Service Lead
  • 11. Background ● Available services - Box, Google Drive, and Sharepoint ● OneDrive is not currently available at UCB ● Sharepoint is our secure storage solution ● Priorities: effective change management and communication with the campus community and transparent and balanced management of services in the portfolio.
  • 12. What issues have emerged related to migrating data? ● Tools for migration are scoped too broadly (Globus) ● Would require secure certification for server running Globus, and high level of vendor trust ● Data throughput is too low ● Tools for monitoring service usage are inadequate for completing the requirements established by service providers
  • 13. LETUS Learning Environments, Technologies & User Services User Services Director Chris Clements San Diego State University
  • 14. User Services Portfolio Enterprise Application Support • Google Workspace • O365/Azure • ServiceNow • Zoom • Canvas • Mediasite • Adobe Acrobat Sign • Adobe Creative Cloud • Globus • Duo Security • Slack Support Services • Help Desk Services • Desktop Services • Identity Management Support (SDSUid) • Duo Multi-Factor (MFA) Support • Network and Wireless Troubleshooting • Software Distribution • ZoomCorps • ServiceNow Corps
  • 15. Cloud Storage Today Google Workspace • 76,000 Active Accounts • 1 petabyte of data • Primary repository for PL data • Campus standard for communication and collaboration (faculty, staff, and students) Microsoft OneDrive • Used for PC backup and folder redirection SharePoint • Departmental use only • The long-term strategy is to phase this out Azure • 356 terabytes of primary and backup storage • Replaced traditional tape backup Amazon S3 • Projects requiring GovCloud
  • 16. Preparing For Limited Storage What we are doing right • Automation (provisioning and timely deprovisioning of accounts) • Staff, volunteers, and consultant accounts are deleted 90 days after separation • Faculty and students who are graduating have a 1-year grace period • Accounts for life are not offered to our alumni • Routine auditing of accounts What we are working on • Appropriate retention policies • Consider how Google’s new storage tools can be applied • Training up and expanding our enterprise platform administrators • Increase user base training and documentation
  • 17. What Does The Future Hold? Future direction • SDSU prefers the Google Workspace platform • Our users prefer the file sharing interface of Google Workspace. Moving data to cold storage isn’t a desirable option • Continue to use Globus with our Google and Amazon connectors for sharing research data • Moving forward, we will consider all storage options What we need from our cloud partners • Additional add-on storage that is competitively priced and easily acquired • Tools for administrators and end-users to help manage their storage • Better communication as it relates to changes and assistance with implementation strategies e.g., training materials, sample communications…
  • 18. HUIT HUIT 018 HUIT Harvard University IT Hellen Zziwa, Director of Strategy & Engineering, Technology Partner Services
  • 19. HUIT HUIT 019 0.4% users account for 81.6% of Google storage use Harvard’s Google Use Cases ■ Medium term storage ■ Sharing externally ■ Archival storage
  • 20. HUIT HUIT 020 Destroy? Google Drive 20 Share Central Storage Create / Capture Local Storage Current state: Sample Workflow Google Drive (Sync) used to Collect, Share and Archive large datasets/files (cryo-EM/video) Collect Process Archive
  • 21. HUIT HUIT 021 An unmet storage need: Medium-term storage 21 Need for medium term storage As a faculty member in the History of Art and Architecture.. I want to store and manage a large image collection, with various permissions, so that I can use them for teaching, research and publication. I want to temporarily store a large collection of audio/visual files, so that I can review and process them before depositing into the DRS. As a special collections curator... As a digital archivist… I want to store large amounts of digital content transferred by donors, so that I can securely appraise and describe it before it is archived.
  • 22. HUIT HUIT 022 22 Future state: Sample Workflow Leverage Globus to expand options for data collection / sharing Destroy? Google Drive Share Central Storage Create / Capture Local Storage Collect Process Archive AWS S3 OneDrive Tape Library AWS S3 Glacier
  • 23. HUIT HUIT 023 ■ How do we protect users against inadvertent retrieval penalties from AWS S3 Glacier Deep Archive? ■ What happens to shared data/files when the owner leaves the University? ■ How do we provide insights into the data to help with data lifecycle management? Open Questions
  • 24. Globus@IU Research Technologies Storage Advanced Cyberinfrastructure Research Technologies Indiana University
  • 25. Globus@IU Overview of IU on-prem Storage and data transfer 1. Storage 1. Redundant HPSS Tape Library archive storage service (total 354 PB) 2. Redundant GPFS storage service (total 7.2 PB) 3. Lustre storage service (total 11.6 PB) 2. Data transfer methods 1. HPSS – HSI, SFTP, Globus 2. GPFS – test native client (HPC), Samba, SFTP, Globus 3. Lustre – native client (HPC), Globus
  • 26. Cloud storage @IU In the beginning – Box.com • Well adopted by users • Significant price increase • Major IU project with vended migration tool to move to Google & MS • Prepped Globus Box connector to assist but… • Vended migration tool with movement to Google and MS was the focus • Serious concern for well-being of Tape Library service • Drag-n-drop “many small files”
  • 27. Cloud storage @IU (cont.) Now - Google Drive & MS OneDrive/Sharepoint/Team • Reasonably well adopted but split usage • Google with price increase • Major IU project to coordinate reduction of Google use and data migration • Concerns for what data should go where (wrt…data classifications) • Effort is in progress • Prepped Globus connectors for Google Drive & MS to assist • Serious concern for well-being of Tape Library service • Premium connectors are set to “invite only” • Plan to “white glove” research users (i.e. protect the tape library)
  • 28. Tape Library and Globus • It's WAY too easy to ingest "many small files" • Retrieving "many small files" is a challenge • Accessing each file has overhead and can beat up tape drives and robot pickers • User perceived slow performance Best Practice: encourage users to aggregate small files Feature Request: Provide a way to auto-aggregate small files or block/throttle the upload of small files.
  • 29. OACIOR Office of the Associate CIO for OACIOR Office of the Associate CIO for Cloud Storage Transitions Jim Leous Office of the Associate CIO for Research Penn State
  • 30. OACIOR Office of the Associate CIO for Box to O365 Transition • "We're just moving people from one cloud storage to another, right?" • We started with 143k and we're down to on the order of a couple of hundred remaining to migrate. • Office 365 hasn't been the answer to everything • The "tough stuff" is left...
  • 31. OACIOR Office of the Associate CIO for What else is there? What remains in the "Clean-up List" -- a list of mostly researchers where O365 is not the solution (currently ~250 accounts): • AWS S3, Glacier • Azure • On-prem • Wasabi • Dropbox • Google Drive
  • 32. OACIOR Office of the Associate CIO for Workflows Matter! • We've been telling people to move research data into the cloud for 6-7 years • "It's FREE!” we said… • Researchers designed ways to get data from instruments into Box • Researchers designed ways to edit video in the cloud • What we realized is that workflows matter!
  • 33. OACIOR Office of the Associate CIO for Cloud Limitations Limit Science Some current solutions, notably Office 365, have constraints or limits that make them less than useful for Big Data studies: • Limits on individual file size • Limits on total "volume" of data • Limits on pathname length