RCUK Cloud Workshop

•Transferir como PPTX, PDF•

1 gostou•299 visualizações

Simon Woodman

How the Digital Institute is enabling the Long Tail of Science

Tecnologia

Supporting research in the Cloud:
Our experiences and future
directions
Simon Woodman
Digital Institute
Newcastle University

Digital Institute
• In 2001, NEReSC was formed to help researchers
– £17M funding
• Bioinformatics, Neuroscience, Aging & Health, Chemical Engineering,
Transport, Video archiving
– Core UK e-Science funding programme
• Became the Digital Institute
– Similar remit but more diverse: Humanities, Medical Science,
BIM
– £40M funding
• Provided computing aspect of many research projects
• Frequently same requirements

Research
World-leading
Teaching
Specialist
Skills
& Knowledge
Engagemen
t
Impact on
Industry &
Public Sector
Innovation
Cycle
Cloud Innovation
Centre
Events, engagement &
partnerships
EPSRC Centre for
Doctoral training
in Cloud
Computing for
Big Data
Developing the
next generation of
leaders
World-
leading
Research
in Cloud
computin
g & data
analytics
Digital Institute

Having 100s of machines
available to process data doesn’t
automatically help

Building a data management and
processing platform
• An environment to store, manage and process data
– Every project needed this, volumes growing
– Open Source
• A platform that can operate in a number of different locations
– Laptop
– On Premise
– On a cloud provider (Amazon and MS)
• An expandable system
– API to connect other software
– Data processing code can be added
• A platform for our academic research
– Scalability, data management, provenance

Features of e-SC
• Data storage
– Cloud: infinite scale
• Data processing
– Best-of-breed open source tools (R / Octave)
• Provenance
– Audit of everything performed
• Scalability
– Easy to run at large scale
• External communications
– Rich APIs

All our projects have significant storage,
processing and collaboration needs
Use of the platform
Movement monitoring
Collaboration
ModellingStroke rehabilitation
Care home design

Learning from Industry -
DevOps
The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win.
Gene Kim, Kevin Behr, George Spafford. ISBN 0988262509

Cloud Innovation Centre
• £1M funding from DCMS
• Offerings
– Training
– Architecture Reviews
– Consultancy
– KTPs
– Cloud time
• £700k Private and Public Cloud infrastructure

Cloud Computing for Big Data
CDT
• £7M funding from EPSRC
• 60 Students
• 5 years
• High level of industry engagement
Cloud
Computing
Bayesian
Statistics
Group
Projects
Internships
Thesis
Retreats
Professional
Skills

Challenges
• Past
– Cost of on-boarding
– Scaling
• Present
– Licensing
– Researchers doing Sys Admin
• Future
– Manage the relationship with Central IT
– Staff costs/time

Questions?
Simon Woodman
simon.woodman@ncl.ac.uk

Mais conteúdo relacionado

Mais procurados

Digitalisation and the future of research environmentsJisc

IT is Innovation in TechnologyMartin Hamilton

Cloud computing and as a servicemmiracola

Big Data and Massive AnalyticsInternational Society of Service Innovation Professionals

Managing data behind creative masterpiecesJisc RDM

Compass Informatics - Overviewgoriain

UCL & IoE Libraries - Research Data Management - 22/10/14Caroline Lloyd

NordForsk Open Access Reykjavik 14-15/8-2014:NeICNordForsk

Vision and Mission for a Future African Open Science Platform/Felix DakoraAfrican Open Science Platform

The DATALAB - building a world-class innovation centre in data scienceUniversity of Glasgow Research Strategy & Innovation Office

Service System EngineeringInternational Society of Service Innovation Professionals

Crowdsourcing Representation Information to Support Preservation: CRISPmopennock

Storage that Powers Digital Business: Scality for Enterprise BackupScality

Planning for digitisation (09-02-11)Richard Davies

Scalable data access is necessary for successful digitalizationSIRIUS Centre, University of Oslo

ELIXIR and ELIXIR-UK training activitiesELIXIR UK

Open Access Repository Capacity Strengthening Programme for Africa (OA-RCSP)AIMS (Agricultural Information Management Standards)

Kirsty gowans-arenajames hamilton

Building a National Research Data Commons – Transforming Scholarship Through ...Andrew Treloar

The Cloud and Microsoft Office 365Grassroots IT

Mais procurados (20)

Digitalisation and the future of research environments

IT is Innovation in Technology

Cloud computing and as a service

Big Data and Massive Analytics

Managing data behind creative masterpieces

Compass Informatics - Overview

UCL & IoE Libraries - Research Data Management - 22/10/14

NordForsk Open Access Reykjavik 14-15/8-2014:NeIC

Vision and Mission for a Future African Open Science Platform/Felix Dakora

The DATALAB - building a world-class innovation centre in data science

Service System Engineering

Crowdsourcing Representation Information to Support Preservation: CRISP

Storage that Powers Digital Business: Scality for Enterprise Backup

Planning for digitisation (09-02-11)

Scalable data access is necessary for successful digitalization

ELIXIR and ELIXIR-UK training activities

Open Access Repository Capacity Strengthening Programme for Africa (OA-RCSP)

Kirsty gowans-arena

Building a National Research Data Commons – Transforming Scholarship Through ...

The Cloud and Microsoft Office 365

Semelhante a RCUK Cloud Workshop

The e-Ciber Superfacility ProjectLeandro Ciuffo

Jisc11 Cloud Solutions Paul WatsonJisc

EPCC MSc industry projectsEPCC, University of Edinburgh

E Infrastructure for OAKnowledge Exchange

Values & Vision - Cloud Sandboxes for BIG Earth Sciencesterradue

Adoption of Cloud Computing in Scientific ResearchYehia El-khatib

Sgci esip-7-20-18Nancy Wilkins-Diehr

Shared services - the future of HPC and big data facilities for UK researchMartin Hamilton

Data-intensive bioinformatics on HPC and CloudOla Spjuth

SGCI - S2I2: Science Gateways Community InstituteSandra Gesing

Cloud Standards in the Real World: Cloud Standards Testing for DevelopersAlan Sill

Innovate UK – Emerging Technologies seminar: Catapult overviewInvest Northern Ireland

Utilising Cloud Computing for Research through Infrastructure, Software and D...David Wallom

BeSTGRID OpenGridForum 29 GIN sessionNick Jones

Membership Intro PresentationInternational Society of Service Innovation Professionals

The Irish Centre for High End Computing and IBM - The role of advanced comput...MarieThrseCulligan

The Irish Centre for High End Computing and IBM: The role of advanced computi...MarieThrseCulligan

EGI Services EGI Federation

e-infrastructural needs to support informaticsDavid Wallom

WhereScape, the pioneer in data warehouse automation software Patrick Van Renterghem

Semelhante a RCUK Cloud Workshop (20)

The e-Ciber Superfacility Project

Jisc11 Cloud Solutions Paul Watson

EPCC MSc industry projects

E Infrastructure for OA

Values & Vision - Cloud Sandboxes for BIG Earth Sciences

Adoption of Cloud Computing in Scientific Research

Sgci esip-7-20-18

Shared services - the future of HPC and big data facilities for UK research

Data-intensive bioinformatics on HPC and Cloud

SGCI - S2I2: Science Gateways Community Institute

Cloud Standards in the Real World: Cloud Standards Testing for Developers

Innovate UK – Emerging Technologies seminar: Catapult overview

Utilising Cloud Computing for Research through Infrastructure, Software and D...

BeSTGRID OpenGridForum 29 GIN session

Membership Intro Presentation

The Irish Centre for High End Computing and IBM - The role of advanced comput...

The Irish Centre for High End Computing and IBM: The role of advanced computi...

EGI Services

e-infrastructural needs to support informatics

WhereScape, the pioneer in data warehouse automation software

Último

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra

Manulife - Insurer Innovation Award 2024The Digital Insurer

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

A Year of the Servo Reboot: Where Are We Now?Igalia

GenAI Risks & Security Meetup 01052024.pdflior mazor

Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions

Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

RCUK Cloud Workshop

1. Supporting research in the Cloud: Our experiences and future directions Simon Woodman Digital Institute Newcastle University

2. Digital Institute • In 2001, NEReSC was formed to help researchers – £17M funding • Bioinformatics, Neuroscience, Aging & Health, Chemical Engineering, Transport, Video archiving – Core UK e-Science funding programme • Became the Digital Institute – Similar remit but more diverse: Humanities, Medical Science, BIM – £40M funding • Provided computing aspect of many research projects • Frequently same requirements

3. Research World-leading Teaching Specialist Skills & Knowledge Engagemen t Impact on Industry & Public Sector Innovation Cycle Cloud Innovation Centre Events, engagement & partnerships EPSRC Centre for Doctoral training in Cloud Computing for Big Data Developing the next generation of leaders World- leading Research in Cloud computin g & data analytics Digital Institute

4. Having 100s of machines available to process data doesn’t automatically help

5. Building a data management and processing platform • An environment to store, manage and process data – Every project needed this, volumes growing – Open Source • A platform that can operate in a number of different locations – Laptop – On Premise – On a cloud provider (Amazon and MS) • An expandable system – API to connect other software – Data processing code can be added • A platform for our academic research – Scalability, data management, provenance

10. Provenance

11.

12. Features of e-SC • Data storage – Cloud: infinite scale • Data processing – Best-of-breed open source tools (R / Octave) • Provenance – Audit of everything performed • Scalability – Easy to run at large scale • External communications – Rich APIs

13. All our projects have significant storage, processing and collaboration needs Use of the platform Movement monitoring Collaboration ModellingStroke rehabilitation Care home design

14. Cap Ex v Op Ex

15. Service Provision v Research

16. One eScience Central to rule them all.

17. Learning from Industry - DevOps The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win. Gene Kim, Kevin Behr, George Spafford. ISBN 0988262509

18. Cloud Innovation Centre • £1M funding from DCMS • Offerings – Training – Architecture Reviews – Consultancy – KTPs – Cloud time • £700k Private and Public Cloud infrastructure

19. Cloud Computing for Big Data CDT • £7M funding from EPSRC • 60 Students • 5 years • High level of industry engagement Cloud Computing Bayesian Statistics Group Projects Internships Thesis Retreats Professional Skills

20. Challenges • Past – Cost of on-boarding – Scaling • Present – Licensing – Researchers doing Sys Admin • Future – Manage the relationship with Central IT – Staff costs/time

21. Questions? Simon Woodman simon.woodman@ncl.ac.uk

Notas do Editor

Research influences the teaching Teaching generates the talent which benefits industry Engagement informs the research Having the platform allows us to collaborate with many people around the University to do novel research into real world problems. There is obviously other research in the group too such as machine learning, simulation and social media analysis but it is the e-SC platform an similar technologies that is the biggest enabled. That in turn influences teaching within the CDT – what the students are taught is based on our research. The teaching develops talent which is acquired by industry… And the engagement informs the research as we are solving the problems that really exist rather than made up ones. Ultimately this is one recipe for making Cloud Computing work for a research group Cloud Agnostic Platform Drives academic collaborations Vehicle for CS research Industrial Links Learn best practice (Dev Ops etc.) Train and Influence Education Develop new talent They drive the research Wider remit than most research groups. Sit in the middle of what is hopefully a virtuous circle. Today I’m going to focus on some of the research, particularly around Cloud and what we’ve done on Windows Azure. I’ll talk about some work that we did to port an existing application to Azure. Then look at some of the non-functional requirements that came out of that and how we’re used that to drive our research forwards
Hard to make use of large scale computing resources Managing machines Keeping track of data and results People more accustomed to programming for their own problem Distributed systems development background Not many tools around to help Mainly targeted at business/consumer applications Getting better with new tools available but these still require low level programming skills that application scientists often don’t have.
One of the often touted benefits of the Cloud is the transfer of CapEx to OpEx. Whilst in business this is considered preferable and a good thing, is it the same in academia? level programming skills that application scientists often don’t have. In academia this has some interesting side effects when it comes to the end of the project – who pays for machines to be kept running? The reality is that this cost was hidden by central IT before who would keep machines running FOC for a period after the end of the project – usually until they failed. However, now that isn’t the case. In some projects this is less of an issue – classical research projects which run, generate data, analyse the data, publish the results and then move onto another project. For many of our projects it is an issue for 2 reasons. Firstly the members of the project often wish to demo the project after the official end – trying to secure more funding, enhancing publications etc. The bigger issue is the mandate to store research data for N years after the end of the project. Who pays for this? The project can’t as the budget has ended. I don’t have any answers but it’s one of the interesting features of the OpEx model.
One of the challenges is that we, as a group are walking a fine line between providing a service and performing our own research. Ultimately, we’re measured by both of these things – the amount number of collaborations that we engage in and also the number of papers we write. Therefore, e-SC has two functions: Provide facilities to research projects so that they don’t have to be built from scratch Act as a vehicle for our academic research Both of these functions are important and necessary for success. One of the downsides of these types of collaborations is that you can be seen as a service role within the project and not a fully fledged research partner. The liklihood of this depends on the project and PI etc.
One of the others things that we have learnt is that not every assumption you make will turn out to be right. For instance… When we started we envisaged one e-SC to rule them all. The most I’ve had to manage is about 15 at the same time. It’s not far off a full time job. You end up doing a lot of system administration You may be able to get central IT to do some of the work for you but in our experience it’s not that easy.
Over the past few years many of the large software companies have been making large strides, often due to necessity in the ways the manage their infrastructure and services. My view is that academia often hasn’t followed their lead because of the lack of need but we can learn a great deal from them. One of the keys is automating the process from development to a new version going live. A few years ago this would be measured in months or even years. Version Next of the software will be available on X date. With SaaS provision this can be vastly sped up. For instance on a weekday, Amazon updates its deployments every 12 seconds. The way that they are able to do this is leveraging tools such as Continuous Integration and delivery. Every check in to VCS is automatically checked out and tested using automated tools. Configuration management. Make sure that the developer environment exactly matches the production one and any changes are managed. Usually using tools such as Chef/Puppet/Ansible.
Engagement arm – links to industry both small and large and we aim to learn from them and inform them of new research Cloud agnostic platform important – OpenStack private cloud and Azure Public Cloud

RCUK Cloud Workshop

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a RCUK Cloud Workshop

Semelhante a RCUK Cloud Workshop (20)

Último

Último (20)

RCUK Cloud Workshop

Notas do Editor