The cloud not only helps organizations do things better, cheaper, and faster; it also drives breakthroughs that transform mission delivery. This session will feature a panel of international government and university leaders who are using the cloud to take on big data challenges, and innovating in the “white space” between data silos to deliver impact.
Exploring the Future Potential of AI-Enabled Smartphone Processors
AWS Public Sector Symposium 2014 Canberra | Big Data in the Cloud: Accelerating Innovation in the Public Sector
1. AWS Government, Education, &
Nonprofits Symposium
Canberra, Australia | May 20, 2014
Big Data in the Cloud:
Accelerating Innovation in the Public Sector
Russell Nash
Solutions Architect
Amazon Web Services, APAC
18. 1
2
4
8
16
32
64
128
256
1 2 4 8 16 32 64 128
Memory(GB)
EC2 Compute Units
Instance Types
Standard 2nd Gen Standard Micro High-Memory High-CPU Cluster Compute Cluster GPU High I/O High-Storage Cluster High-Mem
hi1.4xlarge
60.5 GB of memory
35 EC2 Compute
Units
2x1024 GB SSD
instance storage
64-bit platform
cc1.4xlarge
23 GB of memory
33.5 EC2 Compute
Units
1690 GB of instance
storage
64-bit platform
c1.xlarge
7 GB of memory
20 EC2 Compute
Units
1690 GB of
instance storage
64-bit platform
m1.small
1.7 GB memory
1 EC2 Compute Unit
160 GB instance
storage
32-bit or 64-bit
m1.medium
3.75 GB memory
2 EC2 Compute Unit
410 GB instance
storage
32-bit or 64-bit
platform
m1.large
EBS Optimizable
7.5 GB memory
4 EC2 Compute Units
850 GB instance
storage
64-bit platform
m1.xlarge
EBS Optimizable
15 GB memory
8 EC2 Compute Units
1,690 GB instance
storage
64-bit platform
m2.xlarge
17.1 GB of memory
6.5 EC2 Compute
Units
420 GB of instance
storage
64-bit platform
m2.2xlarge
34.2 GB of memory
13 EC2 Compute
Units
850 GB of instance
storage
64-bit platform
m2.4xlarge
EBS Optimizable
68.4 GB of memory
26 EC2 Compute
Units
1690 GB of instance
storage
64-bit platform
t1.micro
613 MB memory
Up to 2 EC2 Compute
Units
EBS storage only
32-bit or 64-bit platform
c1.medium
1.7 GB of memory
5 EC2 Compute Units
350 GB of instance
storage
32-bit or 64-bit
platform
cg1.4xlarge
22 GB of memory
33.5 EC2 Compute
Units
2 x NVIDIA Tesla
“Fermi” M2050 GPUs
1690 GB of instance
storage
64-bit platform
cc2.8xlarge
60.5 GB of memory
88 EC2 Compute
Units
3370 GB of instance
storage
64-bit platformm3.xlarge
15 GB of memory
13 EC2 Compute
Units
m3.2xlarge
EBS Optimizable
30 GB of
memory
26 EC2 Compute
Units
hs1.8xlarge
117 GB of memory
35 EC2 Compute
Units
24x2 TB instance
storage
64-bit platform
cr1.8xlarge
244 GB of memory
88 EC2 Compute Units
2x120 GB SSD
instance storage
64-bit platform
19. • ON A SINGLE INSTANCE
COST: 4h x $2.1 = $8.4
COMPUTE TIME: 4h
20. • ON MULTIPLE INSTANCES
COST: 2 x 2h x $2.1 = $8.4
COMPUTE TIME:
21.
22.
23. Metric Count
Compute Hours of Work 109,927 hours
Compute Days of Work 4,580 days
Compute Years of Work 12.55 years
Ligand Count ~21 million ligands
Using Cycle Computing and Amazon Web Services
35. What are Spot Instances?
Availability Zone
Region
Availability Zone
Unused
Unused
Unused
Unused
Unused
Unused
Sold
at
50%
Discount!
Sold
at
56%
Discount!
Sold
at
66%
Discount!
Sold
at
59%
Discount!
Sold
at
54%
Discount!
Sold
at
63%
Discount!
36. • ON MULTIPLE INSTANCES
COST: 2 x 2h x $2.1 = $8.4
COMPUTE TIME:
37. • ON MULTIPLE SPOT INSTANCES
COST: 4 x 1h x $0.35 = $1.4
COMPUTE TIME:
38. SEC MIDAS & Tradeworx
Real-time analysis of 20 billion messages/day
Reconstruct any market, any day in history
39. “For the growing team of quant types now
employed at the SEC, MIDAS is becoming the
world’s greatest data sandbox.
The staff is planning to use it to make the SEC a
leader in its use of market data”
Elisse B. Walter, Chairman of the SEC
44. } The Research Data Storage Infrastructure (RDSI) Project,
an Australian Government initiative, is funded from the
Education Investment Fund under the Super Science
(Future Industries) initiative.
} RDSI is a $50m federally funded project, for which UQ is
the lead agent and was awarded up to $10m in NCRIS
2013.
26 May 2014 44
45. } Andrew Goodchild, QCIF
} Andrew Reay, AWS
} Paul Campbell, RDSI
} Shane Youl, Intersect
} Mark Terrill , O2
26 May 2014 45
46. 46
} Researchers will be
able to use and
manipulate significant
collections of data that
were previously either
unavailable or difficult
to access
48. 48
} Evaluation and testing facility
} Working with Partners
} Implemented through two nodes (initially)
§ QCIF – Brisbane
§ Intersect - Sydney
} Act as gateways for public cloud access
} Will host a series of projects
49. 26 May 2014 49
Gateway to the Public Cloud
through nodes
Peering through AARNet
(Sydney/Oregon)
Volume Aggregation through
CAUDIT/Test Platform
Removing barriers to use,
such as egress charges for
researchers
50. 50
RDSI use cases are designed to test
boundary conditions involving
performance, capability and cost/
effectiveness
51. RDSI is partnering with AWS, QCIF, Intersect
and O2 networks to undertake testing of:
§ Integration
§ Data movement
§ Data storage
§ Services over data
52. § AWS Identity and access manager (IAM) for billing
aggregation
§ Australian Access Federation (AAF) based authentication of
researchers
§ O2 networks integration of IAM and AAF via SAML
§ Extending authentication and authorization through reX
53.
54. QBI need to visualize neural
tracks for ~1000 MRI image sets
Each image set:
§ Input: 17 TB
§ Compute: 300 - 900 cores
for 1 week
§ Output: 50 TB
Status:
§ QCIF are working with
QBI on the first run
55. Compute Services over Data
§ Challenge: 900 cores for a week becomes
expensive ($15K)
§ Solution: MIT Starcluster + AWS Spot
§ Why: Starcluster elastically expands based
on spot price & queue size
56. Storage
– Challenge: Need 67 TB of volume storage, but AWS
provides 1 TB volumes
– Solution: Glusterfs + HS1.8XLarge servers
– Why: Glusterfs gets faster with more servers and
HS1.8XLarge are less $ per TB compared to SSDs (which
are hard to saturate)
57. Network
– Challenge: Moving 17 TB through the eye of a needle (the
campus network)
– Solution: NAS sneaker net + Aspera
– Why: Tried compression, but it adds significant time
overheads
– Would have been better if campus had a “data transfer
network”
58. Effective use of the Cloud, whether public or private requires thought
and planning
The Public Cloud provides a significant opportunity to extend the
capability and capacity of existing research infrastructure
The ability to scale very rapidly with few constraints is attractive for
Data Intensive Research
Public Cloud offers the capability of allowing access to significant
infrastructure as large upfront capital investments become scarce
59. 59
RDSI is establishing a BLOG to
document the journey rather than wait
until the end to write a report
https://www.rdsi.edu.au/aws-test
60. THANK YOU
Please give us your feedback by filling out the Feedback Forms
AWS Government, Education, &
Nonprofits Symposium
Canberra, Australia | May 20, 2014