SlideShare a Scribd company logo
1 of 78
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Edward Naim, Head of Product, Amazon EFS
Darryl Osborne, Storage Specialist Solutions Architect
David Green, Enterprise Solutions Architect
February 23rd, 2017
Deep Dive on Amazon EFS
Learn why and when to use Amazon EFS
Understand key technical & security concepts
Discover how to leverage EFS’s performance
See EFS in action: Hands-on demos
Review EFS’s economics
Answer your questions (Q&A)
What to expect from this webinar
Why & When to Use Amazon EFS
Cloud Data Migration
Direct
Connect
Snow* data
transport
family
3rd Party
Connectors
Transfer
Acceleration
Storage
Gateway
Kinesis Firehose
AWS Storage Platform and SolutionsThe AWS Storage Portfolio
Object
Amazon GlacierAmazon S3
Block
Amazon EBS
(persistent)
Amazon EC2
Instance Store
(ephemeral)
File
Amazon EFS
Amazon EFS attributes
1) Standard file system interface & semantics
2) Shared storage
3) Highly available
4) Highly durable
5) Consistent, low latencies
6) Scalable (storage & throughput)
7) Elastic capacity
8) Fully managed
We focused on changing the game
Simple Elastic Scalable
1 2 3
Highly durable
Highly available
Amazon EFS is Simple
• Fully managed
- No hardware, network, file layer
- Create a scalable file system in seconds!
• Seamless integration with existing tools and apps
- NFS v4.1—widespread, open
- Standard file system access semantics
- Works with standard OS file system APIs
• Simple pricing = simple forecasting
1
Amazon EFS is Elastic
• File systems grow and shrink automatically as
you add and remove files
• No need to provision storage capacity or
performance
• You pay only for the storage space you use,
with no minimum fee
2
• File systems can grow to petabytes of
capacity
• Throughput scales automatically as file
systems grow
• Consistent low latencies regardless of file
system size
• Support for thousands of concurrent NFS
connections
Amazon EFS is Scalable
3
• Every file system object is redundantly
stored across multiple Availability Zones in a
Region
• Designed to sustain Availability Zone offline
conditions
• Superior to traditional NAS availability
models
• Appropriate for production/tier 0 applications
High Durability & High Availability
In which Regions can I use EFS today?
• US West (Oregon)
• US East (N. Virginia)
• US East (Ohio)
• EU (Ireland)
More coming soon!
Do you need an EFS file system?
If you have an application (EC2 or on-premises) or use
case that requires a file system AND
• Requires multi-attach OR
• GBs/s throughput OR
• Multi-AZ availability/durability OR
• Requires automatic scaling (grow/shrink) of storage
What customers are using EFS for today
Web serving Content management
Analytics
Media and Entertainment
workflows
Workflow management
Home directories
Container storage
Database backups
Understand Key Technical and
Security Concepts
What is a file system?
• The primary resource in EFS
• Where you store files and directories
• Can create 125 file systems per account
What is a mount target?
• To access your file system within
a VPC, you create mount targets
in the VPC
• A mount target is an NFS endpoint
that lives in your VPC
• A mount target has an IP address
and a DNS name you use in your
mount command
• A mount target is highly available
AVAILABILITY ZONE 1
REGION
AVAILABILITY ZONE 2
AVAILABILITY ZONE 3
VPC
EC2
EC2
EC2
EC2
Mount
target
How to access a file system from an instance
• You “mount” a file system on an Amazon EC2 instance (standard
command) — the file system appears like a local set of directories
and files
• An NFS v4.1 client is standard on Linux distributions
mount –t nfs4 –o nfsvers=4.1
[file system DNS name]:/
/[user’s target directory]
How does it all fit together?
AVAILABILITY ZONE 1
REGION
AVAILABILITY ZONE 2
AVAILABILITY ZONE 3
VPC
EC2
EC2
EC2
EC2
File system
Data can be accessed from any AZ in the Region while maintaining full consistency
Several security mechanisms
 Control network traffic to and from file systems (mount targets) by
using VPC security groups and network ACLs
 Control file and directory access by using POSIX permissions
 Control administrative access (API access) to file systems by
using AWS Identity and Access Management (IAM)
 EFS supports action-level and resource-level permissions
Access your EFS file system via AWS Direct Connect
Direct Connect EFS in your Amazon VPCOn-premises servers
Direct Connect support addresses three of four
hybrid scenarios
Bursting
Migration
Tiering
Backup / DR
Learn How to Leverage EFS’s
Performance
Amazon EFS is designed for wide spectrum of
performance needs
High throughput and parallel I/O
Low latency and serial I/O
Genomics
Big data analytics
Scale-out jobs
Home directories
Content management
Web serving
Metadata-intensive
jobs
Choose the performance mode best suited to
your workload
Mode What’s it for? Advantages Tradeoffs When to use
General
purpose
(default)
Latency-sensitive
applications and
general-purpose
workloads
Lowest latencies
for file operations
Limit of 7,000 ops/sec Best choice for most
workloads
Max I/O Large-scale and data-
heavy applications
Virtually unlimited
ability to scale out
throughput/IOPS
Slightly higher
latencies
Consider if 10s (or
more) instances
access your file
system concurrently
Use the PercentIOLimit CloudWatch metric to determine
if you’re constrained by General Purpose mode
Amazon EFS has a distributed data storage design
EC2
EC2
…
EC2
EC2
…
EC2
EC2
…
• File systems distributed across
unconstrained number of servers
• Avoids bottlenecks/constraints of
traditional file servers
• Enables high levels of aggregate
IOPS/throughput
• Data also distributed across
Availability Zones (durability,
availability)
How to think about EFS perf relative to EBS
Amazon EFS Amazon EBS PIOPS
Performance
Per-operation
latency
Low, consistent Lowest, consistent
Throughput
scale
Multiple GBs per second Single GB per second
Characteristics
Data availability
/ durability
Stored redundantly across multiple AZs Stored redundantly in a single AZ
Access
1 to 1000s of EC2 instances, from
multiple AZs, concurrently
Single EC2 instance in a single AZ
Use cases
Big Data and analytics, media processing
workflows, content management, web
serving, home directories
Boot volumes, transactional and
NoSQL databases, data warehousing
& ETL
An implication of per-operation latency: I/O size
impacts throughput of serialized operations
4 KB 32 KB 256 KB 2 MB 16 MB
I/O size
Throughput
How to take advantage of EFS’s distributed architecture:
Parallelize
Parallelize via multiple threads and/or multiple instances
0
5000
10000
15000
20000
25000
30000
0 20 40 60 80 100 120 140 160
IOPS
# of Total Threads
Aggregate IOPS of parallel writes using
10 m4.xlarge instances
Use CloudWatch for a number of views of file
system performance
DataReadIOBytes
DataWriteIOBytes
MetadataIOBytes
TotalIOBytes
Measure throughput (‘Sum’ of bytes divided by
seconds in time period) or ops/sec (‘Data
Samples’ divided by seconds in time period)
BurstCreditBalance Monitor your burst credit usage over time to
ensure sufficient throughput capacity
PermittedThroughput Compare to actual throughput to determine
whether you’re being constrained by the burst
model
ClientConnections View the number of clients connected to your
file system
PercentIOLimit Determine whether you’re being constrained by
General Purpose mode (PercentIOLimit at or
near 100%)
Recommended kernel version and NFS mount options
Kernel
version
 Use Linux kernel 4.0+ (e.g., Amazon Linux 2016.03.0, Ubuntu
15.10 or 16.04)
Mount
options
 Mount via NFSv4.1
 Specify 1MB read/write buffers (“rsize”/”wsize”)
 Ensure operations are asynchronous
Recommend the following mount options:
-o nfsvers=4.1,
rsize=1048576,wsize=1048576,hard,
timeo=600,retrans=2,async
See EFS in Action: Move Data
Goal: Move Data Quickly!!
Two Scenarios:
Transferring media assets to EFS
• Size ranges from a few GB to
100+GB per file
• Data sources:
• Amazon S3
• Amazon EBS
Transferring many small files to EFS
• Size ranges from 64K to 256K
• Data sources:
• Amazon S3
• Amazon EBS
Serial vs Parallel
Serial file transfer
Parallel file transfer
How do we do this?
GNU parallel
• Tool for executing jobs in parallel
• Similar to xargs
• Replace loops in shell scripts
• GNU parallel makes sure output
from the commands is the same
output as you would get if you had
run the commands sequentially
https://www.gnu.org/software/parallel/
For people who live life in the parallel lane
Use parallel threads – GNU parallel
# Create destination directory tree from source
find . -type d -print0 | parallel -j $N_THREADS -0 "mkdir -p
${DST_DIR}/{}" > /dev/null 2>&1
# Copy files
find . ! ( -type d ) -print0 | parallel -j $N_THREADS -0 "cp -
f {} ${DST_DIR}/{}"
Optimizing Transfers
Monitoring performance
• Data-driven results
• Repeatable outcomes
• Optimize for costs
Benchmark different instance types
• Determine the optimal instance size
• What is best? T2, C3, C4, M3, M4,
R3, X?
• Transfer test set of 1000 small files
• Increase thread count from 1-1024
concurrent threads
Tools
• Command orchestration
• Instance configuration
• Log collection
• Visualization
• Instance performance
Test Results – Large Files
Large Files: Four Instances
Large Files: Four Instances
Adding Additional Instances
Large File: 50 Instances
Test Results – Small Files
Small File Performance - Instance Family Test
~200 threads
c3.large – 5,342 files per minute @ 200 threads
Increase Instance Count
• Using optimal instance size
• c3.large
• Using optimal thread counts
• ~200 per instance
• Increase instance count
• 300 instances
• Optimize for costs
• EC2 Spot Market
EC2 Spot
c3.large – 300 instances
Summary / tl;dr
Results
Small files – 300 instancesLarge files – 50 instances
Demo
Summary / tl;dr
• Parallelize everything
• Threads
• Instances
• Test, test, test
• Capture & analyze test data
• Less than $5/hr for 300 instances
See EFS in Action: Web Serving
Content Management & Web Serving
Web-based applications for creating
and managing website content.
wikis
blogs
discussion
boards
Free and open-source content management system hosted
on a web platform
Web software to create beautiful websites, blogs, or apps
“Free and priceless at the same time” – WordPress.org
CODE IS POETRY
27% of all websites (November 2016) – Web Technology Surveys
Easiest and most popular blogging system in use on the
Web – CMS Usage Statistics
Supporting more than 60 million websites – Forbes
WordPress is Popular
Available as..
• Managed Web Hosting Service
• Software package from WordPress.org installed on self-
provisioned web platform… like AWS
How are people running WordPress today?
Structured data
(Posts, pages, comments, categories, tags, etc.)
Amazon EFSUnstructured data
(directories, php files, config, themes, plugins, etc.)
Amazon RDS
Amazon EC2Web Server
(Amazon Linux, Apache, PHP, OPCache)
WordPress Demo
Reference Architecture https://aws.amazon.com/architecture/
Coming Soon
Economics
Simple and predictable pricing
• With Amazon EFS, you pay only for the storage space you use
 No minimum commitments or up-front fees
 No need to provision storage in advance
 No other fees, charges, or billing dimensions
• EFS price: $0.30/GB-month (US Regions)
$0.33/GB-month (EU Ireland)
AVAILABILITY
ZONE 1
REGION
EC2
AVAILABILITY
ZONE 2
AVAILABILITY
ZONE 3
EC2
Compute nodes to
manage 3rd-party
file system layer
EBS
Replicated
storage volumes
EBS
Inter-AZ traffic for
replication
Typical multi-AZ file system setup without EFS
EC2
NFS client
accessing file
system
NFS
TCO example
Let’s say you need to store ~500 GB and require high availability and durability
Using a shared file layer on top of EBS, you might provision 600 GB (with ~85% utilization)
and fully replicate the data to a second Availability Zone for availability/durability
Example comparative cost:
Storage (2x 600 GB EBS gp2 volumes): $120 per month
Compute (2x m4.xlarge instances): $350 per month
Inter-AZ data transfer costs (est.): $129 per month
Total $599 per month
EFS cost is (500GB * $0.30/GB-month) = $150 per month, with no additional charges
Summary
Key Recommendations
• Test your application!
• Use General Purpose mode for lowest latency, Max-I/O for
scale-out
• Use Linux kernel version 4.0 or newer, mount via NFSv4.1
• To optimize, look for opportunities to:
• Aggregate I/O
• Perform async operations
• Parallelize (demo later)
• Cache (demo later)
• Don’t forget to check your burst credit earn/spend rate when
testing – ensure sufficient amount of storage
Coming Soon: Encryption of data at rest
• Integrated with AWS Key Management Service
• Encryption/decryption handled transparently
• No extra cost
Additional Resources
Amazon EFS Site
- https://aws.amazon.com/efs/
Amazon EFS User Guide
- https://docs.aws.amazon.com/efs/latest/ug/whatisefs.html
AWS 10-Minute Tutorials
- https://aws.amazon.com/getting-started/tutorials/
Reference Architecture - WordPress on EFS coming soon
- https://aws.amazon.com/architecture/
qwikLABS
- https://aws.qwiklabs.com/
YouTube: Amazon Web Services Channel
Thank you!

More Related Content

What's hot

What's hot (20)

AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
AWS re:Invent 2016: Event Handling at Scale: Designing an Auditable Ingestion...
 
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
AWS re:Invent 2016: Introduction to Managed Database Services on AWS (DAT307)
 
Cost Savings at High Performance with Redis Labs and AWS
Cost Savings at High Performance with Redis Labs and AWSCost Savings at High Performance with Redis Labs and AWS
Cost Savings at High Performance with Redis Labs and AWS
 
Making (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with CachingMaking (Almost) Any Database Faster and Cheaper with Caching
Making (Almost) Any Database Faster and Cheaper with Caching
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWS
 
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
Building Big Data Applications with Serverless Architectures -  June 2017 AWS...Building Big Data Applications with Serverless Architectures -  June 2017 AWS...
Building Big Data Applications with Serverless Architectures - June 2017 AWS...
 
Introduction to Amazon Aurora
Introduction to Amazon AuroraIntroduction to Amazon Aurora
Introduction to Amazon Aurora
 
SRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraSRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon Aurora
 
Amazon Relational Database Service Deep Dive
Amazon Relational Database Service Deep DiveAmazon Relational Database Service Deep Dive
Amazon Relational Database Service Deep Dive
 
Introduction to Block and File storage on AWS
Introduction to Block and File storage on AWSIntroduction to Block and File storage on AWS
Introduction to Block and File storage on AWS
 
Getting started with Amazon DynamoDB
Getting started with Amazon DynamoDBGetting started with Amazon DynamoDB
Getting started with Amazon DynamoDB
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
What’s New in Amazon RDS for Open-Source and Commercial Databases
What’s New in Amazon RDS for Open-Source and Commercial DatabasesWhat’s New in Amazon RDS for Open-Source and Commercial Databases
What’s New in Amazon RDS for Open-Source and Commercial Databases
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...
ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...
ENT313 Deploying a Disaster Recovery Site on AWS: Minimal Cost with Maximum E...
 
Continuous Integration with Amazon ECS and Docker
Continuous Integration with Amazon ECS and DockerContinuous Integration with Amazon ECS and Docker
Continuous Integration with Amazon ECS and Docker
 
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
AWS re:Invent 2016: JustGiving: Serverless Data Pipelines, Event-Driven ETL, ...
 
How to Migrate your Startup to AWS
How to Migrate your Startup to AWSHow to Migrate your Startup to AWS
How to Migrate your Startup to AWS
 
Deep Dive on Elastic Load Balancing
Deep Dive on Elastic Load BalancingDeep Dive on Elastic Load Balancing
Deep Dive on Elastic Load Balancing
 

Viewers also liked

AWS CLOUD 2017 - AWS 코어팀과 함께하는 고객 성공 전략 (황인철 상무 & 박성훈 테크니컬 어카운트 매니저 & 김소희 컨설턴트)
AWS CLOUD 2017 - AWS 코어팀과 함께하는 고객 성공 전략 (황인철 상무 & 박성훈 테크니컬 어카운트 매니저 & 김소희 컨설턴트)AWS CLOUD 2017 - AWS 코어팀과 함께하는 고객 성공 전략 (황인철 상무 & 박성훈 테크니컬 어카운트 매니저 & 김소희 컨설턴트)
AWS CLOUD 2017 - AWS 코어팀과 함께하는 고객 성공 전략 (황인철 상무 & 박성훈 테크니컬 어카운트 매니저 & 김소희 컨설턴트)
Amazon Web Services Korea
 
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
Amazon Web Services Korea
 

Viewers also liked (20)

Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...
Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...
Optimize MySQL Workloads with Amazon Elastic Block Store - February 2017 AWS ...
 
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
Migrate from SQL Server or Oracle into Amazon Aurora using AWS Database Migra...
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
 
Introduction to DevSecOps on AWS
Introduction to DevSecOps on AWSIntroduction to DevSecOps on AWS
Introduction to DevSecOps on AWS
 
AWS CLOUD 2017 - AWS 코어팀과 함께하는 고객 성공 전략 (황인철 상무 & 박성훈 테크니컬 어카운트 매니저 & 김소희 컨설턴트)
AWS CLOUD 2017 - AWS 코어팀과 함께하는 고객 성공 전략 (황인철 상무 & 박성훈 테크니컬 어카운트 매니저 & 김소희 컨설턴트)AWS CLOUD 2017 - AWS 코어팀과 함께하는 고객 성공 전략 (황인철 상무 & 박성훈 테크니컬 어카운트 매니저 & 김소희 컨설턴트)
AWS CLOUD 2017 - AWS 코어팀과 함께하는 고객 성공 전략 (황인철 상무 & 박성훈 테크니컬 어카운트 매니저 & 김소희 컨설턴트)
 
Best Practices with IoT Security - February Online Tech Talks
Best Practices with IoT Security - February Online Tech TalksBest Practices with IoT Security - February Online Tech Talks
Best Practices with IoT Security - February Online Tech Talks
 
Bases de datos en la nube con AWS
Bases de datos en la nube con AWSBases de datos en la nube con AWS
Bases de datos en la nube con AWS
 
천만 사용자를 위한 AWS 아키텍처 보안 모범 사례 (윤석찬, 테크에반젤리스트)
천만 사용자를 위한 AWS 아키텍처 보안 모범 사례 (윤석찬, 테크에반젤리스트)천만 사용자를 위한 AWS 아키텍처 보안 모범 사례 (윤석찬, 테크에반젤리스트)
천만 사용자를 위한 AWS 아키텍처 보안 모범 사례 (윤석찬, 테크에반젤리스트)
 
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
Amazon 인공 지능(AI) 서비스 및 AWS 기반 딥러닝 활용 방법 - 윤석찬 (AWS, 테크에반젤리스트)
 
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
AWS re:Invent 2016: Design Patterns for High Availability: Lessons from Amazo...
 
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)AWS re:Invent 2016: AWS Database State of the Union (DAT320)
AWS re:Invent 2016: AWS Database State of the Union (DAT320)
 
Accelerating the Transition to Broadcast and OTT Infrastructure in the Cloud
Accelerating the Transition to Broadcast and OTT Infrastructure in the CloudAccelerating the Transition to Broadcast and OTT Infrastructure in the Cloud
Accelerating the Transition to Broadcast and OTT Infrastructure in the Cloud
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
AWS re:Invent 2016: Extending Datacenters to the Cloud: Connectivity Options ...
AWS re:Invent 2016: Extending Datacenters to the Cloud: Connectivity Options ...AWS re:Invent 2016: Extending Datacenters to the Cloud: Connectivity Options ...
AWS re:Invent 2016: Extending Datacenters to the Cloud: Connectivity Options ...
 
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
 
Towards Full Stack Security
Towards Full Stack Security Towards Full Stack Security
Towards Full Stack Security
 
AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...
AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...
AWS re:Invent 2016: Achieving Agility by Following Well-Architected Framework...
 
AWS Services for Content Production
AWS Services for Content ProductionAWS Services for Content Production
AWS Services for Content Production
 
Deliver and monetize your content with video center operations on aws
Deliver and monetize your content with video center operations on awsDeliver and monetize your content with video center operations on aws
Deliver and monetize your content with video center operations on aws
 
DDoS Resiliency
DDoS ResiliencyDDoS Resiliency
DDoS Resiliency
 

Similar to Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks

Similar to Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks (20)

Deep Dive on Amazon Elastic File System - June 2017 AWS Online Tech Talks
Deep Dive on Amazon Elastic File System - June 2017 AWS Online Tech TalksDeep Dive on Amazon Elastic File System - June 2017 AWS Online Tech Talks
Deep Dive on Amazon Elastic File System - June 2017 AWS Online Tech Talks
 
AWS re:Invent 2016: Deep Dive on Amazon Elastic File System (STG202)
AWS re:Invent 2016: Deep Dive on Amazon Elastic File System (STG202)AWS re:Invent 2016: Deep Dive on Amazon Elastic File System (STG202)
AWS re:Invent 2016: Deep Dive on Amazon Elastic File System (STG202)
 
SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)
SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)
SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)
 
SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)
SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)
SRV401 Deep Dive on Amazon Elastic File System (Amazon EFS)
 
Deep Dive on Amazon Elastic File System (Amazon EFS)
Deep Dive on Amazon Elastic File System (Amazon EFS)Deep Dive on Amazon Elastic File System (Amazon EFS)
Deep Dive on Amazon Elastic File System (Amazon EFS)
 
Deep Dive on Amazon EFS | AWS Public Sector Summit 2017
Deep Dive on Amazon EFS | AWS Public Sector Summit 2017Deep Dive on Amazon EFS | AWS Public Sector Summit 2017
Deep Dive on Amazon EFS | AWS Public Sector Summit 2017
 
Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)Introduction to Amazon Elastic File System (EFS)
Introduction to Amazon Elastic File System (EFS)
 
Amazon Elastic File System (Amazon EFS) Introduction & Demo
Amazon Elastic File System (Amazon EFS) Introduction & DemoAmazon Elastic File System (Amazon EFS) Introduction & Demo
Amazon Elastic File System (Amazon EFS) Introduction & Demo
 
(STG306) EFS: How to store 8 Exabytes & look good doing it
(STG306) EFS: How to store 8 Exabytes & look good doing it(STG306) EFS: How to store 8 Exabytes & look good doing it
(STG306) EFS: How to store 8 Exabytes & look good doing it
 
Amazon EFS: Deploying Scalable, Shared File Systems
 Amazon EFS: Deploying Scalable, Shared File Systems  Amazon EFS: Deploying Scalable, Shared File Systems
Amazon EFS: Deploying Scalable, Shared File Systems
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Amazon EFS
Amazon EFSAmazon EFS
Amazon EFS
 
Amazon EFS
Amazon EFSAmazon EFS
Amazon EFS
 
Amazon Elastic File System (EFS): New Elastic File Storage Service That Makes...
Amazon Elastic File System (EFS): New Elastic File Storage Service That Makes...Amazon Elastic File System (EFS): New Elastic File Storage Service That Makes...
Amazon Elastic File System (EFS): New Elastic File Storage Service That Makes...
 
AWS re:Invent 2016: Case Study: How Spokeo Improved Web Application Response ...
AWS re:Invent 2016: Case Study: How Spokeo Improved Web Application Response ...AWS re:Invent 2016: Case Study: How Spokeo Improved Web Application Response ...
AWS re:Invent 2016: Case Study: How Spokeo Improved Web Application Response ...
 
Log Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & KibanaLog Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & Kibana
 
Intro to AWS: Storage Services
Intro to AWS: Storage ServicesIntro to AWS: Storage Services
Intro to AWS: Storage Services
 
Amazon Elastic File System (Amazon EFS)
Amazon Elastic File System (Amazon EFS)Amazon Elastic File System (Amazon EFS)
Amazon Elastic File System (Amazon EFS)
 
Deep Dive on Log Analytics with Elasticsearch Service
Deep Dive on Log Analytics with Elasticsearch ServiceDeep Dive on Log Analytics with Elasticsearch Service
Deep Dive on Log Analytics with Elasticsearch Service
 
Introduction to Storage on AWS - AWS Summit Cape Town 2017
Introduction to Storage on AWS - AWS Summit Cape Town 2017Introduction to Storage on AWS - AWS Summit Cape Town 2017
Introduction to Storage on AWS - AWS Summit Cape Town 2017
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Deep Dive on Elastic File System - February 2017 AWS Online Tech Talks

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Edward Naim, Head of Product, Amazon EFS Darryl Osborne, Storage Specialist Solutions Architect David Green, Enterprise Solutions Architect February 23rd, 2017 Deep Dive on Amazon EFS
  • 2. Learn why and when to use Amazon EFS Understand key technical & security concepts Discover how to leverage EFS’s performance See EFS in action: Hands-on demos Review EFS’s economics Answer your questions (Q&A) What to expect from this webinar
  • 3. Why & When to Use Amazon EFS
  • 4. Cloud Data Migration Direct Connect Snow* data transport family 3rd Party Connectors Transfer Acceleration Storage Gateway Kinesis Firehose AWS Storage Platform and SolutionsThe AWS Storage Portfolio Object Amazon GlacierAmazon S3 Block Amazon EBS (persistent) Amazon EC2 Instance Store (ephemeral) File Amazon EFS
  • 5. Amazon EFS attributes 1) Standard file system interface & semantics 2) Shared storage 3) Highly available 4) Highly durable 5) Consistent, low latencies 6) Scalable (storage & throughput) 7) Elastic capacity 8) Fully managed
  • 6. We focused on changing the game Simple Elastic Scalable 1 2 3 Highly durable Highly available
  • 7. Amazon EFS is Simple • Fully managed - No hardware, network, file layer - Create a scalable file system in seconds! • Seamless integration with existing tools and apps - NFS v4.1—widespread, open - Standard file system access semantics - Works with standard OS file system APIs • Simple pricing = simple forecasting 1
  • 8. Amazon EFS is Elastic • File systems grow and shrink automatically as you add and remove files • No need to provision storage capacity or performance • You pay only for the storage space you use, with no minimum fee 2
  • 9. • File systems can grow to petabytes of capacity • Throughput scales automatically as file systems grow • Consistent low latencies regardless of file system size • Support for thousands of concurrent NFS connections Amazon EFS is Scalable 3
  • 10. • Every file system object is redundantly stored across multiple Availability Zones in a Region • Designed to sustain Availability Zone offline conditions • Superior to traditional NAS availability models • Appropriate for production/tier 0 applications High Durability & High Availability
  • 11. In which Regions can I use EFS today? • US West (Oregon) • US East (N. Virginia) • US East (Ohio) • EU (Ireland) More coming soon!
  • 12. Do you need an EFS file system? If you have an application (EC2 or on-premises) or use case that requires a file system AND • Requires multi-attach OR • GBs/s throughput OR • Multi-AZ availability/durability OR • Requires automatic scaling (grow/shrink) of storage
  • 13. What customers are using EFS for today Web serving Content management Analytics Media and Entertainment workflows Workflow management Home directories Container storage Database backups
  • 14. Understand Key Technical and Security Concepts
  • 15. What is a file system? • The primary resource in EFS • Where you store files and directories • Can create 125 file systems per account
  • 16. What is a mount target? • To access your file system within a VPC, you create mount targets in the VPC • A mount target is an NFS endpoint that lives in your VPC • A mount target has an IP address and a DNS name you use in your mount command • A mount target is highly available AVAILABILITY ZONE 1 REGION AVAILABILITY ZONE 2 AVAILABILITY ZONE 3 VPC EC2 EC2 EC2 EC2 Mount target
  • 17. How to access a file system from an instance • You “mount” a file system on an Amazon EC2 instance (standard command) — the file system appears like a local set of directories and files • An NFS v4.1 client is standard on Linux distributions mount –t nfs4 –o nfsvers=4.1 [file system DNS name]:/ /[user’s target directory]
  • 18. How does it all fit together? AVAILABILITY ZONE 1 REGION AVAILABILITY ZONE 2 AVAILABILITY ZONE 3 VPC EC2 EC2 EC2 EC2 File system Data can be accessed from any AZ in the Region while maintaining full consistency
  • 19. Several security mechanisms  Control network traffic to and from file systems (mount targets) by using VPC security groups and network ACLs  Control file and directory access by using POSIX permissions  Control administrative access (API access) to file systems by using AWS Identity and Access Management (IAM)  EFS supports action-level and resource-level permissions
  • 20. Access your EFS file system via AWS Direct Connect Direct Connect EFS in your Amazon VPCOn-premises servers
  • 21. Direct Connect support addresses three of four hybrid scenarios Bursting Migration Tiering Backup / DR
  • 22. Learn How to Leverage EFS’s Performance
  • 23. Amazon EFS is designed for wide spectrum of performance needs High throughput and parallel I/O Low latency and serial I/O Genomics Big data analytics Scale-out jobs Home directories Content management Web serving Metadata-intensive jobs
  • 24. Choose the performance mode best suited to your workload Mode What’s it for? Advantages Tradeoffs When to use General purpose (default) Latency-sensitive applications and general-purpose workloads Lowest latencies for file operations Limit of 7,000 ops/sec Best choice for most workloads Max I/O Large-scale and data- heavy applications Virtually unlimited ability to scale out throughput/IOPS Slightly higher latencies Consider if 10s (or more) instances access your file system concurrently
  • 25. Use the PercentIOLimit CloudWatch metric to determine if you’re constrained by General Purpose mode
  • 26. Amazon EFS has a distributed data storage design EC2 EC2 … EC2 EC2 … EC2 EC2 … • File systems distributed across unconstrained number of servers • Avoids bottlenecks/constraints of traditional file servers • Enables high levels of aggregate IOPS/throughput • Data also distributed across Availability Zones (durability, availability)
  • 27. How to think about EFS perf relative to EBS Amazon EFS Amazon EBS PIOPS Performance Per-operation latency Low, consistent Lowest, consistent Throughput scale Multiple GBs per second Single GB per second Characteristics Data availability / durability Stored redundantly across multiple AZs Stored redundantly in a single AZ Access 1 to 1000s of EC2 instances, from multiple AZs, concurrently Single EC2 instance in a single AZ Use cases Big Data and analytics, media processing workflows, content management, web serving, home directories Boot volumes, transactional and NoSQL databases, data warehousing & ETL
  • 28. An implication of per-operation latency: I/O size impacts throughput of serialized operations 4 KB 32 KB 256 KB 2 MB 16 MB I/O size Throughput
  • 29. How to take advantage of EFS’s distributed architecture: Parallelize Parallelize via multiple threads and/or multiple instances 0 5000 10000 15000 20000 25000 30000 0 20 40 60 80 100 120 140 160 IOPS # of Total Threads Aggregate IOPS of parallel writes using 10 m4.xlarge instances
  • 30. Use CloudWatch for a number of views of file system performance DataReadIOBytes DataWriteIOBytes MetadataIOBytes TotalIOBytes Measure throughput (‘Sum’ of bytes divided by seconds in time period) or ops/sec (‘Data Samples’ divided by seconds in time period) BurstCreditBalance Monitor your burst credit usage over time to ensure sufficient throughput capacity PermittedThroughput Compare to actual throughput to determine whether you’re being constrained by the burst model ClientConnections View the number of clients connected to your file system PercentIOLimit Determine whether you’re being constrained by General Purpose mode (PercentIOLimit at or near 100%)
  • 31. Recommended kernel version and NFS mount options Kernel version  Use Linux kernel 4.0+ (e.g., Amazon Linux 2016.03.0, Ubuntu 15.10 or 16.04) Mount options  Mount via NFSv4.1  Specify 1MB read/write buffers (“rsize”/”wsize”)  Ensure operations are asynchronous Recommend the following mount options: -o nfsvers=4.1, rsize=1048576,wsize=1048576,hard, timeo=600,retrans=2,async
  • 32. See EFS in Action: Move Data
  • 33. Goal: Move Data Quickly!!
  • 35. Transferring media assets to EFS • Size ranges from a few GB to 100+GB per file • Data sources: • Amazon S3 • Amazon EBS
  • 36. Transferring many small files to EFS • Size ranges from 64K to 256K • Data sources: • Amazon S3 • Amazon EBS
  • 40. How do we do this?
  • 41. GNU parallel • Tool for executing jobs in parallel • Similar to xargs • Replace loops in shell scripts • GNU parallel makes sure output from the commands is the same output as you would get if you had run the commands sequentially https://www.gnu.org/software/parallel/ For people who live life in the parallel lane
  • 42. Use parallel threads – GNU parallel # Create destination directory tree from source find . -type d -print0 | parallel -j $N_THREADS -0 "mkdir -p ${DST_DIR}/{}" > /dev/null 2>&1 # Copy files find . ! ( -type d ) -print0 | parallel -j $N_THREADS -0 "cp - f {} ${DST_DIR}/{}"
  • 44. Monitoring performance • Data-driven results • Repeatable outcomes • Optimize for costs
  • 45. Benchmark different instance types • Determine the optimal instance size • What is best? T2, C3, C4, M3, M4, R3, X? • Transfer test set of 1000 small files • Increase thread count from 1-1024 concurrent threads
  • 46. Tools • Command orchestration • Instance configuration • Log collection • Visualization • Instance performance
  • 47. Test Results – Large Files
  • 48. Large Files: Four Instances
  • 49. Large Files: Four Instances
  • 51. Large File: 50 Instances
  • 52. Test Results – Small Files
  • 53. Small File Performance - Instance Family Test ~200 threads
  • 54. c3.large – 5,342 files per minute @ 200 threads
  • 55. Increase Instance Count • Using optimal instance size • c3.large • Using optimal thread counts • ~200 per instance • Increase instance count • 300 instances • Optimize for costs • EC2 Spot Market
  • 57. c3.large – 300 instances
  • 59. Results Small files – 300 instancesLarge files – 50 instances
  • 60. Demo
  • 61. Summary / tl;dr • Parallelize everything • Threads • Instances • Test, test, test • Capture & analyze test data • Less than $5/hr for 300 instances
  • 62. See EFS in Action: Web Serving
  • 63. Content Management & Web Serving Web-based applications for creating and managing website content. wikis blogs discussion boards
  • 64. Free and open-source content management system hosted on a web platform Web software to create beautiful websites, blogs, or apps “Free and priceless at the same time” – WordPress.org CODE IS POETRY
  • 65. 27% of all websites (November 2016) – Web Technology Surveys Easiest and most popular blogging system in use on the Web – CMS Usage Statistics Supporting more than 60 million websites – Forbes WordPress is Popular
  • 66. Available as.. • Managed Web Hosting Service • Software package from WordPress.org installed on self- provisioned web platform… like AWS How are people running WordPress today?
  • 67. Structured data (Posts, pages, comments, categories, tags, etc.) Amazon EFSUnstructured data (directories, php files, config, themes, plugins, etc.) Amazon RDS Amazon EC2Web Server (Amazon Linux, Apache, PHP, OPCache)
  • 71. Simple and predictable pricing • With Amazon EFS, you pay only for the storage space you use  No minimum commitments or up-front fees  No need to provision storage in advance  No other fees, charges, or billing dimensions • EFS price: $0.30/GB-month (US Regions) $0.33/GB-month (EU Ireland)
  • 72. AVAILABILITY ZONE 1 REGION EC2 AVAILABILITY ZONE 2 AVAILABILITY ZONE 3 EC2 Compute nodes to manage 3rd-party file system layer EBS Replicated storage volumes EBS Inter-AZ traffic for replication Typical multi-AZ file system setup without EFS EC2 NFS client accessing file system NFS
  • 73. TCO example Let’s say you need to store ~500 GB and require high availability and durability Using a shared file layer on top of EBS, you might provision 600 GB (with ~85% utilization) and fully replicate the data to a second Availability Zone for availability/durability Example comparative cost: Storage (2x 600 GB EBS gp2 volumes): $120 per month Compute (2x m4.xlarge instances): $350 per month Inter-AZ data transfer costs (est.): $129 per month Total $599 per month EFS cost is (500GB * $0.30/GB-month) = $150 per month, with no additional charges
  • 75. Key Recommendations • Test your application! • Use General Purpose mode for lowest latency, Max-I/O for scale-out • Use Linux kernel version 4.0 or newer, mount via NFSv4.1 • To optimize, look for opportunities to: • Aggregate I/O • Perform async operations • Parallelize (demo later) • Cache (demo later) • Don’t forget to check your burst credit earn/spend rate when testing – ensure sufficient amount of storage
  • 76. Coming Soon: Encryption of data at rest • Integrated with AWS Key Management Service • Encryption/decryption handled transparently • No extra cost
  • 77. Additional Resources Amazon EFS Site - https://aws.amazon.com/efs/ Amazon EFS User Guide - https://docs.aws.amazon.com/efs/latest/ug/whatisefs.html AWS 10-Minute Tutorials - https://aws.amazon.com/getting-started/tutorials/ Reference Architecture - WordPress on EFS coming soon - https://aws.amazon.com/architecture/ qwikLABS - https://aws.qwiklabs.com/ YouTube: Amazon Web Services Channel