Review of how AWS EC2 storage options have evolved, and making the right selection for your workload. Covering Amazon Elastic Block Storage, EBS and Amazon Elastic File System, EFS.
2. Amazon EFS
File
Amazon EBS
Amazon EC2
Instance Store
Block
Amazon S3 Amazon Glacier
Object
AWS Storage is a Platform
Data Transfer
AWS Direct
Connect
ISV
Connectors
Amazon
Kinesis
Firehose
AWS Storage
Gateway
Amazon S3
Transfer
Acceleration
AWS
Snowball
Amazon
CloudFront
Internet/VPN
3. AWS Storage Choices
Amazon S3
Durable object storage
for all types of data
Amazon EBS
Block storage for use
with Amazon EC2
Amazon Glacier
Archival storage
for infrequently
accessed data
Amazon EFS
Simple, scalable, and
reliable file storage
4. Amazon EBS
Highly available block storage for all types of data
Internet-scale storage
Grow without limits
Benefit from AWS’s
massive security
investments
Built-in redundancy
Designed for
99.999% availability
Low price per GB
per month
No commitment
No up-front cost
5. Amazon EBS use cases
• Persistent block storage for Amazon EC2
• Boot Volumes
• Enterprise Applications – Oracle, SAP, Microsoft Exchange, and
Microsoft Sharepoint
• Relational Databases – SQL, Oracle, MySQL, and PostgreSQL
• NoSQL – Cassandra, MongoDB, and Couchbase.
• Big Data – Hadoop/EMR, Kafka/stream processing, Splunk/logs,
ETL, Data Warehousing, Media Streaming
• File system for an instance - NTFS, ExtFS etc.
• Dev and Test Environments
6. AWS EBS Features
Durable And
Available
Secure
Low-latency SSD
Consistent I/O
Performance
Stripe multiple volumes
for higher I/O
performance
Identity and
Access Policies
Encryption
Scalable
Unlimited capacity
when you need it
Easily scale up
and down
Performance
Backup
Designed for five
9’s reliability
Redundant storage
across multiple devices
within an AZ
Designed for an annual
failure rate (AFR) of
between 0.1% - 0.2%,
Point-in-time Snapshots
Copy snapshots across
AZ and Regions
7. 2015 - 2016 Recap
Larger & Faster EBS SSD Volumes
• 16 TB, up to 20,000 IOPS
Performance Improvements
• Eliminating pre-warming of new EBS volumes
• Eliminating performance penalty while snapshotting
Cross-Region Encrypted Snapshot Copy
New Throughput-Intensive Magnetic Volumes
• ST1 (max 500 MB/sec) - ~$0.045
• SC1 (max 250 MB/sec) - ~$0.025
GP2 Burst Bucket Metrics
9. EBS Volume Types: IOPS
General Purpose SSD
gp2
Throughput: 160 MB/s
Latency: Single-digit ms
Capacity: 1 GB to 16 TB
Baseline: 3 IOPS per GB up to 10,000
Burst: 3,000 IOPS (for volumes up to 1 TB)
Great for boot volumes, low latency applications and bursty databases
10. EBS Volume Types: IOPS
Provisioned IOPS SSD
io1
Baseline: 100 to 20,000 IOPS
Throughput: 320 MB/s
Latency: Single-digit ms
Capacity: 4 GB to 16 TB
Ideal for critical applications and databases with sustained IOPS
11. EBS Volume Types: Throughput
Throughput
Optimized HDD
st1
Baseline: 40 MB/s per TB up to 500 MB/s
Capacity: 500 GB to 16 TB
Burst: 250 MB/s per TB up to 500 MB/s
Ideal for large block, high throughput sequential workloads
12. Cold HDD
sc1
EBS Volume Types: Throughput
Baseline: 12 MB/s per TB up to 192 MB/s
Capacity: 500 GB to 16 TB
Burst: 80 MB/s per TB up to 250 MB/s
Ideal for sequential throughput workloads such as logging and backup
13. Customer Example SSD (GP2): Crowdstrike
Cassandra on EBS vs. Ephemeral storage
“Amazon EBS offered the performance we
needed, at a third of the cost of the SSD-backed
instance storage.”
Goal: 1 million writes per second on 60 nodes with EBS
14. Customer Example: Crowdstrike, cont.
Used to believe they could never run Cassandra on EBS:
• Noisy Neighbor (jitter)
• Single point of failure in a region
• Too expensive
• Bad Volumes (spin up ten, run tests, and pick the best
one)
15. Customer Example: Crowdstrike, cont.
Disadvantages to running Cassandra on ephemeral storage
• Fewer EC2 instances are offering local instance storage
• There is no data persistence if you stop/start the EC2 instance to resize
• I2’s are expensive, especially when you need three of them per node for the replication
• You can’t snapshot the data using EBS snapshot
• No EBS volume monitoring
18. Customer Example: Crowdstrike
Crowdstrike Today:
• In the past 12 months, zero Amazon EBS-related failures
• Thousands of GP2 data volumes (~2PB of data)
• Transitioning all systems to Amazon EBS root drives
• Moved all data stores to EBS
Benefits of EBS:
• Use EBS volume monitoring
• Schedule snapshots for consistent backups
• Stop/start and resize
• Half the cost (using reserved pricing comparison)
20. Agenda
1. Overview of Amazon EFS
2. Examples of Amazon EFS use cases
3. Review of Amazon EFS technical concepts
4. Walk through experience of creating a file system
5. Discuss file system security mechanisms
6. Review the Amazon EFS performance model
7. Explore the Amazon EFS regional availability and durability model
21. Amazon EFS
File
Amazon EBS
Amazon EC2
Instance Store
Block
Amazon S3 Amazon Glacier
Object
Data Transfer
AWS Direct
Connect
ISV
Connectors
Amazon
Kinesis
Firehose
Storage
Gateway
S3 Transfer
Acceleration
AWS Storage is a Platform
AWS
Snowball
Amazon
CloudFront
Internet/VPN
22. • A fully managed file system for Amazon EC2 instances
• A file system interface with file system access semantics that
works with standard operating system APIs
• Sharable across thousands of instances
• Designed to grow elastically to petabyte scale
• Built for performance across a wide variety of workloads
• Highly available and durable
• Strong consistency
What is Amazon EFS?
23. We focused on changing the game
Simple Elastic Scalable
1 2 3
Highly Durable
Highly Available
24. Amazon EFS is Simple
Fully managed
- No hardware, network, file layer
- Create a scalable file system in seconds!
Seamless integration with existing tools and apps
- NFS v4.1—widespread, open
- Standard file system access semantics
- Works with standard OS file system APIs
Simple pricing = simple forecasting
1
25. Amazon EFS is Elastic
File systems grow and shrink automatically
as you add and remove files
No need to provision storage capacity or
performance
You pay only for the storage space you use,
with no minimum fee
2
26. File systems can grow to petabyte scale
Throughput and IOPS scale automatically
as file systems grow
Consistent low latencies regardless of file
system size
Support for thousands of concurrent NFS
connections
Amazon EFS is Scalable3
27. Designed to sustain AZ offline conditions
Resources aggregated across multiple AZ’s
Superior to traditional NAS availability
models
Appropriate for Production / Tier 0
applications
Highly Durable and Highly Available
29. Example use cases
Big Data Analytics
Media Workflow Processing
Web Serving
Content Management
Home Directories
30. We are growing by leaps and bounds, and our core offering is
all about better support delivery. During the course of
developing our next-generation internal support system, we
never wanted to worry about scale again, yet we had existing
architectural commitments that meant a distributed file solution
was required. We chose Amazon EFS because it was the only
option available that scaled both capacity and performance –
without the up-front payments or the management overhead of
traditional models. Now we can just focus on supporting our
customers to help them support their customers.
- Sam Caldwell, Sr. Systems Engineer
”
“
Use case: Enterprise Applications
31. Arcesium is a financial services SaaS platform that requires
resilient, secure, and scalable file storage. Amazon EFS
offers us a powerful way to operate and scale file storage for
our Amazon EC2 instances and is just the kind of solution we
need to build out our platform quickly and without
compromising quality.
- Guarav Suri, CEO
”
“
Use case: Enterprise Applications
32. Seeking Alpha subscribers rely on us for accurate
trading information. Amazon EFS helps us manage the
thousands of web pages on our site. It also helps us
manage the traffic spikes generated by financial market
events, and ensure that every bit of content requested
during these spikes is quickly and reliably delivered to
our subscribers.
- Asi Segal, CTO
”
“
Use case: Web Serving/Content Management
33. Our customers are the Internet. They build over 80% of the
world’s websites using our PHP technology.
Our testing has helped us conclude that the only logical way
to share files in the cloud is on Amazon EFS. There’s no
easier way to get out of the file server management business
and focus on web development.
- Boaz Ziniman, Sr. Dir Cloud Strategy
”
“
Use case: Web Serving/Content Management
35. What is a file system?
The primary resource in Amazon EFS
Where you store files and directories
Can create multiple file systems per account
36. How to access a file system from an instance
You “mount” a file system on an Amazon EC2 instance
(standard command) — the file system appears like a local
set of directories and files
An NFSv4.1 client is standard on Linux distributions
mount –t nfs4 –o nfsvers=4.1
[file system DNS name]:/
/[user’s target directory]
37. What is a mount target?
To access your file system from
instances in an Amazon VPC,
you create mount targets in the
VPC
A mount target is an NFS
endpoint in your VPC
A mount target has an IP address
and a DNS name you use in your
mount command
AVAILABILITY ZONE 1
REGION
AVAILABILITY ZONE 2
AVAILABILITY ZONE 3
VPC
EC2
EC2
EC2
EC2
Mount
target
38. How does it all fit together?
AVAILABILITY ZONE 1
AVAILABILITY ZONE 2
VPC
EC2
EC2
EC2
EC2
REGION
AVAILABILITY ZONE 3
39. There are three ways to set up and
manage a file system
AWS Management Console
AWS Command Line Interface (CLI)
AWS Software Development Kit (SDK)
40. The AWS Management Console, CLI, and SDK each
allow you to perform a variety of management tasks
Create a file system
Create and manage mount targets
Tag a file system
Delete a file system
View details on file systems in your AWS account
41. Setting up and mounting a file system takes
under a minute
1. Create a file system
2. Create a mount target in each Availability Zone from
which you want to access the file system
3. Enable the NFS client on your instances
4. Run the mount command
46. Security controls
Control network traffic to and from file systems by using
VPC security groups and network ACLs
Manage file and directory access by using standard Linux
directory and file-level permissions
Grant administrative access (via API) to file systems by
using AWS Identity and Access Management (IAM)
47. Only EC2 instances in the VPC you specify can access
your Amazon EFS file system
VPC
EC2
EC2
EC2
EC2
VPC
EC2
EC2
EC2
EC2
Customer’s file
system
48. VPC
EC2
EC2
Security groups control which instances in your VPC
can connect to your mount targets
Customer’s file
system
Security group:
sg-allowed
Security group:
Permit inbound traffic
from “sg-allowed”
Security group:
sg-not-allowed
49. Amazon EFS supports the POSIX permissions
model
Set file/directory permissions to specify read-write-execute
permissions for users and groups
50. Use IAM policies to control who can use the
administrative APIs to create, manage, and
delete file systems
Amazon EFS supports action-level and
resource-level permissions
Integration with AWS IAM provides administrative
security
52. EFS provides a throughput bursting model that
scales as a file system grows
As a file system gets larger, it
needs access to more
throughput
Many file workloads are spiky,
with peak throughput well above
average levels
Amazon EFS scalable bursting model is designed to
make performance available when you need it
53. Throughput bursting model based on earning
and spending “bursting credits”
• File systems earn credits at a “baseline rate” of 0.05 MB/s per GB stored and use credits by
performing file system operations; file systems can drive throughput at “baseline rate” indefinitely
• File systems with a positive bursting credit balance can “burst” to higher levels for periods of time:
100 MB/s for file systems 1TB or smaller, 100 MB/s per TB for file systems larger than 1TB
• New file systems start with a full credit balance
0
50
100
150
200
250
300
350
0 500 1000 1500 2000 2500 3000
Throughput(MB/s)
Storage (GB)
Baseline rate (MB/s) Burst rate (MB/s)
54. Bursting model examples
File system size Read/write throughput
A 1 TB EFS file system can… • Drive up to 50 MB/s continuously
or
• Burst to 100 MB/s for up to 12 hours each day*
A 10 TB EFS file system can… • Drive up to 500 MB/s continuously
or
• Burst to 1 GB/s for up to 12 hours each day*
A 100 GB EFS file system can… • Drive up to 5 MB/s continuously
or
• Burst to 100 MB/s for up to 72 minutes each day*
55. Amazon EFS is designed for wide spectrum of use cases
High throughput and parallel I/O
Low latency and serial I/O
Genomics
Big Data Analytics
Scale-out jobs
Home Directories
Content Management
Web serving
Metadata-intensive
jobs
56. Two performance modes designed to support
this broad spectrum of use cases
Optimized for latency-sensitive applications and general-
purpose file-based workloads – this mode is the best option for
the majority of use cases
General
Purpose
mode
Max I/O
mode
Can scale to higher levels of aggregate throughput with a tradeoff
of slightly higher latencies for file operations
Default: Recommended for most use cases
Use CloudWatch to determine whether your application may benefit from “Max I/O” mode;
if not, you’ll get the best performance in “General Purpose” mode
57. CloudWatch metrics provide visibility into file
system performance
View bursting credit balance
and determine throughput
permitted by the bursting model
Monitor I/O metrics to
understand the throughput your
file system is driving
Determine whether your
workload may benefit from “Max
I/O” mode
59. In what regions can I use Amazon EFS?
US-West-2 (Oregon)
US-East-1 (Northern Virginia)
EU-West-1 (Ireland)
60. Data is stored in multiple Availability Zones for
high availability and durability
Every file system
object (directory,
file, and link) is
redundantly
stored across
multiple
Availability Zones
in a region
AVAILABILITY
ZONE 1
REGION
AVAILABILITY
ZONE 2
AVAILABILITY
ZONE 3
Amazon
EFS
61. Data can be accessed from any Availability Zone in
the region while maintaining strong consistency
Your EC2 instances can
connect to your EFS file
system from any
Availability Zone in a
region
Reads and writes are
strongly consistent in all
Availability Zones
AVAILABILITY
ZONE 1
REGION
VPC
EC2
EC2
EC2
AVAILABILITY
ZONE 2
AVAILABILITY
ZONE 3
EC2
Write
Read
63. Simple and predictable pricing
With EFS, you pay only for the storage space you use
• No minimum commitments or up-front fees
• No need to provision storage in advance
• No other fees, charges, or billing dimensions
EFS price: $0.30/GB-month (US-East-1)
Customers eligible for the AWS Free Tier (12 months following AWS
sign-up date) get 5GB free storage per month
64. Frequently asked questions
Q. Does Amazon EFS offer encryption of data at rest?
A. Not currently, but we will offer this feature soon.
Q. What AMIs work with Amazon EFS?
A. Amazon EFS is accessible from all Linux-based AMIs.
Q. How do I back up a file system?
A. Amazon EFS is designed to be highly durable. If you want to be able to revert to
earlier versions to undo changes, you can use standard backup software.
Q. Can I mount Amazon EFS using NFSv4.0, or only NFSv4.1?
A. Amazon EFS supports either protocol, but we recommend mounting via NFSv4.1
for best performance.
Deals are more competitive, we can use EBS to differentiate
EBS is an oft overlooked piece of the puzzle that is very important and very integral to winning deals and retaining customers.
Key Takeaway: If we’re not taking the time to size block store appropriately (understand the workload and the disk I/O and bandwith requirements), we risk losing the deal or the customer because they feel they are paying to much. Case in point:
We need to be a trusted advisor and add value with volume selection.
A really well architected application is one that has been designed on stop of a solid, stable storage system. Getting it right is very important.
Storage is more than just the protocol or interface. It’s the lifeblood of application design and renewed architectures. Our customers have taught us that they need two things: scale and trust. 1. Make sure I can grow. 2. Make sure I can access what I need when I need it, (and of course help me keep costs down). Today we’re focusing on the latter - data migration options but I wanted to give you a quick taste on the different storage services we provide once you transfer the data over.
Data comes in all shapes and sizes with different requirement for performance, cost and access interface. AWS provides several services that address those different needs – S3 is a globally accessible and highly scalable object storage service. Glacier has a similar object architecture but caters for deep archives and long-term backup. EBS is high performance block storage dedicated to EC2 and EFS is a high performance NFS-based file system dedicated to EC2.
What’s common to all these services is that they’re all run on AWS which means there’s no need for you to put up any upfront investment or commitment, you pay only for what you use and don’t have to perform any complex or risky capacity planning modeling. It’s easy to use, all of them were designed to very high levels of availability and durability and allow you to reduce time to market by easily provisioning PBs of capacity in minutes without ever worrying about running out of space or hitting performance bottlenecks.
The suite of transfer services that support customers in their migrations means more choice. Large batches, incremental changes, constant streams or seamless integration are all part of the storage offering. Today we’re going to talk about all 6 of these data migration strategies, including two of the newest ways to do cloud data migration, Snowball and S3 Transfer Acceleration.
Note to presenters: Disk Transfer service is not EOL but has been deprecated out of the transfer services story in favor of Snowball. Snowball has already surpassed the amount of data imported over the lifetime of the disk transfer service.
EFS is in preview and due before the end of the year
You use EBS when you need a persistent block device for EC2 instances, when you have transactional workloads and when you want to deploy a file system on top of EC2, such as NTFS or a Linux file system. Which leads us nicely to our next topic...
Amazon Web Services give you reliable, durable backup storage without the up-front capital expenditures and complex capacity-planning burden of on-premises storage. Amazon storage services remove the need for complex and time-consuming capacity planning, ongoing negotiations with multiple hardware and software vendors, specialized training, and maintenance of offsite facilities or transportation of storage media to third party offsite locations.
EBS volumes are designed for an annual failure rate (AFR) of between 0.1% - 0.2%, where failure refers to a complete or partial loss of the volume, depending on the size and performance of the volume. This makes EBS volumes 20 times more reliable than typical commodity disk drives, which fail with an AFR of around 4%. For example, if you have 1,000 EBS volumes running for 1 year, you should expect 1 to 2 will have a failure. EBS also supports a snapshot feature, which is a good way to take point-in-time backups of your data.
Focus on 2 types: SSD vs. Magnetic
SSD for databases and high performance random I/O workloads such as nosql and relational databases, boot drives. Think SQL, MySQL, PostgreSQL, Oracle, SAP, Cassandra, MongoDB, etc.
Magnetic for cold data and Big Data: high throughput, sequential I/O workloads. Think Hadoop/EMR (Cloudera and Hortonworks), Kafka, Log Processing (Splunk, Sumo Logic), Media Streaming, Data Warehousing, ETL.
Focus on the workload, gather throughput and disk I/O requirements, then make careful selection of drive types
Designed for 99.999% SLA
.1% - .2% AFR (expect to lose 1-2 volumes out of 1000 in a year)
GP2 = JACK OF ALL TRADES, workhorse, go-to volume type
Now the standard boot volume type
SSD so low, single digit MS latency and will excel at random read write
Simple provisioning model, choose the size drive you need and you will get baseline performance of 3 IOPS per GB
These are ALL PER VOLUME metrics
100 IOPS BASELINE PER VOLUME
for storage amounts less than 1000 GB also have the capability of bursting to 3000 IOPS for 30 minutes
Hamilton:
“GP2 says listen, I will give you burst capability. Not good for a constant load that never changes like a database but for almost every [other] workload its an ideal choice. And will give you much better value for that.”
Again, per volume metrics and pretty much double what we had with GP2
PIOPS = when you absolutely must have every IOP in the room...
The Hot Rod
Mission Critical, I/O intensive workloads such as transactional DBs
Performance Consistency SLA
Workloads that demand consistent level of IOPS – No BURST bucket
Instead there is a performance consistency guarantee that 99.9% of the time you will get within 10% of what you provision
[ONLY if using EBS-optimized, so save for later?]
Hamilton:
“PIOPS... Which is really Ideal for db workloads. We sign up with 3 * 9’s of reliability to absolutely give you the number of IOPS you bought”
All performance characteristics measured as MB/s throughput – NOT IOPS
Mention 20 MB/s for a 500 GB volume
Data only volumes, cannot be a boot / root volume
Mention Latency model for HDD based volume types:
Will include both seek time and access latency
Double digit ms latency vs single in SSD
high throughput sequential IO workloads:
Log processing
ETL
Data warehousing
Hadoop, Kafka, Vertica
Any workload that thrives on large block, sequential throughput performance
Same platform for both new volume types – just the baseline and burst performance thresholds are different (oh and price, of course)
Cost Optimized throughput
Logging, Backup, Archive
6 MB/s for 500GB volume
Generally speaking, workloads with heavy writes and low reads are great candidates for C4+EBS. This is because Cassandra is heavily write-optimized with the foreground writes happening in memory. EBS only has to do background work related to compaction and fsync. Workloads with 50-50 reads/writes could benefit from R3 or M4 with EBS. If you are running datastax solr clusters, then i2 is a better choice.
Cassandra is usually setup to replicate itself across three nodes so that if any one node fails the other two still have the data. This is very different from relational databases where you can rely on one node with a SAN using RAID to keep the data redundant across multiple drives, and then setup read replicas for further redundancy. The main benefit of Cassandra doing it this way in the on-premise world is it allows you to use commodity hardware rather than rely on expensive SAN solutions. However this changes with the cloud (leads into next slide).
In general, the workloads with heavy writes and low reads are great candidates for C4+EBS. This is because Cassandra is heavily write optimized with the foreground writes happening in memory. EBS only has to do background work related to compaction and fsync.
Workloads with 50-50 reads/writes could benefit from R3 or M4 with EBS.
If you are running datastax solr clusters, then i2 is a much better choice.
On paper EBS looked much cheaper for a 1PB cluster for the performance they needed:
350 C4.2xlarge + 4TB EBS GP2 volumes/node vs. 1750 I2.2xlarge assuming 40% compaction overhead
On-demand price comparison: 350 C4.2xlarge + 4TB EBS GP2 volumes/node ($4 million/year) vs. 1750 I2.2xlarge ($14 million /year) assuming 40% compaction overhead (on-demand pricing).
Reserved pricing comparison: $8 million / year with reserved I2xlarge vs. under $4m with c4.2xlarge reserved plus EBS.
As of Re:Invent 2015
Storage is more than just the protocol or interface. It’s the lifeblood of application design and renewed architectures. Our customers have taught us that they need two things: scale and trust. 1. Make sure I can grow. 2. Make sure I can access what I need when I need it, (and of course help me keep costs down). Today we’re focusing on the latter - data migration options but I wanted to give you a quick taste on the different storage services we provide once you transfer the data over.
Data comes in all shapes and sizes with different requirement for performance, cost and access interface. AWS provides several services that address those different needs – S3 is a globally accessible and highly scalable object storage service. Glacier has a similar object architecture but caters for deep archives and long-term backup. EBS is high performance block storage dedicated to EC2 and EFS is a high performance NFS-based file system dedicated to EC2.
What’s common to all these services is that they’re all run on AWS which means there’s no need for you to put up any upfront investment or commitment, you pay only for what you use and don’t have to perform any complex or risky capacity planning modeling. It’s easy to use, all of them were designed to very high levels of availability and durability and allow you to reduce time to market by easily provisioning PBs of capacity in minutes without ever worrying about running out of space or hitting performance bottlenecks.
The suite of transfer services that support customers in their migrations means more choice. Large batches, incremental changes, constant streams or seamless integration are all part of the storage offering. Today we’re going to talk about all 6 of these data migration strategies, including two of the newest ways to do cloud data migration, Snowball and S3 Transfer Acceleration.
Note to presenters: Disk Transfer service is not EOL but has been deprecated out of the transfer services story in favor of Snowball. Snowball has already surpassed the amount of data imported over the lifetime of the disk transfer service.
EFS is in preview and due before the end of the year
Managed service – no hardware to maintain, no file software layer to maintain
Standard file system access semantics – you get what you’d expect from a file system. Read-after-write consistency, file locking, ability to have a hierarchical directory structure, file operations like appends, atomic renames, the ability to write to a particular block in the middle of a file
Standard OS APIs – EFS appears like any other file system to your operating system. So applications that leverage standard OS APIs to work with files will work with EFS.
Sharable – a common data source for applications and workloads across EC2 instances
Elastically grows to PB scale – don’t specify a provisioned size upfront. You just create a file system, and it grows and shrinks automatically as you add and remove data.
Performance for a wide variety of workloads – SSD-based. Consistent latencies, low latency vs. high throughput.
Available/durable – Replicate data across AZs within a region. Means that your files are highly available, accessible from multiple AZs, and also well-protected from data loss.
NFSv4.1 based – NFS is a network protocol for interacting with file systems. In the background, when you connect an instance to EFS, the data is sent over the network via NFS.
Open standard, widely adopted. NFS support is available on virtually all Linux distributions.
The “-t” for file system type
Target directory – local directory that it’s attached to
[Explain you have a region, VPC spanning multiple AZs, EC2 instances in your VPC in AZs, mount targets in each acting as the endpoint to the file system]
AWS management console – our browser-based management tool
AWS command line interface – a tool you download on your instance or local machine, provides scripts that allow you to control multiple AWS services from the Windows or Linux/Unix/MacOS shell
AWS SDK – Provides APIs for Java, Python, PHP, .NET, and others
----- Meeting Notes (9/22/15 12:50) -----
Latency numbers and call our for cross -az TIP and tricks.
You create mount targets in only the VPC you specify. Mount targets are the endpoints through which the FS is accessed.
Security groups, which are these firewall-like rules, are added to mount targets – specify what can send traffic to the mount targets and therefore to your file systems.
Security groups can have rules like:
Only accept traffic from an individual IP address
Only accept traffic from instances that are within a particular IP address range
Only accept traffic from instances that are in the same SG
In the example here, we have added to the mount target a SG rule to allow only traffic to/from instances that are part of that SG.
EFS supports the standard permission model you’re used to when working with local file systems on Linux or Windows – enforces user and group rights at the file and directory level.
For example, you can set permission for a file such that only the owner of a file can write to it. So only if the user account you’re logged in with on your instance matches the user ID of the owner of that file will EFS allow that client to write to the file.
Our goal for EFS is to serve the vast majority of file workloads, covering a wide spectrum of performance needs – from big data applications that are massively parallelized and require the highest possible throughput, to single-threaded, latency-sensitive workloads.
To support this broad spectrum, EFS has two different performance modes, which trade off latency for throughput scalability. Most customers will be best served by the “general purpose” mode, which is why it’s the default for EFS file systems. This mode is appropriate for most applications, including web serving, content management, software build systems, home directories, and general purpose file-based workloads.
“Max throughput mode” is designed for large-scale and data-heavy applications in which hundreds or thousands of instances actively access files within a single file system, but which are not as sensitive to per-operation latency. Note that when we talk about scale here, we’re not talking about how large your file system is. “Max throughput” mode is optimized to deliver the highest levels of aggregate throughput, rather than the lowest-possible latencies, and workloads like big data analysis, video rendering, and genomics analysis may benefit from this mode.
To determine whether your file system in “general purpose” mode might benefit from being in “max scale” mode, we’re providing a CloudWatch metric (GeneralPurposeScale?) that tells you whether you may be running into a scale limit.
Finally, let’s talk about how we store data in a region
In the event of an AZ issue (e.g., perhaps there’s a natural disaster that impacts a particular DC), EFS file systems will continue to be available from other AZs
You can design your application to take advantage of this; for example:
Use Elastic Load Balancing (ELB) to run your application in multiple AZs
If one AZ is out, your application will continue to run and have access to the EFS file system