1) O documento discute as opções da AWS para migrar bancos de dados, incluindo Amazon RDS, AWS Database Migration Service e AWS Schema Conversion Tool.
2) Ele fornece detalhes sobre como migrar bancos de dados relacionais para a AWS usando o Amazon RDS e AWS DMS, incluindo como manter a aplicação em execução durante a migração.
3) O documento também aborda recursos como replicação, conversão de esquema e suporte a várias origens e destinos para migração de bancos de dados.
2. Migrando seu Banco de Dados para a AWS
• Bancos de Dados na Amazon;
• Deep Dive em Amazon RDS;
• Deep Dive em Amazon DMS e SCT;
3. Bancos de Dados Relacionais
Amazon EC2
Amazon
RDS
DBs suportados Linux ou Windows
4. Seu Data Center
Serviços
Energia,HVAC,rede
Rack e Cabeamento
Manuten. Servidor
Patches SO
DB software patches
Database backups
Escalabilidade
Alta Disponibilidade
DB software installs
Instalação SO
Otimização Apps
5. Amazon RDS – Relational Database Services
Serviços
Energia,HVAC,rede
Rack e Cabeamento
Manuten. Servidor
Instalação SO
Otimização Apps
Patches SO
DB software patches
Database backups
Escalabilidade
Alta Disponibilidade
DB software installs
Amazon
RDS
7. Performance
Capacidade
Computacional
1 vCPU a 128vCPUs
1 vCPU a 40vCPUs
Memória
GB of RAM
1 GB a 1.952 GB
1 GB a 244 GB
Redes
(Throughput)
Low a 20 Gbps
Low a 10 Gbps
Storage
I/O Throughput
48.000 IOPS
30.000 IOPS
R3, R4 instance support
Instance Families: T2, M3, M4
Amazon EC2
Amazon
RDS
8. Disponibilidade
Node 1
Node 2
Storage 1
Storage 2
Storage 3
Node 3
Node 4
Mesmo Rack
Mesmo appliance
Mesmo Data Center
Mesma entrada de
energia
Mesma localização
geográfica
13. MySQL and Oracle
• SOC 1, 2, and 3
• ISO 27001/9001
• ISO 27017/27018
• PCI DSS
• FedRAMP
• HIPAA BAA
• UK government programs
• Singapore MTCS
Compliance
SQL Server and PostgreSQL
• SOC 1, 2, and 3
• ISO 27001/9001
• ISO 27017/27018
• PCI DSS
• UK government programs
• Singapore MTCS
14. SSL
Disponível para os seis
Motores de Banco de Dados
Critografia de tráfego de Banco de Dados
17. Monitoramento padrão
Amazon CloudWatch
metrics for Amazon RDS
CPU utilization
Storage
Memory
Swap usage
DB connections
I/O (read and write)
Latency (read and write)
Throughput (read and write)
Replica lag
Many more
Amazon CloudWatch Alarms
Similar to on-premises custom
monitoring tools
18. Monitoramento granular
Mais de 50 novas métricas CPU, memória, file system, e disk I/O com um
intervalo de até 1 segundo
19. Notificações
• Use Amazon Simple Notification
Service (Amazon SNS) para
notificação de usuários
• 17 categorias diferentes de
eventos (availability, backup,
configuration change, and so on)
21. Read Replicas
Traga seu dado para perto de
seus clientes
Diminua a pressão no seu nó
master, com suporte a read
réplicas
Promova Read Replica a master
para uma recuperação rápida de
desastre
22. Read Replicas
Dentro da Região
• MySQL
• MariaDB
• PostgreSQL
• Aurora
Cross-region
• MySQL
• MariaDB
• PostgreSQL
• Aurora
23. Read Replicas para Amazon Aurora
AZ 1
AZ 3AZ 2
Primary
Node
Primary
Node
Primary
node
AZ 1
AZ 1
Primary
Node
Primary
Node
Read Replica
node
AZ 1
Primary
Node
Primary
Node
Read Replica
node
24. Read Replicas—Oracle e SQL Server
Opções
• Oracle GoldenGate
• Produtos de terceiros
• Snapshots
• Amazon DMS
28. Escalando —automação
AWS CLI
Scheduled CLI—cron
aws rds modify-db-instance --db-instance-identifier sg-cli-test --db-instance-class
db.m4.large --apply-immediately
#Scale down at 8:00 PM on Friday
0 20 * * 5 /home/ec2-user/scripts/scale_down_rds.sh
#Scale up at 4:00 AM on Monday
0 4 * * 1 /home/ec2-user/scripts/scale_up_rds.sh
29. Escalando —automação
Scheduled—AWS Lambda
No server but still runs on a schedule!
import boto3
client=boto3.client('rds')
def lambda_handler(event, context):
response=client.modify_db_instance(DBInstanceIdentifier='sg-cli-test',
DBInstanceClass='db.m4.xlarge',
ApplyImmediately=True)
print response
32. Banco de Dados Relacional, MySQL-compatible
Performance e disponibilidade dos
Bancos de dados Comerciais
Simplicidade e eficiência de custo de um
Banco de dados open-source
O que é Amazon Aurora?
33. Escalando o tamanho dos dados
SYSBENCH WRITE-ONLY
DB Size Amazon Aurora
RDS MySQL
30 K IOPS (single AZ)
1 GB 107,000 8,400
10 GB 107,000 2,400
100 GB 101,000 1,500
1 TB 26,000 1,200
67x
U P TO
FA STER
DB Size Amazon Aurora
RDS MySQL
30K IOPS (single AZ)
80 GB 12,582 585
800 GB 9,406 69
CLOUDHARMONY TPC-C
136x
U P TO
FA STER
34. Trabalhando com read replicas
SysBench write-only workload
250 tables
Updates per
second Amazon Aurora
RDS MySQL
30 K IOPS (single AZ)
1,000 2.62 ms 0 s
2,000 3.42 ms 1 s
5,000 3.94 ms 60 s
10,000 5.38 ms 300 s
500x
UP TO
LOW ER LA G
35. Faça menos I/Os
Minimize pacotes de redes
Faça cache de resultados anteriores
Offload da engine de banco
FAÇA MENOS TRABALHO
Processos Assíncronos
Reduza o caminho da latência
Use estruturas lock-free
Operações Batch juntas
SEJA MAIS EFICIENTE
Como alcançamos estes resultados?
DATABASES Fazem Muito I/O
NETWORK-ATTACHED STORAGE Utilizam muitos PACKETS/SECOND
HIGH-THROUGHPUT PROCESSING DOES NOT ALLOW CONTEXT SWITCHES
37. Aurora cluster with replicas
Amazon S3
AZ 1 AZ 2 AZ 3
Aurora primary
instance
Cluster volume spans 3 AZs
Aurora Replica Aurora Replica
38. I/O traffic in RDS MySQL
BINLOG DATA DOUBLE-WRITELOG FRM FILES
T Y P E O F W R IT E
MYSQL WITH STANDBY
EBS mirrorEBS mirror
AZ 1 AZ 2
Amazon S3
EBS
Amazon Elastic
Block Store (EBS)
Primary
Instance
Standby
Instance
1
2
3
4
5
Faz write Amazon EBS—EBS escreve no mirror, e retorna status
Faz o write no standby usando replicação de storage
Faz o write no EBS da instância standby
Fluxo I/O
Passos 1, 3, 5 são sequenciais e síncronos
Amplifia a latência
Muitos tipos de write para cada interação
Escreve blocos de dados duas vezes para evitar perda de escrita
OBSERVAÇÕES
780 K transações
7,388 K I/Os por milhão de transação (excludes mirroring,
standby)
Média 7.4 I/Os por transação
PERFORMANCE
30 minute SysBench write-only workload, 100 GB dataset, RDS Single AZ, 30 K PIOPS
39. I/O traffic in Aurora (database)
AZ 1 AZ 3
Primary
Instance
Amazon S3
AZ 2
Replica
Instance
AMAZON AURORA
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
BINLOG DATA DOUBLE-WRITELOG FRM FILES
T Y P E O F W R IT E S
30 minute SysBench writeonly workload, 100GB dataset
FLUXO I/O
Somente redo log; outros passos assíncronos
Não escreve blocos (checkpoint, cache replacement)
6x mais log writes, mas 9x menos tráfego de rede
Tolerância a picos de latência em rede e storage
OBSERVAÇÕES
27,378 K transações 35x MORE
950K I/Os por 1M transações (6x amplification) 7.7x LESS
PERFORMANCE
Envio dos redo log — ordenados por LSN
Organiza para os segmentos apropriados — demandados
Envio massivo dos logs para os storage nodes e faz
operações de escritas
40. Migração de Bancos de Dados Relacionais
para a AWS – Amazon RDS
Amazon
RDS
Customer
premises
AWS
Internet
VPN
Backup
Lógico/Físico
Sincronismo
ReplicaçãoAmazon
RDS
Amazon
S3
Amazon
EC2
41. Comece a migração em poucos minutos
Mantanha a aplicação rodando enquanto migra
Replicação entre, para e de Amazon EC2 ou Amazon RDS
Movimenta dados para o mesmo motor de DB ou outro
AWS
Database Migration
Service
(AWS DMS)
Amazon Aurora
42. Customer
premises
Application users
AWS
Internet
VPN
Inicie uma instância de replicação
Conecte-se aos bancos Origem e
Destino
Selecione as tabelas, schemas ou
databases
AWS DMS pode criar as
tabeleas, carregar e manter
sincronizadas
Mude a aplicação para o
banco de dados Destino
Aplicação rodando enquanto migra
AWS
DMS
Origem Destino
43. Carga é feita por tabela
Replication instance
Source Target
44. Change data capture (CDC)
Replication instance
Source Target
Update
t1 t2
t1
t2
Transactions Aplicações
das
transações
após a carga
45. Opção Multi-AZ para alta disponibilidade
Customer
premises
or AWS
AWS
Internet
VPN
AWS DMS
AWS DMS
48. Recursos disponíveis para os clientes —AWS
SCT
Guia de uso: documentos
aws.amazon.com/documentation/SchemaConversionTool/
Ou faça Download.
Download area: arquivos de
instalação.
Support forums: Ask questions
and review how-to guides.
https://forums.aws.amazon.com/forum.jspa?forumID=208.
49. AWS Schema Conversion Tool
Features
Conversão de esquema Oracle e Microsoft SQL Server para MySQL, Amazon Aurora, MariaDB e PostgreSQL
Ou converta seu esquema entre o PostgreSQL e qualquer mecanismo MySQL
Relatório de avaliação de migração de banco de dados para escolher o melhor mecanismo de destino
Navegador de código que destaca lugares onde são necessárias edições manuais
Conexões seguras aos seus bancos de dados com SSL
Otimização do código nativo da nuvem
O AWS Schema Conversion Tool ajuda a automatizar
a conversão de schema de banco de dados e
códigos, para migrações entre motores de bancos de
dados ou data warehouses
50. Origens e Destinos com AWS DMS
Origens:
On-premises and Amazon EC2 instance databases:
• Oracle Database 10g – 12c
• Microsoft SQL Server 2005 – 2014
• MySQL 5.5 – 5.7
• MariaDB (MySQL-compatible data source)
• PostgreSQL 9.4 – 9.5
• SAP ASE 15.7+
RDS instance databases:
• Oracle Database 11g – 12c
• Microsoft SQL Server 2008R2 - 2014. CDC
operations are not supported yet.
• MySQL versions 5.5 – 5.7
• MariaDB (MySQL-compatible data source)
• PostgreSQL 9.4 – 9.5. CDC operations are not
supported yet.
• Amazon Aurora (MySQL-compatible data
source)
Destinos:
On-premises and EC2 instance databases:
• Oracle Database 10g – 12c
• Microsoft SQL Server 2005 – 2014
• MySQL 5.5 – 5.7
• MariaDB (MySQL-compatible data source)
• PostgreSQL 9.3 – 9.5
• SAP ASE 15.7+
RDS instance databases:
• Oracle Database 11g – 12c
• Microsoft SQL Server 2008 R2 - 2014
• MySQL 5.5 – 5.7
• MariaDB (MySQL-compatible data source)
• PostgreSQL 9.3 – 9.5
• Amazon Aurora (MySQL-compatible data
source)
Amazon Redshift
51. SCT ajuda a converter tabelas, views, e código
Sequences
User-defined types
Synonyms
Packages
Stored procedures
Functions
Triggers
Schemas
Tables
Indexes
Views
Sort and distribution keys
55. ”Com a AWS, conseguimos realizar um set
de mudanças em um único momento”
“Com a nuvem da AWS
conseguimos ter
dinamismo, eficiência e
rapidez.
Nem tudo o que é rápido
leva eficiência junto.
Fazer algo grande e
eficiente em pouco
tempo é muito bom”
- Juliano Souza
• A Dr.Consulta está
revolucionando
a saúde no Brasil.
• Trabalhamos para que todos nós
tenhamos acesso a saúde de
excelência, no menor custo
possível. Estamos fazendo isso
através de nossos centros
médicos eficientes, análise
sofisticada de dados e uso
inteligente de tecnologia.
56. Os Desafios
1. Mudança do fornecedor de cloud.
2. Mudança de tecnologia.
3. Nova mentalidade.
4. Foco no que realmente importa.
5. Parceria.
6. Melhor estrutura de BI.
58. A migração e o DMS.
• RDS
• Economia
• Administração
• Segurança
• Disponibilidade
• Replicação
• BI Real-Time
• Foco na
performance
da aplicação e
queries.
DMS
59. Sem gestão de
infraestrutura
Escale up/down
Custo-benefício
Provisionamento
instantâneo
Compatibilidade
Amazon Relational Database Service (Amazon RDS)
So it was with all these factors in mind that we developed the database migration service. We designed it to be simple - you can get started in less than ten minutes. We designed it to enable near-zero-downtime migration. And we designed it to be a kind of replication Swiss army knife. To replicate data between on-premises systems, RDS, EC2, and across database engine types.
The first thing you want to do is to tightly control access to your data resources.
You should start with database-level access restrictions. Create a user that can only read, write, but can’t delete tables. Don’t allow users to Drop Table or Truncate.
Next, secure your network.
Security Groups will allow you to define which hosts can access your databases. You should define separate Security Groups for your Database hosts and Application hosts. Use the Application Security Group names in the egress/ingress rules for your Database instances instead of CIDRs.
The next layer of protection are the VPCs (Virtual Private Cloud). VPCs separate Amazon into logically separate parts.
You can further subdivide VPCs into subnets.
You should define separate subnets for your databases, application and web servers. The first one will be a private subnet not accessible to the internet. You should deploy your database and application server there. The second one, where you deploy web servers, will be public and accessible to internet hosts.
You can use the Network Access Lists (ACLs for short) to define egress/ingress rules for subnets.
Lastly, use the Identity and Access Management (IAM for short) to manage permissions for your users.
One of the reasons that customers are concerned with security is that they have certain compliance requirements that they have to mee.
AWS has lots of different customers in the startup, enterprise, and public sector space running a wide range of applications and workloads across many different industries. Some of these applications and workloads require that they meet certain regulatory requirements. To support the audit and compliance requirements of these customers AWS has worked to achieve certification that you can run your workloads on RDS and be able to achieve a full compliant application.
So for RDS we currently offer 9 different compliance attestations including Soc1, 2, & 3, FedRamp, PCI/DSS, and HIPAA.
With these compliance certifications for RDS it means that you can build and run your applications and workloads, related to these compliance bodies, on RDS. However, it does not relieve you of the responsibility of making sure that your application meets the appropriate compliance requirements. AWS takes responsibility for the compliance of the RDS service and the infrastructure related to it and you, the customer, take care of the application that you have built on top of AWS. With this model you can take the audit findings from your application and combine them with the appropriate attestation from AWS’s third party auditors and have a complete verification of compliance for your application running on AWS.
----------------
Chart of what is certified and what is not is here: https://w.amazon.com/index.php/AWS_Security_Assurance/Compliance/Scope
Here is a listing of our compliance certifications and which database engines they currently apply to on RDS. You can see here that we have a wide range of certifications covering MySQL, Oracle, SQL Server, and PostgreSQL. As you can see there is a lot of coverage among these four different engines around key areas of compliance.
You may be asking where is Aurora in this. What I can say to that is “Compliance is mostly a certification process that takes time with the agencies that do the certification so we are working with them on Aurora but you have to follow a process for certification which takes time”.
** Make this one stronger**
With some of the compliance bodies that I just mentioned in the previous slides there is a requirement that all traffic between compute nodes and external users be encrypted. HIPAA is a great example of this. At times, even just for internal reasons some companies will make this a requirement as well.
With RDS we have the ability to encrypt the traffic to and from your database using SSL. Each of the six engines supports the ability to configure an SSL connection into your database. The way that you implement the appropriate SSL certificates and configurations on each database engine may be a little different from each other and this is primarily due to how each of these engines has built their support for SSL.
The main thing to keep in mind here is that you are able to protect your data in traffic via a standard SSL connection to your database.
Which SSL versions supported?
---------
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html
MySQL, Aurora, SQL Server, MariaDB, PostgreSQL – RDS creates an SSL Cert in the DB instance when the instance is first provisioned. You then configure the client for your engine to reference the DB Instance Endpoint common name and do the setup that is specific to the client you are using. MySQL, Aurora, MariaDB are very similar due to the fact that they are all MySQL or MySQL Based.
Oracle – You define SSL as an option when you launch and RDS uses a second port for SSL connections. Then you configure your Oracle client to use SSL.
After you secured the access to your database, you have additional mechanisms to further securing your databases.
You can encrypt data in transit using NNE – native network encryption.
You can audit access to your database using the native Orcale audit logs that are accessible to you through the API and the AWS Management Console.
And you can also encrypt your data at rest.
I want to dive deep into this topic.
Volume Encryption:
- Works with all editions of Oracle
- Use your own AWS KMS keys for more control
Use the default key managed by RDS for ease of use
Using TDE with Enterprise Edition
TDE_HSM and TDE database options
TDE_HSM integrates with AWS CloudHSM
Share CloudHSM partition between databases and accounts
Now let’s move into metrics and monitoring on AWS and what you can get out of that.
Being able to monitor your database and know how it is performing is a very critical part of managing a database that is supporting your application or workloads.
For everyone using RDS you get a collection of 15-18 metrics that are automatically made available to you, depending on the type of instance that you are using. RDS sends the necessary information for these metrics to Amazon CloudWatch and then you are able to view the metrics in the RDS console, in the CloudWatch console, or via the CloudWatch APIs. These metrics are made available to you at one minute intervals.
With this monitoring you can keep an eye on the performance of your database around key items like CPU utilization, memory, storage, latency, and any lag between your master and read replica databases. You can view this information in individual graphs, multiple graphs, or even pull them into your own monitoring tool.
Additionally, you can also take these metrics and build alarms based on thresholds that are meaningful to you. Then whenever you go above that threshold you can have a notification or other action take place that helps to respond to that metric being outside of its normal boundaries. We will actually take a look at an example of what you can do there a bit later in this presentation.
---------------
Basic cloudwatch metrics - http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/rds-metricscollected.html
When it comes to monitoring a database there are a lot of different measurements and metrics that you can examine. With no access to the host operating system of the RDS instance getting at all the monitoring information you need could be a challenge because there are certain operating system metrics that you want to see that would better help you determine the health of your RDS instance.
To help with this problem the RDS service gives you the option to turn on Enhanced Monitoring. When you enable enhanced monitoring you get access to around 37 metrics that you can combine with the metrics that you get from basic monitoring to better help you understand what is going on with your database instance. When you enable enhanced monitoring you can define the granularity at which you can see the metrics from sixty seconds all the way to one second.
A few of the metrics that you get with enhanced monitoring are: Free Memory, Active Memory, Load Average, and how much of the file system you have used.
You might be asking the differences between basic monitoring and enhanced monitoring and why they are not all delivered the same. The reason is that standard monitoring is based on what the Hypervisor that is running your database instance can see. There are metrics related to your OS that the hypervisor cannot see and we utilize an agent running on your RDS instance to collect the necessary metrics to give you enhanced monitoring.
Enhanced monitoring is available for: all six engines
-------
Discussed in a section on this page: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Monitoring.html
Available on all classes except t1.micro and m1.small
** Crisper**
To give you even more visibility into what is going on with your RDS instance you can also use Event Notifications. These Event Notifications allow you to get notifications, via Amazon SNS, when certain events within your RDS instance occur.
You can subscribe to events around the DB Instance, DB Snapshots, DB Security Groups, or DB parameter groups.
A few examples of what types of event notifications you can receive:
Configuration change – you can get notified any time a configuration change such as a security group modification occurs, or when the storage settings for the instance have changed.
Low Storage – You could be notified when the database is getting low on storage so that you can take action before you run out of storage.
Failover – you could be notified when your master database is failing over to its standby configuration.
Overall Event notifications are something to give you extra insight into what is going on with your RDS instance and giving you the opportunity to take action when something does happen. These event notifications not only give you visibility into what might naturally be going on with your RDS instance but also can give you visibility into changes happening that might impact the security of your database such as making a change to the security group for the database.
------------
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_Events.html
Future deep dive item – We should show action happening when an Event Notification happens. Maybe adding storage when we get a low storage notification. This way we are not paying for storage before we need it. Maybe notifying when a security group change occurs.
Being able to scale your database to keep up with demand is an important part of operating a database for any application or workload. Let’s dive into the ways that the RDS service allows you to scale your database.
When your master database becomes overloaded in trying to handle read and write requests you can either try and scale the master itself or you can take the path of adding read replicas to handle the read workload.
With RDS read replicas do a couple of things. First they allow you to create read only copies of your master database that are kept in synch with your master database. You can then point your read queries from your applications or end users at these read replicas helping to keep the read workload on your master database low. Depending on the database engine you can also place your read replica in a different region from your master which may give you the ability to have a read closer to a particular locality. Finally, read replicas give you the ability to have another avenue for failover when there is a problem with the master, giving you the ability to have coverage in the event of a disaster.
-----------
Benefits
Offload reads that would normally happen against your master database, allowing the master database to be able to focus on serving up writes and maintaining the integrity of the transactions in your database.
Support read intensive applications for specific business units
Promote read replica to your production database in the case of failure ** *More on this one – Have to figure out how to tie this one in with High Availability
For read replicas there are two deployment options.
Intra region which allows you to create additional read replicas within the same AWS region, but in the same or different availability zones from your master database. This functionality is supported by MySQL, MariaDB, PostgreSQL, and Aurora
Cross Region which allows you to deploy the read replica into an AWS region that is different from the region that your master is located in. This functionality is supported by MySQL, MariaDB, PostgreSQL, and Aurora
The approach you take for these options is going to depend on the database engine that you are using as well as the ultimate approach that you are trying to take in regards to why you want the read replicas and what your disaster recovery plans and high availability plans are.
This is just the same as with our High Availability discussion for Aurora a few slides back.
So with Aurora I want to call out that there are a few differences with how read replicas are handled vs the read replicas that are offered for the other engines.
With the other engines there is going to be some lag between the master and the read replica in regards to getting updated data synched from the master to the read replica.
When it comes to Aurora there is much lower lag between the master and the read replica because in the background they are sharing the same storage layer so there is no need to replicate data between the storage layers of the master and read replica.
With the other engines you can create around five read replicas before you start to put a lot of pressure on the master database in regards to supporting replication to all the read replicas. With Aurora you can actually have up to fifteen read replicas all tied into the master and the shared storage related to your aurora cluster.
For Aurora these read replicas are actually your failover option as well for when there are problems with your master database. With this approach you have the ability to failover to your read replica with nearly no data loss from the failure of the master due to the shared storage that both instances would use.
Finally, if you have a need to have multiple read replicas and want to balance traffic across those nodes we have partnered with the team at MariaDB and there is a driver that you can actually use in your applications that will allow you to balance traffic across your read replicas in order to ensure that you are not overloading any one read replica.
Up to this point I have been talking about read replicas for four of the engines that are supported on RDS.
Currently RDS does not currently support read replicas for Oracle and SQL Server. However you can still accomplish this on RDS.
To start out with there are several AWS partner products such as Attunity and SnapLogic that you can use to replicate data between two RDS instances of Oracle or SQL Server.
You can also use the Oracle GoldenGate product to replicate data between your databases. I have this in the Oracle section you can actually use the GoldenGate product to replicate data from a source database that is SQL Server as well.
Finally you can use snapshots to take a copy of your database and create a new database with a copy of that point in time copy of the database. We will talk more about snapshots in a bit.
Here is a quick example of what it would take to scale your RDS database instance up or down.
Here we have screen shots of what you do when you want to change the instance size.
Choose the Modify option from the Instance Actions menu of the RDS console.
Then choose what you want your new DB Instance Class to be.
Finally, determine if you want to apply the change immediately or not. If you do not apply the change immediately then the change will be scheduled to occur during the preferred maintenance window that you defined when creating the database.
Keep in mind that when you are applying the change immediately you could incur some downtime as the instance size is changed so be aware of what your applications that are accessing the database can tolerate in regards to downtime.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Overview.DBInstance.Modifying.html
Here is an overview of what happens if you are running a scaling operation on a database that is running in a single availability zone configuration.
Final piece of animation is what the RDS console shows in regards to the events that just occurred.
Here is an example of how scaling works when you are running your RDS instance in a multi availability zone configuration.
Final screen shot. You can see the RDS recording of the sequence of events for the multi AZ failover.
You can also see what I do at 6:20 AM on a Saturday.
So I showed you how easy it is to scale your RDS instance via the management console. It is also just as easy to scale your RDS instance via the AWS APIs.
The top example is an example of re-sizing an RDS instance with the command line. You can see that I am setting the instance class to an m4.large and I am setting the apply immediately option so that the scaling option happens immediately.
Recall the example I talked about a few slides ago where I talked about a scenario where you might want to scale your database down on the weekend when there is lower usage.
What I have on the bottom is an example of a cron job that could be scheduled to perform the scaling actions in the above CLI command but on a schedule. So at 8:00 PM on Friday I call the scale down script to set the RDS instance to a lower size and then at 4:00 AM on Monday I call the scale up script to set the instance size back to what I need it to be during the week.
Now with the cron job you have to have some sort of compute up and running to support cron and you need to grant that compute access to be able to modify your RDS instance. So you are either running some very cheap compute that is just for this or you have an instance that you are comfortable with giving this level of access.
Still on the subject of scaling…
Another option to accomplish the same scheduled scale activities is using a scheduled lambda function. For those who are not familiar with Lambda, it is an AWS service that allows you to run code on demand based on events. In this case we are creating a scheduled event. On this screen is an example of some python code that I created as a scheduled lambda function to call the modfiy db instance API to scale my database instance. For the weekend scaling example you could have two scheduled lambda functions to handle the Friday scale down and the Monday scale up.
One important thing to call out here is that when you are doing these scaling actions you need to take into consideration how your application is going to react to the changes in the database. It may be necessary to have some additional code that pauses the application while the scale action takes place.
-----------
Lambda creation using python: http://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html
So with RDS there is not a service, like AutoScaling on EC2, that allows you to automatically re-size your database based on metrics.
However, there is still a way to implement something like AutoScaling with RDS using a lot of the same services.
So you can utilize an approach where you have RDS sending metric information to CloudWatch. You then have alarms happening in CloudWatch based on alerts that you created for thresholds that matter to you. Those alarms are then sending notifications to the Simple Notification Service and then you have a lambda function subscribed that notification to actually take action.
So with this approach let’s say that you care about CPU utilization of your RDS instance. You could have an alarm based on CPU that fires when utilization is above 80%. That alarm would then call a notification that would then call a lambda function to scale up your RDS instance.
Here is an example of a lambda function that gets fired based on an SNS notification. It reads the message sent by the notification and pulls out he name of the database instance and then calls the modify API to scale up the database.
This is a super simple example. A more thorough piece of code might pull out the current instance type and determine what the next appropriate step up level would be.
The point of all of this is that you have some options when it comes to scaling your database in an automated fashion.
Amazon Aurora is a MySQL-compatible, ACID compliant enterprise-grade relational database engine built for the cloud.
In many ways, Aurora is a game changer and helps overcome the limitations of traditional relational database engines. Most traditional databases have been built and optimized for on-premises environments, and the fundamentals of their architecture have not changed much in the last few decades.
One of the key objectives of Aurora is to overcome the performance, scalability, and availability limitations of traditional databases in a cost-effective manner similar to open-source databases.
Durable, fault tolerant, self healing, storage auto-scaling, automatic backups, performance, read replicas, instant crash recovery, survivable buffer cache, security.
Let’s look at data that does not fit in cache
Here we are scaling a db size up from 1 GB , which can certainly fit in most any cache, to a TB
Aurora is basically running about the same performance as long as we are fitting in cache.
Than it drops down as you would expect, once you no longer fit in the cache by about 4x
Whats interesting with mysql is it starts declining well ahead of that point while the data still is fully cached.
Int his case.we are running 67x faster, which is a crazy number, which made us decide to run something else, the
Cloudharmony TPCC like benchmark which is pretty much like the std TPC-C benchmark the tpc council runs, and here it is running 136 x faster when it is not fitting in cache.
And these are transaction on the right, and statements on the left because those are the two standards for these benchmarks
If you want to run HA or high throughput read queries you will probably use read replicas
Mysql RRs start to degrade, the lag between the master and slaves goes up, as you start to increase the updates per second through the system
In Aurora we also use the RRs as failover targets as well.
Going from 1000 to 10,000 updates per second we are relatively stable at about 5 ms of replica lag
Whereas In MySQL we go from a few ms to 300 seconds
For the bulk of this talk we want to talk about is HOW.
The core of this comes down to improving your algorithms, sharding your locks, removing contention. And that is good, but anyone can do this – it is not architectural work
Once you are done with that you are at the point where you are hitting IO. Well designed dbs are all about IO.
Once you do network attached storage it is all about packets per second. You can do somewhere between 1 and 2 mil packets per sec on high end hw like EC2 enhanced networking, but that is still a pretty constrained number
The other thing that is important if you want to operate at high throughput you cannot afford system calls or context switches.
So a lot of our work that has been architectural has been around reducing Ios, minimizing network packets, and moving toward processing asynchrounously.
Here we have a single Aurora instance shown with its storage layer. This is called an Aurora cluster. The cluster consists of one or more Aurora instances and the underlying shared storage layer which spans 3 azs. Here we are showing Aurora with one primary instance only, but you can have multiple replicas – up to 15 – in these 3 Azs..
Let’s take a look at the storage layer. Aurora maintains 6 copies of your data across three Availability Zones – by default – no matter if you have chosen to have additional Aurora instances – the storage is on a virtual disk that spans 3 availability zones.
Because Aurora maintains multiple copies of your data in three Availability Zones, the chance of losing data as a result of a disk failure is minimized. Aurora automatically detects failures in the disk volumes that make up the cluster volume. When a segment of a disk volume fails, Aurora immediately repairs the segment. When Aurora repairs the disk segment, it uses the data in the other volumes that make up the cluster volume to make sure that the data in the repaired segment is current. As a result, Aurora avoids data loss and reduces the need to perform a point-in-time restore to recover from a disk failure.
And by the way, at the storage layer, you are only billed for one of those copies, not six.( You are charged based on the storage your database consumes at the database layer, not the storage consumed in this virtualized storage layer.)
For comparison, this is what your cluster would look like with additional Aurora replicas provisioned, here one in each AZ
Lets compare and contrast : So here is io traffic in RDS MySQL with a standby instance.
You issue a write, it goes to EBS, EBS issues to a mirror asynchronously in parallel. Once both writes have happened we synch are going to go to another AZ to do another write with DRBD. Then you go to the standby, it has to issue a write, and then it proprages and comes back.
The interesting thing to point out here is steps 1,3 and 5 are sequential and synchronous. So if there is any network “weather” going on it is oing to amplify both the latency and the jitter – minimally ou are going to be looking at 3x and possibly much more than that.
The other thing you see is there are a lot of diferent types of write operations: log, binlob, data, double write to avoid the torn write, and the frm files for metadata
And the thing that is particularly difficult int his is the data blocs are a good deal larger than the log, about 40x, and you are writing them twice so that is 80x. So that is a big deal. But like any solid db mysql is going to be doing a lot of batching under the covers to try to mitigate that overhead.
So what we did here is we ran a 30 minute write only SysBench workload, with 100GB data set, SingleAZ, 30K PIOPS –and we ran 780 k transactions.
We saw over 7.4 million and that’s pretty oggd –that is the work the primary instance is seeing. that is steps 1. Steps 2,3,4,5 are not reflected. You want to focus on the network throughtput on that single instance. Ets compare it to aurora.
Aurora – 27,378,519 transactions, 35x more. 158,323 operations – 46x fewer.
Latency aggregates
Jitter amplified
Write both log and data. Data
IO flow
Observations
Multiple sequential steps
Latency and jitter amplify
Double-writes because overwriting page
Performance
So let’s compare it to Aurora.
So in Aurora what we are doing is we are only shipping log records. So if we see a bunch of log records come to us, we boxcar them together
SysBench writeonly workload 100GB SingleAZ 30K PIOPS – 30 minutes 779,664 txns. 7,387,798 operations
Aurora – 27,378,519 transactions, 35x more. 158,323 operations – 46x fewer.
949938
So it was with all these factors in mind that we developed the database migration service. We designed it to be simple - you can get started in less than ten minutes. We designed it to enable near-zero-downtime migration. And we designed it to be a kind of replication Swiss army knife. To replicate data between on-premises systems, RDS, EC2, and across database engine types.
Mention why no older engines as sources
+Sybase v15+
30 segundos para apresentar a empresa, rapidamente
Os 4 (máximo) maiores desafios do projeto, que foram resolvidos pela utilização da nuvem da AWS
Diagrama de solução, e explicar a solução, vantagens, etc
Os 4 (máximo) maiores desafios do projeto, que foram resolvidos pela utilização da nuvem da AWS
Quick overview of what RDS is.
This is a Deep Dive so there are some assumptions that some of the basics with RDS and the benefits are already understood.
We are going to touch on many of these in more depth throughout the presentation.
RDS is a managed database service. This service allows you more time to focus on your application: You focus on Schema Design, query construction, query optimization, and building your application.
Infra Mgmt
AWS does patching
AWS Handles backup and replication
AWS manages the Infrastructure and making sure that it is healthy
You focus on your application
HA and automated failover management
High end features that you can do on your own but you get automatically.
Instant Provisioning
Simple and Fast to deploy
When you need to launch a new database or change your existing one you can at any point in time with no need to wait for infrastructure to be ordered or configured.
Scale up/Down
Simple and fast to scale. You can change your configuration to meet your needs when you want to.
Cost-effective
No Cost to get started
Pay only for what you consume
Application Compatibility
* Six different engines to choose from
* There are many popular applications, or even your own custom code code, that you may be running on your own infrastructure and it and it can still work on RDS. If you are using one of the engines that are currently supported then there is a good chance you can get it working on RDS.
When you think about all that it takes to get new database infrastructure and an actual database up and running there are a lot of things that an expert DBA and infrastructure person would have to do. With RDS you are getting this with just a few clicks and are up and running in a manner of minutes.