SlideShare uma empresa Scribd logo
1 de 57
Aerospike 3
Configuration:
Basic Configuration
Part 2
Young Paik
Director of Sales Engineering
young@aerospike.com
Aerospike aer . o . spike [air-oh- spahyk]
noun, 1. tip of a rocket that enhances speed and stability
Aerospike
Configuration
Data Storage
Curriculum
This training module is an overview of the
Aerospike Database and covers the basic
configuration of a single Aerospike Database
Cluster.

This course has been split into 3 areas:
1. Overview*
2. Database Service Configuration*
3. Database Storage Configuration
* These parts were covered in the first part of the webinar series.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 3
Agenda
Data Hierarchy
 How Aerospike reads/writes/updates data
 Defragmentation
 Data Hygiene
 Configuring Storage


© 2014 Aerospike. All rights reserved. Confidential

Pg. 4
Data Hierarchy
Cluster
Node 1

Node 3

Node 2

Namespace
Set

Record

Record

Bin

© 2014 Aerospike. All rights reserved. Confidential

Bin

Bin

Pg. 5
Database Hierarchy
Term

Definition

Notes

Cluster

An Aerospike cluster services a single
database service.

While a company may deploy multiple clusters,
applications will only connect to a single cluster.

Node

A single instance of an Aerospike
database.

For production deployments, a host should only
have a single node. For development, you may
place more than one node on a host.

Namespace

An area of storage related to the media.
Can be either RAM or SSD based.

Similar to a “database” or “tablespaces” in
relational databases.

Set

An unstructured grouping of data that
have some commonality.

Similar to “tables” in a relational database, but do
not require a schema.

Record

A key and all data related to that key.

Similar to a “row” in a relational database.

Bin

One part of data related to a key.

Bins in Aerospike are typed, but the same bin in
different records can have different types. Bins are
not required. Single bin optimizations are allowed.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 6
Cluster
• Will be distributed on different nodes.
• Management of cluster is automated, so
no manual rebalancing or reconfiguration
is necessary.
• Will contain one or more namespaces.
Adding/removing namespaces requires a
cluster-wide restart.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 7
Nodes
• Each node is assumed to be identical.
• Data (and their associated traffic) will be
evenly balanced across the nodes.
• Big differences between nodes imply a
problem.
• Node capacity should take into account
node failure patterns.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 8
Namespaces
• Are associated with the storage media:
– Hybrid (ram for index and SSD for data)
– RAM + disk for persistence only
– RAM only

• Each can be configured with their own:
–
–
–
–

replication factor (change requires a cluster-wide restart)
RAM and disk configuration
settings for high-watermark
default TTL (if you have data that must never be automatically
deleted, you must set this to “0”)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 9
Sets
• Similar to “tables” in relational
databases.
• Sets are optional.
• Schema does not have to be pre-defined.
• In order to request a record, you must
know its set.
• Scans can be done across a set

© 2014 Aerospike. All rights reserved. Confidential

Pg. 10
Records
• Similar to a row in a relational database.
• All data for a record will be stored on the
same node.
• Any change to a record will result in a
complete write of the entire record.
• Sometimes referred to as “objects”

© 2014 Aerospike. All rights reserved. Confidential

Pg. 11
Bins
• Values Are typed. Current types are:
– Simple (integer, string, blob [language specific])
– Complex (list, map)

• A single bin may be updated by the client.
– Increment
– Replacement
– User Defined Function (UDF)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 12
Data Hierarchy
Cluster
Node 1

Node 3

Node 2

Namespace
Set

Record

Record

Bin

© 2014 Aerospike. All rights reserved. Confidential

Bin

Bin

Pg. 13
Data Access Patterns
How does Aerospike handle the following
operations
– Read
– Write*
– Update*

* The following slides do not illustrate how replica are written.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 14
Accessing An Object In Aerospike
Reading A Standard Data Type With SSDs

Client

DRAM (Index)

Master Node

Index reference

SSD (DATA)
1) Client finds Master Node from
partition map.
2) Client makes read request to
Master Node.
3) Master Node finds data location
from index in DRAM.
4) Master Node reads entire object
from SSD. This is true even if only
reading bin.
5) Master Node returns value.

Block size (128 KB by default)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 15
Accessing An Object In Aerospike
Writing A New Standard Data Type Record With SSDs

Client

Master Node

DRAM (Index)

SSD (DATA)
1) Client finds Master Node from
partition map.
2) Client makes write request to
Master Node.
3) Master Node make an entry indo
index (in DRAM) and queues write
in temporary write buffer.
4) Master Node coordinates write
with replica nodes (not shown).
5) Master Node returns success to
client.
6) Master Node asynchronously writes
data in blocks.
7) Index in DRAM points to location
on SSD.

Asynchronous write

Block size (128 KB by default)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 16
Accessing An Object In Aerospike
Updating A Standard Data Type Record With SSDs

Client

Master Node

DRAM (Index)
New

SSD (DATA)
1) Client finds Master Node from
partition map.
2) Client makes update request to
Master Node.
3) Master Node reads the existing
record (if using multiple bins)
4) Master Node queues write of
updated record in a temporary
write buffer
5) Master Node coordinates write
with replica nodes (not shown).
6) Master Node returns success to
client.
7) Master Node asynchronously writes
data in blocks.
8) Index in DRAM points to new
location on SSD.

Old

New
Asynchronous write

Block size (128 KB by default)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 17
Accessing An Object In Aerospike
How Space Is Freed Up
SSD (DATA)

Aerospike writes the data in large data
blocks.

1
2
3
4
5
6
7
8

Block size (128 KB by default)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 18
Accessing An Object In Aerospike
How Space Is Freed Up
SSD (DATA)

As new data is added to the disk, new
blocks will be continually written to
the SSD.

1
2
3
4
5
6
7
8

Block size (128 KB by default)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 19
Accessing An Object In Aerospike
How Space Is Freed Up
SSD (DATA)

Over time, some records will be
deleted or updated, resulting in
fragmented usage on the flash/SSD
disk. This unused space must be freed
up.

1
2
3
4
5
6
7
8

Block size

© 2014 Aerospike. All rights reserved. Confidential

Pg. 20
Accessing An Object In Aerospike
How Space Is Freed Up
SSD (DATA)

Some databases use a nightly process
called “compaction,” which is an
intensive process. Aerospike runs a
regular process (every few minutes) that
looks for blocks below some level of use
(called the high watermark).

1
2
3
4
5
6
7

In this example, if the high watermark is
50%, blocks 1 and 3 to the left are below
50% occupied. The defragmenter will
take the data in these blocks and merge
then into another block.

8

Block size

© 2014 Aerospike. All rights reserved. Confidential

Pg. 21
Accessing An Object In Aerospike
How Space Is Freed Up
SSD (DATA)

The defragmenter will get write the new
block (block 7) and clear up blocks 1 and
3 for new writes.

1
2
3
4

Because this runs constantly, there is no
special time where the performance of
the database is bad.

5
6
7
8

This algorithm operates best when the
SSD is less than 50% occupied. As disk use
grows above this, the performance of the
defragmenter will decrease.

Block size

© 2014 Aerospike. All rights reserved. Confidential

Pg. 22
Data Hygiene
Every database has a maximum capacity.
Aerospike provides you with a few tools to
manage the behavior of the database.
TTL (time-to-live) and Expiration
 High watermark and Eviction
 Stop-writes


© 2014 Aerospike. All rights reserved. Confidential

Pg. 23
TTL and Expiration
Aerospike has a mechanism for automatically
expiring data if it has not been changed in a
while.
In the configuration of the namespace you can set
a TTL (time-to-live). When the data has reached
its TTL limit, it will be automatically deleted
from the database.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 24
TTL and Expiration
This works using the following logic:
• When a record is first written, the server will take the current
time and add the TTL. The server will use the default TTL as
configured on the server or, if supplied, an overriding TTL from the
client.
• If the record is updated the TTL will be extended as if it were a
new write. You can override this from the client.
• A “touch” from the client has the same impact, but does not
change the data.
• Reads do not affect the expiration time.
• When the expiration time has been reached a background job will
delete the data.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 25
TTL and Expiration
Special Note:
If you do not want data to expire, but want to keep all data, you can set
the TTL value to 0.
However, if you have set the TTL to any non-zero value, it is not possible
to set any object to not expire.
As a workaround, you can set a very long TTL from the client.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 26
High Watermarks and Eviction
One of the most dangerous things that can happen to a
database is to reach maximum capacity.
Aerospike has a set of safety measures that will allow you
to minimize the chances of hitting that limit and taking
corrective action before then.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 27
High Watermarks and Eviction
A high watermark is a threshold for RAM or disk use. When the
watermark has been breached, the server will begin to evict data. It
will start by evicting data closest to its TTL.
For example:
A namespace has a 90 day TTL. The server automatically deletes data
that has not been written in the past 90 data. The server hits the high
watermark for SSD usage. It will begin to delete data that is 89 days
from its TTL, then 88, etc.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 28
Stop Writes
Suppose data is being written/changed in the
database at such a rate that the evictions cannot
keep up.
It is possible for the database to use more and
more space until it hits 100% usage. In order to
prevent data loss, Aerospike has a mark called
“stop-writes.” When the database usage hits this
level, the database will stop writing new data.
The server will respond with an error to new
write requests.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 29
Configuring Storage
Configuration File
There are 7 major stanzas in an Aerospike
configuration file.









service (required, covered in Part 1)
logging (required, covered in Part 1)
network (required, covered in Part 1)
mod-lua (required for 3.x, covered in Part 1)
cluster (optional, not covered)
namespace (at least 1 required)
xdr (optional, not covered)

These will look like this:
namespace <NAMESPACE NAME> {
...
}

© 2014 Aerospike. All rights reserved. Confidential

Pg. 31
Special Note

Parameters that are most commonly problematic
are denoted in RED. Pay special attention to
these, since the ramifications of improperly
setting these variables may take months to show
up or be difficult to fix once set.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 32
Namespace Configuration Parameters
There are 2 different kinds of namespaces:
o
o

RAM (with or without HDD for persistence)
Flash/SSD

Each namespace type has specific parameters,
but some are common to all of them.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 33
Namespace Parameters: Common Parameters
Some of the parameters for namespaces are
common for all types.
Parameters covered:






Replication factor
Watermarks
DRAM allocation
Time-to-live (TTL)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 34
Replication Factor
Description

The total (master + replica) copies of data in the database
cluster.

Stanza location

namespace

Config parameters
(defaults)

replication-factor

Notes

Some databases refer to the replication factor as being only the
number of copies, without the master. Aerospike refers to it as
the total number of copies.

Change dynamically

No

Best practices

For almost all use cases, the best replication factor is “2”.
“1” would not give any replication and means that the loss of a
node means a loss of data.
“3” or more would mean you would need additional storage to
accommodate the additional copies.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 35
Storage Watermarks
Description

Aerospike has built in a set of watermarks that trigger actions
meant to act as safety valves.

Stanza location

namespace

Config parameters
(defaults)

high-water-memory-pct
high-water-disk-pct
stop-writes-pct

Notes

These features do not exist in most other databases, so some care should
be taken to understand how these work. If either the high-water-memorypct or high-water-disk-pct, the server will begin to evict data closest to its
time-to-live (TTL). If the either RAM or disk should exceed the stopwrites-pct, the server will no longer accept write requests for new records
(note that it will accept updates to existing records).

Change dynamically

Yes

Best practices

•

•
•

•

Recommended settings are:
•
high-water-memory-pct 60
•
high-water-disk-pct 50
•
stop-writes-pct 90
You should always account for changes in node could within the
cluster. So consider what happens if you were to lose a node.
These settings are dynamically changeable, if you do want to increase
these values temporarily, makes sure to take the appropriate
corrective action.
In order to properly defragment Flash/SSD namespaces, keep the highwater-disk at 50.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 36
RAM Allocation
Description

All namespaces (whether RAM or Flash/SSD) require RAM for at
least the index. This controls how much RAM is allocated.

Stanza location

namespace

Config parameters
(defaults)

memory-size (4G)

Notes

The minimum size for this is 1 GB. Aerospike uses exactly 64
bytes of memory for each record. This will be multiplied by the
replication factor.
If you are using a RAM namespace, all data will also be stored
in RAM. Aerospike does not cache data.

Change dynamically

Yes

Best practices

Since each namespace uses at least 1 GB, do not configure
unused namespaces.
Always be aware that the amount of RAM in use will increase if
the cluster loses a node.
The amount of memory set here should also take into account
the high water marks for memory above.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 37
Namespace Configuration: Common Parameters
The general configuration parameters for every namespace are:
namespace test_namespace {
replication-factor 2
high-water-memory-pct 60
high-water-disk-pct 50
stop-writes-pct 90
memory-size 4G
default-ttl 2592000
...
}

# TODO: Default 30 days expiration (1 day=86400)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 38
Data Storage In RAM (No persistence)
RAM namespaces with no persistence stores all
data and indexes in RAM.
The configuration parameters are:
➤ Replication factor*
➤ Watermarks*
➤ RAM allocation*
➤ Time-to-live (TTL)*
 Storage Engine
*See the Data Storage Common Parameters above.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 39
Storage Engine for RAM (No Persistence)
Description

Sets how data will be stored.

Stanza location

namespace

Config parameters
(defaults)

storage-engine memory

Notes

Aerospike does not cache. So all data (index + data) will be
stored in RAM.

Change dynamically

No

Best practices

Be sure to take into consideration what will happen if you lose
one or more nodes. See the Storage Watermark section above.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 40
Namespace Configuration: Common Parameters
The general configuration parameters for every namespace are:
namespace test_namespace {
replication-factor 2
high-water-memory-pct 60
high-water-disk-pct 50
stop-writes-pct 90
memory-size 4G
default-ttl 2592000
storage-engine memory
}

# TODO: Default 30 days expiration (1 day=86400)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 41
Data Storage In RAM + HDD
RAM namespaces with persistence (RAM +HDD)
stores all data and indexes in RAM, but stores the
data on disk. Note that although we say “HDD,”
you may also use SSDs.
The configuration parameters are:
➤ All the values from Data Storage in RAM above
 Storage Engine
 Persistence File
 Whether or not to load at startup
 Whether or not to store data in memory
© 2014 Aerospike. All rights reserved. Confidential

Pg. 42
Storage Engine for RAM + HDD (With Persistence)
Description

Sets how data will be stored.

Stanza location

namespace

Config parameters
(defaults)

storage-engine device

Notes

Aerospike does not cache. So all data (index + data) will be
stored in RAM. The device will only be used for persistence and
will only be used when starting up the node. The server will use
this to rebuild the index in memory.

Change dynamically

No

Best practices

Be sure to take into consideration what will happen if you lose
one or more nodes. See the Storage Watermark section above.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 43
RAM + HDD Persistence File
Description

These are the settings for the persistence file

Stanza location

namespace:storage-engine

Config parameters
(defaults)

file or device
filesize

Notes

When using a file, you specify the path of the file. When using
the device, you specify the device (such as “/dev/sdb1”) of the
disk or partition. You must also specify the filesize.
When using device, the server assumes it will use the entire
disk, so please make sure there is no data on this partition.

Change dynamically

No

Best practices

•

•

Aerospike recommends setting the size of the persistence
file at 8x the amount of RAM allocated in the “memorysize” of the namespace.
You may use more than one file or device, but the server
will balance between them.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 44
Load At Startup Option
Description

Determines whether or not the server should read from
persistence to load data.

Stanza location

namespace:storage-engine

Config parameters
(defaults)

load-at-startup (false)

Notes

When set to true, the server will read from the persistence
store and build up the primary index in RAM.
When set to false, the server will disregard any data in the
persistence store and get data from the other nodes.

Change dynamically

No

Best practices

•
•

Aerospike recommends setting this to be true.
If the node has been offline for long enough (many hours or
longer), the server will have to deal with data conflicts. It
will take less time to get the data from the other nodes. In
this case, you should set the value to “false”. However,
after the server has started, change the value in the config
file to “true” again to deal with short-term outages.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 45
Store Data In Memory
Description

Tells the server to store data in memory, rather than on
storage.

Stanza location

namespace:storage-engine

Config parameters
(defaults)

data-in-memory

Notes

When set to “true”, the server will store data in RAM and use
disk only for persistence.
If set to “false”, the server will only store the index in RAM and
use the disk for data storage.

Change dynamically

No

Best practices

•

In principle it is possible to use rotational disk to store the
data and use RAM only for the index. Aerospike has found
that performance with this configuration is very poor.
Aerospike does not recommend or support this
configuration.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 46
Namespace Configuration: Common Parameters
The configuration parameters for RAM + HDD are:
namespace RAM_persist_namespace {
replication-factor 2
high-water-memory-pct 60
high-water-disk-pct 50
stop-writes-pct 90
memory-size 4G
default-ttl 2592000
# TODO: Default 30 days expiration (1 day=86400)
storage-engine device {
file /opt/aerospike/data/test.data
filesize 32G
load-at-startup true
data-in-memory true
}
}

© 2014 Aerospike. All rights reserved. Confidential

Pg. 47
Data Storage In Flash/SSD
Flash/SSD namespaces store indexes in RAM and
all data on Flash/SSD.
The configuration parameters are:
➤ Common Data Storage Parameters
 Storage Engine
 Flash/SSD Devices
 Whether or not to load at startup
 Whether or not to store data in memory
*See the Data Storage Common Parameters above.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 48
Storage Engine for Flash/SSD
Description

Sets how data will be stored.

Stanza location

namespace

Config parameters
(defaults)

storage-engine device

Notes

The server will use RAM for index only and Flash/SSD for data.

Change dynamically

No

Best practices

•

•

Be sure to take into consideration what will happen if you
lose one or more nodes. See the Storage Watermark section
above.
Flash/SSDs should be no more than 50% full to allow for
efficient defragmentation. Usage at above this rate is fine
for short periods of time.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 49
Flash/SSD Devices
Description

These are the settings for the Fash/SSD devices

Stanza location

namespace:storage-engine

Config parameters
(defaults)

device

Notes

You must specify the device (such as /dev/sdb) of the SSDs.

Change dynamically

No

Best practices

•
•

You may use more than one file or device, but the server
will balance between them. Do not use heterogeneously
sized devices.
You may use SATA or PCIe devices.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 50
Load At Startup Option
Description

Determines whether or not the server should read from
persistence to load data.

Stanza location

namespace:storage-engine

Config parameters
(defaults)

load-at-startup (false)

Notes

When set to true, the server will read from the persistence
store and build up the primary index in RAM.
When set to false, the server will disregard any data in the
persistence store and get data from the other nodes.

Change dynamically

No

Best practices

•
•

Aerospike recommends setting this to be true.
If the node has been offline for long enough (many hours or
longer), the server will have to deal with data conflicts. It
will take less time to get the data from the other nodes. In
this case, you should set the value to “false”. However,
after the server has started, change the value in the config
file to “true” again to deal with short-term outages.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 51
Store Data In Memory
Description

Tells the server to store data in memory, rather than on
storage.

Stanza location

namespace:storage-engine

Config parameters
(defaults)

data-in-memory

Notes

When set to “true”, the server will store data in RAM and use
disk only for persistence.
If set to “false”, the server will only store the index in RAM and
use the disk for data storage.

Change dynamically

No

Best practices

•
•
•

In principle it is possible to use Flash/SSD as persistence
only. However, most modern rotational hard drives are fast
enough so that using Flash/SSD is not necessary.
The drives must be cleared or “dd”ed prior to use in the
database.
Be sure to test the drives in the servers using the Aerospike
ACT test tool (http://aerospike.github.io/act/).

© 2014 Aerospike. All rights reserved. Confidential

Pg. 52
Write Block Size
Description

Tells the server what block size to use when writing data.

Stanza location

namespace:storage-engine

Config parameters
(defaults)

write-block-size

Notes

The value for this must be a multiple of 128 KB. Aerospike has
tested these up to 1 MB. No single record can be larger than
this block size, so size this according to the maximum size in
the database.

Change dynamically

No

Best practices

Size these according to the largest size of any object in your
database.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 53
Namespace Configuration: Common Parameters
The configuration parameters for RAM + HDD are:
namespace ssd_namespace {
replication-factor 2
high-water-memory-pct 60
high-water-disk-pct 50
stop-writes-pct 90
memory-size 4G
default-ttl 2592000
storage-engine device {
device /dev/sdb
device /dev/sdc
load-at-startup true
data-in-memory false
write-block-size 128K
}
}

# TODO: Default 30 days expiration (1 day=86400)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 54
Agenda






Data Hierarchy
How Aerospike reads/writes/updates data
Defragmentation
Data Hygiene
Configuring Storage

© 2014 Aerospike. All rights reserved. Confidential

Pg. 55
Q&A
Thank You
Send any questions/comments/complaints to

Young Paik
young@aerospike.com

Mais conteúdo relacionado

Mais procurados

Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph Community
 
SQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialSQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialDaniel Abadi
 
QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?Pradeep Kumar
 
Crimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryCrimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryScyllaDB
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerMongoDB
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Under the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureUnder the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureScyllaDB
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compactionMIJIN AN
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Ceph Community
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...Dremio Corporation
 
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...InfluxData
 
XPDS16: Keeping coherency on ARM - Julien Grall, ARM
XPDS16: Keeping coherency on ARM - Julien Grall, ARMXPDS16: Keeping coherency on ARM - Julien Grall, ARM
XPDS16: Keeping coherency on ARM - Julien Grall, ARMThe Linux Foundation
 
分布式存储的元数据设计
分布式存储的元数据设计分布式存储的元数据设计
分布式存储的元数据设计LI Daobing
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSHSage Weil
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific DashboardCeph Community
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBYugabyteDB
 
Implementing distributed mclock in ceph
Implementing distributed mclock in cephImplementing distributed mclock in ceph
Implementing distributed mclock in ceph병수 박
 
Performance Analysis of Apache Spark and Presto in Cloud Environments
Performance Analysis of Apache Spark and Presto in Cloud EnvironmentsPerformance Analysis of Apache Spark and Presto in Cloud Environments
Performance Analysis of Apache Spark and Presto in Cloud EnvironmentsDatabricks
 

Mais procurados (20)

Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021
 
SQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialSQL-on-Hadoop Tutorial
SQL-on-Hadoop Tutorial
 
QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?QEMU Disk IO Which performs Better: Native or threads?
QEMU Disk IO Which performs Better: Native or threads?
 
Crimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryCrimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent Memory
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
A Technical Introduction to WiredTiger
A Technical Introduction to WiredTigerA Technical Introduction to WiredTiger
A Technical Introduction to WiredTiger
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Under the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureUnder the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database Architecture
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
 
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...InfluxDB IOx Tech Talks: The Impossible Dream:  Easy-to-Use, Super Fast Softw...
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
 
XPDS16: Keeping coherency on ARM - Julien Grall, ARM
XPDS16: Keeping coherency on ARM - Julien Grall, ARMXPDS16: Keeping coherency on ARM - Julien Grall, ARM
XPDS16: Keeping coherency on ARM - Julien Grall, ARM
 
分布式存储的元数据设计
分布式存储的元数据设计分布式存储的元数据设计
分布式存储的元数据设计
 
A crash course in CRUSH
A crash course in CRUSHA crash course in CRUSH
A crash course in CRUSH
 
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...
 
2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard2021.02 new in Ceph Pacific Dashboard
2021.02 new in Ceph Pacific Dashboard
 
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DBDistributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
Distributed Databases Deconstructed: CockroachDB, TiDB and YugaByte DB
 
Implementing distributed mclock in ceph
Implementing distributed mclock in cephImplementing distributed mclock in ceph
Implementing distributed mclock in ceph
 
Performance Analysis of Apache Spark and Presto in Cloud Environments
Performance Analysis of Apache Spark and Presto in Cloud EnvironmentsPerformance Analysis of Apache Spark and Presto in Cloud Environments
Performance Analysis of Apache Spark and Presto in Cloud Environments
 

Semelhante a Configuring Aerospike - Part 2

fdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptfdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptyashsharma863914
 
Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?Aerospike, Inc.
 
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Aerospike
 
Managing and Configuring Databases
Managing and Configuring DatabasesManaging and Configuring Databases
Managing and Configuring DatabasesRam Kedem
 
Aerospike: Maximizing Performance
Aerospike: Maximizing PerformanceAerospike: Maximizing Performance
Aerospike: Maximizing PerformanceAerospike, Inc.
 
A presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASA presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASRahul Janghel
 
Art of the Possible_Tim Faulkes.pdf
Art of the Possible_Tim Faulkes.pdfArt of the Possible_Tim Faulkes.pdf
Art of the Possible_Tim Faulkes.pdfAerospike, Inc.
 
Exploring the Oracle Database Architecture.ppt
Exploring the Oracle Database Architecture.pptExploring the Oracle Database Architecture.ppt
Exploring the Oracle Database Architecture.pptMohammedHdi1
 
exploring-the-oracle-database-architecture.ppt
exploring-the-oracle-database-architecture.pptexploring-the-oracle-database-architecture.ppt
exploring-the-oracle-database-architecture.pptAmitavaRoy49
 
Architecture_Masking_Delphix.pptx
Architecture_Masking_Delphix.pptxArchitecture_Masking_Delphix.pptx
Architecture_Masking_Delphix.pptxshaikshazil1
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateDataWorks Summit
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveSematext Group, Inc.
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Shardinguzzal basak
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inRahulBhole12
 
1Z0-027 Exam-Oracle Exadata Database Machine Administration, Software Release
1Z0-027 Exam-Oracle Exadata Database Machine Administration, Software Release1Z0-027 Exam-Oracle Exadata Database Machine Administration, Software Release
1Z0-027 Exam-Oracle Exadata Database Machine Administration, Software ReleaseIsabella789
 
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...VMware Tanzu
 

Semelhante a Configuring Aerospike - Part 2 (20)

fdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptfdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.ppt
 
Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower Manhattan
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?
 
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
 
Managing and Configuring Databases
Managing and Configuring DatabasesManaging and Configuring Databases
Managing and Configuring Databases
 
Dba tuning
Dba tuningDba tuning
Dba tuning
 
Aerospike: Maximizing Performance
Aerospike: Maximizing PerformanceAerospike: Maximizing Performance
Aerospike: Maximizing Performance
 
A presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASA presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NAS
 
Art of the Possible_Tim Faulkes.pdf
Art of the Possible_Tim Faulkes.pdfArt of the Possible_Tim Faulkes.pdf
Art of the Possible_Tim Faulkes.pdf
 
Exploring the Oracle Database Architecture.ppt
Exploring the Oracle Database Architecture.pptExploring the Oracle Database Architecture.ppt
Exploring the Oracle Database Architecture.ppt
 
exploring-the-oracle-database-architecture.ppt
exploring-the-oracle-database-architecture.pptexploring-the-oracle-database-architecture.ppt
exploring-the-oracle-database-architecture.ppt
 
Architecture_Masking_Delphix.pptx
Architecture_Masking_Delphix.pptxArchitecture_Masking_Delphix.pptx
Architecture_Masking_Delphix.pptx
 
Apache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community UpdateApache Hadoop 3.0 Community Update
Apache Hadoop 3.0 Community Update
 
Vmfs
VmfsVmfs
Vmfs
 
os
osos
os
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
 
MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
1Z0-027 Exam-Oracle Exadata Database Machine Administration, Software Release
1Z0-027 Exam-Oracle Exadata Database Machine Administration, Software Release1Z0-027 Exam-Oracle Exadata Database Machine Administration, Software Release
1Z0-027 Exam-Oracle Exadata Database Machine Administration, Software Release
 
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
Start Counting: How We Unlocked Platform Efficiency and Reliability While Sav...
 

Mais de Aerospike, Inc.

2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of EngagementAerospike, Inc.
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...Aerospike, Inc.
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSAerospike, Inc.
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to DeploymentAerospike, Inc.
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinarAerospike, Inc.
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
Tectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessTectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessAerospike, Inc.
 
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeHow to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeAerospike, Inc.
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesAerospike, Inc.
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourAerospike, Inc.
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACIDAerospike, Inc.
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Aerospike, Inc.
 
Storm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsStorm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsAerospike, Inc.
 
Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike WayAerospike, Inc.
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveAerospike, Inc.
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timeAerospike, Inc.
 

Mais de Aerospike, Inc. (19)

2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMS
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to Deployment
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
Tectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessTectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven Business
 
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeHow to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use Cases
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...
 
Storm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsStorm Persistence and Real-Time Analytics
Storm Persistence and Real-Time Analytics
 
Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike Way
 
Introduction to Aerospike
Introduction to AerospikeIntroduction to Aerospike
Introduction to Aerospike
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's Perspective
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-time
 

Último

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 

Último (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 

Configuring Aerospike - Part 2

  • 1. Aerospike 3 Configuration: Basic Configuration Part 2 Young Paik Director of Sales Engineering young@aerospike.com Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability
  • 3. Curriculum This training module is an overview of the Aerospike Database and covers the basic configuration of a single Aerospike Database Cluster. This course has been split into 3 areas: 1. Overview* 2. Database Service Configuration* 3. Database Storage Configuration * These parts were covered in the first part of the webinar series. © 2014 Aerospike. All rights reserved. Confidential Pg. 3
  • 4. Agenda Data Hierarchy  How Aerospike reads/writes/updates data  Defragmentation  Data Hygiene  Configuring Storage  © 2014 Aerospike. All rights reserved. Confidential Pg. 4
  • 5. Data Hierarchy Cluster Node 1 Node 3 Node 2 Namespace Set Record Record Bin © 2014 Aerospike. All rights reserved. Confidential Bin Bin Pg. 5
  • 6. Database Hierarchy Term Definition Notes Cluster An Aerospike cluster services a single database service. While a company may deploy multiple clusters, applications will only connect to a single cluster. Node A single instance of an Aerospike database. For production deployments, a host should only have a single node. For development, you may place more than one node on a host. Namespace An area of storage related to the media. Can be either RAM or SSD based. Similar to a “database” or “tablespaces” in relational databases. Set An unstructured grouping of data that have some commonality. Similar to “tables” in a relational database, but do not require a schema. Record A key and all data related to that key. Similar to a “row” in a relational database. Bin One part of data related to a key. Bins in Aerospike are typed, but the same bin in different records can have different types. Bins are not required. Single bin optimizations are allowed. © 2014 Aerospike. All rights reserved. Confidential Pg. 6
  • 7. Cluster • Will be distributed on different nodes. • Management of cluster is automated, so no manual rebalancing or reconfiguration is necessary. • Will contain one or more namespaces. Adding/removing namespaces requires a cluster-wide restart. © 2014 Aerospike. All rights reserved. Confidential Pg. 7
  • 8. Nodes • Each node is assumed to be identical. • Data (and their associated traffic) will be evenly balanced across the nodes. • Big differences between nodes imply a problem. • Node capacity should take into account node failure patterns. © 2014 Aerospike. All rights reserved. Confidential Pg. 8
  • 9. Namespaces • Are associated with the storage media: – Hybrid (ram for index and SSD for data) – RAM + disk for persistence only – RAM only • Each can be configured with their own: – – – – replication factor (change requires a cluster-wide restart) RAM and disk configuration settings for high-watermark default TTL (if you have data that must never be automatically deleted, you must set this to “0”) © 2014 Aerospike. All rights reserved. Confidential Pg. 9
  • 10. Sets • Similar to “tables” in relational databases. • Sets are optional. • Schema does not have to be pre-defined. • In order to request a record, you must know its set. • Scans can be done across a set © 2014 Aerospike. All rights reserved. Confidential Pg. 10
  • 11. Records • Similar to a row in a relational database. • All data for a record will be stored on the same node. • Any change to a record will result in a complete write of the entire record. • Sometimes referred to as “objects” © 2014 Aerospike. All rights reserved. Confidential Pg. 11
  • 12. Bins • Values Are typed. Current types are: – Simple (integer, string, blob [language specific]) – Complex (list, map) • A single bin may be updated by the client. – Increment – Replacement – User Defined Function (UDF) © 2014 Aerospike. All rights reserved. Confidential Pg. 12
  • 13. Data Hierarchy Cluster Node 1 Node 3 Node 2 Namespace Set Record Record Bin © 2014 Aerospike. All rights reserved. Confidential Bin Bin Pg. 13
  • 14. Data Access Patterns How does Aerospike handle the following operations – Read – Write* – Update* * The following slides do not illustrate how replica are written. © 2014 Aerospike. All rights reserved. Confidential Pg. 14
  • 15. Accessing An Object In Aerospike Reading A Standard Data Type With SSDs Client DRAM (Index) Master Node Index reference SSD (DATA) 1) Client finds Master Node from partition map. 2) Client makes read request to Master Node. 3) Master Node finds data location from index in DRAM. 4) Master Node reads entire object from SSD. This is true even if only reading bin. 5) Master Node returns value. Block size (128 KB by default) © 2014 Aerospike. All rights reserved. Confidential Pg. 15
  • 16. Accessing An Object In Aerospike Writing A New Standard Data Type Record With SSDs Client Master Node DRAM (Index) SSD (DATA) 1) Client finds Master Node from partition map. 2) Client makes write request to Master Node. 3) Master Node make an entry indo index (in DRAM) and queues write in temporary write buffer. 4) Master Node coordinates write with replica nodes (not shown). 5) Master Node returns success to client. 6) Master Node asynchronously writes data in blocks. 7) Index in DRAM points to location on SSD. Asynchronous write Block size (128 KB by default) © 2014 Aerospike. All rights reserved. Confidential Pg. 16
  • 17. Accessing An Object In Aerospike Updating A Standard Data Type Record With SSDs Client Master Node DRAM (Index) New SSD (DATA) 1) Client finds Master Node from partition map. 2) Client makes update request to Master Node. 3) Master Node reads the existing record (if using multiple bins) 4) Master Node queues write of updated record in a temporary write buffer 5) Master Node coordinates write with replica nodes (not shown). 6) Master Node returns success to client. 7) Master Node asynchronously writes data in blocks. 8) Index in DRAM points to new location on SSD. Old New Asynchronous write Block size (128 KB by default) © 2014 Aerospike. All rights reserved. Confidential Pg. 17
  • 18. Accessing An Object In Aerospike How Space Is Freed Up SSD (DATA) Aerospike writes the data in large data blocks. 1 2 3 4 5 6 7 8 Block size (128 KB by default) © 2014 Aerospike. All rights reserved. Confidential Pg. 18
  • 19. Accessing An Object In Aerospike How Space Is Freed Up SSD (DATA) As new data is added to the disk, new blocks will be continually written to the SSD. 1 2 3 4 5 6 7 8 Block size (128 KB by default) © 2014 Aerospike. All rights reserved. Confidential Pg. 19
  • 20. Accessing An Object In Aerospike How Space Is Freed Up SSD (DATA) Over time, some records will be deleted or updated, resulting in fragmented usage on the flash/SSD disk. This unused space must be freed up. 1 2 3 4 5 6 7 8 Block size © 2014 Aerospike. All rights reserved. Confidential Pg. 20
  • 21. Accessing An Object In Aerospike How Space Is Freed Up SSD (DATA) Some databases use a nightly process called “compaction,” which is an intensive process. Aerospike runs a regular process (every few minutes) that looks for blocks below some level of use (called the high watermark). 1 2 3 4 5 6 7 In this example, if the high watermark is 50%, blocks 1 and 3 to the left are below 50% occupied. The defragmenter will take the data in these blocks and merge then into another block. 8 Block size © 2014 Aerospike. All rights reserved. Confidential Pg. 21
  • 22. Accessing An Object In Aerospike How Space Is Freed Up SSD (DATA) The defragmenter will get write the new block (block 7) and clear up blocks 1 and 3 for new writes. 1 2 3 4 Because this runs constantly, there is no special time where the performance of the database is bad. 5 6 7 8 This algorithm operates best when the SSD is less than 50% occupied. As disk use grows above this, the performance of the defragmenter will decrease. Block size © 2014 Aerospike. All rights reserved. Confidential Pg. 22
  • 23. Data Hygiene Every database has a maximum capacity. Aerospike provides you with a few tools to manage the behavior of the database. TTL (time-to-live) and Expiration  High watermark and Eviction  Stop-writes  © 2014 Aerospike. All rights reserved. Confidential Pg. 23
  • 24. TTL and Expiration Aerospike has a mechanism for automatically expiring data if it has not been changed in a while. In the configuration of the namespace you can set a TTL (time-to-live). When the data has reached its TTL limit, it will be automatically deleted from the database. © 2014 Aerospike. All rights reserved. Confidential Pg. 24
  • 25. TTL and Expiration This works using the following logic: • When a record is first written, the server will take the current time and add the TTL. The server will use the default TTL as configured on the server or, if supplied, an overriding TTL from the client. • If the record is updated the TTL will be extended as if it were a new write. You can override this from the client. • A “touch” from the client has the same impact, but does not change the data. • Reads do not affect the expiration time. • When the expiration time has been reached a background job will delete the data. © 2014 Aerospike. All rights reserved. Confidential Pg. 25
  • 26. TTL and Expiration Special Note: If you do not want data to expire, but want to keep all data, you can set the TTL value to 0. However, if you have set the TTL to any non-zero value, it is not possible to set any object to not expire. As a workaround, you can set a very long TTL from the client. © 2014 Aerospike. All rights reserved. Confidential Pg. 26
  • 27. High Watermarks and Eviction One of the most dangerous things that can happen to a database is to reach maximum capacity. Aerospike has a set of safety measures that will allow you to minimize the chances of hitting that limit and taking corrective action before then. © 2014 Aerospike. All rights reserved. Confidential Pg. 27
  • 28. High Watermarks and Eviction A high watermark is a threshold for RAM or disk use. When the watermark has been breached, the server will begin to evict data. It will start by evicting data closest to its TTL. For example: A namespace has a 90 day TTL. The server automatically deletes data that has not been written in the past 90 data. The server hits the high watermark for SSD usage. It will begin to delete data that is 89 days from its TTL, then 88, etc. © 2014 Aerospike. All rights reserved. Confidential Pg. 28
  • 29. Stop Writes Suppose data is being written/changed in the database at such a rate that the evictions cannot keep up. It is possible for the database to use more and more space until it hits 100% usage. In order to prevent data loss, Aerospike has a mark called “stop-writes.” When the database usage hits this level, the database will stop writing new data. The server will respond with an error to new write requests. © 2014 Aerospike. All rights reserved. Confidential Pg. 29
  • 31. Configuration File There are 7 major stanzas in an Aerospike configuration file.        service (required, covered in Part 1) logging (required, covered in Part 1) network (required, covered in Part 1) mod-lua (required for 3.x, covered in Part 1) cluster (optional, not covered) namespace (at least 1 required) xdr (optional, not covered) These will look like this: namespace <NAMESPACE NAME> { ... } © 2014 Aerospike. All rights reserved. Confidential Pg. 31
  • 32. Special Note Parameters that are most commonly problematic are denoted in RED. Pay special attention to these, since the ramifications of improperly setting these variables may take months to show up or be difficult to fix once set. © 2014 Aerospike. All rights reserved. Confidential Pg. 32
  • 33. Namespace Configuration Parameters There are 2 different kinds of namespaces: o o RAM (with or without HDD for persistence) Flash/SSD Each namespace type has specific parameters, but some are common to all of them. © 2014 Aerospike. All rights reserved. Confidential Pg. 33
  • 34. Namespace Parameters: Common Parameters Some of the parameters for namespaces are common for all types. Parameters covered:     Replication factor Watermarks DRAM allocation Time-to-live (TTL) © 2014 Aerospike. All rights reserved. Confidential Pg. 34
  • 35. Replication Factor Description The total (master + replica) copies of data in the database cluster. Stanza location namespace Config parameters (defaults) replication-factor Notes Some databases refer to the replication factor as being only the number of copies, without the master. Aerospike refers to it as the total number of copies. Change dynamically No Best practices For almost all use cases, the best replication factor is “2”. “1” would not give any replication and means that the loss of a node means a loss of data. “3” or more would mean you would need additional storage to accommodate the additional copies. © 2014 Aerospike. All rights reserved. Confidential Pg. 35
  • 36. Storage Watermarks Description Aerospike has built in a set of watermarks that trigger actions meant to act as safety valves. Stanza location namespace Config parameters (defaults) high-water-memory-pct high-water-disk-pct stop-writes-pct Notes These features do not exist in most other databases, so some care should be taken to understand how these work. If either the high-water-memorypct or high-water-disk-pct, the server will begin to evict data closest to its time-to-live (TTL). If the either RAM or disk should exceed the stopwrites-pct, the server will no longer accept write requests for new records (note that it will accept updates to existing records). Change dynamically Yes Best practices • • • • Recommended settings are: • high-water-memory-pct 60 • high-water-disk-pct 50 • stop-writes-pct 90 You should always account for changes in node could within the cluster. So consider what happens if you were to lose a node. These settings are dynamically changeable, if you do want to increase these values temporarily, makes sure to take the appropriate corrective action. In order to properly defragment Flash/SSD namespaces, keep the highwater-disk at 50. © 2014 Aerospike. All rights reserved. Confidential Pg. 36
  • 37. RAM Allocation Description All namespaces (whether RAM or Flash/SSD) require RAM for at least the index. This controls how much RAM is allocated. Stanza location namespace Config parameters (defaults) memory-size (4G) Notes The minimum size for this is 1 GB. Aerospike uses exactly 64 bytes of memory for each record. This will be multiplied by the replication factor. If you are using a RAM namespace, all data will also be stored in RAM. Aerospike does not cache data. Change dynamically Yes Best practices Since each namespace uses at least 1 GB, do not configure unused namespaces. Always be aware that the amount of RAM in use will increase if the cluster loses a node. The amount of memory set here should also take into account the high water marks for memory above. © 2014 Aerospike. All rights reserved. Confidential Pg. 37
  • 38. Namespace Configuration: Common Parameters The general configuration parameters for every namespace are: namespace test_namespace { replication-factor 2 high-water-memory-pct 60 high-water-disk-pct 50 stop-writes-pct 90 memory-size 4G default-ttl 2592000 ... } # TODO: Default 30 days expiration (1 day=86400) © 2014 Aerospike. All rights reserved. Confidential Pg. 38
  • 39. Data Storage In RAM (No persistence) RAM namespaces with no persistence stores all data and indexes in RAM. The configuration parameters are: ➤ Replication factor* ➤ Watermarks* ➤ RAM allocation* ➤ Time-to-live (TTL)*  Storage Engine *See the Data Storage Common Parameters above. © 2014 Aerospike. All rights reserved. Confidential Pg. 39
  • 40. Storage Engine for RAM (No Persistence) Description Sets how data will be stored. Stanza location namespace Config parameters (defaults) storage-engine memory Notes Aerospike does not cache. So all data (index + data) will be stored in RAM. Change dynamically No Best practices Be sure to take into consideration what will happen if you lose one or more nodes. See the Storage Watermark section above. © 2014 Aerospike. All rights reserved. Confidential Pg. 40
  • 41. Namespace Configuration: Common Parameters The general configuration parameters for every namespace are: namespace test_namespace { replication-factor 2 high-water-memory-pct 60 high-water-disk-pct 50 stop-writes-pct 90 memory-size 4G default-ttl 2592000 storage-engine memory } # TODO: Default 30 days expiration (1 day=86400) © 2014 Aerospike. All rights reserved. Confidential Pg. 41
  • 42. Data Storage In RAM + HDD RAM namespaces with persistence (RAM +HDD) stores all data and indexes in RAM, but stores the data on disk. Note that although we say “HDD,” you may also use SSDs. The configuration parameters are: ➤ All the values from Data Storage in RAM above  Storage Engine  Persistence File  Whether or not to load at startup  Whether or not to store data in memory © 2014 Aerospike. All rights reserved. Confidential Pg. 42
  • 43. Storage Engine for RAM + HDD (With Persistence) Description Sets how data will be stored. Stanza location namespace Config parameters (defaults) storage-engine device Notes Aerospike does not cache. So all data (index + data) will be stored in RAM. The device will only be used for persistence and will only be used when starting up the node. The server will use this to rebuild the index in memory. Change dynamically No Best practices Be sure to take into consideration what will happen if you lose one or more nodes. See the Storage Watermark section above. © 2014 Aerospike. All rights reserved. Confidential Pg. 43
  • 44. RAM + HDD Persistence File Description These are the settings for the persistence file Stanza location namespace:storage-engine Config parameters (defaults) file or device filesize Notes When using a file, you specify the path of the file. When using the device, you specify the device (such as “/dev/sdb1”) of the disk or partition. You must also specify the filesize. When using device, the server assumes it will use the entire disk, so please make sure there is no data on this partition. Change dynamically No Best practices • • Aerospike recommends setting the size of the persistence file at 8x the amount of RAM allocated in the “memorysize” of the namespace. You may use more than one file or device, but the server will balance between them. © 2014 Aerospike. All rights reserved. Confidential Pg. 44
  • 45. Load At Startup Option Description Determines whether or not the server should read from persistence to load data. Stanza location namespace:storage-engine Config parameters (defaults) load-at-startup (false) Notes When set to true, the server will read from the persistence store and build up the primary index in RAM. When set to false, the server will disregard any data in the persistence store and get data from the other nodes. Change dynamically No Best practices • • Aerospike recommends setting this to be true. If the node has been offline for long enough (many hours or longer), the server will have to deal with data conflicts. It will take less time to get the data from the other nodes. In this case, you should set the value to “false”. However, after the server has started, change the value in the config file to “true” again to deal with short-term outages. © 2014 Aerospike. All rights reserved. Confidential Pg. 45
  • 46. Store Data In Memory Description Tells the server to store data in memory, rather than on storage. Stanza location namespace:storage-engine Config parameters (defaults) data-in-memory Notes When set to “true”, the server will store data in RAM and use disk only for persistence. If set to “false”, the server will only store the index in RAM and use the disk for data storage. Change dynamically No Best practices • In principle it is possible to use rotational disk to store the data and use RAM only for the index. Aerospike has found that performance with this configuration is very poor. Aerospike does not recommend or support this configuration. © 2014 Aerospike. All rights reserved. Confidential Pg. 46
  • 47. Namespace Configuration: Common Parameters The configuration parameters for RAM + HDD are: namespace RAM_persist_namespace { replication-factor 2 high-water-memory-pct 60 high-water-disk-pct 50 stop-writes-pct 90 memory-size 4G default-ttl 2592000 # TODO: Default 30 days expiration (1 day=86400) storage-engine device { file /opt/aerospike/data/test.data filesize 32G load-at-startup true data-in-memory true } } © 2014 Aerospike. All rights reserved. Confidential Pg. 47
  • 48. Data Storage In Flash/SSD Flash/SSD namespaces store indexes in RAM and all data on Flash/SSD. The configuration parameters are: ➤ Common Data Storage Parameters  Storage Engine  Flash/SSD Devices  Whether or not to load at startup  Whether or not to store data in memory *See the Data Storage Common Parameters above. © 2014 Aerospike. All rights reserved. Confidential Pg. 48
  • 49. Storage Engine for Flash/SSD Description Sets how data will be stored. Stanza location namespace Config parameters (defaults) storage-engine device Notes The server will use RAM for index only and Flash/SSD for data. Change dynamically No Best practices • • Be sure to take into consideration what will happen if you lose one or more nodes. See the Storage Watermark section above. Flash/SSDs should be no more than 50% full to allow for efficient defragmentation. Usage at above this rate is fine for short periods of time. © 2014 Aerospike. All rights reserved. Confidential Pg. 49
  • 50. Flash/SSD Devices Description These are the settings for the Fash/SSD devices Stanza location namespace:storage-engine Config parameters (defaults) device Notes You must specify the device (such as /dev/sdb) of the SSDs. Change dynamically No Best practices • • You may use more than one file or device, but the server will balance between them. Do not use heterogeneously sized devices. You may use SATA or PCIe devices. © 2014 Aerospike. All rights reserved. Confidential Pg. 50
  • 51. Load At Startup Option Description Determines whether or not the server should read from persistence to load data. Stanza location namespace:storage-engine Config parameters (defaults) load-at-startup (false) Notes When set to true, the server will read from the persistence store and build up the primary index in RAM. When set to false, the server will disregard any data in the persistence store and get data from the other nodes. Change dynamically No Best practices • • Aerospike recommends setting this to be true. If the node has been offline for long enough (many hours or longer), the server will have to deal with data conflicts. It will take less time to get the data from the other nodes. In this case, you should set the value to “false”. However, after the server has started, change the value in the config file to “true” again to deal with short-term outages. © 2014 Aerospike. All rights reserved. Confidential Pg. 51
  • 52. Store Data In Memory Description Tells the server to store data in memory, rather than on storage. Stanza location namespace:storage-engine Config parameters (defaults) data-in-memory Notes When set to “true”, the server will store data in RAM and use disk only for persistence. If set to “false”, the server will only store the index in RAM and use the disk for data storage. Change dynamically No Best practices • • • In principle it is possible to use Flash/SSD as persistence only. However, most modern rotational hard drives are fast enough so that using Flash/SSD is not necessary. The drives must be cleared or “dd”ed prior to use in the database. Be sure to test the drives in the servers using the Aerospike ACT test tool (http://aerospike.github.io/act/). © 2014 Aerospike. All rights reserved. Confidential Pg. 52
  • 53. Write Block Size Description Tells the server what block size to use when writing data. Stanza location namespace:storage-engine Config parameters (defaults) write-block-size Notes The value for this must be a multiple of 128 KB. Aerospike has tested these up to 1 MB. No single record can be larger than this block size, so size this according to the maximum size in the database. Change dynamically No Best practices Size these according to the largest size of any object in your database. © 2014 Aerospike. All rights reserved. Confidential Pg. 53
  • 54. Namespace Configuration: Common Parameters The configuration parameters for RAM + HDD are: namespace ssd_namespace { replication-factor 2 high-water-memory-pct 60 high-water-disk-pct 50 stop-writes-pct 90 memory-size 4G default-ttl 2592000 storage-engine device { device /dev/sdb device /dev/sdc load-at-startup true data-in-memory false write-block-size 128K } } # TODO: Default 30 days expiration (1 day=86400) © 2014 Aerospike. All rights reserved. Confidential Pg. 54
  • 55. Agenda      Data Hierarchy How Aerospike reads/writes/updates data Defragmentation Data Hygiene Configuring Storage © 2014 Aerospike. All rights reserved. Confidential Pg. 55
  • 56. Q&A
  • 57. Thank You Send any questions/comments/complaints to Young Paik young@aerospike.com

Notas do Editor

  1. FastestBest uptimePredictable performanceconsistency