Evaluating NoSQL performance: Which database is right for your data? - Sergey Sverchkov (Altoros)

•Transferir como PPTX, PDF•

1 gostou•753 visualizações

Presented at JAX London 2013 The need to operate terabyte-size databases becomes very common these days. Unless you have implemented architectures that use NoSQL databases and frameworks that support data-intensive distributed applications, then many technology options available are probably a slight enigma. This session focuses on real-world successful attempts to benchmark four of the most popular NoSQL databases side by side. The base tool selected for the purpose of this research is Yahoo Cloud Serving Benchmark and benchmarking is performed on Amazon Elastic Compute Cloud instances.

Tecnologia Negócios

Sergey Sverchkov
Project Manager
sergey.sverchkov@altoros.com

© ALTOROS Systems | CONFIDENTIAL

ORDER
Order
ID: 1001
Order Date: 15.9.2012
Customer






Billing Address
Street: Somestreet 10
City: Somewhere
Postal Code: 55901



ADDRESS

Line Items
Quantity

Price

Ipod Touch

1

220.95

Monster Beat

2

190.00

Apple Mouse

1

69.90

Name



CUSTOMER

First Name: Peter
Last Name: Sample

ORDER_LINES






© ALTOROS Systems | CONFIDENTIAL

2

•
•
•
•
•
•
•
•
•

© ALTOROS Systems | CONFIDENTIAL

3

•
•

•
•
•
•

© ALTOROS Systems | CONFIDENTIAL

4

•



• Workload is defined by different distributions



•

Operations of the following types:





© ALTOROS Systems | CONFIDENTIAL

5

•




“user234123”



•





© ALTOROS Systems | CONFIDENTIAL

6

•
 Single availability zone eu-west-1b, Ireland region
 Single security group with all required port opened
 4 m1.xlarge 64bit instances for cluster nodes: 16GB RAM, 4 vCPU, 8 ECU, highperformance network
 1 c1.xlarge 64bit instance for YSCB client: 7GB RAM, 8 vCPU, 20 ECU, highperformance network
 2 additional c1.medium 64bit instances for mongo routers: 1.7GB RAM, 2 vCPU, 5
ECU, moderate network

•
 4 EBS volumes by 25 GB each in RAID0
 EBS optimized volumes, no Provisioned IOPS
© ALTOROS Systems | CONFIDENTIAL

9

•
 partitioner: org.apache.cassandra.dht.Murmur3Partitioner
 key_cache_size_in_mb: 1024
 row_cache_size_in_mb: 6096
 JVM heap size: 6GB
 Snappy compressor
 Replica factor 1

•
 2 c1.medium nodes with mongo router process - mongos
 Replica factor 1
 Sharding by internal key “_id”

© ALTOROS Systems | CONFIDENTIAL

10

•
 Replica factor 1
 Memory + disk mode

•
 JVM heap size 12GB
 Replica factor 1

 Snappy compressor

© ALTOROS Systems | CONFIDENTIAL

11

Performance of the systems was evaluated under different workloads:







© ALTOROS Systems | CONFIDENTIAL

12

Load phase, 100.000.000 records * 1 KB, [INSERT]
9

Average latency, ms

8
7
6
5

hbase

4

cassandra

3

couchbase
mongodb

2
1
0
0

10000

20000

30000

40000

Throughput, ops/sec

© ALTOROS Systems | CONFIDENTIAL

13

Workload A: Update (Update 50%, Read 50%)
120
100

cassandra

80

couchbase
hbase

60

mongodb
40
20
0
0

500

1000

1500

2000

© ALTOROS Systems | CONFIDENTIAL

2500

3000
14

Workload A: Read (Update 50%, Read 50%)

80
70
60

50

cassandra
couch

40

hbase
mongo

30
20
10
0
0

500

1000

1500

2000

© ALTOROS Systems | CONFIDENTIAL

2500

3000
15

Workload B: Update (update 5% , read 95%)
120
100
80
cassandra
60

couch
hbase

40

mongo

20
0
0

500

1000

1500

© ALTOROS Systems | CONFIDENTIAL

2000

2500

16

Workload B: Read (update 5% , read 95%)
90

80
70
60
cassandra

50

couch

40

hbase

30

mongo

20
10
0
0

500

1000

1500

© ALTOROS Systems | CONFIDENTIAL

2000

2500

17

Workload C: 100% Read
80
70
60
50

cassandra

40

couch
hbase

30

mongo
20
10
0
0

500

1000

1500

2000

© ALTOROS Systems | CONFIDENTIAL

2500

3000

18

Workload D: Insert (insert 5% , read 95%)
60
50
40
cassandra
30

couch
hbase

20

mongo

10
0
0

500

1000

1500

2000

© ALTOROS Systems | CONFIDENTIAL

2500

3000

19

Workload D: Read (insert 5% , read 95%)
90
80
70
60
cassandra

50

couch

40

hbase

30

mongo

20
10
0
0

500

1000

1500

2000

© ALTOROS Systems | CONFIDENTIAL

2500

3000

20

400

Workload E: Insert (Insert 5%, Scan 95%)

350
300
250
200

cassandra

150

hbase

100
50
0

0

50

100

150

© ALTOROS Systems | CONFIDENTIAL

200

250
21

Workload F: read (Read-Modify-Write 50%, Read 50%)
80
70

60
50

cassandra

40

couch
hbase

30

mongo
20
10
0
0

500

1000

1500

© ALTOROS Systems | CONFIDENTIAL

2000

2500

22

Workload F: Update (Read-Modify-Write 50%, Read 50%)
140
120

100
cassandra

80

couch
60

hbase
mongo

40
20
0
0

500

1000

1500

© ALTOROS Systems | CONFIDENTIAL

2000

2500

23

Workload F: Read-Modify-Write (Read-Modify-Write 50%, Read 50%)
200
180
160
140
120

cassandra

100

couch

80

hbase

60

mongo

40
20
0
0

500

1000

1500

© ALTOROS Systems | CONFIDENTIAL

2000

2500

24

Workload G: Insert (Insert 90%, Read 10%)
35

30
25
cassandra

20

couch
15

hbase
mongo

10
5
0
0

1000

2000

3000

4000

5000

© ALTOROS Systems | CONFIDENTIAL

6000

7000

25

Workload G: Read (Insert 90%, Read 10%)
60
50
40
cassandra
30

couch
hbase

20

mongo

10
0
0

1000

2000

3000

4000

5000

© ALTOROS Systems | CONFIDENTIAL

6000

7000

26

•
•

•
•
•
•
•
•

© ALTOROS Systems | CONFIDENTIAL

27

Sergey Sverchkov

Project Manager
sergey.sverchkov@altoros.com
Altoros, 2013

© ALTOROS Systems | CONFIDENTIAL

28

Mais conteúdo relacionado

Destaque

Streams and Things - Darach Ennis (Ubiquiti Networks)

jaxLondonConference

Presented at JAX London 2013 When you write and run Java code, the JVM makes several allocations on your behalf, but do you have an understanding of how much that is? This session provides insight into the memory usage of Java code, covering the memory overhead of putting int into an integer object and the cost of object delegation and the memory efficiency of the different collection types. It also gives you an understanding of the off-Java (native) heap memory usage of some types of Java objects, such as threads and sockets.

Practical Performance: Understand the Performance of Your Application - Chris...

jaxLondonConference

Presented at JAX London 2013 Building complex applications in the browser is hard especially when you are working in teams. Dart is ideal for developing the next generation of web applications in an enterprise environment, by allowing you to communicate type information to your fellow developers and automated tools. With familiar (but lightweight) syntax, class-based OOP and a type system that allows tooling, Java developers will quickly feel at home with Dart.

Bringing your app to the web with Dart - Chris Buckett (Entity Group)

jaxLondonConference

The state of the art biorepository at ILRI

Absolomon Kihara

Presented at JAX London MapReduce begat Hadoop begat Big Data. NoSQL moved us away from the stricture of monolithic storage architectures to fit-for-purpose designs. But, Houston, we still have a problem. Architects are still designing systems like this is the '70s. SOA, went from buzzword to the bank with the emergence and evolution of the cloud and on-demand right-now elasticity. Yet most systems are still designed to store-then-compute rather than to observe, orient, decide and act on in-flight data.

Big Events, Mob Scale - Darach Ennis (Push Technology)

jaxLondonConference

Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...

jaxLondonConference

Presented at JAX London 2013 Hypermedia or HATEOAS APIs get a lot of air cover but there aren't many of them in the wild. This session will summarize the challenges that exist when building hypermedia REST API’s and explain why it’s worth it. We’ll take a comparative look at various different approaches for using hypermedia in a typical REST API, before taking a closer look at an approach that automatically generates links from a semantically rich API definition.

Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...

jaxLondonConference

45 second video proposal

Nicole174

Presented as a keynote at JAX London 2013 Lambdas are coming to the Java language in the upcoming release of Java 8! While this is generally great news, many Java developers have never experienced Lambdas before, and have not yet learned the best ways to use them for maximum productivity. In this talk, we will discuss best practices for using Lambdas in Java and other JVM-based languages, and we will investigate how we can make these constructs more usable in production.

What You Need to Know About Lambdas - Jamie Allen (Typesafe)

jaxLondonConference

Presented at JAX London 2013. Software craftsman and co-founder of the London Software Craftsmanship Community (LSCC). Sandro has been coding since a very young age but just started his professional career in 1996. He has worked for startups, software houses, product companies and international consultancy companies. Having worked as a consultant for the majority of his career, he had the opportunity to work in a good variety of projects, with different languages and technologies, and across many industries. Currently he is a director at UBS Investment Bank, where he works as a hands-on mentor, giving technical directions, looking after the quality of the systems and pair-programming with developers in the UK and abroad. His main objective is to help developers to become real software craftsmen.

Why other ppl_dont_get_it

jaxLondonConference

Legal and ethical considerations redone

Nicole174

Presented at JAX London 2013 Whether we're talking Analytics, Big Data, Cloud, NoSQL, Continuous Integration and Deployment, Mobile computing or Minecraft, Java is at the nexus of the mass convergence of trends we're currently seeing in tech. It's still fashionable in Web developer circles to dismiss Java the language, but its not going anywhere, and the JVM as we know it goes from strength to strength. In this talk James Governor will look to put the return of Java into context, with both stories, and quantitative data.

How Java got its Mojo Back - James Governor (Redmonk)

jaxLondonConference

Presented at JAX London 2013 Community Night Through some famous quotes and pictures that will make you think, Guillaume Laforge, Head of Groovy Development for SpringSource, will illustrate some simple principles that he has followed on the projects he’s worked on, and walk through the lessons he’s learned throughout the journey. Guillaume is the official Groovy Project Manager, and the spec lead of JSR-241, the JSR that standardizes the Groovy dynamic language.

Little words of wisdom for the developer - Guillaume Laforge (Pivotal)

jaxLondonConference

Databases and agile development - Dwight Merriman (MongoDB)

jaxLondonConference

What's new in Windows 10? See presentation from a live webinar with virtual classroom IT training expert – Martin Budnarowski – who gives a walk through some of the features in Microsoft’s latest OS - Windows 10. Microsoft are taking a new direction. Their new Windows 10 will change the way you and your learners use your technology. It will unify all devices from your mobile phone to your tablet to your PC; some people call it the Internet of Things. Windows 10 is a true game changer! If you’re interested in learning what’s new in Windows 10, see the slides from our recent webinar hosted in CloudRooms™, live online learning solution.

How Windows 10 will change the way we use devices

Commelius Solutions

Real-world polyglot programming on the JVM - Ben Summers (ONEIS)

jaxLondonConference

Presented at JAX London 2013 tl;dr - How will the everyday developer cope with Java 8’s Language changes? Java 8 will ship with a powerful new abstraction - Lambda Expressions (aka Closures) and a completely retooled set of Collections libraries. In addition interfaces have changed through the addition of default and static methods. The ongoing debate as to whether Java should include such language changes has resulted in many vocal opinions being espoused. Sadly few of these opinions have been backed up by practical experimentation and experience. - Are these opinions just myths?

Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)

jaxLondonConference

Presented at JAX London 2013 Vert.x is a lightweight, high performance, application platform for the JVM that's designed for modern mobile, web, and enterprise applications. The recent Vert.x 2.0 release marks a coming of age for Vert.x, as it progresses to a fully independent project. We'll dive into the Vert.x 2.0 release and show how the powerful new module system enables a Vert.x ecosystem by allowing modules to be re-used via Maven and Bintray repositories. You'll also learn about how better build tool and IDE integration makes developing applications with Vert.x a breeze.

Introducing Vert.x 2.0 - Taking polyglot application development to the next ...

jaxLondonConference

Interactive media applications

Nicole174

Presented at JAX London 2013 The Large Hadron Collider experiments manage tens of petabytes of data spread across hundreds of data centres. Managing and processing this volume required significant infrastructure and novel software systems, involving years of R&D and significant commissioning to prepare for the LHC First Data. The evolution of this global computing infrastructure, and the specialisations made by the experiments, have lessons relevant for many commercial "big data" users.

Big data from the LHC commissioning: practical lessons from big science - Sim...

jaxLondonConference

Destaque (20)

Streams and Things - Darach Ennis (Ubiquiti Networks)

Practical Performance: Understand the Performance of Your Application - Chris...

Bringing your app to the web with Dart - Chris Buckett (Entity Group)

The state of the art biorepository at ILRI

Big Events, Mob Scale - Darach Ennis (Push Technology)

Garbage Collection: the Useful Parts - Martijn Verburg & Dr John Oliver (jCla...

Are Hypermedia APIs Just Hype? - Aaron Phethean (Temenos) & Daniel Feist (Mul...

45 second video proposal

What You Need to Know About Lambdas - Jamie Allen (Typesafe)

Why other ppl_dont_get_it

Legal and ethical considerations redone

How Java got its Mojo Back - James Governor (Redmonk)

Little words of wisdom for the developer - Guillaume Laforge (Pivotal)

Databases and agile development - Dwight Merriman (MongoDB)

How Windows 10 will change the way we use devices

Real-world polyglot programming on the JVM - Ben Summers (ONEIS)

Lambda Expressions: Myths and Mistakes - Richard Warburton (jClarity)

Introducing Vert.x 2.0 - Taking polyglot application development to the next ...

Interactive media applications

Big data from the LHC commissioning: practical lessons from big science - Sim...

Semelhante a Evaluating NoSQL performance: Which database is right for your data? - Sergey Sverchkov (Altoros)

The Real World - Plugging the Enterprise Into It (nodejs)

Aman Kohli

This presentation contains information on the test environment, settings, major criteria for evaluation, and component diagrams that can help you to test a NoSQL data store for your project. It also provides a matrix that compares a number of NoSQL products based on our test results. We also list the issues we encountered and some approaches we used to overcome them. For more independent research into Hadoop, NoSQL, and other big data technologies, please visit www.altoros.com/research-papers or follow @altoros.

EC2 NoSQL Benchmarking

Altoros

Being HAPI! Reverse Proxying on Purpose

Aman Kohli

Cisco Cloud Networking Workshop

Cisco Canada

eMagic-Data Center Management System

Sandesh Sonar

Mobility switch security architecture scott calzia madani adjali

Aruba, a Hewlett Packard Enterprise company

Solarwinds NPM 10.5 webcast

Michal Hrncirik

Delta ah500 catalog

Electromate

huawei-s1730s-s24t4x-a-brochure-datasheet.pdf

Hi-Network.com

Commscope-Andrew LDF12-50

savomir

NPM10.5 Come See Whats New

SolarWinds

Commscope-Andrew AVA5-50FX

savomir

This presentation will offer an Overview of the UCS System Architecture, including all of the technical innovations that serve as the foundation. Among the topics covered will be overviews on Unified Fabric, Service Profiles, Hardware Abstraction, Fabric Extension, Memory Expansion and the UCS Manager. Further insight will be offered into the XML -based API and the basic set of managed objects, including Pools, Policies and Templates.

UCS System Architecture

Cisco Canada

Commscope-Andrew LDF5-50A

savomir

As presented at AIST 2014: This presentation describes how Multi-Facility SCADA and Historian solutions can be used within Steel Mill Plants as the single interface for monitoring and controlling all Steel Mill Facility Building Management Systems. This solution helps facility operations operate and maintain their buildings from a single environment without having to use multiple Building Management Systems, while maintaining all existing multi-vendor hardware by using software solutions for direct connectivity to the buildings control system, and without the need for hardware gateways. As a result, Steel Mill Plant Facility Operations see increased operating expenses and more maintenance time spent on maintaining, training and operating these disparate management systems.

The Challenges and Solutions to Integrate Multi-Facility/Buildings Disparate ...

Schneider Electric

Osol Netadmin Solaris Administrator

Opeyemi Olakitan

Embrace the BYOD wave and explore the untapped potential of your wireless controllers. In this session, you will learn how the features in controller code release 7.2 - 7.4, can help you scale up your wireless deployment and open the door to a world of new potential. Topics will include: Application Visibility and Control (AVC), Flex Connect, IPv6, Identity Services Engine integration and other configuration best practices.

Wireless Feature Update

Cisco Canada

SolarWinds Scalability for the Enterprise

SolarWinds

One of the key challenges for all public cloud providers, not just Oracle, is how to securely and reliably connect cloud services to their customers’ existing systems. This presentation provides an impartial view of Oracle Network Cloud’s three offerings, with a more detailed drill down into the VPN available for shared compute cloud. First delivered by Simon Haslam on 6 December 2016 at the UKOUG Tech16 conference

3 Ways to Connect to the Oracle Cloud

Simon Haslam

Iai xsel controller_catalog

Electromate

Semelhante a Evaluating NoSQL performance: Which database is right for your data? - Sergey Sverchkov (Altoros) (20)

The Real World - Plugging the Enterprise Into It (nodejs)

EC2 NoSQL Benchmarking

Being HAPI! Reverse Proxying on Purpose

Cisco Cloud Networking Workshop

eMagic-Data Center Management System

Mobility switch security architecture scott calzia madani adjali

Solarwinds NPM 10.5 webcast

Delta ah500 catalog

huawei-s1730s-s24t4x-a-brochure-datasheet.pdf

Commscope-Andrew LDF12-50

NPM10.5 Come See Whats New

Commscope-Andrew AVA5-50FX

UCS System Architecture

Commscope-Andrew LDF5-50A

The Challenges and Solutions to Integrate Multi-Facility/Buildings Disparate ...

Osol Netadmin Solaris Administrator

Wireless Feature Update

SolarWinds Scalability for the Enterprise

3 Ways to Connect to the Oracle Cloud

Iai xsel controller_catalog

Mais de jaxLondonConference

This talk was due to be presented at JAX London 2013, but the speaker was unfortunately unable to attend. Distributed data stores give us increased availability, linear scalability, predictable latency and improved fault tolerance. The flip-side is having to deal with inconsistencies: most distributed databases will ask your application layer how to resolve such inconsistencies. Conflict-free Replicated Data Types (CRDTs) are a way for a distributed database, such as Riak, to resolve those inconsistencies logically and automatically. Unlike traditional data structures, there is always a single state on which they converge. In this talk, I’ll look at the development of CRDTs from an academic project to implementation in Riak.

Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...

jaxLondonConference

Presented at JAX London 2013 Per-tenant resource management can help ensure that collocated tenants peacefully share computational resources based on individual quotas. This session begins with a comparison of deployment models (shared: hardware, OS, middleware, everything) to motivate the multitenant approach. The main topic is an exploration of experimental data isolation and resource management primitives in IBM’s JDK that combine to help make multitenant applications smaller and more predictable.

JVM Support for Multitenant Applications - Steve Poole (IBM)

jaxLondonConference

Presented at JAX London 2013 Worried about the future of Java? Want to see it keep moving forward? Don't be concerned. The transformation of Java is already underway. Driven by new technologies and new opportunities Java and the JVM are entering uncharted worlds and challenging old approaches. In this session learn about one such expedition in the form of an introductory talk to technology being developed by IBM. This experimental technology is exploring a new way to share data between the JVM and other runtimes.

Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)

jaxLondonConference

The Spock unit testing framework is on the verge of a 1.0 release and has already proven itself to be the next generation thinking on how to test Java production code. One of the many ever present challenges to testing code is the ability to Mock classes which has simplified by Spock from a very early release. Recently added to Spock is the notion of Stubs and Spies. This sessions is designed to demonstrate proper unit testing technique showing off these new features along with a number of advanced Spock features.

Java Testing With Spock - Ken Sipe (Trexin Consulting)

jaxLondonConference

Presented at JAX London 2013 Groovy is not a newcomer to the arena of alternative languages for the JVM. With over 1.7 million downloads a year, it's clearly ahead of the pack. But what makes it a great choice for your projects? - a flat learning curve for Java developers - its seamless Java integration where you can mix & mash Groovy & Java together - a malleable & concise syntax fit for Domain-Specific Languages - an interesting take on type safety - its rich ecosystem of projects: Grails, Gradle, GPars, Spock, Griffon, Geb...

What makes Groovy Groovy - Guillaume Laforge (Pivotal)

jaxLondonConference

Presented at JAX London 2013 Ever since Java’s inception, in 1995, people have been compiling languages that aren’t Java to bytecode and deploying them on the JVM. Lately we are seeing an explosion in JVM languages. This is partly, but not only, because of Java 7, the first JVM to ship with invokedynamic, which is a quantum leap in polyglot runtime implementation. This session explains why emerging language implementations are becoming more common and more feasible to implement on the JVM with satisfactory performance.

The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...

jaxLondonConference

Presented at JAX London 2013 The Java EE 7 platform focuses on Productivity and HTML5. JAX-RS 2 adds a new Client API to invoke the RESTful endpoints. JMS 2 is undergoing a complete overhaul to align with improvements in the Java language. The long awaited Batch Processing API and Concurrency API are also getting added to build applications using capabilities of the platform itself. Together these APIs will allow you to be more productive by simplifying enterprise development. WebSocket attempts to solve the issues and limitations of HTTP for real-time communication.

Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...

jaxLondonConference

Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...

jaxLondonConference

The Curious Clojurist - Neal Ford (Thoughtworks)

jaxLondonConference

Presented at JAX London 2013 Test Driven Development is a practice generally endorsed by most people. However it is also one of the most difficult to get right. I am part of a very large project where we decided to use TDD from the very start. We encountered a number of challenges and learned a lot of lessons. We are still learning and evolving our approach to TDD. We discovered that doing TDD badly is actually worse than not doing TDD at all and that it is very important to get some basics rights otherwise you'll put yourself in a world of pain.

TDD at scale - Mash Badar (UBS)

jaxLondonConference

Presented at JAX London 2013 Public, private, and hybrid; software, platform, and infrastructure. This talk will discuss the current state of the Platform-as-a-Service space, and why the keys to success lie in enabling developer productivity, and providing openness and choice. We'll do this by considering the success of Open Source in general, look at the Cloud Foundry project, and find out why Cloud Foundry-based PaaSes are the best places to host your applications written in Java and other JVM-based languages.

Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)

jaxLondonConference

Presented at JAX London 2013 This session shows you how Slick, the Scala database library, can bring your data seamlessly into your Scala application and compile Scala collection operations to database code for execution on the database server. There will be plenty of live coding that highlights the latest features such as distributed queries, macro-based type providers, and the use of non-SQL data stores.

Scaling Scala to the database - Stefan Zeiger (Typesafe)

jaxLondonConference

Presented at JAX London 2013 Imagine if, when your applications weren't in use, they could go to sleep, just like your laptop does when idle. Just think how much money you could save on your infrastructure. The problem with many resource-intensive Java applications is that they are far more difficult to redeploy than they are to take down. Consequently applications tend to be left running whether they are being used or not.

Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)

jaxLondonConference

Presented at JAX London 2013 The big language features for Java SE 8 are lambda expressions (closures) and default methods (formerly called defender methods or virtual extension methods). Adding lambda expressions to the language opens up a host of new expressive opportunities for applications and libraries. You might assume that lambda expressions are simply a more syntactically compact form of inner classes, but, in fact, the implementation of lambda expressions is substantially different and builds on the invokedynamic feature added in Java SE 7.

Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...

jaxLondonConference

Presented at JAX London 2013 The Raspberry Pi has caused a huge wave of interest amongst developers, providing an ARM powered single board computer running a full Linux distro off an SD card and all for only $35! After an introduction to the Raspberry Pi and the ARM architecture, this session will look at how Java can be used on a device like this. Oracle have released an early access preview of JDK8 including JavaFX and a version of Java ME Embedded (3.3) tuned specifically for the Raspberry Pi.

Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...

jaxLondonConference

Large scale, interactive ad-hoc queries over different datastores with Apache...

jaxLondonConference

Presented at JAX London 2013 All too often I have observed infrastructure designs for deploying Java applications come as an afterthought by businesses, technical analysts, and application developers. Choices of technologies are frequently made with no final deployment infrastructures being discussed. The talk will cover the design considerations on building resilient applications, and application deployment platforms across multiple data centres, and how organisations can leverage technologies such as Apache Cassandra to achieve this.

Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...

jaxLondonConference

Mais de jaxLondonConference (17)

Conflict Free Replicated Data-types in Eventually Consistent Systems - Joel J...

JVM Support for Multitenant Applications - Steve Poole (IBM)

Packed Objects: Fast Talking Java Meets Native Code - Steve Poole (IBM)

Java Testing With Spock - Ken Sipe (Trexin Consulting)

What makes Groovy Groovy - Guillaume Laforge (Pivotal)

The Java Virtual Machine is Over - The Polyglot VM is here - Marcus Lagergren...

Java EE 7 Platform: Boosting Productivity and Embracing HTML5 - Arun Gupta (R...

Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...

The Curious Clojurist - Neal Ford (Thoughtworks)

TDD at scale - Mash Badar (UBS)

Run Your Java Code on Cloud Foundry - Andy Piper (Pivotal)

Scaling Scala to the database - Stefan Zeiger (Typesafe)

Put your Java apps to sleep? Find out how - John Matthew Holt (Waratek)

Project Lambda: Functional Programming Constructs in Java - Simon Ritter (Ora...

Do You Like Coffee with Your dessert? Java and the Raspberry Pi - Simon Ritte...

Large scale, interactive ad-hoc queries over different datastores with Apache...

Designing Resilient Application Platforms with Apache Cassandra - Hayato Shim...

Último

Manulife - Insurer Transformation Award 2024

The Digital Insurer

Keynote 2: APIs in 2030: The Risk of Technological Sleepwalk Paolo Malinverno, Growth Advisor - The Business of Technology Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

apidays

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Zilliz

Accelerating FinTech Innovation: Unleashing API Economy and GenAI Vasa Krishnan, Chief Technology Officer - FinResults Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

apidays

AXA XL - Insurer Innovation Award Americas 2024

The Digital Insurer

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Juan lago vázquez

Passkeys: Developing APIs to enable passwordless authentication Cody Salas, Sr Developer Advocate | Solutions Architect - Yubico Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

apidays

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

MadyBayot

[BuildWithAI] Introduction to Gemini.pdf

Sandro Moreira

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

FWD Group - Insurer Innovation Award 2024

The Digital Insurer

Following the popularity of "Cloud Revolution: Exploring the New Wave of Serverless Spatial Data," we're thrilled to announce this much-anticipated encore webinar. In this sequel, we'll dive deeper into the Cloud-Native realm by uncovering practical applications and FME support for these new formats, including COGs, COPC, FlatGeoBuf, GeoParquet, STAC, and ZARR. Building on the foundation laid by industry leaders Michelle Roby of Radiant Earth and Chris Holmes of Planet in the first webinar, this second part offers an in-depth look at the real-world application and behind-the-scenes dynamics of these cutting-edge formats. We will spotlight specific use-cases and workflows, showcasing their efficiency and relevance in practical scenarios. Discover the vast possibilities each format holds, highlighted through detailed discussions and demonstrations. Our expert speakers will dissect the key aspects and provide critical takeaways for effective use, ensuring attendees leave with a thorough understanding of how to apply these formats in their own projects. Elevate your understanding of how FME supports these cutting-edge technologies, enhancing your ability to manage, share, and analyze spatial data. Whether you're building on knowledge from our initial session or are new to the serverless spatial data landscape, this webinar is your gateway to mastering cloud-native formats in your workflows.

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Safe Software

Effective data discovery is crucial for maintaining compliance and mitigating risks in today's rapidly evolving privacy landscape. However, traditional manual approaches often struggle to keep pace with the growing volume and complexity of data. Join us for an insightful webinar where industry leaders from TrustArc and Privya will share their expertise on leveraging AI-powered solutions to revolutionize data discovery. You'll learn how to: - Effortlessly maintain a comprehensive, up-to-date data inventory - Harness code scanning insights to gain complete visibility into data flows leveraging the advantages of code scanning over DB scanning - Simplify compliance by leveraging Privya's integration with TrustArc - Implement proven strategies to mitigate third-party risks Our panel of experts will discuss real-world case studies and share practical strategies for overcoming common data discovery challenges. They'll also explore the latest trends and innovations in AI-driven data management, and how these technologies can help organizations stay ahead of the curve in an ever-changing privacy landscape.

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

TrustArc

Artificial Intelligence Chap.5 : Uncertainty

Khushali Kathiriya

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Corporate and higher education. Two industries that, in the past, have had a clear divide with very little crossover. The difference in goals, learning styles and objectives paved the way for differing learning technologies platforms to evolve. Now, those stark lines are blurring as both sides are discovering they have content that’s relevant to the other. Join Tammy Rutherford as she walks through the pros and cons of corporate and higher ed collaborating. And the challenges of these different technology platforms working together for a brighter future.

Corporate and higher education May webinar.pptx

Rustici Software

presentation ICT roal in 21st century education

jfdjdjcjdnsjd

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

DBX First Quarter 2024 Investor Presentation

Dropbox

Evaluating NoSQL performance: Which database is right for your data? - Sergey Sverchkov (Altoros)

2. ORDER Order ID: 1001 Order Date: 15.9.2012 Customer    Billing Address Street: Somestreet 10 City: Somewhere Postal Code: 55901  ADDRESS Line Items Quantity Price Ipod Touch 1 220.95 Monster Beat 2 190.00 Apple Mouse 1 69.90 Name  CUSTOMER First Name: Peter Last Name: Sample ORDER_LINES     © ALTOROS Systems | CONFIDENTIAL 2

5. •   • Workload is defined by different distributions   • Operations of the following types:     © ALTOROS Systems | CONFIDENTIAL 5

9. •  Single availability zone eu-west-1b, Ireland region  Single security group with all required port opened  4 m1.xlarge 64bit instances for cluster nodes: 16GB RAM, 4 vCPU, 8 ECU, highperformance network  1 c1.xlarge 64bit instance for YSCB client: 7GB RAM, 8 vCPU, 20 ECU, highperformance network  2 additional c1.medium 64bit instances for mongo routers: 1.7GB RAM, 2 vCPU, 5 ECU, moderate network •  4 EBS volumes by 25 GB each in RAID0  EBS optimized volumes, no Provisioned IOPS © ALTOROS Systems | CONFIDENTIAL 9

10. •  partitioner: org.apache.cassandra.dht.Murmur3Partitioner  key_cache_size_in_mb: 1024  row_cache_size_in_mb: 6096  JVM heap size: 6GB  Snappy compressor  Replica factor 1 •  2 c1.medium nodes with mongo router process - mongos  Replica factor 1  Sharding by internal key “_id” © ALTOROS Systems | CONFIDENTIAL 10

11. •  Replica factor 1  Memory + disk mode •  JVM heap size 12GB  Replica factor 1  Snappy compressor © ALTOROS Systems | CONFIDENTIAL 11

13. Load phase, 100.000.000 records * 1 KB, [INSERT] 9 Average latency, ms 8 7 6 5 hbase 4 cassandra 3 couchbase mongodb 2 1 0 0 10000 20000 30000 40000 Throughput, ops/sec © ALTOROS Systems | CONFIDENTIAL 13

14. Workload A: Update (Update 50%, Read 50%) 120 100 cassandra 80 couchbase hbase 60 mongodb 40 20 0 0 500 1000 1500 2000 © ALTOROS Systems | CONFIDENTIAL 2500 3000 14

15. Workload A: Read (Update 50%, Read 50%) 80 70 60 50 cassandra couch 40 hbase mongo 30 20 10 0 0 500 1000 1500 2000 © ALTOROS Systems | CONFIDENTIAL 2500 3000 15

16. Workload B: Update (update 5% , read 95%) 120 100 80 cassandra 60 couch hbase 40 mongo 20 0 0 500 1000 1500 © ALTOROS Systems | CONFIDENTIAL 2000 2500 16

17. Workload B: Read (update 5% , read 95%) 90 80 70 60 cassandra 50 couch 40 hbase 30 mongo 20 10 0 0 500 1000 1500 © ALTOROS Systems | CONFIDENTIAL 2000 2500 17

18. Workload C: 100% Read 80 70 60 50 cassandra 40 couch hbase 30 mongo 20 10 0 0 500 1000 1500 2000 © ALTOROS Systems | CONFIDENTIAL 2500 3000 18

19. Workload D: Insert (insert 5% , read 95%) 60 50 40 cassandra 30 couch hbase 20 mongo 10 0 0 500 1000 1500 2000 © ALTOROS Systems | CONFIDENTIAL 2500 3000 19

20. Workload D: Read (insert 5% , read 95%) 90 80 70 60 cassandra 50 couch 40 hbase 30 mongo 20 10 0 0 500 1000 1500 2000 © ALTOROS Systems | CONFIDENTIAL 2500 3000 20

21. 400 Workload E: Insert (Insert 5%, Scan 95%) 350 300 250 200 cassandra 150 hbase 100 50 0 0 50 100 150 © ALTOROS Systems | CONFIDENTIAL 200 250 21

22. Workload F: read (Read-Modify-Write 50%, Read 50%) 80 70 60 50 cassandra 40 couch hbase 30 mongo 20 10 0 0 500 1000 1500 © ALTOROS Systems | CONFIDENTIAL 2000 2500 22

23. Workload F: Update (Read-Modify-Write 50%, Read 50%) 140 120 100 cassandra 80 couch 60 hbase mongo 40 20 0 0 500 1000 1500 © ALTOROS Systems | CONFIDENTIAL 2000 2500 23

24. Workload F: Read-Modify-Write (Read-Modify-Write 50%, Read 50%) 200 180 160 140 120 cassandra 100 couch 80 hbase 60 mongo 40 20 0 0 500 1000 1500 © ALTOROS Systems | CONFIDENTIAL 2000 2500 24

25. Workload G: Insert (Insert 90%, Read 10%) 35 30 25 cassandra 20 couch 15 hbase mongo 10 5 0 0 1000 2000 3000 4000 5000 © ALTOROS Systems | CONFIDENTIAL 6000 7000 25

26. Workload G: Read (Insert 90%, Read 10%) 60 50 40 cassandra 30 couch hbase 20 mongo 10 0 0 1000 2000 3000 4000 5000 © ALTOROS Systems | CONFIDENTIAL 6000 7000 26

Notas do Editor

Often referred to as NoSQL, non-relational databases feature elasticity and scalability. In addition, they can store big data and work with cloud computing systems. All of these factors make them extremely popular.
Why did NoSQL data stores appear? Mostly because relational databases (RDBMS) have a number of disadvantages, if you have to work with large datasets.For example, RDBMS are hard to scale and their architecture is designed to work on a single machine. - Scaling write operations is either hard, expensive, or impossible.- Vertical scaling (or upgrading equipment) is either limited or very expensive. Unfortunately, this is often the only possible way you can scale.- Horizontal scaling (or adding new nodes to the cluster) is either unavailable or you can only implement it partially. There are some solutions from Oracle and Microsoft that make it possible to have computing instances on several servers. Still, the database itself remains in shared storage.In addition to poor scalability, RDBMS have strict data models. The schema is created together with the database and you will need a lot of time and effort to change this structure. In most cases it is an extremely complex task. Apart from that, RDBMS have difficulties with semistructured data.
NoSQL solutions address many of these problems.POINT 1: In 2013, the number of NoSQL products reached 150+ and the figure is still growing. That variety makes it difficult to select the best tool for a particular case.POINT 2: They come in many types--key-value, columnar, document-oriented, and graph.POINT 3: There is one thing in common for all NoSQL databases. They don't use the relational data model. This means they do not use the SQL query language.POINT 4: NoSQL data management systems are inherently schema-free (with no obsessive complexity and a flexible data model) and eventually consistent (complying with BASE rather than ACID)POINT 5: They provide APIs to perform various operations. Some of NoSQL data stores support query language operations, for example, Cassandra and Hbase. However, there is no standard. This is another difference between NoSQL databases and traditional RDBMS.POINT 6: RDBMS usually have strong data consistency. In contrast to that, NoSQL data stores operate with eventual consistency. When you add data to the system, it becomes consistent after some time.POINT 7: NoSQL architectures are designed to run in cluster that consist of several nodes. This makes it possible to scale them horizontally by increasing the number of nodes.In addition, NoSQL data stores serve huge amounts of data and provide high throughput.
POINT 1: NoSQL databases differ from RDBMS in their data models. These systems can be divided into 4 groups:A. Key Value StoresKey value stores are similar to maps or dictionaries where data is addressed by a unique key.B. Document StoresDocument Stores encapsulate key value pairs in JSON or JSON like documents. Within documents, keys have to be unique. In contrast to key-value stores, values are not opaque to the system and can be queried as well.C. Column Family StoresColumn Family Stores are also known as column oriented stores, extensible record stores and wide columnar stores.D. Graph databasesKey-value stores, document stores, and column family stores have a common feature. They do store denormalized data in order to gain advantages in distribution.In contrast to relational databases and the already introduced key oriented NoSQL databases, graph databases are specialized on efficient management of heavily linked data.POINT 2: All NoSQL data stores have an API to work with data. Some DBs use certain SQL operations. Others support MapReduce aggregation.POINT 3: Multiversion concurrency control (MVCC) relaxes strict consistency in favor of performance. In order to support transactions without reserving multiple datasets for exclusive access, optimistic locking is provided by many stores. Before changed data is committed, each transaction checks, whether another transactions made any conflicting modifications to the same datasets.POINT 4: NoSQL databases differ in the way they distribute data on multiple machines. Since data models of key-value stores, document stores and column family stores are key oriented, the two common partition strategies are based on keys, too.The first strategy distributes datasets by the range of their keys. A routing server splits the whole keyset into blocks and allocates these blocks to different nodes. Afterwards, one node is responsible for storage and request handling of his specific key ranges. In order to find a certain key, clients have to contact the routing server for getting the partition table.Higher availability and much simpler cluster architecture can be achieved with the second distributionstrategy called consistent hashing. In contrast to range based partitioning, keys are distributed by using hash functions. Since every server is responsible for a certain hash region, addresses of certain keys within the cluster can be calculated very fast.In addition to better read performance through load balancing, replication also brings better availability and durability, because failing nodes can be replaced by other servers. If all replicas of a master server were updated synchronously, the system would not be available until all slaves had committed a write operation. Ifmessages got lost due to network problems, the system would not be available for a longer period of time. This solution is not suitable for platforms that rely on high availability, because even a few milliseconds of latency can have a big influence on user behavior.POINT 5: (PERFORMANCE: TYPICAL WORKLOADS)Obviously, performance is a very important factor. Performance of data storage solutions can be evaluated using typical scenarios. These scenarios simulate the most common operations performed by applications that use the data store, also known as typical workloads. The tests that we performed to compare performance of several NoSQL data stores also used typical workloads.
Database vendors usually measure productivity of their products with custom hardware and software settings designed to demonstrate the advantages of their solutions. In our tests we tried to see how NoSQL data stores perform under the same conditions.POINT 1: For benchmarking, we used the Yahoo Cloud Serving Benchmark (YCSB)The kernel of YCSB has a a framework with a workload generator that creates test workload and a set of workload scenarios.POINT 2: Developers need to describe the scenario of the workload by operation type: what operations are performed on what types of records. POINT 3: Supported operations include: insert, update (change one of the fields), read (one random field or all the field of one record), and scan (read the records in the order of the key starting from a randomly selected record).We can define the workload by the data that will be loaded into the database during the loading phase and the operations that will be executed against the data set during the transaction phase.
Each workload was targeted at a table of 100,000,000 records; each record was 1,000 bytes in size and contained 10 fields. A primary key identified each record, which was a string, such as “user234123”. Each field was named field0, field1, and so on. The values in each field were random strings of ASCII characters, 100 bytes each. Database performance was defined by the speed at which a database computed basic operations. A basic operation is an action performed by the workload executor, which drives multiple client threads. Each thread executes a sequential series of operations by making calls to the database interface layer both to load the database (the load phase) and to execute the workload (the transaction phase). The threads throttle the rate at which they generate requests, so that we may directly control the offered load against the database. In addition, the threads measure the latency and achieved throughput of their operations and report these measurements to the statistics module.-- The tests:We defined the following values for the workload executor:the number of threadsthe types of operations in the workload and the desired number of operations per second (target throughput)Then we measured the time it took to perform these transactions (latency).
This is a component diagram of the YCSB framework. It consists of several modules.Workload executor applies the workload to the data store. For each session, when the client accesses the DB, a client thread is initiated. Each thread performs a set of operations from the workload. The results in the form of statistics are then sent to the statistics module, which prints the output of the test to console where benchmark is started. These tests are consequently repeated for all the selected solutions.The YCSB framework has connectors for a wide range of DBs. For each database tested with YCSB, a developer needs to determine the type of database, target throughput, the number of concurrent threads on the client side, and how many operations we want to perform. This is necessary to create and start a test.
Now let's take a look at the NoSQL data stores that we tested:Cassandra 2.0: This is a column-value data store. We ran it on a virtual machines with Java 1.7.40 installed. The transactions were performed with the non-default configuration. In particular, we used a random partitioner to section data by nodes. The amount of data cash for the keys was 1 GB. The size of row cash was 6 GB. The size of JVM heap was 6 GB. Data was not replicated (there were no copies). This approach was intentional, we wanted to test performance, not failure tolerance of the cluster. MongoDB: This is a document-oriented DB. Here, we didn't do much additional configuration or tuning. As I have mentioned before, for Mongo, we added two VMs that served as routers because according to official documentation, the mongo router process should run on a separate machine. However, if you need to simplify the model, mongo router may run on the same machine, where the YCSB client is. In one of our earlier tests, we discovered that it uses a lot of CPU power. This is why it should be placed on a separate machine. Data sharding for MongoDB was based on document key.
We used the following workloads: Workload A: Update-heavily mode. Workload A is an update-heavily scenario that simulates how a database works, when recording typical actions of an e-commerce solution user.Settings for the workload: Read/update ratio: 50/50Zipfian request distributionWorkload BWorkload B is a read-mostly workload that has a 95/5 (ninety five to five percent) read/update ratio. It recaps content tagging, when adding a tag is an update, but most operations include reading tags.Workload CWorkload C is a read-only workload that simulates a data caching layer, for example a user profile cache.Workload D Workload D has 95/5 read/insert ratio. The workload simulates access to the latest data, such as user status updates or working with inbox messages first.Workload EWorkload E is a scan-short-ranges workload with a scan/insert percentile proportion of 95/5. It corresponds to threaded conversations that are clustered by a thread ID. Each scan is performed for the posts of a given thread.Workload F Workload F has read-modify-write/read ops in a proportion of 50/50. It simulates access to user database, where user records are read and modified by the user. User activity is also recorded to this database.Workload G Workload G has a 10/90 read/insert ratio. It simulates data migration process or highly intensive data creation.
For this benchmark, we generated a number of workloads based on a target throughputs and then measured the number of operations per second or the actual throughput of the each database. We also measured their latencies, or how long it took to perform each operation.During the first stage of the test, we uploaded 100 mln records of 1 kb each to each data store. YCSB measured average throughput in operations per second and latency in milliseconds.Hbase demonstrated the lowest throughput, probably because we turned on the auto-flash mode. This mode ensures that each operation that creates a record will be sent from the client to the server and then persisted to the database. Hbase also supports an alternative mode that generates additional cash on the client side. When the client is out of storage it sends this cash to the server. In this alternative mode, Hbase saves data to disk in batches.As we expected, Couchbase and Cassandra demonstrated excellent results. Cassandra simultaneously updates data in memory and writes it to the transaction journal on disk. Couchbase writes data to memory and then asynchronously persists it to disc. The result of the transaction returns after everything has been saved to memory. In this particular test, all data was loaded in a single iteration. But insert, update, and read operations that I will describe later were performed in several iterations.
The last workload, G, mostly consists of insert operations. It simulates the process of data migration or inserting a lot of data. The results are similar to what we saw on the previous graph. Hbase, Cassandra, and Couchbase demonstrate low latencies and high throughput. MongoDB’s performance starts to decline at about 4000 ops per second with an average latency of up to 5 times greater than in other databases.
This is the last diagram that shows the results of read operations, which make up 10% of workload G. Here we can see that latencies vary for different solutions. This might be because the data is in network storage on the cloud. Couchbase and Cassandra show a maximum throughput of up to 6000 operations per second.
1. What you choose depends on your needs. Before making the decision, answer the following questions: determine what your data sets and your data model will be like; the type data model will depend on the data sets and typical operations that your app will performdetermine your requirements to transaction support; decide whether you need transactionschoose whether you need replication; decide on your requirements to data consistencydetermine your performance requirements (how fast your DB should be)If your project is based on an existing solution, you should see if it is possible to migrate your data2. Then, taking into account all these factors, evaluate different solutions and test their performance (that’s what this presentation is all about). It is very useful to build a prototype and perform proof of concept. Then based on this prototype, you can select the best solution for your system. 3. Prototyping makes it possible to see how the solution will work in a real-life project. If it doesn’t work well enough, you need to review the architecture, the components, and build a new prototype. 4. There are no perfect solutions and there are no bad NoSQL or RDBMS data stores. The database and its implementation depends on a particular use case. The tests we have performed show that in different scenarios, different solutions have very different results. Your final choice might be a compromise. The main determinant will be what you want to achieve and what properties you need most. The solution may use several different solutions, including relational and NoSQL databases.

Evaluating NoSQL performance: Which database is right for your data? - Sergey Sverchkov (Altoros)

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (20)

Semelhante a Evaluating NoSQL performance: Which database is right for your data? - Sergey Sverchkov (Altoros)

Semelhante a Evaluating NoSQL performance: Which database is right for your data? - Sergey Sverchkov (Altoros) (20)

Mais de jaxLondonConference

Mais de jaxLondonConference (17)

Último

Último (20)

Evaluating NoSQL performance: Which database is right for your data? - Sergey Sverchkov (Altoros)

Notas do Editor