OrientDB Distributed Architecture v2.0

•

17 gostaram•79,978 visualizações

Orient Technologies

OrientDB Distributed Architecture Overview

Tecnologia

Short history
!
In 2012, we had a Master/Slave replication
!
While it scaled up well on reads, users
complained of a single Master node
bottleneck
It’s quite easy to scale up reads, the hard
part is to scale up both reads and writes
Copyright (c) - Orient Technologies LTD 2

How Master/Slave works
Copyright (c) - Orient Technologies LTD
3
C C C
Master
Node
Slave
Node
Slave
Node
Writes
Master
node is the
bottleneck

Master/Slave
!
PROS:
- Relatively easy to develop
!
CONS:
- The master is the bottleneck for writes
- No matter how many servers you have, the
throughput is limited by the Master node
Copyright (c) - Orient Technologies LTD 4

What happened to OrientDB's M/S architecture?
This is the old
MASTER/SLAVE
replication
Copyright (c) - Orient Technologies LTD 5

2012: new architectural goals
Multi-Master: all the nodes must accept writes
Sharding: split data in multiple partitions
Better Fail-Over
Simplified configuration with Auto-Discovery
Copyright (c) - Orient Technologies LTD 6

Auto-Discovery
C
Master
Node
I’m the
only one!
Copyright (c) - Orient Technologies LTD 7

Auto-Discovery
Connected!
C
Master
Node
Master
Node
Copyright (c) - Orient Technologies LTD 8

Clients see the distributed configuration
C
Master
Node
updated distributed
configuration is broadcasted to
all the connected clients
Master
Node
Copyright (c) - Orient Technologies LTD 9

Auto-reconnect in case of failure
In case of failure, the
clients auto-reconnect to
C C
the available nodes
Master
Node
Master
Node
Copyright (c) - Orient Technologies LTD 10

Auto-deploy of databases
automatically deployed
C
to the new joining
Master
Node
C
Master
Node
DB are
nodes
C
C
DB DB
Copyright (c) - Orient Technologies LTD 11

Classes rely on Cluster to store records
1 class -> 1 cluster Class
Customer
customer
By default
Cluster
Copyright (c) - Orient Technologies LTD 12

Classes can be split into more clusters
Customer
customer_usa
Class
multiple clusters
and assign them to
customer_china
Define
each node
Cluster Cluster
customer_europe
Cluster
Copyright (c) - Orient Technologies LTD 13

Assign 1 cluster per Node
Master
Node
Customer
Master
Node
Master
Node
customer_usa customer_europe customer_china
Copyright (c) - Orient Technologies LTD 14

Copyright (c) - Orient Technologies LTD
What about
sharing + replication?
!
We used a solution similar
to RAID for HardDrives
15

RAID for databases
Replica
factor = 2
Master
Node
Customer
Master
Node
Master
Node
customer_usa customer_europe customer_china
customer_china customer_usa customer_europe
Copyright (c) - Orient Technologies LTD 16

RAID for databases
Replica
factor = 3
Master
Node
Master
Node
Each node
owns all customers
Master
Node
customer_usa customer_europe customer_china
customer_customer_china usa customer_europe
customer_europe customer_china customer_usa
Copyright (c) - Orient Technologies LTD 17

Replication: under the hood
Client sends an INSERT request
HZ
Queue
Requests
Master
Node
HZ
Queue
Master
Node
HZ
Queue
Master
Node
C
INSERT
Copyright (c) - Orient Technologies LTD 18

Replication: under the hood
HZ
Queue
Response handling
Requests
Master
Node
HZ
Queue
Master
Node
HZ
Queue
WriteQuorum
= 2
Sends OK
Master
Node
C
HZ
Queue
HZ
Queue
HZ
Queue
OK
Responses
Copyright (c) - Orient Technologies LTD 19

Replication: under the hood
Fix the unaligned node
HZ
Queue
Requests
Master
Node
HZ
Queue
Master
Node
HZ
Queue
Master
Node
HZ
Queue
HZ
Queue
HZ
Queue
Responses
Fix
Copyright (c) - Orient Technologies LTD 20

Linear and Elastic scalability
C
Master
Node
C
on both read & writes!
Master
Node
C
C
Master
Node
C
C
C
C
Master
Node
C
C
C
C Master
Node
C
C
C
Master
Node
C
C
C
Master
Node
C
C
Copyright (c) - Orient Technologies LTD 21

Hazelcast’s role
Auto-Discovering (Multicast/TCP-IP/Amazon)
Queues for requests and responses
Store metadata in distributed Maps
Distributed Locks
Copyright (c) - Orient Technologies LTD 22

OrientDB’s Future Roadmap
OrientDB 2.0 (Sept 2014) has even better
performance: +300% improvement on all the
distributed operations
Pluggable conflict resolution strategy
Auto-discovery also by Clients
Copyright (c) - Orient Technologies LTD 23

Mais conteúdo relacionado

Mais procurados

Introduction to NoSQL Databases

Derek Stainer

Netflix’s Big Data Platform team manages data warehouse in Amazon S3 with over 60 petabytes of data and writes hundreds of terabytes of data every day. With a data warehouse at this scale, it is a constant challenge to keep improving performance. This talk will focus on Iceberg, a new table metadata format that is designed for managing huge tables backed by S3 storage. Iceberg decreases job planning time from minutes to under a second, while also isolating reads from writes to guarantee jobs always use consistent table snapshots. In this session, you'll learn: • Some background about big data at Netflix • Why Iceberg is needed and the drawbacks of the current tables used by Spark and Hive • How Iceberg maintains table metadata to make queries fast and reliable • The benefits of Iceberg's design and how it is changing the way Netflix manages its data warehouse • How you can get started using Iceberg Speaker Ryan Blue, Software Engineer, Netflix

Iceberg: a fast table format for S3

DataWorks Summit

RocksDB is the default state store for Kafka Streams. In this talk, we will discuss how to improve single node performance of the state store by tuning RocksDB and how to efficiently identify issues in the setup. We start with a short description of the RocksDB architecture. We discuss how Kafka Streams restores the state stores from Kafka by leveraging RocksDB features for bulk loading of data. We give examples of hand-tuning the RocksDB state stores based on Kafka Streams metrics and RocksDB’s metrics. At the end, we dive into a few RocksDB command line utilities that allow you to debug your setup and dump data from a state store. We illustrate the usage of the utilities with a few real-life use cases. The key takeaway from the session is the ability to understand the internal details of the default state store in Kafka Streams so that engineers can fine-tune their performance for different varieties of workloads and operate the state stores in a more robust manner.

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...

confluent

Switching from relational to the graph model

Luca Garulli

Hive tables are an integral part of the big data ecosystem, but the simple directory-based design that made them ubiquitous is increasingly problematic. Netflix uses tables backed by S3 that, like other object stores, don’t fit this directory-based model: listings are much slower, renames are not atomic, and results are eventually consistent. Even tables in HDFS are problematic at scale, and reliable query behavior requires readers to acquire locks and wait. Owen O’Malley and Ryan Blue offer an overview of Iceberg, a new open source project that defines a new table layout addresses the challenges of current Hive tables, with properties specifically designed for cloud object stores, such as S3. Iceberg is an Apache-licensed open source project. It specifies the portable table format and standardizes many important features, including: * All reads use snapshot isolation without locking. * No directory listings are required for query planning. * Files can be added, removed, or replaced atomically. * Full schema evolution supports changes in the table over time. * Partitioning evolution enables changes to the physical layout without breaking existing queries. * Data files are stored as Avro, ORC, or Parquet. * Support for Spark, Pig, and Presto.

Iceberg: A modern table format for big data (Strata NY 2018)

Ryan Blue

Node.js and the MySQL Document Store

Rui Quelhas

Introduction to Redis

Dvir Volk

Apache Tez - A New Chapter in Hadoop Data Processing

DataWorks Summit

Flink Forward San Francisco 2022. Apache Flink and Delta Lake together allow you to build the foundation for your data lakehouses by ensuring the reliability of your concurrent streams from processing to the underlying cloud object-store. Together, the Flink/Delta Connector enables you to store data in Delta tables such that you harness Delta’s reliability by providing ACID transactions and scalability while maintaining Flink’s end-to-end exactly-once processing. This ensures that the data from Flink is written to Delta Tables in an idempotent manner such that even if the Flink pipeline is restarted from its checkpoint information, the pipeline will guarantee no data is lost or duplicated thus preserving the exactly-once semantics of Flink. by Scott Sandre & Denny Lee

Building Reliable Lakehouses with Apache Flink and Delta Lake

Flink Forward

Kafka internals

David Groozman

PostgreSQL Extensions: A deeper look

Jignesh Shah

If you are running Apache Spark in cloud environments, Object Stores —such as Amazon S3 or Azure WASB— are a core part of your system. What you can’t do is treat them like “just another filesystem” —do that and things will, eventually, go horribly wrong. This talk looks at the object stores in the cloud infrastructures, including underlying architectures., compares them to what a “real filesystem” is expected to do and shows how to use object stores efficiently and safely as sources of and destinations of data. It goes into depth on recent “S3a” work, showing how including improvements in performance, security, functionality and measurement —and demonstrating how to use make best use of it from a spark application. If you are planning to deploy Spark in cloud, or doing so today: this is information you need to understand. The performance of you code and integrity of your data depends on it.

Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...

Spark Summit

Introduction to Apache Spark

Datio Big Data

Presto At Arm Treasure Data - 2019 Updates

Taro L. Saito

This talk will start by explaining the optimal file format, compression algorithm, and file size for plain vanilla Parquet data lakes. It discusses the small file problem and how you can compact the small files. Then we will talk about partitioning Parquet data lakes on disk and how to examine Spark physical plans when running queries on a partitioned lake. We will discuss why it’s better to avoid PartitionFilters and directly grab partitions when querying partitioned lakes. We will explain why partitioned lakes tend to have a massive small file problem and why it’s hard to compact a partitioned lake. Then we’ll move on to Delta lakes and explain how they offer cool features on top of what’s available in Parquet. We’ll start with Delta 101 best practices and then move on to compacting with the OPTIMIZE command. We’ll talk about creating partitioned Delta lake and how OPTIMIZE works on a partitioned lake. Then we’ll talk about ZORDER indexes and how to incrementally update lakes with a ZORDER index. We’ll finish with a discussion on adding a ZORDER index to a partitioned Delta data lake.

Optimizing Delta/Parquet Data Lakes for Apache Spark

Databricks

Lakehouses are quickly growing in popularity as a new approach to Data Platform Architecture bringing some of the long-established benefits from OLTP world to OLAP, including transactions, record-level updates/deletes, and changes streaming. In this talk, we will discuss Apache Hudi and how it unlocks possibilities of building your own fully open-source Lakehouse featuring a rich set of integrations with existing technologies, including Apache Pulsar. In this session, we will present: - What Lakehouses are, and why they are needed. - What Apache Hudi is and how it works. - Provide a use-case and demo that applies Apache Hudi’s DeltaStreamer tool to ingest data from Apache Pulsar.

Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...

StreamNative

This presentation introduces Apache Flink, a massively parallel data processing engine which currently undergoes the incubation process at the Apache Software Foundation. Flink's programming primitives are presented and it is shown how easily a distributed PageRank algorithm can be implemented with Flink. Intriguing features such as dedicated memory management, Hadoop compatibility, streaming and automatic optimisation make it an unique system in the world of Big Data processing.

Introduction to Apache Flink - Fast and reliable big data processing

Till Rohrmann

Apache Arrow Flight Overview

Jacques Nadeau

Presentation at Strata Data Conference 2018, New York The controller is the brain of Apache Kafka. A big part of what the controller does is to maintain the consistency of the replicas and determine which replica can be used to serve the clients, especially during individual broker failure. Jun Rao outlines the main data flow in the controller—in particular, when a broker fails, how the controller automatically promotes another replica as the leader to serve the clients, and when a broker is started, how the controller resumes the replication pipeline in the restarted broker. Jun then describes recent improvements to the controller that allow it to handle certain edge cases correctly and increase its performance, which allows for more partitions in a Kafka cluster.

A Deep Dive into Kafka Controller

confluent

As a general computing engine, Spark can process data from various data management/storage systems, including HDFS, Hive, Cassandra and Kafka. For flexibility and high throughput, Spark defines the Data Source API, which is an abstraction of the storage layer. The Data Source API has two requirements. 1) Generality: support reading/writing most data management/storage systems. 2) Flexibility: customize and optimize the read and write paths for different systems based on their capabilities. Data Source API V2 is one of the most important features coming with Spark 2.3. This talk will dive into the design and implementation of Data Source API V2, with comparison to the Data Source API V1. We also demonstrate how to implement a file-based data source using the Data Source API V2 for showing its generality and flexibility.

Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang

Databricks

Mais procurados (20)

Introduction to NoSQL Databases

Iceberg: a fast table format for S3

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...

Switching from relational to the graph model

Iceberg: A modern table format for big data (Strata NY 2018)

Node.js and the MySQL Document Store

Introduction to Redis

Apache Tez - A New Chapter in Hadoop Data Processing

Building Reliable Lakehouses with Apache Flink and Delta Lake

Kafka internals

PostgreSQL Extensions: A deeper look

Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...

Introduction to Apache Spark

Presto At Arm Treasure Data - 2019 Updates

Optimizing Delta/Parquet Data Lakes for Apache Spark

Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...

Introduction to Apache Flink - Fast and reliable big data processing

Apache Arrow Flight Overview

A Deep Dive into Kafka Controller

Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang

Semelhante a OrientDB Distributed Architecture v2.0

OrientDB and Hazelcast

Luca Garulli

OrientDB uses Hazelcast to achieve a distributed configuration with multi-master replication. By using these together you can scale up horizontally by adding new servers without stopping or reconfigure the cluster. In this webinar, you’ll be introduced to OrientDB and how it compares to other NoSQL DBMS. You will also learn how and why Hazelcast is being used with OrientDB to achieve scale, elasticity and performance. Both Hazelcast and Orient Technologies are providing professional open source support to their respective projects under the Apache software license.

OrientDB & Hazelcast: In-Memory Distributed Graph Database

Hazelcast

Scale Out Your Graph Across Servers and Clouds with OrientDB

Luca Garulli

Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system

Shuai Yuan

Best Practices for Building Open Source Data Layers

IBMCompose

352-001-Exam-ADVDESIGN

KylieJonathan

ScaleIO : capitalisez sur vos infrastructures existantes avec une solution so...

RSD

Best practices for long-term support and security of the device-tree

Alison Chaiken

OpenSlava Infrastructure Automation Patterns

Antons Kranga

IBM recently announced SMCv2 for SMC-Dv2. SMCv2 defines Shared Memory Communications for TCP/IP connections that span multiple IP subnets. This presentation will introduce how this new capability applies to SMC-Dv2 for z/OS V2R4 and ISMv2 for the IBM z15. The key concepts of the SMC-Dv2 will be introduced along with an overview of how to get started using the new solution with z/OS 2.4. The overview will also cover exploitation and configuration considerations for existing SMC-D users and for new SMC-Dv2 only users. In this presentation, customers will learn about this new capability and how this capability can extend the performance benefits of SMC-D to additional System Z workloads.

Introduction to IBM Shared Memory Communications Version 2 (SMCv2) and SMC-Dv2

zOSCommserver

EMC ScaleIO Overview

walshe1

Deview 2013 rise of the wimpy machines - john mao

NAVER D2

Drbd9 and drbdmanage_june_2016

Philipp Reisner

AWS Meetup Paris - Short URL project by Pernod Ricard

Charles Rapp

Xiv svc best practices - march 2013

Jinesh Shah

200-301-demo.pdf

CiscoExamDumpsarticl1

How is this newsletter going to help you? Apart from providing you with a brief glimpse of the test’s topics and shape, we are able to additionally assist you discover efficient education substances. Cisco’s internet site is a extraordinary starting point, but you shouldn’t restriction your self to it. Despite the fact that you would possibly have by no means heard approximately them, you have to attempt exam dumps as they may grow to be your secret tool to get a passing rating in 2 hundred-301 assessment. But now, let’s start with the exam details.

Cisco 200-301 Exam Dumps

CiscoExamDumpsarticl2

Cisco 200-301 Exam Dumps

CiscoExamDumpsarticl

Massively Parallel RISC-V Processing with Transactional Memory

Netronome

GumGum relies heavily on Cassandra for storing different kinds of metadata. Currently GumGum reaches 1 billion unique visitors per month using 3 Cassandra datacenters in Amazon Web Services spread across the globe. This presentation will detail how we scaled out from one local Cassandra datacenter to a multi-datacenter Cassandra cluster and all the problems we encountered and choices we made while implementing it. How did we architect multi-region Cassandra in AWS? What were our experiences in implementing multi-datacenter Cassandra? How did we achieve low latency with multi-region Cassandra and the Datastax Driver? What are the different Cassandra use cases at GumGum? How did we integrate our Cassandra with Spark?

GumGum: Multi-Region Cassandra in AWS

DataStax Academy

Semelhante a OrientDB Distributed Architecture v2.0 (20)

OrientDB and Hazelcast

OrientDB & Hazelcast: In-Memory Distributed Graph Database

Scale Out Your Graph Across Servers and Clouds with OrientDB

Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system

Best Practices for Building Open Source Data Layers

352-001-Exam-ADVDESIGN

ScaleIO : capitalisez sur vos infrastructures existantes avec une solution so...

Best practices for long-term support and security of the device-tree

OpenSlava Infrastructure Automation Patterns

Introduction to IBM Shared Memory Communications Version 2 (SMCv2) and SMC-Dv2

EMC ScaleIO Overview

Deview 2013 rise of the wimpy machines - john mao

Drbd9 and drbdmanage_june_2016

AWS Meetup Paris - Short URL project by Pernod Ricard

Xiv svc best practices - march 2013

200-301-demo.pdf

Cisco 200-301 Exam Dumps

Massively Parallel RISC-V Processing with Transactional Memory

GumGum: Multi-Region Cassandra in AWS

Último

Abhishek Deb(1), Mr Abdul Kalam(2) M. Des (UX) , School of Design, DIT University , Dehradun. This paper explores the future potential of AI-enabled smartphone processors, aiming to investigate the advancements, capabilities, and implications of integrating artificial intelligence (AI) into smartphone technology. The research study goals consist of evaluating the development of AI in mobile phone processors, analyzing the existing state as well as abilities of AI-enabled cpus determining future patterns as well as chances together with reviewing obstacles as well as factors to consider for more growth.

Exploring the Future Potential of AI-Enabled Smartphone Processors

debabhi2

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

MadyBayot

Boost Fertility New Invention Ups Success Rates.pdf

sudhanshuwaghmare1

Automating Google Workspace (GWS) & more with Apps Script

wesley chun

Created by Mozilla Research in 2012 and now part of Linux Foundation Europe, the Servo project is an experimental rendering engine written in Rust. It combines memory safety and concurrency to create an independent, modular, and embeddable rendering engine that adheres to web standards. Stewardship of Servo moved from Mozilla Research to the Linux Foundation in 2020, where its mission remains unchanged. After some slow years, in 2023 there has been renewed activity on the project, with a roadmap now focused on improving the engine’s CSS 2 conformance, exploring Android support, and making Servo a practical embeddable rendering engine. In this presentation, Rakhi Sharma reviews the status of the project, our recent developments in 2023, our collaboration with Tauri to make Servo an easy-to-use embeddable rendering engine, and our plans for the future to make Servo an alternative web rendering engine for the embedded devices industry. (c) Embedded Open Source Summit 2024 April 16-18, 2024 Seattle, Washington (US) https://events.linuxfoundation.org/embedded-open-source-summit/ https://ossna2024.sched.com/event/1aBNF/a-year-of-servo-reboot-where-are-we-now-rakhi-sharma-igalia

A Year of the Servo Reboot: Where Are We Now?

Igalia

Accelerating FinTech Innovation: Unleashing API Economy and GenAI Vasa Krishnan, Chief Technology Officer - FinResults Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

apidays

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

Manulife - Insurer Transformation Award 2024

The Digital Insurer

AXA XL - Insurer Innovation Award Americas 2024

The Digital Insurer

Scaling API-first – The story of a global engineering organization Ian Reasor, Senior Computer Scientist - Adobe Radu Cotescu, Senior Computer Scientist - Adobe Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

apidays

Join our latest Connector Corner webinar to discover how UiPath Integration Service revolutionizes API-centric automation in a 'Quote to Cash' process—and how that automation empowers businesses to accelerate revenue generation. A comprehensive demo will explore connecting systems, GenAI, and people, through powerful pre-built connectors designed to speed process cycle times. Speakers: James Dickson, Senior Software Engineer Charlie Greenberg, Host, Product Marketing Manager

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

DianaGray10

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...

Zilliz

Scalable LLM APIs for AI and Generative AI Application Development Ettikan Karuppiah, Director/Technologist - NVIDIA Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...

apidays

Building Digital Trust in a Digital Economy Veronica Tan, Director - Cyber Security Agency of Singapore Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

apidays

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Zilliz

As privacy and data protection regulations evolve rapidly, organizations operating in multiple jurisdictions face mounting challenges to ensure compliance and safeguard customer data. With state-specific privacy laws coming up in multiple states this year, it is essential to understand what their unique data protection regulations will require clearly. How will data privacy evolve in the US in 2024? How to stay compliant? Our panellists will guide you through the intricacies of these states' specific data privacy laws, clarifying complex legal frameworks and compliance requirements. This webinar will review: - The essential aspects of each state's privacy landscape and the latest updates - Common compliance challenges faced by organizations operating in multiple states and best practices to achieve regulatory adherence - Valuable insights into potential changes to existing regulations and prepare your organization for the evolving landscape

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

TrustArc

MySQL Webinar, presented on the 25th of April, 2024. Summary: MySQL solutions enable the deployment of diverse Database Architectures tailored to specific needs, including High Availability, Disaster Recovery, and Read Scale-Out. With MySQL Shell's AdminAPI, administrators can seamlessly set up, manage, and monitor these solutions, ensuring efficiency and ease of use in their administration. MySQL Router, on the other hand, provides transparent routing from the application traffic to the backend servers in the architectures, requiring minimal configuration. Completely built in-house and supported by Oracle, these solutions have been adopted by enterprises of all sizes for their business-critical applications. In this presentation, we'll delve into various database architecture solutions to help you choose the right one based on your business requirements. Focusing on technical details and the latest features to maximize the potential of these solutions.

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Miguel Araújo

FWD Group - Insurer Innovation Award 2024

The Digital Insurer

Effective data discovery is crucial for maintaining compliance and mitigating risks in today's rapidly evolving privacy landscape. However, traditional manual approaches often struggle to keep pace with the growing volume and complexity of data. Join us for an insightful webinar where industry leaders from TrustArc and Privya will share their expertise on leveraging AI-powered solutions to revolutionize data discovery. You'll learn how to: - Effortlessly maintain a comprehensive, up-to-date data inventory - Harness code scanning insights to gain complete visibility into data flows leveraging the advantages of code scanning over DB scanning - Simplify compliance by leveraging Privya's integration with TrustArc - Implement proven strategies to mitigate third-party risks Our panel of experts will discuss real-world case studies and share practical strategies for overcoming common data discovery challenges. They'll also explore the latest trends and innovations in AI-driven data management, and how these technologies can help organizations stay ahead of the curve in an ever-changing privacy landscape.

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

TrustArc

A Beginners Guide to Building a RAG App Using Open Source Milvus

Zilliz

OrientDB Distributed Architecture v2.0

1. Distributed DBMS

2. Short history ! In 2012, we had a Master/Slave replication ! While it scaled up well on reads, users complained of a single Master node bottleneck It’s quite easy to scale up reads, the hard part is to scale up both reads and writes Copyright (c) - Orient Technologies LTD 2

3. How Master/Slave works Copyright (c) - Orient Technologies LTD 3 C C C Master Node Slave Node Slave Node Writes Master node is the bottleneck

4. Master/Slave ! PROS: - Relatively easy to develop ! CONS: - The master is the bottleneck for writes - No matter how many servers you have, the throughput is limited by the Master node Copyright (c) - Orient Technologies LTD 4

5. What happened to OrientDB's M/S architecture? This is the old MASTER/SLAVE replication Copyright (c) - Orient Technologies LTD 5

6. 2012: new architectural goals Multi-Master: all the nodes must accept writes Sharding: split data in multiple partitions Better Fail-Over Simplified configuration with Auto-Discovery Copyright (c) - Orient Technologies LTD 6

7. Auto-Discovery C Master Node I’m the only one! Copyright (c) - Orient Technologies LTD 7

8. Auto-Discovery Connected! C Master Node Master Node Copyright (c) - Orient Technologies LTD 8

9. Clients see the distributed configuration C Master Node updated distributed configuration is broadcasted to all the connected clients Master Node Copyright (c) - Orient Technologies LTD 9

10. Auto-reconnect in case of failure In case of failure, the clients auto-reconnect to C C the available nodes Master Node Master Node Copyright (c) - Orient Technologies LTD 10

11. Auto-deploy of databases automatically deployed C to the new joining Master Node C Master Node DB are nodes C C DB DB Copyright (c) - Orient Technologies LTD 11

12. Classes rely on Cluster to store records 1 class -> 1 cluster Class Customer customer By default Cluster Copyright (c) - Orient Technologies LTD 12

13. Classes can be split into more clusters Customer customer_usa Class multiple clusters and assign them to customer_china Define each node Cluster Cluster customer_europe Cluster Copyright (c) - Orient Technologies LTD 13

14. Assign 1 cluster per Node Master Node Customer Master Node Master Node customer_usa customer_europe customer_china Copyright (c) - Orient Technologies LTD 14

15. Copyright (c) - Orient Technologies LTD What about sharing + replication? ! We used a solution similar to RAID for HardDrives 15

16. RAID for databases Replica factor = 2 Master Node Customer Master Node Master Node customer_usa customer_europe customer_china customer_china customer_usa customer_europe Copyright (c) - Orient Technologies LTD 16

17. RAID for databases Replica factor = 3 Master Node Master Node Each node owns all customers Master Node customer_usa customer_europe customer_china customer_customer_china usa customer_europe customer_europe customer_china customer_usa Copyright (c) - Orient Technologies LTD 17

18. Replication: under the hood Client sends an INSERT request HZ Queue Requests Master Node HZ Queue Master Node HZ Queue Master Node C INSERT Copyright (c) - Orient Technologies LTD 18

19. Replication: under the hood HZ Queue Response handling Requests Master Node HZ Queue Master Node HZ Queue WriteQuorum = 2 Sends OK Master Node C HZ Queue HZ Queue HZ Queue OK Responses Copyright (c) - Orient Technologies LTD 19

20. Replication: under the hood Fix the unaligned node HZ Queue Requests Master Node HZ Queue Master Node HZ Queue Master Node HZ Queue HZ Queue HZ Queue Responses Fix Copyright (c) - Orient Technologies LTD 20

21. Linear and Elastic scalability C Master Node C on both read & writes! Master Node C C Master Node C C C C Master Node C C C C Master Node C C C Master Node C C C Master Node C C Copyright (c) - Orient Technologies LTD 21

22. Hazelcast’s role Auto-Discovering (Multicast/TCP-IP/Amazon) Queues for requests and responses Store metadata in distributed Maps Distributed Locks Copyright (c) - Orient Technologies LTD 22

23. OrientDB’s Future Roadmap OrientDB 2.0 (Sept 2014) has even better performance: +300% improvement on all the distributed operations Pluggable conflict resolution strategy Auto-discovery also by Clients Copyright (c) - Orient Technologies LTD 23

OrientDB Distributed Architecture v2.0

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a OrientDB Distributed Architecture v2.0

Semelhante a OrientDB Distributed Architecture v2.0 (20)

Último

Último (20)

OrientDB Distributed Architecture v2.0