SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
Satoshi Konno
Cassandra: Now and the Future
@ Yahoo! JAPAN
10 October 2017
whoami
2
Satoshi Konno
http://www.cybergarage.org
Engineering Manager of NoSQL and NewSQL Teams @ Yahoo! Japan
Open Source Software Developer for
Virtual Reality, IoT and Cloud Computing
Doctor's Course Student @ JAIST
Défago Lab: The φ accrual failure detector
Agenda
3
• What is the NoSQL Team?
• Why did we choose Cassandra?
• What is NewSQL?
• NewSQL with Cassandra
Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved.
NoSQL Team
What is Yahoo! JAPAN?
5
Many Strong Services
Media
US
Search Video Answer Mail
JP
US
JP
Membership C2C Payment C2C EC B2C EC Local
Search Knowledge search MailNews
YAHUOKU!Premium Wallet Loco
What is the NoSQL Team?
6
300+
Systems
100+
Services
NoSQL Team
Cassandra @ Yahoo! JAPAN
7
2010 2012 2014 2016 2018
Service
Departments
NoSQL
Team
0.5 0.8 1.x
0.8 1.x 2.x 3.xNoSQL
Team
Cassandra @ Yahoo! JAPAN
8
50
Clusters
50TB
Usages
2000+
Nodes
500,000
Read/sec
500,000
Write/sec
2017
10
Nodes /
Cluster
200
Nodes /
Cluster
…
1
Shared
Cluster
50
Special
Clusters
50
Systems
50
Systems
3
DCs
Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved.
NoSQL Team with
Cassandra
Key Value Store
Team
Search Engine
Team
Before Cassandra
10
Services
Search Engine
Key Value Store
• Problems: Inappropriate usage by internal platforms
and new demands for more big data.
Don’t store
any big key
data!!
Don’t use it
like a key
value store!!
2012
We store data of
our services on
your platforms.
We want to store
more big data of
our new services
easily.
Key Value Store
Team
Search Engine
Team
NoSQL Team
11
• Launched NoSQL Team in 2012
We should build
new centralized
platform for more
big data!!
2012
NoSQL
Team
Service
Departments
Join
Join
Join
However, many open source NoSQL
databases have been released already,
so we have to evaluate these.
New
NoSQL Team
12
• NoSQL team selected Cassandra as our
first centralized NoSQL database.
Services
2012
• High Availability
• Performance
• Persistence
• Scalability
• …..
• Maintainability
• Appropriate
Open Source
License
• …..
Function Point Analysis
No1
NoSQL
Team
NoSQL 2012 2014 2016 2018
State of NoSQL Databases
13
0.x 1.x 2.x 3.x 4.x
2.x 3.x 4.x
0.x 1.x 2.x
2.x 3.x
1.x 2.x
1.x 2.x
0.9.7.x
Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved.
NewSQL
NewSQL Trends
• NewSQL
= NoSQL (Scalability) + RDBMS (SQL, ACID)
= Scalable RDBMS like NoSQL
15
NoSQL
Team
Services
Requests for NewSQL
16
NoSQL
Team
We have big on-
premises data centers
and, we can’t use the
NewSQL platforms in
our private cloud.
Public Cloud OSS
We want to make use
of our knowledge
experience with
Cassandra.
Private Cloud Knowledge
Experience
17
NewSQL with Cassandra
Function
Google Amazon
MariaDB Cockroachdb
Spanner Aurora
Logging
Query
Engine
Transaction
Schema
Store
Storage
NoSQL
Team
Could we use Cassandra
for storage layer of
NewSQL databases?
Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved.
Trial for
NewSQL with
Cassandra
(OSS SQL Engines with Cassandra)
Trial Concept
19
• OSS SQL Engine + Distributed Storage
= PostgreSQL + Cassandra or
SQLite + Cassandra
Function Traial
Logging
Query Engine
Transaction
Schema
Store
Storage
NoSQL
Team
Could we replace storage
layer of SQL databases
with Cassandra?
Study Implementation
20
NoSQL
Team
PostgreSQL’s storage is
abstracted as the storage
manager, but …..
Storage Manager
SQLite’s storage is
abstracted as the virtual file
system too, but …..
Virtual File System
NoSQL
Team
To implement the abstract
functions directly is hard to
debug ….
POSIX Emulation
21
#define open(path, flags, mode) posix_vfs_cassandra_open(path, flags, mode)
#define close(fd) posix_vfs_cassandra_close(fd)
#define read(fd, buf, nbytes) posix_vfs_cassandra_read(fd, buf, nbytes)
#define write(fd, buf, nbytes) posix_vfs_cassandra_write(fd, buf, nbytes)
#define access(path, mode) posix_vfs_cassandra_access(path, mode)
#define unlink(path) posix_vfs_cassandra_unlink(path)
#define fstat(fd, buf) posix_vfs_cassandra_fstat(fd, buf)
#define fsync(fd) posix_vfs_cassandra_fsync(fd)
#define lseek(fd, offset, whence) posix_vfs_cassandra_lseek(fd, offset, whence)
NoSQL
Team
Develop compliant library with Cassandra for POSX file I/O functions,
and replace the POSIX functions with Cassandra compliant functions
The storage layers of PostgreSQL and SQLite are
implemented using POSIX file I/O functions.
Storage Manager Virtual File System
POSIX file I/O
functions
POSIX file I/O
functions
POSIX file I/O
functions
Compliant file I/O
functions
NoSQL
Team
This implementation method
easy to write the unit test,
and it is easy to debug too.
Cassandra
File Management
22
CREATE TABLE IF NOT EXISTS posix.storage (
path varchar,
block_no bigint,
block blob,
PRIMARY KEY (path, block_no));
SQL Engines
Storage Manager
Virtual File System
• File A
• File B
• .....
• .....
• .....
• File N
Block 0
File
Block 1 Block 2 ….. ….. ….. ….. …..
File
Benchmark
23
0
5
10
15
20
25
30
35
INSERT SELECT UPDATE
SQLite (Disk) SQLite+C* (1KB) SQLite+C* (4KB) SQLite+C* (8KB)
• Naive Implementation
X : Multi-threads
X : Async Requests
• Don’t care
X : Only Storage Layer
X : Access Coflict
(v3.20.1 + speedtest.tcl)
This is very a naive
and rough
implementation of a
distributed database
now, but ….
Yahoo! JAPAN ブース
24
A
連絡先 : nosql-info@mail.yahoo.co.jp
Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved.
References

Mais conteúdo relacionado

Mais procurados

FDW-based Sharding Update and Future
FDW-based Sharding Update and FutureFDW-based Sharding Update and Future
FDW-based Sharding Update and FutureMasahiko Sawada
 
Full Stack Monitoring with Prometheus and Grafana
Full Stack Monitoring with Prometheus and GrafanaFull Stack Monitoring with Prometheus and Grafana
Full Stack Monitoring with Prometheus and GrafanaJazz Yao-Tsung Wang
 
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...Hadoop / Spark Conference Japan
 
Big Data Tools : PAST, NOW and FUTURE
Big Data Tools : PAST, NOW and FUTUREBig Data Tools : PAST, NOW and FUTURE
Big Data Tools : PAST, NOW and FUTUREJazz Yao-Tsung Wang
 
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャリアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャHiroyuki Inoue
 
A Peek in the Elephant's Trunk
A Peek in the Elephant's TrunkA Peek in the Elephant's Trunk
A Peek in the Elephant's TrunkEDB
 
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsA Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsDataWorks Summit
 
DAT316_Report from the field on Aurora PostgreSQL Performance
DAT316_Report from the field on Aurora PostgreSQL PerformanceDAT316_Report from the field on Aurora PostgreSQL Performance
DAT316_Report from the field on Aurora PostgreSQL PerformanceAmazon Web Services
 
Transparent Data Encryption in PostgreSQL and Integration with Key Management...
Transparent Data Encryption in PostgreSQL and Integration with Key Management...Transparent Data Encryption in PostgreSQL and Integration with Key Management...
Transparent Data Encryption in PostgreSQL and Integration with Key Management...Masahiko Sawada
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Hadoop / Spark Conference Japan
 
Presto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Presto: SQL-on-Anything. Netherlands Hadoop User Group MeetupPresto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Presto: SQL-on-Anything. Netherlands Hadoop User Group MeetupWojciech Biela
 
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachLiving with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachJeremy Zawodny
 
Online Upgrade Using Logical Replication
 Online Upgrade Using Logical Replication Online Upgrade Using Logical Replication
Online Upgrade Using Logical ReplicationEDB
 
Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...
Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...
Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...Amazon Web Services
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesHaohui Mai
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity PlanningNorberto Leite
 

Mais procurados (20)

FDW-based Sharding Update and Future
FDW-based Sharding Update and FutureFDW-based Sharding Update and Future
FDW-based Sharding Update and Future
 
Ventajas de IPv6
Ventajas de IPv6Ventajas de IPv6
Ventajas de IPv6
 
Introduction to HCFS
Introduction to HCFSIntroduction to HCFS
Introduction to HCFS
 
Full Stack Monitoring with Prometheus and Grafana
Full Stack Monitoring with Prometheus and GrafanaFull Stack Monitoring with Prometheus and Grafana
Full Stack Monitoring with Prometheus and Grafana
 
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
The Evolution and Future of Hadoop Storage (Hadoop Conference Japan 2016キーノート...
 
Big Data Tools : PAST, NOW and FUTURE
Big Data Tools : PAST, NOW and FUTUREBig Data Tools : PAST, NOW and FUTURE
Big Data Tools : PAST, NOW and FUTURE
 
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャリアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
リアルタイム分析サービス『たべみる』を支える高可用性アーキテクチャ
 
A Peek in the Elephant's Trunk
A Peek in the Elephant's TrunkA Peek in the Elephant's Trunk
A Peek in the Elephant's Trunk
 
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and AnalyticsA Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
A Non-Standard use Case of Hadoop: High Scale Image Processing and Analytics
 
DAT316_Report from the field on Aurora PostgreSQL Performance
DAT316_Report from the field on Aurora PostgreSQL PerformanceDAT316_Report from the field on Aurora PostgreSQL Performance
DAT316_Report from the field on Aurora PostgreSQL Performance
 
Transparent Data Encryption in PostgreSQL and Integration with Key Management...
Transparent Data Encryption in PostgreSQL and Integration with Key Management...Transparent Data Encryption in PostgreSQL and Integration with Key Management...
Transparent Data Encryption in PostgreSQL and Integration with Key Management...
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
 
Presto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Presto: SQL-on-Anything. Netherlands Hadoop User Group MeetupPresto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
Presto: SQL-on-Anything. Netherlands Hadoop User Group Meetup
 
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic ApproachLiving with SQL and NoSQL at craigslist, a Pragmatic Approach
Living with SQL and NoSQL at craigslist, a Pragmatic Approach
 
Online Upgrade Using Logical Replication
 Online Upgrade Using Logical Replication Online Upgrade Using Logical Replication
Online Upgrade Using Logical Replication
 
Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...
Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...
Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...
 
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of FilesScaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of Files
 
Treasure Data Mobile SDK
Treasure Data Mobile SDKTreasure Data Mobile SDK
Treasure Data Mobile SDK
 
MongoDB Capacity Planning
MongoDB Capacity PlanningMongoDB Capacity Planning
MongoDB Capacity Planning
 
ElastiCache & Redis
ElastiCache & RedisElastiCache & Redis
ElastiCache & Redis
 

Destaque (9)

ICML2017 参加報告会 山本康生
ICML2017 参加報告会 山本康生ICML2017 参加報告会 山本康生
ICML2017 参加報告会 山本康生
 
広告における機械学習の適用例とシステムについて
広告における機械学習の適用例とシステムについて広告における機械学習の適用例とシステムについて
広告における機械学習の適用例とシステムについて
 
データ利活用を促進するメタデータ
データ利活用を促進するメタデータデータ利活用を促進するメタデータ
データ利活用を促進するメタデータ
 
Yahoo! JAPANのOSS Cassandra貢献の今までとこれから
Yahoo! JAPANのOSS Cassandra貢献の今までとこれからYahoo! JAPANのOSS Cassandra貢献の今までとこれから
Yahoo! JAPANのOSS Cassandra貢献の今までとこれから
 
第4回 NIPS+読み会・関西 発表資料 山本
第4回 NIPS+読み会・関西 発表資料 山本第4回 NIPS+読み会・関西 発表資料 山本
第4回 NIPS+読み会・関西 発表資料 山本
 
JavaOne2017参加報告 Microservices topic & approach #jjug
JavaOne2017参加報告 Microservices topic & approach #jjugJavaOne2017参加報告 Microservices topic & approach #jjug
JavaOne2017参加報告 Microservices topic & approach #jjug
 
Design pattern in presto source code
Design pattern in presto source codeDesign pattern in presto source code
Design pattern in presto source code
 
データテクノロジースペシャル:Yahoo! JAPANにおけるメタデータ管理の試み
データテクノロジースペシャル:Yahoo! JAPANにおけるメタデータ管理の試みデータテクノロジースペシャル:Yahoo! JAPANにおけるメタデータ管理の試み
データテクノロジースペシャル:Yahoo! JAPANにおけるメタデータ管理の試み
 
#ibis2017 Description: IBIS2017の企画セッションでの発表資料
#ibis2017 Description: IBIS2017の企画セッションでの発表資料#ibis2017 Description: IBIS2017の企画セッションでの発表資料
#ibis2017 Description: IBIS2017の企画セッションでの発表資料
 

Semelhante a Cassandra: Now and the Future @ Yahoo! JAPAN

Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014NoSQLmatters
 
20171104 hk-py con-mysql-documentstore_v1
20171104 hk-py con-mysql-documentstore_v120171104 hk-py con-mysql-documentstore_v1
20171104 hk-py con-mysql-documentstore_v1Ivan Ma
 
MySQL Document Store
MySQL Document StoreMySQL Document Store
MySQL Document StoreMario Beck
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQLUlf Wendel
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at PollfishPollfish
 
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkReal-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkGuido Schmutz
 
Big Data, NoSQL with MongoDB and Cassasdra
Big Data, NoSQL with MongoDB and CassasdraBig Data, NoSQL with MongoDB and Cassasdra
Big Data, NoSQL with MongoDB and CassasdraBrian Enochson
 
20140614 introduction to spark-ben white
20140614 introduction to spark-ben white20140614 introduction to spark-ben white
20140614 introduction to spark-ben whiteData Con LA
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkRed Hat Developers
 
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisReal time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisDuyhai Doan
 
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...NoSQLmatters
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandraBrian Enochson
 
Squeak DBX
Squeak DBXSqueak DBX
Squeak DBXESUG
 
Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...
Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...
Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...Tesora
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksAnyscale
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceDuyhai Doan
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Dave Stokes
 

Semelhante a Cassandra: Now and the Future @ Yahoo! JAPAN (20)

Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
 
20171104 hk-py con-mysql-documentstore_v1
20171104 hk-py con-mysql-documentstore_v120171104 hk-py con-mysql-documentstore_v1
20171104 hk-py con-mysql-documentstore_v1
 
MySQL Document Store
MySQL Document StoreMySQL Document Store
MySQL Document Store
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQL
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Real-Time Analytics with Apache Cassandra and Apache Spark,
Real-Time Analytics with Apache Cassandra and Apache Spark,Real-Time Analytics with Apache Cassandra and Apache Spark,
Real-Time Analytics with Apache Cassandra and Apache Spark,
 
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkReal-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache Spark
 
Big Data, NoSQL with MongoDB and Cassasdra
Big Data, NoSQL with MongoDB and CassasdraBig Data, NoSQL with MongoDB and Cassasdra
Big Data, NoSQL with MongoDB and Cassasdra
 
20140614 introduction to spark-ben white
20140614 introduction to spark-ben white20140614 introduction to spark-ben white
20140614 introduction to spark-ben white
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
 
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisReal time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
 
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
 
Dask: Scaling Python
Dask: Scaling PythonDask: Scaling Python
Dask: Scaling Python
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 
Squeak DBX
Squeak DBXSqueak DBX
Squeak DBX
 
Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...
Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...
Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with Databricks
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practice
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016
 

Mais de Yahoo!デベロッパーネットワーク

ヤフーでは開発迅速性と品質のバランスをどう取ってるか
ヤフーでは開発迅速性と品質のバランスをどう取ってるかヤフーでは開発迅速性と品質のバランスをどう取ってるか
ヤフーでは開発迅速性と品質のバランスをどう取ってるかYahoo!デベロッパーネットワーク
 
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2Yahoo!デベロッパーネットワーク
 
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtcヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtcYahoo!デベロッパーネットワーク
 
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtcYahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtcYahoo!デベロッパーネットワーク
 
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtcヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtcYahoo!デベロッパーネットワーク
 
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtcYahoo!デベロッパーネットワーク
 
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtcPC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtcYahoo!デベロッパーネットワーク
 
モブデザインによる多職種チームのコミュニケーション改善 #yjtc
モブデザインによる多職種チームのコミュニケーション改善 #yjtcモブデザインによる多職種チームのコミュニケーション改善 #yjtc
モブデザインによる多職種チームのコミュニケーション改善 #yjtcYahoo!デベロッパーネットワーク
 
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtcユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtcYahoo!デベロッパーネットワーク
 

Mais de Yahoo!デベロッパーネットワーク (20)

ゼロから始める転移学習
ゼロから始める転移学習ゼロから始める転移学習
ゼロから始める転移学習
 
継続的なモデルモニタリングを実現するKubernetes Operator
継続的なモデルモニタリングを実現するKubernetes Operator継続的なモデルモニタリングを実現するKubernetes Operator
継続的なモデルモニタリングを実現するKubernetes Operator
 
ヤフーでは開発迅速性と品質のバランスをどう取ってるか
ヤフーでは開発迅速性と品質のバランスをどう取ってるかヤフーでは開発迅速性と品質のバランスをどう取ってるか
ヤフーでは開発迅速性と品質のバランスをどう取ってるか
 
オンプレML基盤on Kubernetes パネルディスカッション
オンプレML基盤on Kubernetes パネルディスカッションオンプレML基盤on Kubernetes パネルディスカッション
オンプレML基盤on Kubernetes パネルディスカッション
 
LakeTahoe
LakeTahoeLakeTahoe
LakeTahoe
 
オンプレML基盤on Kubernetes 〜Yahoo! JAPAN AIPF〜
オンプレML基盤on Kubernetes 〜Yahoo! JAPAN AIPF〜オンプレML基盤on Kubernetes 〜Yahoo! JAPAN AIPF〜
オンプレML基盤on Kubernetes 〜Yahoo! JAPAN AIPF〜
 
Persistent-memory-native Database High-availability Feature
Persistent-memory-native Database High-availability FeaturePersistent-memory-native Database High-availability Feature
Persistent-memory-native Database High-availability Feature
 
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
データの価値を最大化させるためのデザイン~データビジュアライゼーションの方法~ #devsumi 17-E-2
 
eコマースと実店舗の相互利益を目指したデザイン #yjtc
eコマースと実店舗の相互利益を目指したデザイン #yjtceコマースと実店舗の相互利益を目指したデザイン #yjtc
eコマースと実店舗の相互利益を目指したデザイン #yjtc
 
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtcヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
ヤフーを支えるセキュリティ ~サイバー攻撃を防ぐエンジニアの仕事とは~ #yjtc
 
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtcYahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
Yahoo! JAPANのIaaSを支えるKubernetesクラスタ、アップデート自動化への挑戦 #yjtc
 
ビッグデータから人々のムードを捉える #yjtc
ビッグデータから人々のムードを捉える #yjtcビッグデータから人々のムードを捉える #yjtc
ビッグデータから人々のムードを捉える #yjtc
 
サイエンス領域におけるMLOpsの取り組み #yjtc
サイエンス領域におけるMLOpsの取り組み #yjtcサイエンス領域におけるMLOpsの取り組み #yjtc
サイエンス領域におけるMLOpsの取り組み #yjtc
 
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtcヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
ヤフーのAIプラットフォーム紹介 ~AIテックカンパニーを支えるデータ基盤~ #yjtc
 
Yahoo! JAPAN Tech Conference 2022 Day2 Keynote #yjtc
Yahoo! JAPAN Tech Conference 2022 Day2 Keynote #yjtcYahoo! JAPAN Tech Conference 2022 Day2 Keynote #yjtc
Yahoo! JAPAN Tech Conference 2022 Day2 Keynote #yjtc
 
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
新技術を使った次世代の商品の見せ方 ~ヤフオク!のマルチビュー機能~ #yjtc
 
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtcPC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
PC版Yahoo!メールリニューアル ~サービスのUI/UX統合と改善プロセス~ #yjtc
 
モブデザインによる多職種チームのコミュニケーション改善 #yjtc
モブデザインによる多職種チームのコミュニケーション改善 #yjtcモブデザインによる多職種チームのコミュニケーション改善 #yjtc
モブデザインによる多職種チームのコミュニケーション改善 #yjtc
 
「新しいおうち探し」のためのAIアシスト検索 #yjtc
「新しいおうち探し」のためのAIアシスト検索 #yjtc「新しいおうち探し」のためのAIアシスト検索 #yjtc
「新しいおうち探し」のためのAIアシスト検索 #yjtc
 
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtcユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
ユーザーの地域を考慮した検索入力補助機能の改善の試み #yjtc
 

Último

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Último (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Cassandra: Now and the Future @ Yahoo! JAPAN

  • 1. Satoshi Konno Cassandra: Now and the Future @ Yahoo! JAPAN 10 October 2017
  • 2. whoami 2 Satoshi Konno http://www.cybergarage.org Engineering Manager of NoSQL and NewSQL Teams @ Yahoo! Japan Open Source Software Developer for Virtual Reality, IoT and Cloud Computing Doctor's Course Student @ JAIST Défago Lab: The φ accrual failure detector
  • 3. Agenda 3 • What is the NoSQL Team? • Why did we choose Cassandra? • What is NewSQL? • NewSQL with Cassandra
  • 4. Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved. NoSQL Team
  • 5. What is Yahoo! JAPAN? 5 Many Strong Services Media US Search Video Answer Mail JP US JP Membership C2C Payment C2C EC B2C EC Local Search Knowledge search MailNews YAHUOKU!Premium Wallet Loco
  • 6. What is the NoSQL Team? 6 300+ Systems 100+ Services NoSQL Team
  • 7. Cassandra @ Yahoo! JAPAN 7 2010 2012 2014 2016 2018 Service Departments NoSQL Team 0.5 0.8 1.x 0.8 1.x 2.x 3.xNoSQL Team
  • 8. Cassandra @ Yahoo! JAPAN 8 50 Clusters 50TB Usages 2000+ Nodes 500,000 Read/sec 500,000 Write/sec 2017 10 Nodes / Cluster 200 Nodes / Cluster … 1 Shared Cluster 50 Special Clusters 50 Systems 50 Systems 3 DCs
  • 9. Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved. NoSQL Team with Cassandra
  • 10. Key Value Store Team Search Engine Team Before Cassandra 10 Services Search Engine Key Value Store • Problems: Inappropriate usage by internal platforms and new demands for more big data. Don’t store any big key data!! Don’t use it like a key value store!! 2012 We store data of our services on your platforms. We want to store more big data of our new services easily.
  • 11. Key Value Store Team Search Engine Team NoSQL Team 11 • Launched NoSQL Team in 2012 We should build new centralized platform for more big data!! 2012 NoSQL Team Service Departments Join Join Join However, many open source NoSQL databases have been released already, so we have to evaluate these. New
  • 12. NoSQL Team 12 • NoSQL team selected Cassandra as our first centralized NoSQL database. Services 2012 • High Availability • Performance • Persistence • Scalability • ….. • Maintainability • Appropriate Open Source License • ….. Function Point Analysis No1 NoSQL Team
  • 13. NoSQL 2012 2014 2016 2018 State of NoSQL Databases 13 0.x 1.x 2.x 3.x 4.x 2.x 3.x 4.x 0.x 1.x 2.x 2.x 3.x 1.x 2.x 1.x 2.x 0.9.7.x
  • 14. Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved. NewSQL
  • 15. NewSQL Trends • NewSQL = NoSQL (Scalability) + RDBMS (SQL, ACID) = Scalable RDBMS like NoSQL 15 NoSQL Team
  • 16. Services Requests for NewSQL 16 NoSQL Team We have big on- premises data centers and, we can’t use the NewSQL platforms in our private cloud. Public Cloud OSS We want to make use of our knowledge experience with Cassandra. Private Cloud Knowledge Experience
  • 17. 17 NewSQL with Cassandra Function Google Amazon MariaDB Cockroachdb Spanner Aurora Logging Query Engine Transaction Schema Store Storage NoSQL Team Could we use Cassandra for storage layer of NewSQL databases?
  • 18. Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved. Trial for NewSQL with Cassandra (OSS SQL Engines with Cassandra)
  • 19. Trial Concept 19 • OSS SQL Engine + Distributed Storage = PostgreSQL + Cassandra or SQLite + Cassandra Function Traial Logging Query Engine Transaction Schema Store Storage NoSQL Team Could we replace storage layer of SQL databases with Cassandra?
  • 20. Study Implementation 20 NoSQL Team PostgreSQL’s storage is abstracted as the storage manager, but ….. Storage Manager SQLite’s storage is abstracted as the virtual file system too, but ….. Virtual File System NoSQL Team To implement the abstract functions directly is hard to debug ….
  • 21. POSIX Emulation 21 #define open(path, flags, mode) posix_vfs_cassandra_open(path, flags, mode) #define close(fd) posix_vfs_cassandra_close(fd) #define read(fd, buf, nbytes) posix_vfs_cassandra_read(fd, buf, nbytes) #define write(fd, buf, nbytes) posix_vfs_cassandra_write(fd, buf, nbytes) #define access(path, mode) posix_vfs_cassandra_access(path, mode) #define unlink(path) posix_vfs_cassandra_unlink(path) #define fstat(fd, buf) posix_vfs_cassandra_fstat(fd, buf) #define fsync(fd) posix_vfs_cassandra_fsync(fd) #define lseek(fd, offset, whence) posix_vfs_cassandra_lseek(fd, offset, whence) NoSQL Team Develop compliant library with Cassandra for POSX file I/O functions, and replace the POSIX functions with Cassandra compliant functions The storage layers of PostgreSQL and SQLite are implemented using POSIX file I/O functions. Storage Manager Virtual File System POSIX file I/O functions POSIX file I/O functions POSIX file I/O functions Compliant file I/O functions NoSQL Team This implementation method easy to write the unit test, and it is easy to debug too.
  • 22. Cassandra File Management 22 CREATE TABLE IF NOT EXISTS posix.storage ( path varchar, block_no bigint, block blob, PRIMARY KEY (path, block_no)); SQL Engines Storage Manager Virtual File System • File A • File B • ..... • ..... • ..... • File N Block 0 File Block 1 Block 2 ….. ….. ….. ….. ….. File
  • 23. Benchmark 23 0 5 10 15 20 25 30 35 INSERT SELECT UPDATE SQLite (Disk) SQLite+C* (1KB) SQLite+C* (4KB) SQLite+C* (8KB) • Naive Implementation X : Multi-threads X : Async Requests • Don’t care X : Only Storage Layer X : Access Coflict (v3.20.1 + speedtest.tcl) This is very a naive and rough implementation of a distributed database now, but ….
  • 24. Yahoo! JAPAN ブース 24 A 連絡先 : nosql-info@mail.yahoo.co.jp
  • 25. Copyrig ht © 2017 Yahoo Japan Corporation. All Rig hts Reserved. References