SlideShare uma empresa Scribd logo
1 de 51
In Memory Databases
An Overview
By John Sullivan
john@inmemory.net
Row Store
Features
• Data is stored sequentially by Row
• Essentially an Array / List Structure
• Easy to Add / Update / Insert /Delete
• Need to read entire Row to get to
one Column’s Data
Column Store
Features
• Data is stored by Column
• Faster to Read a few Columns
• Very Hard to Update / Insert
• Reading Data Sequentially from
Column, CPU Cache Friendly
Compressed Column Store
Compressed Column Store
• Column Array is converted into 2 arrays
–One array contains a list of sorted
Unique Values
–Another array containing an integer
index to the values
Sqlite
• Opened by Special Filename :memory:
• Designed for Single Process / File
• Great for embedded systems/ mobile
devices. E.g. IOS Apps
• Row Store , No Column Store
• One Writer only. Non Server Based.
• Free & Open Source
Excel
• Power Pivot, Introduced in Excel 2010
• Non SQL Query Language
• Data Analysis Expressions (DAX)
• Syntax similar to Excel Formulae
• Requires Pro version of Office or Excel
Tableau
• Primarily a Visualization Tool
• Tableau Data Extracts (TDE)
• Compressed Column Store
• Generates one table flat Extract from Source (
that may involve joins )
• Uses ODBC / OLEDB For Extraction
• Only loads required columns from Extract
Qlik
• One of the Original Developers in Compressed
Columnar In Memory Analytics
• Nice Dashboards
• Incremental Updates
• Autojoins Fields based on Field Name
• Scripting Langauge for Generating QVD Files
Qlik Load Script Example
Companies:
LOAD id AS COMPANY_ID,
name as COMPANY_NAME,
postcode AS COMPANY_POSTCODE,
address AS COMPANY_ADDRESS,
If(id > 100, 1, 0) AS FLAG_NATIONAL;
SQL SELECT id, name, postcode, address
FROM database.Companies;
Monet DB
• Pioneer in Columnar Databases
• Research Focussed out of the Netherlands
• Open Source
• Can Cache Expensive Computations and Reuse
• Early versions was used by Data Distilleries,
which got bought out by SPSS
• R Integration
SQL Server Enterprise
• ColumnStore Indexes
–Data is stored by column.
–Blocks of 1,048,576 Values
• InMemory OLTP
(MEMORY_OPTIMIZED=ON) after Create Table
Data/Delta files of 128 MB
Oracle
• TimesTen
– Works with Oracle Database as a Cache
– Telecoms and Financial Companies
• Oracle 12 Enterprise
– Row & Column Formats
– In Memory Columnstore
• Exalytics
SAP Hana
• Pure In Memory Database
• In Memory OLTP Rowstore
• In Memory Columnstore
– Up to 2^31 rows per block
• Cluster Large Fact tables across nodes
• Hana One Available on EC2 & IBM
SAP Hana Archictecture
Memsql
• Pure In Memory Database
• Mysql Wire Protocol Compatible
• Lockfree Linked Lists and Skiplists
• SQL Queries compiled into C++
• Split Large Tables Across Nodes
• Column Store Aimed at Analytics
• Apache Spark Integration
Skiplists
Clustered Databases
• Amazon Redshift
• EMC Greenplum
• IBM Netezza
• HP Vertica
• Teradata
Other In Memory Players
• Sisense BI Focussed
• Parstream Cisco Owned
• Domo SAAS BI Company. Omniture Founder
• Iri
• InsightSquared BI Focussed
• VoltDB Java Stored Procedure Unit of Exec
• Infobright Open Sourced based on Mysql
• KDB Focussed on HFT / Terse
InMemory.Net
public static void testDoublePerformance() {
double total = 0;
for (int kk = 0; kk < 1000000000; kk++) {
total += kk;
}
Console.WriteLine(total);
}
Results
• Ran in about 2.5 second for a billion Rows
• 400 million rows per second on Single Core
• About 50% of performance of C++ Prog.
• 1.6 billion / second when running using 4 Core
• 2.0 billion / second when running with HT
Cores
Initial Version
• InMemoryColumn<T> {
Dictionary <T,int> initialValuesDict;
List <int> initialIndexes;
T [] finalValues;
int [] finalIndexes;
• }
Next Version
• InMemoryColumn<T> {
Dictionary <T,int> initialValuesDict;
int [][] initialIndexes;
T [] finalValues;
int [] finalIndexes;
• }
Final Version
• InMemoryColumn<T> {
Dictionary <T,int> initialValuesDict;
byte/ushort/int [][] initialIndexes;
T [] finalValues;
byte/ushort/int [] finalIndexes;
• }
ANLTR to Parse Queries
grammar Expr;
prog: (expr NEWLINE)* ;
expr: expr ('*'|'/') expr |
expr ('+'|'-') expr |
INT | '(' expr ')' ;
NEWLINE : [rn]+ ;
INT : [0-9]+ ;
Example Rule from Grammer
mainquery [ImpVars vars] returns [InMemoryQuery query ] :
{ $query = new InMemoryQuery(); }
SELECT1
(CACHE {$query.setCache();} )?
(NOCACHE {$query.setNoCache();} )?
(DISTINCT {$query.setDistinct();} )?
fieldclause [$query,$vars]
(
(INTO label { $query.setInto ($label.text2 ) ;})?
FROM tableclause [$query,$vars]
( (COMMA|CROSS JOIN ) tableclause [$query,$vars] ) *
(WHERE whereclause [$query,$vars])?
(GROUP BY groupclause [$query,$vars])?
(HAVING havingclause [$query,$vars])?
(ORDER BY orderclause [$query,$vars])?
(LIMIT limitclause [$query,$vars])?
)? ;
Code Generation
• Generate C# To Evaluate Query
• Compiled Code undergoes JIT for fast exec
• Parameterize Constants
– Simplify complex Constant Expressions
• Generic Table / Column Naming
• Reuse Generated Code
Detail Queries
• Detail Query
–Initial List Algorithm
–Improved by using Arrays of Arrays
–Only one thread works on one
Array
SELECT customerid FROM Orders
for (int tab1_counter = rowStart; tab1_counter < rowEnd; tab1_counter++,)
{ groupRowD1 = groupRowCount >> 14;
groupRowD2 = groupRowCount & 16383;
if (groupRowD2 == 0)
{
if (groupRowD1 > 0)
{
blockCounts[groupRowD1 - 1] = 16384;
}
lock (lock_newBlockObject)
{
groupRowCount = nextRecordD1 << 14;
nextRecordD1++;
}
groupRowD1 = groupRowCount >> 14;
t_total0[groupRowD1] = new byte[16384];
total0 = t_total0[groupRowD1];
};
total0[groupRowD2] = val_t1_c1[tab1_counter];
groupRowCount++;
if ((groupRowCount & 16383) == 0)
{
blockCounts[groupRowD1] = 16384;
}
}
Aggregative Queries
• Group Cardinality =1
• Group Cardinality < 500k
– Use Arrays of Arrays,
– Lookup Key being Group Index
• Group Cardinality > 500k
– Use Dictionaries to Correlate Group Index ->
Storage
– Arrays of Arrays
SELECT customer, SUM(1) FROM orders
WHERE employee=1 GROUP BY customer
for (int tab1_counter = rowStart; tab1_counter < rowEnd;
tab1_counter++, newRow = false) {
if ((val_t1_c2[tab1_counter] == const_0_t1_c2)) {
rowIndex = val_t1_c1[tab1_counter];
if (groupRowExists[rowIndex] == 0) newRow = true;
groupRowExists[rowIndex] = 1;
total1[rowIndex] += const_0;
if (newRow) {
total0[rowIndex]=val_t1_c1[tab1_counter];
}
}
}
COUNT DISTINCT
• Initial Algorithm used Byte []
• Used lots of Memory on Large Cores
• Upgraded to 1 [] across all Cores
• Interlocked.CompareExchange to set Bit
• Hashmap for initial Values
• Then switch to byte []
Subqueries
• Subquery in Table clause can be materialized
into temp table ( CACHE )
• Simplify Subquery ( NOCACHE)
Only Fields Parent SELECT Requires
Pass Through Parent WHERE Clause
JOINS
• LEFT & INNER JOIN SUPPORT
• Merge Parent & Child Column Values
• Parent Value -> Child Indexes
• ONE to ONE
– Join becomes an Array Lookup
• ONE to Many
– Join Becomes for Loop
Query Simplification
• Rewrite Aggregate Queries with Expressions
SELECT SUM(1) / SUM (qty ) FROM Orders
SELECT SUM(1) as A, SUM(QTY) as B from
Orders
SELECT A/B FROM TEMP_QUERY
More Simplifications
• Group Expressions with 1 Database Field
e.g. Group by Month ( OrderDate )
Inner Join OrderDate to Table of Its Unique
Values and Month ( OrderDate )
• Remove Redundant Group By Parts
Group BY OrderDate , Month ( Orderdate )
Group BY OrderDate , Month ( Orderdate )
HAVING Clause
• Convert to two Queries
• One Query without Having Clause
• Having Clause becomes Where of Second
Query
Function List
String Functions
CAST | CAST_STR_AS_INT | CAST_STR_AS_DECIMAL | CHAR | CHARINDEX | COALESCE | CONCAT | CSTR | ENDSWITH | INSERT | ISNULL | ISNULLOREMPTY
| LEFT | LEN | LCASE | LTRIM | REMOVE | REPLACE | REVERSE | RIGHT | RTRIM | SUBSTRING | STARTSWITH | TRIM | UCASE
Date Functions
CDATE | DATEADD | DATEDIFF | DATEDIFFMILLISECOND | DATEPART | DATESERIAL | DAY | DAYOFWEEK | MONTH | TRUNC | YEAR
Math
ABS | CAST_NUM_AS_BYTE | CAST_NUM_AS_DECIMAL | CAST_NUM_AS_DOUBLE | CAST_NUM_AS_INT | CAST_NUM_AS_LONG | CAST_NUM_AS_SHORT
| CAST_NUM_AS_SINGLE | FLOOR | LOG | MAX | MAXLIST | MIN | MINLIST | POWER | RAND | ROUND | SIGN | SQRT
Trigonometric
ASIN | ACOS | ATAN | ATAN2 | COS | COSH | SIN | SINH | TAN | TANH
Aggregate Functions
MIN | MAX | COUNT | AVG | SUM | COUNT ( DISTINCT() ) | MINLIST | MAXLIST
Statistical Functions
STDEV| STDEVP | VAR | VARP
Special Cases
• SELECT DISCOUNT ( COUNT CUSTOMER )
FROM ORDERS
• Answer is No of Customer Values
• SELECT DISTINCT CUSTOMER FROM ORDERS
Answer is List of Customer Unique Values
Importing Data
DATASOURCE a1=ODBC 'dsn=ir_northwind'
IMPORT Customers=a1.customers
IMPORT Products=a1.{SELECT * FROM Products}
IMPORT orders-a1.'somequery.sql'
SAVE
Importing Data II
• ODBC / OLEDB / DOT NET Providers
• Special ME Datasource
• Existing In Memory Databases
• UNION ALL Between Sources
• SLURP Command
• Variables, Expressions & IF
Interfacing to the Database
• Native Dot Net API
• Dot Net Data Provider
• COM/ ACTIVEX API
• ODBC Driver
C / C++ IO
Licensed ODBC Kit
Parameterized Queries + Cursor Support
Hard Learned Lessons
• Allocated and Store Variables Relating to One
Thread Sequentially. Don’t intermix
• Xeon Servers with Maxed out memory can
have slower memory access speed
– 1 Rank 1,866 Mhz
– 2 Ranks 1,600 Mhz
– 3 Ranks 1,333 Mhz
Bitcoin Mining / HFT
• CPUS
• GPUs
• FPGAs
• Dedicated Mining Chip
GPU & InMemory Databases
• GPUDB, MAPD
– Good for Visualising Billions of Points
– GPUs can run thousands of Cores on Data
– GPU to Main Memory Bottleneck
– Potentially more Data Reduction
• Blazegraph, Graphsql
Fast Graph Database that can use GPU
FPGA Potential
• Field-Programmable Gate Array
– is an integrated circuit designed to be configured
by a customer or a designer after manufacturing
– Programmable Integrated Circuit
• Could be used to enhanced In Memory DBs
• Intel bought Altera back in June 2015
– Will roll technology out into Data Center
Hardware Transaction Memory
• Simplifies Concurrent Programming
– Group of Load & Store Instructions
– Can Execute Atomically
• Hardware of Software Transactional Memory
• Intel TSX
– Transaction Synchronization Extensions
– Available in some Skylake Processors
– Added to Haswell/Broadwell but Disabled
3D XPoint Memory
• Announced by Intel & Micron June 2015
• 1000 times more Durable than Flash
• Like DRAM that has Permanence
• Latency 10 times faster than NAND SSD
• 4-6 Times slower than DRAM
Thanks for help with Market Research
• Dan Khasis
• Niall Dalton
• Jeff Cordova – Wavefront
• SapHanaTutorial.com

Mais conteúdo relacionado

Mais procurados

NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
Query processing-and-optimization
Query processing-and-optimizationQuery processing-and-optimization
Query processing-and-optimizationWBUTTUTORIALS
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationDatabricks
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Databasenehabsairam
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & DeltaDatabricks
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata StorageDataWorks Summit/Hadoop Summit
 
Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)DataWorks Summit
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Simplilearn
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVROairisData
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101MongoDB
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonIgor Anishchenko
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeDatabricks
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseDatabricks
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j InternalsTobias Lindaaker
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetFile Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetOwen O'Malley
 
Database backup & recovery
Database backup & recoveryDatabase backup & recovery
Database backup & recoveryMustafa Khan
 
MySQL 8.0 Document Store - Discovery of a New World
MySQL 8.0 Document Store - Discovery of a New WorldMySQL 8.0 Document Store - Discovery of a New World
MySQL 8.0 Document Store - Discovery of a New WorldFrederic Descamps
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesYoshinori Matsunobu
 

Mais procurados (20)

NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
Query processing-and-optimization
Query processing-and-optimizationQuery processing-and-optimization
Query processing-and-optimization
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storagehive HBase Metastore - Improving Hive with a Big Data Metadata Storage
hive HBase Metastore - Improving Hive with a Big Data Metadata Storage
 
Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)
 
Denormalization
DenormalizationDenormalization
Denormalization
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVRO
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased Comparison
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
An overview of Neo4j Internals
An overview of Neo4j InternalsAn overview of Neo4j Internals
An overview of Neo4j Internals
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetFile Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet
 
Database backup & recovery
Database backup & recoveryDatabase backup & recovery
Database backup & recovery
 
MySQL 8.0 Document Store - Discovery of a New World
MySQL 8.0 Document Store - Discovery of a New WorldMySQL 8.0 Document Store - Discovery of a New World
MySQL 8.0 Document Store - Discovery of a New World
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 

Destaque

In-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataIn-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataSAP Technology
 
In-Memory Computing: How, Why? and common Patterns
In-Memory Computing: How, Why? and common PatternsIn-Memory Computing: How, Why? and common Patterns
In-Memory Computing: How, Why? and common PatternsSrinath Perera
 
in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latencyhyeongchae lee
 
Using In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudUsing In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudFrancesco Pagano
 
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.George Joseph
 
Sap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory databaseSap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory databaseAlexander Talac
 
In-memory Database and MySQL Cluster
In-memory Database and MySQL ClusterIn-memory Database and MySQL Cluster
In-memory Database and MySQL Clustergrandis_au
 
In memory big data management and processing a survey
In memory big data management and processing a surveyIn memory big data management and processing a survey
In memory big data management and processing a surveyredpel dot com
 
Oracle Big Data. Обзор технологий
Oracle Big Data. Обзор технологийOracle Big Data. Обзор технологий
Oracle Big Data. Обзор технологийAndrey Akulov
 
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...In-Memory Computing Summit
 
Data Migration Between MongoDB and Oracle
Data Migration Between MongoDB and OracleData Migration Between MongoDB and Oracle
Data Migration Between MongoDB and OracleChihYung(Raymond) Wu
 
Oracle To Sql Server migration process
Oracle To Sql Server migration processOracle To Sql Server migration process
Oracle To Sql Server migration processharirk1986
 
Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-featuresNavneet Upneja
 
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Amazon Web Services
 
In Memory Computing for Agile Business Intelligence
In Memory Computing for Agile Business IntelligenceIn Memory Computing for Agile Business Intelligence
In Memory Computing for Agile Business IntelligenceMarkus Alsleben, DBA
 

Destaque (20)

In-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataIn-Memory Database Platform for Big Data
In-Memory Database Platform for Big Data
 
In-Memory Computing: How, Why? and common Patterns
In-Memory Computing: How, Why? and common PatternsIn-Memory Computing: How, Why? and common Patterns
In-Memory Computing: How, Why? and common Patterns
 
in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latency
 
Using In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudUsing In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the Cloud
 
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
 
Sap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory databaseSap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory database
 
Ibm aix
Ibm aixIbm aix
Ibm aix
 
In-memory Database and MySQL Cluster
In-memory Database and MySQL ClusterIn-memory Database and MySQL Cluster
In-memory Database and MySQL Cluster
 
Dell server basics v5 0713
Dell server basics v5 0713Dell server basics v5 0713
Dell server basics v5 0713
 
In memory big data management and processing a survey
In memory big data management and processing a surveyIn memory big data management and processing a survey
In memory big data management and processing a survey
 
Oracle Big Data. Обзор технологий
Oracle Big Data. Обзор технологийOracle Big Data. Обзор технологий
Oracle Big Data. Обзор технологий
 
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
 
Data Migration Between MongoDB and Oracle
Data Migration Between MongoDB and OracleData Migration Between MongoDB and Oracle
Data Migration Between MongoDB and Oracle
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Oracle To Sql Server migration process
Oracle To Sql Server migration processOracle To Sql Server migration process
Oracle To Sql Server migration process
 
Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-features
 
Unix Administration 1
Unix Administration 1Unix Administration 1
Unix Administration 1
 
Installing Aix
Installing AixInstalling Aix
Installing Aix
 
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
 
In Memory Computing for Agile Business Intelligence
In Memory Computing for Agile Business IntelligenceIn Memory Computing for Agile Business Intelligence
In Memory Computing for Agile Business Intelligence
 

Semelhante a In memory databases presentation

MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStoreMariaDB plc
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practicesDavid Dhavan
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersJonathan Levin
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Lucidworks
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDBAWS Germany
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAiougVizagChapter
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.pptMARasheed3
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingDatabricks
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
 
SparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsSparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsDatabricks
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
 

Semelhante a In memory databases presentation (20)

MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Redshift overview
Redshift overviewRedshift overview
Redshift overview
 
30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDB
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.ppt
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured Streaming
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
SparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsSparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDs
 
Master tuning
Master   tuningMaster   tuning
Master tuning
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 

Último

1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 

Último (20)

1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 

In memory databases presentation

  • 1. In Memory Databases An Overview By John Sullivan john@inmemory.net
  • 3. Features • Data is stored sequentially by Row • Essentially an Array / List Structure • Easy to Add / Update / Insert /Delete • Need to read entire Row to get to one Column’s Data
  • 5. Features • Data is stored by Column • Faster to Read a few Columns • Very Hard to Update / Insert • Reading Data Sequentially from Column, CPU Cache Friendly
  • 7. Compressed Column Store • Column Array is converted into 2 arrays –One array contains a list of sorted Unique Values –Another array containing an integer index to the values
  • 8. Sqlite • Opened by Special Filename :memory: • Designed for Single Process / File • Great for embedded systems/ mobile devices. E.g. IOS Apps • Row Store , No Column Store • One Writer only. Non Server Based. • Free & Open Source
  • 9. Excel • Power Pivot, Introduced in Excel 2010 • Non SQL Query Language • Data Analysis Expressions (DAX) • Syntax similar to Excel Formulae • Requires Pro version of Office or Excel
  • 10. Tableau • Primarily a Visualization Tool • Tableau Data Extracts (TDE) • Compressed Column Store • Generates one table flat Extract from Source ( that may involve joins ) • Uses ODBC / OLEDB For Extraction • Only loads required columns from Extract
  • 11. Qlik • One of the Original Developers in Compressed Columnar In Memory Analytics • Nice Dashboards • Incremental Updates • Autojoins Fields based on Field Name • Scripting Langauge for Generating QVD Files
  • 12. Qlik Load Script Example Companies: LOAD id AS COMPANY_ID, name as COMPANY_NAME, postcode AS COMPANY_POSTCODE, address AS COMPANY_ADDRESS, If(id > 100, 1, 0) AS FLAG_NATIONAL; SQL SELECT id, name, postcode, address FROM database.Companies;
  • 13. Monet DB • Pioneer in Columnar Databases • Research Focussed out of the Netherlands • Open Source • Can Cache Expensive Computations and Reuse • Early versions was used by Data Distilleries, which got bought out by SPSS • R Integration
  • 14. SQL Server Enterprise • ColumnStore Indexes –Data is stored by column. –Blocks of 1,048,576 Values • InMemory OLTP (MEMORY_OPTIMIZED=ON) after Create Table Data/Delta files of 128 MB
  • 15. Oracle • TimesTen – Works with Oracle Database as a Cache – Telecoms and Financial Companies • Oracle 12 Enterprise – Row & Column Formats – In Memory Columnstore • Exalytics
  • 16. SAP Hana • Pure In Memory Database • In Memory OLTP Rowstore • In Memory Columnstore – Up to 2^31 rows per block • Cluster Large Fact tables across nodes • Hana One Available on EC2 & IBM
  • 18. Memsql • Pure In Memory Database • Mysql Wire Protocol Compatible • Lockfree Linked Lists and Skiplists • SQL Queries compiled into C++ • Split Large Tables Across Nodes • Column Store Aimed at Analytics • Apache Spark Integration
  • 20. Clustered Databases • Amazon Redshift • EMC Greenplum • IBM Netezza • HP Vertica • Teradata
  • 21. Other In Memory Players • Sisense BI Focussed • Parstream Cisco Owned • Domo SAAS BI Company. Omniture Founder • Iri • InsightSquared BI Focussed • VoltDB Java Stored Procedure Unit of Exec • Infobright Open Sourced based on Mysql • KDB Focussed on HFT / Terse
  • 22. InMemory.Net public static void testDoublePerformance() { double total = 0; for (int kk = 0; kk < 1000000000; kk++) { total += kk; } Console.WriteLine(total); }
  • 23. Results • Ran in about 2.5 second for a billion Rows • 400 million rows per second on Single Core • About 50% of performance of C++ Prog. • 1.6 billion / second when running using 4 Core • 2.0 billion / second when running with HT Cores
  • 24. Initial Version • InMemoryColumn<T> { Dictionary <T,int> initialValuesDict; List <int> initialIndexes; T [] finalValues; int [] finalIndexes; • }
  • 25. Next Version • InMemoryColumn<T> { Dictionary <T,int> initialValuesDict; int [][] initialIndexes; T [] finalValues; int [] finalIndexes; • }
  • 26. Final Version • InMemoryColumn<T> { Dictionary <T,int> initialValuesDict; byte/ushort/int [][] initialIndexes; T [] finalValues; byte/ushort/int [] finalIndexes; • }
  • 27. ANLTR to Parse Queries grammar Expr; prog: (expr NEWLINE)* ; expr: expr ('*'|'/') expr | expr ('+'|'-') expr | INT | '(' expr ')' ; NEWLINE : [rn]+ ; INT : [0-9]+ ;
  • 28. Example Rule from Grammer mainquery [ImpVars vars] returns [InMemoryQuery query ] : { $query = new InMemoryQuery(); } SELECT1 (CACHE {$query.setCache();} )? (NOCACHE {$query.setNoCache();} )? (DISTINCT {$query.setDistinct();} )? fieldclause [$query,$vars] ( (INTO label { $query.setInto ($label.text2 ) ;})? FROM tableclause [$query,$vars] ( (COMMA|CROSS JOIN ) tableclause [$query,$vars] ) * (WHERE whereclause [$query,$vars])? (GROUP BY groupclause [$query,$vars])? (HAVING havingclause [$query,$vars])? (ORDER BY orderclause [$query,$vars])? (LIMIT limitclause [$query,$vars])? )? ;
  • 29. Code Generation • Generate C# To Evaluate Query • Compiled Code undergoes JIT for fast exec • Parameterize Constants – Simplify complex Constant Expressions • Generic Table / Column Naming • Reuse Generated Code
  • 30. Detail Queries • Detail Query –Initial List Algorithm –Improved by using Arrays of Arrays –Only one thread works on one Array
  • 31. SELECT customerid FROM Orders for (int tab1_counter = rowStart; tab1_counter < rowEnd; tab1_counter++,) { groupRowD1 = groupRowCount >> 14; groupRowD2 = groupRowCount & 16383; if (groupRowD2 == 0) { if (groupRowD1 > 0) { blockCounts[groupRowD1 - 1] = 16384; } lock (lock_newBlockObject) { groupRowCount = nextRecordD1 << 14; nextRecordD1++; } groupRowD1 = groupRowCount >> 14; t_total0[groupRowD1] = new byte[16384]; total0 = t_total0[groupRowD1]; }; total0[groupRowD2] = val_t1_c1[tab1_counter]; groupRowCount++; if ((groupRowCount & 16383) == 0) { blockCounts[groupRowD1] = 16384; } }
  • 32. Aggregative Queries • Group Cardinality =1 • Group Cardinality < 500k – Use Arrays of Arrays, – Lookup Key being Group Index • Group Cardinality > 500k – Use Dictionaries to Correlate Group Index -> Storage – Arrays of Arrays
  • 33. SELECT customer, SUM(1) FROM orders WHERE employee=1 GROUP BY customer for (int tab1_counter = rowStart; tab1_counter < rowEnd; tab1_counter++, newRow = false) { if ((val_t1_c2[tab1_counter] == const_0_t1_c2)) { rowIndex = val_t1_c1[tab1_counter]; if (groupRowExists[rowIndex] == 0) newRow = true; groupRowExists[rowIndex] = 1; total1[rowIndex] += const_0; if (newRow) { total0[rowIndex]=val_t1_c1[tab1_counter]; } } }
  • 34. COUNT DISTINCT • Initial Algorithm used Byte [] • Used lots of Memory on Large Cores • Upgraded to 1 [] across all Cores • Interlocked.CompareExchange to set Bit • Hashmap for initial Values • Then switch to byte []
  • 35. Subqueries • Subquery in Table clause can be materialized into temp table ( CACHE ) • Simplify Subquery ( NOCACHE) Only Fields Parent SELECT Requires Pass Through Parent WHERE Clause
  • 36. JOINS • LEFT & INNER JOIN SUPPORT • Merge Parent & Child Column Values • Parent Value -> Child Indexes • ONE to ONE – Join becomes an Array Lookup • ONE to Many – Join Becomes for Loop
  • 37. Query Simplification • Rewrite Aggregate Queries with Expressions SELECT SUM(1) / SUM (qty ) FROM Orders SELECT SUM(1) as A, SUM(QTY) as B from Orders SELECT A/B FROM TEMP_QUERY
  • 38. More Simplifications • Group Expressions with 1 Database Field e.g. Group by Month ( OrderDate ) Inner Join OrderDate to Table of Its Unique Values and Month ( OrderDate ) • Remove Redundant Group By Parts Group BY OrderDate , Month ( Orderdate ) Group BY OrderDate , Month ( Orderdate )
  • 39. HAVING Clause • Convert to two Queries • One Query without Having Clause • Having Clause becomes Where of Second Query
  • 40. Function List String Functions CAST | CAST_STR_AS_INT | CAST_STR_AS_DECIMAL | CHAR | CHARINDEX | COALESCE | CONCAT | CSTR | ENDSWITH | INSERT | ISNULL | ISNULLOREMPTY | LEFT | LEN | LCASE | LTRIM | REMOVE | REPLACE | REVERSE | RIGHT | RTRIM | SUBSTRING | STARTSWITH | TRIM | UCASE Date Functions CDATE | DATEADD | DATEDIFF | DATEDIFFMILLISECOND | DATEPART | DATESERIAL | DAY | DAYOFWEEK | MONTH | TRUNC | YEAR Math ABS | CAST_NUM_AS_BYTE | CAST_NUM_AS_DECIMAL | CAST_NUM_AS_DOUBLE | CAST_NUM_AS_INT | CAST_NUM_AS_LONG | CAST_NUM_AS_SHORT | CAST_NUM_AS_SINGLE | FLOOR | LOG | MAX | MAXLIST | MIN | MINLIST | POWER | RAND | ROUND | SIGN | SQRT Trigonometric ASIN | ACOS | ATAN | ATAN2 | COS | COSH | SIN | SINH | TAN | TANH Aggregate Functions MIN | MAX | COUNT | AVG | SUM | COUNT ( DISTINCT() ) | MINLIST | MAXLIST Statistical Functions STDEV| STDEVP | VAR | VARP
  • 41. Special Cases • SELECT DISCOUNT ( COUNT CUSTOMER ) FROM ORDERS • Answer is No of Customer Values • SELECT DISTINCT CUSTOMER FROM ORDERS Answer is List of Customer Unique Values
  • 42. Importing Data DATASOURCE a1=ODBC 'dsn=ir_northwind' IMPORT Customers=a1.customers IMPORT Products=a1.{SELECT * FROM Products} IMPORT orders-a1.'somequery.sql' SAVE
  • 43. Importing Data II • ODBC / OLEDB / DOT NET Providers • Special ME Datasource • Existing In Memory Databases • UNION ALL Between Sources • SLURP Command • Variables, Expressions & IF
  • 44. Interfacing to the Database • Native Dot Net API • Dot Net Data Provider • COM/ ACTIVEX API • ODBC Driver C / C++ IO Licensed ODBC Kit Parameterized Queries + Cursor Support
  • 45. Hard Learned Lessons • Allocated and Store Variables Relating to One Thread Sequentially. Don’t intermix • Xeon Servers with Maxed out memory can have slower memory access speed – 1 Rank 1,866 Mhz – 2 Ranks 1,600 Mhz – 3 Ranks 1,333 Mhz
  • 46. Bitcoin Mining / HFT • CPUS • GPUs • FPGAs • Dedicated Mining Chip
  • 47. GPU & InMemory Databases • GPUDB, MAPD – Good for Visualising Billions of Points – GPUs can run thousands of Cores on Data – GPU to Main Memory Bottleneck – Potentially more Data Reduction • Blazegraph, Graphsql Fast Graph Database that can use GPU
  • 48. FPGA Potential • Field-Programmable Gate Array – is an integrated circuit designed to be configured by a customer or a designer after manufacturing – Programmable Integrated Circuit • Could be used to enhanced In Memory DBs • Intel bought Altera back in June 2015 – Will roll technology out into Data Center
  • 49. Hardware Transaction Memory • Simplifies Concurrent Programming – Group of Load & Store Instructions – Can Execute Atomically • Hardware of Software Transactional Memory • Intel TSX – Transaction Synchronization Extensions – Available in some Skylake Processors – Added to Haswell/Broadwell but Disabled
  • 50. 3D XPoint Memory • Announced by Intel & Micron June 2015 • 1000 times more Durable than Flash • Like DRAM that has Permanence • Latency 10 times faster than NAND SSD • 4-6 Times slower than DRAM
  • 51. Thanks for help with Market Research • Dan Khasis • Niall Dalton • Jeff Cordova – Wavefront • SapHanaTutorial.com