SlideShare a Scribd company logo
1 of 9
Download to read offline
Acunu: Understanding Massive Data.
We are witnessing at least two revolutions in storage: (1) massive datasets and workloads, and (2) the rise of
scale-out commodity hardware. This whitepaper describes the Acunu Data Platform, and how Acunu is allowing
massive data workloads to take full advantage of
todayā€™s hardware.

Acunu is rewriting the storage stack in the Linux ker-
nel for Massive Data thanks to world-class engineer-
ing and algorithms research.

Massive Data Workloads.

How have workloads changed? The workloads de-
manded by hardware of massive datasets typically
exhibit three main features:

ā€¢   Continuously high ingest rates (many thousands of
    updates/s, typically high-entropy, random updates)

ā€¢   Individual pieces of data are small, and arenā€™t valu-
    able in isolation (for example, stock ticks or ses-
    sion IDs)

ā€¢   Continual range queries are important for analyt-
    ics (such as demanded by Apache Hadoop)

This is in stark contrast to the ā€˜load, then queryā€™
regimes of more traditional databases.

Understanding massive data means being able to
extract features and trends, all the time while the
data is continually updated. Existing platforms and
solutions cannot do this at scale, with predictably
high performance. This is where Acunu comes in.

The ļ¬rst revolution is the rise of non-relational, or
ā€˜nosqlā€™ data bases such as Cassandra, and analyt-
ics frameworks and tools such as Hadoop. The driving force is using clusters of commodity machines to ingest large
volumes of data, process it, and serve it. Previous technologies such as mysql are traditionally cumbersome to operate
at the scales needed here. For many deployments in both enterprise and non-enterprise settings, these technologies
are likely to account for the majority of data stored where features such as high availability at low cost are more impor-
tant than transactional durability.

The second revolution is a hardware one. Commodity machines now typically possess many cores, and bear closer
resemblence to a supercomputer of the 90s than a desktop of the same era. Hard drive capacity and sequential band-
width has been doubling every 18 months, as predicted; yet random IO performance has not improved. Solid-state
drives (SSDs) offer 2-3 orders of magnitude better random IO performance than hard drives. Clearly these have huge
potential to revolutionize the database world, if only the software stack can harness and utilize their performance.
Acunuā€™s proposition - reengineering the stack for massive data.

These two revolutions expose a new
problem. The ā€œstorage stackā€ that
abstracts away details of the hard-
ware and allows applications to
communicate with the hardware, is
now a serious bottleneck. It was built
for the needs of databases and hard-
ware of the 90s. The result is that it
presents fundamentally the wrong
abstraction for Massive Data applica-
tions, which developers either work
around or accept, and secondly, it
simply cannot be easily modiļ¬ed to
take advantage of new storage tech-
nologies - the assumptions underlying
rotational drives are implicit through-
out it.

Acunu is taking the difļ¬cult, but fun-
damental, step of reengineering the
storage stack for the age of Massive
Data. Weā€™ve thrown almost 30 world-class engineers, including over 10 PhDs, mathematicians, Cambridge, Oxford and
Stanford academics at the problem. The result is a set of core components, rearchitected from the ground up.

Why is this important?

          ā€œItā€™s disruptive if itā€™s a 10x beneļ¬t, because thatā€™s a platform for creating opportunities for new ecosystems.ā€
                                         - Reid Hoffman, Data as Web 3.0 (SXSW 2011)

By revisiting the core storage stack,
Acunu is able to provide a platform for
Massive Data applications. This al-
lows us to do things such as improve
Apache Cassandra performance by
almost 100x for heavy workloads, give
it predictable performance (removing
memory garbage collection prob-
lems), support SSDs with high write
writes and guaranteed endurance,
interoperate simultaneous Massive
Data stores (do you want to ingest via
memcached and analyze via Cassan-
dra?), offer fundamentally new fea-
tures (such as full versioning - snap-
shots and clones - while doing fast
inserts) via patented algorithms, and
lots more.

We donā€™t need yet another database - we need a ļ¬rm foundation on which to understand massive data.
Acunu Data Platform.
The Acunu Data Platform is a powerful storage solution that brings simpler, faster and more predictable performance to
NOSQL stores like Apache Cassandra.

Our view is that the new data intensive
workloads that are increasingly com-
mon are a poor match for the legacy
storage systems they tend to run on.
These systems are built on a set of
assumptions about the capacity and
performance of hardware that are sim-
ply no longer true. Ā  The Acunu Data
Platform is the result of a radical re-
think of those assumptions; the result is
high performance from low cost com-
modity hardware.

Open Storage Core.

The Acunu Storage Core is an open-
source, in-kernel,   industrial-strength,
write-optimized, multi-dimensional,
fully-versioned, key-value store. It con-
tains the majority of our techniques that provide extremely high, predictable performance. It is open-source under
GPLv2, and can be downloaded for free from www.acunu.com.

Interoperability of multiple data stores.

By running on the Acunu Data Platform, we are able to allow multiple data stores to interoperate. For example, applica-
tions can write to the store using memcached (running on Acunu), and then perform analysis on the same data using
Apache Cassandra, or the Hadoop framework (running on Acunu). Using Acunuā€™s versioning and advanced isolation
tools, views of large data sets can be updated atomically and isolated from one another.

Powered by Acunu.

We provide user-level client libraries to allow applications to run on the Storage Core. Typically, a small patch or plugin
is required in order for the application to use the Acunu client libraries. Version 1 ships with the Acunu Distribution for
Apache Cassandra, and a large object store that talks the same protocol as Amazonā€™s S3 store, based on Project
Voldemort. As time goes on, we will release more patches, and we will look to the community to contribute patches for
various projects. We will make all these open, and freely-available.
Monitor and control the entire stack, over
the whole cluster.

To make all of this easier to use, we have also
produced some snazzy management tools.
These are web-based and follow the same de-
centralized model of Cassandra: simply point
your web browser at any of the boxes running
Acunuā€™s software and you will be able to create
a cluster, do snapshots and clones, or see what
is happening across your Acunu storage nodes.

Since the Acunu platform replaces the ļ¬le sys-
tem and page cache, it has direct hardware
access and unprecedented hardware visibility.
This means that Acunuā€™s monitoring tools can
observe and directly control such things as disk
queues, latencies throughout the stack, and much more. One can quickly diagnose hardware bottlenecks, and inefļ¬-
ciencies up and down the stack, across the entire cluster.
Fundamental research = new possibilities.
The Acunu Storage Core is based on fundamental, patent-pending, algorithms and engineering research. This isnā€™t just
a better implementation of an existing idea, or about a shinier UI or management console (although our management
stack is also pretty cool). We are doing world-class research, engineering, patenting, and we publish at top confer-
ences. Why? This allows us to do things simply not
possible before. Here are some examples.

Fast, full versioning.

Versioning of large data sets is an incredibly powerful
tool. Not just low-performance snapshots for back-
ups, but high-performance, concurrent-accessible
clones and snapshots of live datasets for test and
development, offering many users different, writeable,
views of the same large dataset, going back in time,
and much more.

Traditionally, the state-of-the-art in algorithms for ver-
sioning large data sets is based on a data structure
known as the ā€˜copy-on-write B-treeā€™ (CoW B-tree) -
this is ubiquitous in ļ¬le systems and databases in-
cluding ZFS, WAFL, Btrfs, and more. The CoW B-tree (and most of its variants, such as append-only trees, log ļ¬le sys-
tems, redirect-on-write, etc.) has three fundamental problems - (1) it is space-inefļ¬cient (and thus requires frequent
garbage collection); (2) it relies on random IO to scale (and thus performs poorly on rotational drives); and (3) it cannot
perform fast updates, even on SSDs.

Acunu has invented a fundamentally new data structure - the Stratiļ¬ed B-tree - that addresses all the above problems.
Some details of this revolutionary data structure have been published: see [Twigg, Byde - Stratiļ¬ed B-trees and ver-
sioned dictionaries, USENIX HotStorageā€™11].

Designed for SSDs

Existing storage schemes do not address the fact that SSDs require addressing in a fundamentally different way. Al-
though they present a SATA/SAS interface and are sector-addressed, this is only to allow them to be a drop-in replace-
ment for hard drives. Extracting maximum performance and lifetimes requires two things: (1) the storage stack to un-
derstand how they operate; and (2) new data structures and algorithms that exploit their design characteristics.

By understanding how SSDs fundamentally work, Acunu has been able to engineer data structures that allow unprece-
dented long-term write performance, while guaranteeing device endurance.

Not just peak performance, but predictable performance.

By eliminating JVM-based garbage collection and memory management issues, and carefully controlling hardware ac-
cess from within the Linux kernel, Acunu is able to offer predictably high performance, even under sustained high loads,
with both ingest and analytic range queries - the perfect ingredients for any real-time analytics platform. Watch carefully
in future versions as Acunu begins to deploy fundamentally new offerings here, exploiting our back-end algorithmic
advantage.
V1: Supercharging Apache Cassandra with Acunu.
A major feature of the Acunu Storage Core is its predictably high performance, even under sustained heavy load. Often,
this is more important than absolute peak performance ļ¬gures - if you know what to expect from a node, then you can
add nodes to get the desired performance level. On the other hand, if performance is unpredictable, how many nodes
should you use?

The graphs below show the difference between Acunuā€™s Distribution for Cassandra and Vanilla Apache Cassandra, un-
der a sustained heavy load of 50k inserts/second. It is easy to see the advantage of Acunu - the worst-case latency is
never worse than 18ms, whereas for Apache Cassandra it often exceeds 10,000ms.




Next, we consider the performance of range
queries under sustained insert load. Immedi-
ately after performing the inserts above, we
attempted to perform a large sequence of small
range queries, simulating a real-time analytics
workload.

The graph on the right shows the result. With
Acunu, Cassandra was able to sustain over 40
range queries per second (this is an area we
have not optimized for V1, and will dramatically
improve in a later release). Apache Cassandra,
by contrast, manages about 0.3 queries per
second. After about 1 hours, this improves
slightly since we manually triggered a ā€˜major
compactionā€™ (in practise, this is not possible
during sustained inserts).
Licensing, Pricing, Support.
At launch, the Acunu Data Platform will come in two ļ¬‚avors:

       Enterprise Edition: The full Acunu stack, with either regular (5x8) or premium (24x7) support via phone, email
       and web at support.acunu.com. Please contact sales@acunu.com for details.

       Standard Edition: Same as Enterprise Edition, but limited to 2 nodes, with mailing list / community support.
       Free for production use.

Tested and supported.

Whatever edition and level of support you opt for, we are committed to making sure the product you use is rock-solid,
and ready for prime-time production use. Unlike other vendors of open-source software, the free version and enterprise
version of the Acunu Storage Core are the same thing, both builds subjected to the same rigorous testing and QA, in-
volving over 300 machine-hours of tests per build. Even if you use the Standard Edition, we provide detailed support
through user and developer mailing lists. For the Enterprise Edition, we offer unparalleled access to our team of support
engineers, and world-class engineers and PhDs via support.acunu.com.

Open-source.

We recognize the importance of the open source community in developing, maintaining, innovating and educating
around complex and fundamental software projects. We also recognize that, in order to become strongly adopted, our
most fundamental code should be open for anyone to examine and improve. Thatā€™s why weā€™re making the Acunu Stor-
age Core open-source, under the GPLv2. All our our contributions to Apache Cassandra and other open-source pro-
jects will be released under the appropriate licenses, too. The rest of the Acunu Data Platform, including the enterprise-
grade management and monitoring tools, and additional performance packs, will be released in due course.

Community.

Acunu is committed to contributing back to the open-source communities for the products we use, and to leverage
their ability to strengthen and develop our own open-source projects (such as the Acunu Storage Core and others com-
ing in the future). We welcome all contributions and developments from the community.
About Acunu.
Acunu is reengineering the storage stack from the ground-up for the age of Massive Data. Based on fundamental algo-
rithms research and world-class engineering, the Acunu Platform allows applications such as Apache Cassandra and
Hadoop, along with many others, to (1) drive todayā€™s commodity hardware harder than ever before, including many-core
architectures, SSDs and large SATA drives; (2) exploit new features in the Acunu Core (such as fast cloning and version-
ing); and (3) obtain predictable, reliable high performance. Storage is the key to understanding Massive Data, and gain-
ing competitive advantage. The Acunu Open Platform lets companies do this quicker, easier and cheaper.




Acunu was founded in 2009 by researchers and engineers from Cambridge, Oxford, and several well-known high-tech
companies. We are backed by some of Europeā€™s top VCs, with total funding over $5.0M. We are based in London and
California.

Founders.

Dr Tim Moreton, CEO: Tim is an expert in distributed ļ¬le systems. He holds a PhD from Cambridge, where he built a
distributed ļ¬le system for the Xen project. He was previously at Tideway (now BMC), where he was lead engineer on a
number of data center projects.

Dr Andy Twigg, CTO: Andy has an outstanding track record of theoretical and applied computing research. He has
held positions at Cambridge University, Microsoft Research, Thomson Research and Oxford University. His PhD in 2006
on compact routing algorithms was nominated for the BCS Best Dissertation Award. He holds a Junior Research Fel-
lowship at Oxford University, where he is a member of the CS department.

Tom Wilkie, VP Engineering: Tom was one of the ļ¬rst UK employees at XenSource before its acquisition by Citrix in
2007. He worked on the XenCenter management stack and numerous customer projects. He has a BA in Computer
Science from Cambridge.

Dr John Wilkes, Technical Advisor: John is an advisor to Acunu. John led the Storage Systems group at HP Labs for
15 years, before moving to Google in 2008. John received his PhD from Cambridge in 1984, an Outstanding Contribu-
tion award from SNIA in 2001 and was made an ACM Fellow in 2002.

More Related Content

What's hot

FAQ on Dedupe NetApp
FAQ on Dedupe NetAppFAQ on Dedupe NetApp
FAQ on Dedupe NetAppAshwin Pawar
Ā 
Whitepaper_Cassandra_Datastax_Final
Whitepaper_Cassandra_Datastax_FinalWhitepaper_Cassandra_Datastax_Final
Whitepaper_Cassandra_Datastax_FinalMichele Hunter
Ā 
EMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data StorageEMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data StorageEMC
Ā 
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)DataCore APAC
Ā 
Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...
Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...
Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...Samsung Business USA
Ā 
Hadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big DataHadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big DataWANdisco Plc
Ā 
Scaling Up vs. Scaling-out
Scaling Up vs. Scaling-outScaling Up vs. Scaling-out
Scaling Up vs. Scaling-outChristopher Nadeau
Ā 
Virtual SAN- Deep Dive Into Converged Storage
Virtual SAN- Deep Dive Into Converged StorageVirtual SAN- Deep Dive Into Converged Storage
Virtual SAN- Deep Dive Into Converged StorageDataCore Software
Ā 
Ddn 2017 10_dse_primer
Ddn 2017 10_dse_primerDdn 2017 10_dse_primer
Ddn 2017 10_dse_primerDaniel M. Farrell
Ā 
Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS  Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS Dr Neelesh Jain
Ā 
Data Domain Architecture
Data Domain ArchitectureData Domain Architecture
Data Domain Architecturekoesteruk22
Ā 
Oracle 11gR2 plain servers vs Exadata - 2013
Oracle 11gR2 plain servers vs Exadata - 2013Oracle 11gR2 plain servers vs Exadata - 2013
Oracle 11gR2 plain servers vs Exadata - 2013Connor McDonald
Ā 
NetAppā€™s Open Solution for Hadoop
NetAppā€™s Open Solution for HadoopNetAppā€™s Open Solution for Hadoop
NetAppā€™s Open Solution for HadoopNetApp
Ā 
Hadoop World Vertica
Hadoop World VerticaHadoop World Vertica
Hadoop World VerticaOmer Trajman
Ā 
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...IJCERT JOURNAL
Ā 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introductionrajsandhu1989
Ā 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep DivesRush Shah
Ā 
IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)Girish Srivastava
Ā 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradataAsis Mohanty
Ā 

What's hot (20)

FAQ on Dedupe NetApp
FAQ on Dedupe NetAppFAQ on Dedupe NetApp
FAQ on Dedupe NetApp
Ā 
Whitepaper_Cassandra_Datastax_Final
Whitepaper_Cassandra_Datastax_FinalWhitepaper_Cassandra_Datastax_Final
Whitepaper_Cassandra_Datastax_Final
Ā 
EMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data StorageEMC Isilon Best Practices for Hadoop Data Storage
EMC Isilon Best Practices for Hadoop Data Storage
Ā 
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
Ā 
Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...
Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...
Big Data SSD Architecture: Digging Deep to Discover Where SSD Performance Pay...
Ā 
Hadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big DataHadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big Data
Ā 
Scaling Up vs. Scaling-out
Scaling Up vs. Scaling-outScaling Up vs. Scaling-out
Scaling Up vs. Scaling-out
Ā 
Virtual SAN- Deep Dive Into Converged Storage
Virtual SAN- Deep Dive Into Converged StorageVirtual SAN- Deep Dive Into Converged Storage
Virtual SAN- Deep Dive Into Converged Storage
Ā 
Ddn 2017 10_dse_primer
Ddn 2017 10_dse_primerDdn 2017 10_dse_primer
Ddn 2017 10_dse_primer
Ā 
Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS  Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS
Ā 
Data Domain Architecture
Data Domain ArchitectureData Domain Architecture
Data Domain Architecture
Ā 
Oracle 11gR2 plain servers vs Exadata - 2013
Oracle 11gR2 plain servers vs Exadata - 2013Oracle 11gR2 plain servers vs Exadata - 2013
Oracle 11gR2 plain servers vs Exadata - 2013
Ā 
Introduction to OpenStack (2012)
Introduction to OpenStack (2012)Introduction to OpenStack (2012)
Introduction to OpenStack (2012)
Ā 
NetAppā€™s Open Solution for Hadoop
NetAppā€™s Open Solution for HadoopNetAppā€™s Open Solution for Hadoop
NetAppā€™s Open Solution for Hadoop
Ā 
Hadoop World Vertica
Hadoop World VerticaHadoop World Vertica
Hadoop World Vertica
Ā 
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
Ā 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introduction
Ā 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
Ā 
IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)IBM Pure Data System for Analytics (Netezza)
IBM Pure Data System for Analytics (Netezza)
Ā 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradata
Ā 

Viewers also liked

Storage on EC2 (& Cassandra), Cassandra Workshop, Berlin Buzzwords
Storage on EC2 (& Cassandra), Cassandra Workshop, Berlin BuzzwordsStorage on EC2 (& Cassandra), Cassandra Workshop, Berlin Buzzwords
Storage on EC2 (& Cassandra), Cassandra Workshop, Berlin BuzzwordsAcunu
Ā 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraAcunu
Ā 
Jk Infra Mar 2010 from kspcg.research
Jk Infra   Mar 2010 from kspcg.researchJk Infra   Mar 2010 from kspcg.research
Jk Infra Mar 2010 from kspcg.researchguest96fa6181
Ā 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Acunu
Ā 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu
Ā 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsAcunu
Ā 

Viewers also liked (7)

Storage on EC2 (& Cassandra), Cassandra Workshop, Berlin Buzzwords
Storage on EC2 (& Cassandra), Cassandra Workshop, Berlin BuzzwordsStorage on EC2 (& Cassandra), Cassandra Workshop, Berlin Buzzwords
Storage on EC2 (& Cassandra), Cassandra Workshop, Berlin Buzzwords
Ā 
Cassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into CassandraCassandra EU 2012 - Putting the X Factor into Cassandra
Cassandra EU 2012 - Putting the X Factor into Cassandra
Ā 
MOKA SAP HCL
MOKA SAP HCLMOKA SAP HCL
MOKA SAP HCL
Ā 
Jk Infra Mar 2010 from kspcg.research
Jk Infra   Mar 2010 from kspcg.researchJk Infra   Mar 2010 from kspcg.research
Jk Infra Mar 2010 from kspcg.research
Ā 
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Cassandra EU 2012 - Overview of Case Studies and State of the Market by 451 R...
Ā 
Acunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra AppsAcunu Analytics: Simpler Real-Time Cassandra Apps
Acunu Analytics: Simpler Real-Time Cassandra Apps
Ā 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
Ā 

Similar to Acunu Whitepaper v1

TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaOpenNebula Project
Ā 
At the Crossroads of HPC and Cloud Computing with Openstack
At the Crossroads of HPC and Cloud Computing with OpenstackAt the Crossroads of HPC and Cloud Computing with Openstack
At the Crossroads of HPC and Cloud Computing with OpenstackRyan Aydelott
Ā 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperDavid Walker
Ā 
PowerAlluxio
PowerAlluxioPowerAlluxio
PowerAlluxioChi-fan Chu
Ā 
Big Data and its emergence
Big Data and its emergenceBig Data and its emergence
Big Data and its emergencekoolkalpz
Ā 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
Ā 
IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...
IMCSummit 2015 - Day 2  IT Business Track - Drive IMC Efficiency with Flash E...IMCSummit 2015 - Day 2  IT Business Track - Drive IMC Efficiency with Flash E...
IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...In-Memory Computing Summit
Ā 
Technical Report NetApp Clustered Data ONTAP 8.2: An Introduction
Technical Report NetApp Clustered Data ONTAP 8.2: An IntroductionTechnical Report NetApp Clustered Data ONTAP 8.2: An Introduction
Technical Report NetApp Clustered Data ONTAP 8.2: An IntroductionNetApp
Ā 
Sybase IQ ile Analitik Platform
Sybase IQ ile Analitik PlatformSybase IQ ile Analitik Platform
Sybase IQ ile Analitik PlatformSybase TĆ¼rkiye
Ā 
Building an analytical platform
Building an analytical platformBuilding an analytical platform
Building an analytical platformDavid Walker
Ā 
Big Data Glossary of terms
Big Data Glossary of termsBig Data Glossary of terms
Big Data Glossary of termsKognitio
Ā 
Scale-on-Scale : Part 1 of 3 - Production Environment
Scale-on-Scale : Part 1 of 3 - Production EnvironmentScale-on-Scale : Part 1 of 3 - Production Environment
Scale-on-Scale : Part 1 of 3 - Production EnvironmentScale Computing
Ā 
Setting up repositories
Setting up repositoriesSetting up repositories
Setting up repositoriesIryna Kuchma
Ā 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed_Hat_Storage
Ā 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
Ā 
OpenStack - An Overview
OpenStack - An OverviewOpenStack - An Overview
OpenStack - An Overviewgraziol
Ā 
Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentationRodrigo Missiaggia
Ā 

Similar to Acunu Whitepaper v1 (20)

TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebulaTechDay - Toronto 2016 - Hyperconvergence and OpenNebula
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
Ā 
At the Crossroads of HPC and Cloud Computing with Openstack
At the Crossroads of HPC and Cloud Computing with OpenstackAt the Crossroads of HPC and Cloud Computing with Openstack
At the Crossroads of HPC and Cloud Computing with Openstack
Ā 
EOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - PaperEOUG95 - Client Server Very Large Databases - Paper
EOUG95 - Client Server Very Large Databases - Paper
Ā 
PowerAlluxio
PowerAlluxioPowerAlluxio
PowerAlluxio
Ā 
Big Data and its emergence
Big Data and its emergenceBig Data and its emergence
Big Data and its emergence
Ā 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
Ā 
IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...
IMCSummit 2015 - Day 2  IT Business Track - Drive IMC Efficiency with Flash E...IMCSummit 2015 - Day 2  IT Business Track - Drive IMC Efficiency with Flash E...
IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...
Ā 
WTIA Cloud Computing Series - Part I: The Fundamentals
WTIA Cloud Computing Series - Part I: The FundamentalsWTIA Cloud Computing Series - Part I: The Fundamentals
WTIA Cloud Computing Series - Part I: The Fundamentals
Ā 
Introducing Mache
Introducing MacheIntroducing Mache
Introducing Mache
Ā 
Technical Report NetApp Clustered Data ONTAP 8.2: An Introduction
Technical Report NetApp Clustered Data ONTAP 8.2: An IntroductionTechnical Report NetApp Clustered Data ONTAP 8.2: An Introduction
Technical Report NetApp Clustered Data ONTAP 8.2: An Introduction
Ā 
Sybase IQ ile Analitik Platform
Sybase IQ ile Analitik PlatformSybase IQ ile Analitik Platform
Sybase IQ ile Analitik Platform
Ā 
Building an analytical platform
Building an analytical platformBuilding an analytical platform
Building an analytical platform
Ā 
Big Data Glossary of terms
Big Data Glossary of termsBig Data Glossary of terms
Big Data Glossary of terms
Ā 
Scale-on-Scale : Part 1 of 3 - Production Environment
Scale-on-Scale : Part 1 of 3 - Production EnvironmentScale-on-Scale : Part 1 of 3 - Production Environment
Scale-on-Scale : Part 1 of 3 - Production Environment
Ā 
Setting up repositories
Setting up repositoriesSetting up repositories
Setting up repositories
Ā 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and Future
Ā 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
Ā 
As34269277
As34269277As34269277
As34269277
Ā 
OpenStack - An Overview
OpenStack - An OverviewOpenStack - An Overview
OpenStack - An Overview
Ā 
Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentation
Ā 

More from Acunu

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu
Ā 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinAcunu
Ā 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu
Ā 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
Ā 
All Your Base
All Your BaseAll Your Base
All Your BaseAcunu
Ā 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraAcunu
Ā 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonAcunu
Ā 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time CassandraAcunu
Ā 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Acunu
Ā 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with CassandraAcunu
Ā 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu
Ā 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your businessAcunu
Ā 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraAcunu
Ā 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: CassandraAcunu
Ā 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsAcunu
Ā 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation CassandraAcunu
Ā 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Acunu
Ā 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixAcunu
Ā 
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Acunu
Ā 
Cassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard LowCassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard LowAcunu
Ā 

More from Acunu (20)

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on Cassandra
Ā 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational Aspirin
Ā 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Ā 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
Ā 
All Your Base
All Your BaseAll Your Base
All Your Base
Ā 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache Cassandra
Ā 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX London
Ā 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time Cassandra
Ā 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Ā 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with Cassandra
Ā 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra London
Ā 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your business
Ā 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with Cassandra
Ā 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: Cassandra
Ā 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Ā 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation Cassandra
Ā 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Ā 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Ā 
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Ā 
Cassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard LowCassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard Low
Ā 

Recently uploaded

šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜RTylerCroy
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationRadu Cotescu
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
Ā 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
Ā 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
Ā 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
Ā 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vƔzquez
Ā 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
Ā 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
Ā 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
Ā 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
Ā 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
Ā 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
Ā 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
Ā 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
Ā 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
Ā 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
Ā 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
Ā 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
Ā 

Recently uploaded (20)

šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organization
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Ā 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Ā 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Ā 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
Ā 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Ā 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
Ā 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
Ā 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Ā 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
Ā 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Ā 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
Ā 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Ā 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Ā 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
Ā 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Ā 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Ā 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Ā 

Acunu Whitepaper v1

  • 1.
  • 2. Acunu: Understanding Massive Data. We are witnessing at least two revolutions in storage: (1) massive datasets and workloads, and (2) the rise of scale-out commodity hardware. This whitepaper describes the Acunu Data Platform, and how Acunu is allowing massive data workloads to take full advantage of todayā€™s hardware. Acunu is rewriting the storage stack in the Linux ker- nel for Massive Data thanks to world-class engineer- ing and algorithms research. Massive Data Workloads. How have workloads changed? The workloads de- manded by hardware of massive datasets typically exhibit three main features: ā€¢ Continuously high ingest rates (many thousands of updates/s, typically high-entropy, random updates) ā€¢ Individual pieces of data are small, and arenā€™t valu- able in isolation (for example, stock ticks or ses- sion IDs) ā€¢ Continual range queries are important for analyt- ics (such as demanded by Apache Hadoop) This is in stark contrast to the ā€˜load, then queryā€™ regimes of more traditional databases. Understanding massive data means being able to extract features and trends, all the time while the data is continually updated. Existing platforms and solutions cannot do this at scale, with predictably high performance. This is where Acunu comes in. The ļ¬rst revolution is the rise of non-relational, or ā€˜nosqlā€™ data bases such as Cassandra, and analyt- ics frameworks and tools such as Hadoop. The driving force is using clusters of commodity machines to ingest large volumes of data, process it, and serve it. Previous technologies such as mysql are traditionally cumbersome to operate at the scales needed here. For many deployments in both enterprise and non-enterprise settings, these technologies are likely to account for the majority of data stored where features such as high availability at low cost are more impor- tant than transactional durability. The second revolution is a hardware one. Commodity machines now typically possess many cores, and bear closer resemblence to a supercomputer of the 90s than a desktop of the same era. Hard drive capacity and sequential band- width has been doubling every 18 months, as predicted; yet random IO performance has not improved. Solid-state drives (SSDs) offer 2-3 orders of magnitude better random IO performance than hard drives. Clearly these have huge potential to revolutionize the database world, if only the software stack can harness and utilize their performance.
  • 3. Acunuā€™s proposition - reengineering the stack for massive data. These two revolutions expose a new problem. The ā€œstorage stackā€ that abstracts away details of the hard- ware and allows applications to communicate with the hardware, is now a serious bottleneck. It was built for the needs of databases and hard- ware of the 90s. The result is that it presents fundamentally the wrong abstraction for Massive Data applica- tions, which developers either work around or accept, and secondly, it simply cannot be easily modiļ¬ed to take advantage of new storage tech- nologies - the assumptions underlying rotational drives are implicit through- out it. Acunu is taking the difļ¬cult, but fun- damental, step of reengineering the storage stack for the age of Massive Data. Weā€™ve thrown almost 30 world-class engineers, including over 10 PhDs, mathematicians, Cambridge, Oxford and Stanford academics at the problem. The result is a set of core components, rearchitected from the ground up. Why is this important? ā€œItā€™s disruptive if itā€™s a 10x beneļ¬t, because thatā€™s a platform for creating opportunities for new ecosystems.ā€ - Reid Hoffman, Data as Web 3.0 (SXSW 2011) By revisiting the core storage stack, Acunu is able to provide a platform for Massive Data applications. This al- lows us to do things such as improve Apache Cassandra performance by almost 100x for heavy workloads, give it predictable performance (removing memory garbage collection prob- lems), support SSDs with high write writes and guaranteed endurance, interoperate simultaneous Massive Data stores (do you want to ingest via memcached and analyze via Cassan- dra?), offer fundamentally new fea- tures (such as full versioning - snap- shots and clones - while doing fast inserts) via patented algorithms, and lots more. We donā€™t need yet another database - we need a ļ¬rm foundation on which to understand massive data.
  • 4. Acunu Data Platform. The Acunu Data Platform is a powerful storage solution that brings simpler, faster and more predictable performance to NOSQL stores like Apache Cassandra. Our view is that the new data intensive workloads that are increasingly com- mon are a poor match for the legacy storage systems they tend to run on. These systems are built on a set of assumptions about the capacity and performance of hardware that are sim- ply no longer true. Ā  The Acunu Data Platform is the result of a radical re- think of those assumptions; the result is high performance from low cost com- modity hardware. Open Storage Core. The Acunu Storage Core is an open- source, in-kernel, industrial-strength, write-optimized, multi-dimensional, fully-versioned, key-value store. It con- tains the majority of our techniques that provide extremely high, predictable performance. It is open-source under GPLv2, and can be downloaded for free from www.acunu.com. Interoperability of multiple data stores. By running on the Acunu Data Platform, we are able to allow multiple data stores to interoperate. For example, applica- tions can write to the store using memcached (running on Acunu), and then perform analysis on the same data using Apache Cassandra, or the Hadoop framework (running on Acunu). Using Acunuā€™s versioning and advanced isolation tools, views of large data sets can be updated atomically and isolated from one another. Powered by Acunu. We provide user-level client libraries to allow applications to run on the Storage Core. Typically, a small patch or plugin is required in order for the application to use the Acunu client libraries. Version 1 ships with the Acunu Distribution for Apache Cassandra, and a large object store that talks the same protocol as Amazonā€™s S3 store, based on Project Voldemort. As time goes on, we will release more patches, and we will look to the community to contribute patches for various projects. We will make all these open, and freely-available.
  • 5. Monitor and control the entire stack, over the whole cluster. To make all of this easier to use, we have also produced some snazzy management tools. These are web-based and follow the same de- centralized model of Cassandra: simply point your web browser at any of the boxes running Acunuā€™s software and you will be able to create a cluster, do snapshots and clones, or see what is happening across your Acunu storage nodes. Since the Acunu platform replaces the ļ¬le sys- tem and page cache, it has direct hardware access and unprecedented hardware visibility. This means that Acunuā€™s monitoring tools can observe and directly control such things as disk queues, latencies throughout the stack, and much more. One can quickly diagnose hardware bottlenecks, and inefļ¬- ciencies up and down the stack, across the entire cluster.
  • 6. Fundamental research = new possibilities. The Acunu Storage Core is based on fundamental, patent-pending, algorithms and engineering research. This isnā€™t just a better implementation of an existing idea, or about a shinier UI or management console (although our management stack is also pretty cool). We are doing world-class research, engineering, patenting, and we publish at top confer- ences. Why? This allows us to do things simply not possible before. Here are some examples. Fast, full versioning. Versioning of large data sets is an incredibly powerful tool. Not just low-performance snapshots for back- ups, but high-performance, concurrent-accessible clones and snapshots of live datasets for test and development, offering many users different, writeable, views of the same large dataset, going back in time, and much more. Traditionally, the state-of-the-art in algorithms for ver- sioning large data sets is based on a data structure known as the ā€˜copy-on-write B-treeā€™ (CoW B-tree) - this is ubiquitous in ļ¬le systems and databases in- cluding ZFS, WAFL, Btrfs, and more. The CoW B-tree (and most of its variants, such as append-only trees, log ļ¬le sys- tems, redirect-on-write, etc.) has three fundamental problems - (1) it is space-inefļ¬cient (and thus requires frequent garbage collection); (2) it relies on random IO to scale (and thus performs poorly on rotational drives); and (3) it cannot perform fast updates, even on SSDs. Acunu has invented a fundamentally new data structure - the Stratiļ¬ed B-tree - that addresses all the above problems. Some details of this revolutionary data structure have been published: see [Twigg, Byde - Stratiļ¬ed B-trees and ver- sioned dictionaries, USENIX HotStorageā€™11]. Designed for SSDs Existing storage schemes do not address the fact that SSDs require addressing in a fundamentally different way. Al- though they present a SATA/SAS interface and are sector-addressed, this is only to allow them to be a drop-in replace- ment for hard drives. Extracting maximum performance and lifetimes requires two things: (1) the storage stack to un- derstand how they operate; and (2) new data structures and algorithms that exploit their design characteristics. By understanding how SSDs fundamentally work, Acunu has been able to engineer data structures that allow unprece- dented long-term write performance, while guaranteeing device endurance. Not just peak performance, but predictable performance. By eliminating JVM-based garbage collection and memory management issues, and carefully controlling hardware ac- cess from within the Linux kernel, Acunu is able to offer predictably high performance, even under sustained high loads, with both ingest and analytic range queries - the perfect ingredients for any real-time analytics platform. Watch carefully in future versions as Acunu begins to deploy fundamentally new offerings here, exploiting our back-end algorithmic advantage.
  • 7. V1: Supercharging Apache Cassandra with Acunu. A major feature of the Acunu Storage Core is its predictably high performance, even under sustained heavy load. Often, this is more important than absolute peak performance ļ¬gures - if you know what to expect from a node, then you can add nodes to get the desired performance level. On the other hand, if performance is unpredictable, how many nodes should you use? The graphs below show the difference between Acunuā€™s Distribution for Cassandra and Vanilla Apache Cassandra, un- der a sustained heavy load of 50k inserts/second. It is easy to see the advantage of Acunu - the worst-case latency is never worse than 18ms, whereas for Apache Cassandra it often exceeds 10,000ms. Next, we consider the performance of range queries under sustained insert load. Immedi- ately after performing the inserts above, we attempted to perform a large sequence of small range queries, simulating a real-time analytics workload. The graph on the right shows the result. With Acunu, Cassandra was able to sustain over 40 range queries per second (this is an area we have not optimized for V1, and will dramatically improve in a later release). Apache Cassandra, by contrast, manages about 0.3 queries per second. After about 1 hours, this improves slightly since we manually triggered a ā€˜major compactionā€™ (in practise, this is not possible during sustained inserts).
  • 8. Licensing, Pricing, Support. At launch, the Acunu Data Platform will come in two ļ¬‚avors: Enterprise Edition: The full Acunu stack, with either regular (5x8) or premium (24x7) support via phone, email and web at support.acunu.com. Please contact sales@acunu.com for details. Standard Edition: Same as Enterprise Edition, but limited to 2 nodes, with mailing list / community support. Free for production use. Tested and supported. Whatever edition and level of support you opt for, we are committed to making sure the product you use is rock-solid, and ready for prime-time production use. Unlike other vendors of open-source software, the free version and enterprise version of the Acunu Storage Core are the same thing, both builds subjected to the same rigorous testing and QA, in- volving over 300 machine-hours of tests per build. Even if you use the Standard Edition, we provide detailed support through user and developer mailing lists. For the Enterprise Edition, we offer unparalleled access to our team of support engineers, and world-class engineers and PhDs via support.acunu.com. Open-source. We recognize the importance of the open source community in developing, maintaining, innovating and educating around complex and fundamental software projects. We also recognize that, in order to become strongly adopted, our most fundamental code should be open for anyone to examine and improve. Thatā€™s why weā€™re making the Acunu Stor- age Core open-source, under the GPLv2. All our our contributions to Apache Cassandra and other open-source pro- jects will be released under the appropriate licenses, too. The rest of the Acunu Data Platform, including the enterprise- grade management and monitoring tools, and additional performance packs, will be released in due course. Community. Acunu is committed to contributing back to the open-source communities for the products we use, and to leverage their ability to strengthen and develop our own open-source projects (such as the Acunu Storage Core and others com- ing in the future). We welcome all contributions and developments from the community.
  • 9. About Acunu. Acunu is reengineering the storage stack from the ground-up for the age of Massive Data. Based on fundamental algo- rithms research and world-class engineering, the Acunu Platform allows applications such as Apache Cassandra and Hadoop, along with many others, to (1) drive todayā€™s commodity hardware harder than ever before, including many-core architectures, SSDs and large SATA drives; (2) exploit new features in the Acunu Core (such as fast cloning and version- ing); and (3) obtain predictable, reliable high performance. Storage is the key to understanding Massive Data, and gain- ing competitive advantage. The Acunu Open Platform lets companies do this quicker, easier and cheaper. Acunu was founded in 2009 by researchers and engineers from Cambridge, Oxford, and several well-known high-tech companies. We are backed by some of Europeā€™s top VCs, with total funding over $5.0M. We are based in London and California. Founders. Dr Tim Moreton, CEO: Tim is an expert in distributed ļ¬le systems. He holds a PhD from Cambridge, where he built a distributed ļ¬le system for the Xen project. He was previously at Tideway (now BMC), where he was lead engineer on a number of data center projects. Dr Andy Twigg, CTO: Andy has an outstanding track record of theoretical and applied computing research. He has held positions at Cambridge University, Microsoft Research, Thomson Research and Oxford University. His PhD in 2006 on compact routing algorithms was nominated for the BCS Best Dissertation Award. He holds a Junior Research Fel- lowship at Oxford University, where he is a member of the CS department. Tom Wilkie, VP Engineering: Tom was one of the ļ¬rst UK employees at XenSource before its acquisition by Citrix in 2007. He worked on the XenCenter management stack and numerous customer projects. He has a BA in Computer Science from Cambridge. Dr John Wilkes, Technical Advisor: John is an advisor to Acunu. John led the Storage Systems group at HP Labs for 15 years, before moving to Google in 2008. John received his PhD from Cambridge in 1984, an Outstanding Contribu- tion award from SNIA in 2001 and was made an ACM Fellow in 2002.