SlideShare uma empresa Scribd logo
1 de 14
An Overview of NoSQL Databases
RichPerry
CIS 264
October25, 2014
An Overview of NoSQL Databases | Rich Perry Page 1 of 13
Executive Summary
NoSQL is a relatively new type of next generation database system, but not relational
database management systems (DBMS). It's also commonly known as "Not Only SQL".
However, according to Carlo Struzzi, it would be more accurate to call it NoREL for No
Relations, or Not Relational. The concept was first introduced in 1998 by Carlo Struzzi,
and then re-introduced in 2009 by Eric Evans. It offers radically different choices and
options for data storage compared to conventional relational databases.
This generation of DBMS offers much more flexibility, higher performance, higher levels
of scalability, less complexity, and different choices of functionality.
It’s about more than just rows in tables. NoSQL database systems allow data storage
and retrieval using many different formats such as key-value, column family, document,
and graph databases.
NoSQL databases have no joins like relational databases. Instead of joins, systems
allow users to extract data using simple interfaces. They also have little or no database
schema and do not strictly enforce ACID transaction standards like relational databases.
It supports linear scalability. If you add more processors, you get a proportional increase
in performance. Horizontal scalability (dividing the system among multiple servers) is
also a benefit of most NoSQL database systems.
There are more choices for storing, retrieving, and manipulating data. It's called "Not
Only SQL" because one can use SQL as well as many other query languages.
NoSQL is not only available as open source. There are many open source NoSQL
products and many commercial products that use NoSQL concepts.
An Overview of NoSQL Databases | Rich Perry Page 2 of 13
It's not just for use with Big Data problems. Many people have a common misconception
that NoSQL is just for use in solving Big Data problems. NoSQL database systems are
commonly used for Big Data situations. However, NoSQL also provides alternative
solutions when flexibility, performance, and scalability are important aside from Big Data
problems.
Categories ofNoSQLDatabases
There four main types of NoSQL databases. Each has their own unique advantages
and disadvantages. The four main types are:
1. Column family (Bigtable) stores
2. Key-Value stores
3. Graph stores
4. Document stores
Columnfamily (Bigtable) stores
Column family databases can scale to manage large amounts data. They use both row
and column identifiers as keys to find data, and are often referred to as "data stores"
rather than databases because they lack features normally found in most DBMSs. For
example, column family databases lack typed columns, secondary indexes, triggers, and
query languages.
Spreadsheets serve well as a comparison model for this type of database. Data values
are addressed by the combination of the row and column much like in an Excel
An Overview of NoSQL Databases | Rich Perry Page 3 of 13
spreadsheet. Cells can contain any type of data. A cell can be populated with data at
any time or left empty.
Most, but not all, column family databases use a timestamp and a "column family" in
addition to the column name and row identifier as a multi-part key. Column family stores
are often called Bigtable stores because the tables can be enormous with billions of
rows or more.
Most rows utilize few columns out of the many possible columns. This results in most
cells in the table being empty, and is known as a sparse matrix. This type of data
structure works very inefficiently in relational databases, but column family stores are
made for this type of data storage.
AdvantagesofColumnfamilystores
1. Can manage very large amounts of data efficiently.
2. High scalability -- Column family databases do not use join so they scale very
easily in distributed systems.
3. High availability. Column family systems are usually configured to store data on
multiple nodes in different geographic areas with automatic failover.
DisadvantagesforColumnfamilystores
1. Not as flexible as other NoSQL database types.
2. Minimal functionality available with most column family databases.
An Overview of NoSQL Databases | Rich Perry Page 4 of 13
Key-Value Stores
Key-Value stores are some of the simplest of NoSQL databases. This type of NoSQL
database system is sometimes called the Swiss Army knife of databases, because it can
be used in many different situations.
Key-Value stores have no query language and work like dictionaries. Keys are paired
with values. Application programmer's interfaces (APIs) are used to add new key-value
pairs (a.k.a., put), delete key-value pairs, and retrieve a value when given a key (a.k.a.,
get). In a dictionary, words are paired with definitions. The words are keys and the
definitions are the values. If the user gives the API a word (key), the API returns a
definition (value).
Keys are flexible and can be many different types of data. Some examples of types of
keys are:
 Name of an image.
 File path.
 Hash code.
 URL.
 SQL query.
 REST web service call.
Values are also flexible. They can be almost anything. Common values could be
documents, images, web pages, text, etc.
CommonAttributes ofKey-ValueStoredatabases
1. All keys must be unique.
2. Keys are indexed but values are not.
3. Values can be any data type and/or different data types. Whereas in a
relational database, values in a single column must be homogenous.
4. Queries return one and only one value.
An Overview of NoSQL Databases | Rich Perry Page 5 of 13
5. Queries must search for a key, but cannot search for a value.
AdvantagesofKey-ValueStoredatabases
1. Precision Service Levels
2. Precision service monitoring and notification
3. Scalability and Reliability
4. Portability and lower operational cost
5. Speed -- queries tend to run very quickly.
6. Simplicity -- it does not get much simpler than key-value pairs
7. Can be used for many different applications and data storage problems.
DisadvantagesofKey-ValueStoredatabases
1. Cannot query values. Only the key can be queried. This means that the user
must know the key.
2. Does not establish relationships between data. If relationships are important,
other NoSQL types may be more appropriate.
3. Cannot return lists of values. Queries return one and only one value.
4. Values may contain any data type. This is only a problem if the user expects a
certain data type but receives another, or if the user expects the data type to
always be the same when no such guarantee is provided.
Document Stores
This is one of the most flexible, powerful and popular types of databases in the NoSQL
movement. Key-Value and column family stores work by searching for a key and they
return a value associated with that key. These two types of databases do not index
An Overview of NoSQL Databases | Rich Perry Page 6 of 13
values, or allow searching on values. Document stores work in a different manner.
They allow searching on any content within documents.
Document stores automatically index all content inside a document when it is added to
the database. This makes indexes large, but everything is searchable. A document
store API can provide a list of documents, find a single document, or find any subsection
of any document. A key-value store can store an entire document in the value area and
return that document if you search for its key, but a document store can return a just a
sentence or paragraph from a large document (e.g., a book) without loading the entire
document into memory.
Tree structures are used in document store databases. The tree structures begin with a
root node that has branches. The branches can have sub-branches and those can be
divided into sub-branches indefinitely until they terminate at a leaf. The values are
stored at the leaf level.
Most document store databases also use collections to manage large number of
documents. Collections can be used for different purposes such as navigation, grouping
similar types of documents, and applying business rules to set different permissions,
indexes, and triggers. Collections can contain other collections and trees can have sub-
trees.
Advantages ofDocumentStoredatabases
1. Very flexible.
2. High performance.
3. Variable but usually high scalability.
4. Relatively simple, but powerful APIs
An Overview of NoSQL Databases | Rich Perry Page 7 of 13
Disadvantagesof DocumentStoredatabases
1. Overkill if searching inside documents is not necessary. If a user only needs
the whole document, a key-value store might be better.
2. Only well suited for storing documents. If the data is not part of a document, it
should probably be put in a different type of database.
Graph Stores
Graph stores are optimized to efficiently store node and links, and allow users to query
those graphs. Graph Store databases are useful for any business that has complex
relationships between objects such as social networking, rules-based engines, mashups,
and systems that must analyze complex network structures.
A graph store is a system that contains a sequence for nodes and relationships that
create a graph when these things are combined. Key-Value stores have two data fields,
the key and its value. Graph stores have 3 data fields -- node, relationship, and
property.
Graph nodes are nouns and often represent real world objects such people,
organizations, websites, computers on a network, or cities on routes (i.e., highways,
railways, or air routes). The relationships are the connections between the nodes.
Graph store database queries essentially traverse the nodes on the graph. They can
return information such as:
1. Shortest path between two nodes on a graph.
2. Neighboring nodes that have specific properties.
3. Similarities of neighboring nodes between two nodes.
An Overview of NoSQL Databases | Rich Perry Page 8 of 13
Relationships are handled differently compared to relational databases. Graph store
databases store related nodes together and assign internal identifiers to nodes, so that it
can join networks.
AdvantagesofGraphStoredatabases
1. Better performance compared to relational databases.
2. Designed to handle complex relationships between data.
DisadvantagesofGraphStoredatabases
1. Difficult to scale horizontally to multiple servers. Data can be replicated on
multiple servers to enhance read performance, but writing to multiple servers that
span multiple nodes is complicated to implement.
Overall Advantages of NoSQLDatabases
Scalability
More specifically, horizontal scaling is a major advantage of NoSQL database systems.
This allows organizations to distribute the database across multiple servers and nodes
rather than just buying bigger and better servers.
Low maintenance in the future
Relational databases require highly trained and experienced DBAs and developers.
Although DBAs are probably not losing their jobs any time in the near future, NoSQL
database systems will require less maintenance, less management, less support, and
either fewer DBAs or IT professionals to do a similar job with less extensive training.
An Overview of NoSQL Databases | Rich Perry Page 9 of 13
Cost
Admins can distribute a NoSQL database system across multiple low cost hardware
rather than just buying more expensive servers. Many NoSQL systems are also open
source which usually translates into no cost software. Relational database systems tend
to rely heavily on costly, proprietary software and hardware.
Flexibility
There are different NoSQL data models available to developers. This means that
organizations have some good choices.
Performance
Most NoSQL databases provide much higher performance compared to relational
databases. The main reason for this is that more choices are available to the IT
professionals and management. The IT department can pick which type of database is
best suited for the purpose. Key-value store databases are simple and return one value.
This means that the performance of a NoSQL database system is primarily related to its
focus. With the exception of graph type data stores, the performance is also helped by
the low level complexity.
Overall Disadvantages of NoSQLDatabases
Maturity
NoSQL database systems are not mature and therefore not appropriate for all purposes
and situations. According to Herman Mehling of Database Journal, NoSQL, in general,
lacks credibility in the IT world; whereas some relational databases are known for their
An Overview of NoSQL Databases | Rich Perry Page 10 of 13
rich functionality, stability, reliability, vendor support, and wealth of expertise available in
the employment pool. Relational databases have a lot of credibility with many IT
professionals, such as DBAs, developers, data architects, and IT managers and
executives.
Support
Relational databases may be expensive but their vendors provide a very high level of
support. Most NoSQL systems are open source which means the basic software is free
which in turn means the there is little or no support from outside your organization.
Analytics and Business Intelligence (BI)
Most business intelligence tools simply do not have interfaces for or connectivity to
NoSQL databases. Quest Software has developed Toad for Cloud databases which
provide limited ad hoc query support for some NoSQL systems. However there is a
significant overall shortage of BI tools available for NoSQL.
Expertise
There is a wide and deep pool of IT professionals skilled, trained, and/or experienced
with relational databases, but not a lot who know how to use, develop, or maintain
NoSQL systems. Over time, the laws of supply and demand with education and
employment will address this problem, but it won’t happen immediately.
Compatibility
Relational databases have many standards. However, NoSQL databases have very
few, if any, standards. The lack of standards could make it very difficult to switch from
An Overview of NoSQL Databases | Rich Perry Page 11 of 13
one vendor to another (i.e., assumes buying vendor software, not just downloading open
source), if an organization becomes displeased with the service.
NoSQLvs. Relational database
Structure
RDBMSs are designed to use only highly structured data. NoSQL databases are
designed to use unstructured or semi-structured data.
ACID
ACID stands for Atomic, Consistent, Isolated, and Durable. It is a way to structure a
database to ensure data integrity and keep transactions reliable. Relational databases
strictly enforce ACID, but NoSQL databases usually do not. This is generally considered
an advantage of relational databases. If your data requires a high amount of
transactions, a relational database is probably a better choice.
Flexibility
Relational databases are not known for being very flexible. As previously mentioned,
they structure data to a high degree and also have rigid schemas. Most NoSQL
databases are very flexible because they have little or no schema, or have a flexible
schema and use data that is unstructured or structured to a much lower degree than
relational databases.
An Overview of NoSQL Databases | Rich Perry Page 12 of 13
Normalization
Relational databases need data to be normalized to at least the 3rd degree (a.k.a., 3rd
normal form) in order to fully function in a relational manner and make efficient use of
joins and indexes. NoSQL databases are often designed to support queries and do so
by denormalizing data based on anticipated queries for the given database.
Denormalizing improves query performance, so this is one of several reasons that
NoSQL databases usually provide higher performance, at least for reading.
Conclusions
Any IT professional or manager must carefully consider the purpose before one can
intelligently choose the correct NoSQL database or even decide whether transferring
data from a relational database is appropriate.
NoSQL databases can offer significant benefits, but they are not necessarily an
appropriate solution for every data storage problem. However, any organization
considering converting from a relational DBMS to a NoSQL database system should
carefully note the limitations and other issues associated with these types of databases,
and the risks of having no vendor support if using open source software, plus the lack of
experienced IT professionals.
Both relational databases and NoSQL databases have their place in the IT world.
NoSQL may be a better alternative for some situations. Relational databases may also
be the correct solution for some types of data storage problems. Some situations may
call for a combined solution, where both relational databases and NoSQL databases are
used.
An Overview of NoSQL Databases | Rich Perry Page 13 of 13
References
[McCreary, Dan and Kelly, Ann].[2014].[Making Sense of NoSQL].[Manning]
[Brooks, Charlie].[2014].[Enterprise NoSQL for Dummies].[MarkLogic]
[Scofield, Ben].[2010].[NoSQL Death to Relational Databases(?)].[Slide Share].[
http://www.slideshare.net/bscofield/nosql-codemash-2010] (accessed [10/17/2014]).
Lith, Adam; Jakob Mattson (2010). "Investigating storage solutions for large data: A
comparison of well performing and scalable data storage solutions for real time
extraction and batch insertion of data" (PDF). Göteborg: Department of Computer
Science and Engineering, Chalmers University of Technology. p. 70. Retrieved 05 Oct
2014. "Carlo Strozzi first used the term NoSQL in 1998 as a name for his open source
relational database that did not offer a SQL interface[...]"
"NoSQL 2009". Blog.sym-link.com. 12 May 2009. Retrieved 05 October 2014.
[Harrison, Guy].[2010].[10 things you should know about NoSQL
databases].[TechReplublic].[http://www.techrepublic.com/blog/10-things/10-things-you-
should-know-about-nosql-databases/].(accessed [10/17/2014]).
[Mehling , Herman].[2010].[10 things you Need to Know About NoSQL
Databases].[Database Journal].[
http://www.databasejournal.com/features/article.php/3905531/10-things-you-Need-to-
Know-About-NoSQL-Databases.htm].(accessed [10/17/2014]).

Mais conteúdo relacionado

Mais procurados

5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
Fabio Fumarola
 

Mais procurados (18)

Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...Comparative study of no sql document, column store databases and evaluation o...
Comparative study of no sql document, column store databases and evaluation o...
 
Chapter 5 design of keyvalue databses from nosql for mere mortals
Chapter 5 design of keyvalue databses from nosql for mere mortalsChapter 5 design of keyvalue databses from nosql for mere mortals
Chapter 5 design of keyvalue databses from nosql for mere mortals
 
Nosql
NosqlNosql
Nosql
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Nosql
NosqlNosql
Nosql
 
MS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database ConceptsMS Sql Server: Introduction To Database Concepts
MS Sql Server: Introduction To Database Concepts
 
NoSQL Basics and MongDB
NoSQL Basics and  MongDBNoSQL Basics and  MongDB
NoSQL Basics and MongDB
 
NoSQL Basics - a quick tour
NoSQL Basics - a quick tourNoSQL Basics - a quick tour
NoSQL Basics - a quick tour
 
All About Database v1.1
All About Database  v1.1All About Database  v1.1
All About Database v1.1
 
Chapter 8(designing of documnt databases)no sql for mere mortals
Chapter 8(designing of documnt databases)no sql for mere mortalsChapter 8(designing of documnt databases)no sql for mere mortals
Chapter 8(designing of documnt databases)no sql for mere mortals
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
What is difference between dbms and rdbms
What is difference between dbms and rdbmsWhat is difference between dbms and rdbms
What is difference between dbms and rdbms
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
Sql Server Basics
Sql Server BasicsSql Server Basics
Sql Server Basics
 

Destaque

Программа Инновационно-инвестиционного форума
Программа Инновационно-инвестиционного форумаПрограмма Инновационно-инвестиционного форума
Программа Инновационно-инвестиционного форума
BDA
 
M4 budaya kerja dan etika
M4 budaya kerja dan etikaM4 budaya kerja dan etika
M4 budaya kerja dan etika
Illiani Fazrien
 
мост выступление
мост выступлениемост выступление
мост выступление
BDA
 
Harirayaaidilfitri 111031060126-phpapp01
Harirayaaidilfitri 111031060126-phpapp01Harirayaaidilfitri 111031060126-phpapp01
Harirayaaidilfitri 111031060126-phpapp01
Adib Danial
 
Thi thử toán hồng quang hd 2012 lần 2 k a
Thi thử toán hồng quang hd 2012 lần 2 k aThi thử toán hồng quang hd 2012 lần 2 k a
Thi thử toán hồng quang hd 2012 lần 2 k a
Thế Giới Tinh Hoa
 
Thi thử toán lương ngọc quyến tn 2012 k d
Thi thử toán lương ngọc quyến tn 2012 k dThi thử toán lương ngọc quyến tn 2012 k d
Thi thử toán lương ngọc quyến tn 2012 k d
Thế Giới Tinh Hoa
 

Destaque (20)

Lancha pop pop
Lancha pop popLancha pop pop
Lancha pop pop
 
First draft script
First draft scriptFirst draft script
First draft script
 
-11031501
-11031501-11031501
-11031501
 
Программа Инновационно-инвестиционного форума
Программа Инновационно-инвестиционного форумаПрограмма Инновационно-инвестиционного форума
Программа Инновационно-инвестиционного форума
 
Бухгалтерия в Модульбанке
Бухгалтерия в МодульбанкеБухгалтерия в Модульбанке
Бухгалтерия в Модульбанке
 
M4 budaya kerja dan etika
M4 budaya kerja dan etikaM4 budaya kerja dan etika
M4 budaya kerja dan etika
 
Battula CV
Battula CVBattula CV
Battula CV
 
4475-03 1
4475-03 14475-03 1
4475-03 1
 
Instructivo sumergete
Instructivo sumergete Instructivo sumergete
Instructivo sumergete
 
мост выступление
мост выступлениемост выступление
мост выступление
 
Thaiartist
ThaiartistThaiartist
Thaiartist
 
Harirayaaidilfitri 111031060126-phpapp01
Harirayaaidilfitri 111031060126-phpapp01Harirayaaidilfitri 111031060126-phpapp01
Harirayaaidilfitri 111031060126-phpapp01
 
учим слова – советы
учим слова – советыучим слова – советы
учим слова – советы
 
о бизнесе Пётр Ерёменко (3)
о бизнесе Пётр Ерёменко (3)о бизнесе Пётр Ерёменко (3)
о бизнесе Пётр Ерёменко (3)
 
Programacion java
Programacion   javaProgramacion   java
Programacion java
 
ELS Recommendation ENG
ELS Recommendation ENGELS Recommendation ENG
ELS Recommendation ENG
 
Test
TestTest
Test
 
Thi thử toán hồng quang hd 2012 lần 2 k a
Thi thử toán hồng quang hd 2012 lần 2 k aThi thử toán hồng quang hd 2012 lần 2 k a
Thi thử toán hồng quang hd 2012 lần 2 k a
 
Thi thử toán lương ngọc quyến tn 2012 k d
Thi thử toán lương ngọc quyến tn 2012 k dThi thử toán lương ngọc quyến tn 2012 k d
Thi thử toán lương ngọc quyến tn 2012 k d
 
Prevision
PrevisionPrevision
Prevision
 

Semelhante a NoSQL_Databases

Assignment_4
Assignment_4Assignment_4
Assignment_4
Kirti J
 

Semelhante a NoSQL_Databases (20)

unit2-ppt1.pptx
unit2-ppt1.pptxunit2-ppt1.pptx
unit2-ppt1.pptx
 
Datastores
DatastoresDatastores
Datastores
 
2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptx2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptx
 
Artigo no sql x relational
Artigo no sql x relationalArtigo no sql x relational
Artigo no sql x relational
 
Unit-10.pptx
Unit-10.pptxUnit-10.pptx
Unit-10.pptx
 
Brief introduction to NoSQL by fas mosleh
Brief introduction to NoSQL by fas moslehBrief introduction to NoSQL by fas mosleh
Brief introduction to NoSQL by fas mosleh
 
nosql.pptx
nosql.pptxnosql.pptx
nosql.pptx
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
 
Unit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docxUnit II -BIG DATA ANALYTICS.docx
Unit II -BIG DATA ANALYTICS.docx
 
NoSQL powerpoint presentation difference with rdbms
NoSQL powerpoint presentation difference with rdbmsNoSQL powerpoint presentation difference with rdbms
NoSQL powerpoint presentation difference with rdbms
 
the rising no sql technology
the rising no sql technologythe rising no sql technology
the rising no sql technology
 
A Comparison between Relational Databases and NoSQL Databases
A Comparison between Relational Databases and NoSQL DatabasesA Comparison between Relational Databases and NoSQL Databases
A Comparison between Relational Databases and NoSQL Databases
 
Assignment_4
Assignment_4Assignment_4
Assignment_4
 
A Survey And Comparison Of Relational And Non-Relational Database
A Survey And Comparison Of Relational And Non-Relational DatabaseA Survey And Comparison Of Relational And Non-Relational Database
A Survey And Comparison Of Relational And Non-Relational Database
 
WEB_DATABASE_chapter_4.pptx
WEB_DATABASE_chapter_4.pptxWEB_DATABASE_chapter_4.pptx
WEB_DATABASE_chapter_4.pptx
 
Know what is NOSQL
Know what is NOSQL Know what is NOSQL
Know what is NOSQL
 
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 

NoSQL_Databases

  • 1. An Overview of NoSQL Databases RichPerry CIS 264 October25, 2014
  • 2. An Overview of NoSQL Databases | Rich Perry Page 1 of 13 Executive Summary NoSQL is a relatively new type of next generation database system, but not relational database management systems (DBMS). It's also commonly known as "Not Only SQL". However, according to Carlo Struzzi, it would be more accurate to call it NoREL for No Relations, or Not Relational. The concept was first introduced in 1998 by Carlo Struzzi, and then re-introduced in 2009 by Eric Evans. It offers radically different choices and options for data storage compared to conventional relational databases. This generation of DBMS offers much more flexibility, higher performance, higher levels of scalability, less complexity, and different choices of functionality. It’s about more than just rows in tables. NoSQL database systems allow data storage and retrieval using many different formats such as key-value, column family, document, and graph databases. NoSQL databases have no joins like relational databases. Instead of joins, systems allow users to extract data using simple interfaces. They also have little or no database schema and do not strictly enforce ACID transaction standards like relational databases. It supports linear scalability. If you add more processors, you get a proportional increase in performance. Horizontal scalability (dividing the system among multiple servers) is also a benefit of most NoSQL database systems. There are more choices for storing, retrieving, and manipulating data. It's called "Not Only SQL" because one can use SQL as well as many other query languages. NoSQL is not only available as open source. There are many open source NoSQL products and many commercial products that use NoSQL concepts.
  • 3. An Overview of NoSQL Databases | Rich Perry Page 2 of 13 It's not just for use with Big Data problems. Many people have a common misconception that NoSQL is just for use in solving Big Data problems. NoSQL database systems are commonly used for Big Data situations. However, NoSQL also provides alternative solutions when flexibility, performance, and scalability are important aside from Big Data problems. Categories ofNoSQLDatabases There four main types of NoSQL databases. Each has their own unique advantages and disadvantages. The four main types are: 1. Column family (Bigtable) stores 2. Key-Value stores 3. Graph stores 4. Document stores Columnfamily (Bigtable) stores Column family databases can scale to manage large amounts data. They use both row and column identifiers as keys to find data, and are often referred to as "data stores" rather than databases because they lack features normally found in most DBMSs. For example, column family databases lack typed columns, secondary indexes, triggers, and query languages. Spreadsheets serve well as a comparison model for this type of database. Data values are addressed by the combination of the row and column much like in an Excel
  • 4. An Overview of NoSQL Databases | Rich Perry Page 3 of 13 spreadsheet. Cells can contain any type of data. A cell can be populated with data at any time or left empty. Most, but not all, column family databases use a timestamp and a "column family" in addition to the column name and row identifier as a multi-part key. Column family stores are often called Bigtable stores because the tables can be enormous with billions of rows or more. Most rows utilize few columns out of the many possible columns. This results in most cells in the table being empty, and is known as a sparse matrix. This type of data structure works very inefficiently in relational databases, but column family stores are made for this type of data storage. AdvantagesofColumnfamilystores 1. Can manage very large amounts of data efficiently. 2. High scalability -- Column family databases do not use join so they scale very easily in distributed systems. 3. High availability. Column family systems are usually configured to store data on multiple nodes in different geographic areas with automatic failover. DisadvantagesforColumnfamilystores 1. Not as flexible as other NoSQL database types. 2. Minimal functionality available with most column family databases.
  • 5. An Overview of NoSQL Databases | Rich Perry Page 4 of 13 Key-Value Stores Key-Value stores are some of the simplest of NoSQL databases. This type of NoSQL database system is sometimes called the Swiss Army knife of databases, because it can be used in many different situations. Key-Value stores have no query language and work like dictionaries. Keys are paired with values. Application programmer's interfaces (APIs) are used to add new key-value pairs (a.k.a., put), delete key-value pairs, and retrieve a value when given a key (a.k.a., get). In a dictionary, words are paired with definitions. The words are keys and the definitions are the values. If the user gives the API a word (key), the API returns a definition (value). Keys are flexible and can be many different types of data. Some examples of types of keys are:  Name of an image.  File path.  Hash code.  URL.  SQL query.  REST web service call. Values are also flexible. They can be almost anything. Common values could be documents, images, web pages, text, etc. CommonAttributes ofKey-ValueStoredatabases 1. All keys must be unique. 2. Keys are indexed but values are not. 3. Values can be any data type and/or different data types. Whereas in a relational database, values in a single column must be homogenous. 4. Queries return one and only one value.
  • 6. An Overview of NoSQL Databases | Rich Perry Page 5 of 13 5. Queries must search for a key, but cannot search for a value. AdvantagesofKey-ValueStoredatabases 1. Precision Service Levels 2. Precision service monitoring and notification 3. Scalability and Reliability 4. Portability and lower operational cost 5. Speed -- queries tend to run very quickly. 6. Simplicity -- it does not get much simpler than key-value pairs 7. Can be used for many different applications and data storage problems. DisadvantagesofKey-ValueStoredatabases 1. Cannot query values. Only the key can be queried. This means that the user must know the key. 2. Does not establish relationships between data. If relationships are important, other NoSQL types may be more appropriate. 3. Cannot return lists of values. Queries return one and only one value. 4. Values may contain any data type. This is only a problem if the user expects a certain data type but receives another, or if the user expects the data type to always be the same when no such guarantee is provided. Document Stores This is one of the most flexible, powerful and popular types of databases in the NoSQL movement. Key-Value and column family stores work by searching for a key and they return a value associated with that key. These two types of databases do not index
  • 7. An Overview of NoSQL Databases | Rich Perry Page 6 of 13 values, or allow searching on values. Document stores work in a different manner. They allow searching on any content within documents. Document stores automatically index all content inside a document when it is added to the database. This makes indexes large, but everything is searchable. A document store API can provide a list of documents, find a single document, or find any subsection of any document. A key-value store can store an entire document in the value area and return that document if you search for its key, but a document store can return a just a sentence or paragraph from a large document (e.g., a book) without loading the entire document into memory. Tree structures are used in document store databases. The tree structures begin with a root node that has branches. The branches can have sub-branches and those can be divided into sub-branches indefinitely until they terminate at a leaf. The values are stored at the leaf level. Most document store databases also use collections to manage large number of documents. Collections can be used for different purposes such as navigation, grouping similar types of documents, and applying business rules to set different permissions, indexes, and triggers. Collections can contain other collections and trees can have sub- trees. Advantages ofDocumentStoredatabases 1. Very flexible. 2. High performance. 3. Variable but usually high scalability. 4. Relatively simple, but powerful APIs
  • 8. An Overview of NoSQL Databases | Rich Perry Page 7 of 13 Disadvantagesof DocumentStoredatabases 1. Overkill if searching inside documents is not necessary. If a user only needs the whole document, a key-value store might be better. 2. Only well suited for storing documents. If the data is not part of a document, it should probably be put in a different type of database. Graph Stores Graph stores are optimized to efficiently store node and links, and allow users to query those graphs. Graph Store databases are useful for any business that has complex relationships between objects such as social networking, rules-based engines, mashups, and systems that must analyze complex network structures. A graph store is a system that contains a sequence for nodes and relationships that create a graph when these things are combined. Key-Value stores have two data fields, the key and its value. Graph stores have 3 data fields -- node, relationship, and property. Graph nodes are nouns and often represent real world objects such people, organizations, websites, computers on a network, or cities on routes (i.e., highways, railways, or air routes). The relationships are the connections between the nodes. Graph store database queries essentially traverse the nodes on the graph. They can return information such as: 1. Shortest path between two nodes on a graph. 2. Neighboring nodes that have specific properties. 3. Similarities of neighboring nodes between two nodes.
  • 9. An Overview of NoSQL Databases | Rich Perry Page 8 of 13 Relationships are handled differently compared to relational databases. Graph store databases store related nodes together and assign internal identifiers to nodes, so that it can join networks. AdvantagesofGraphStoredatabases 1. Better performance compared to relational databases. 2. Designed to handle complex relationships between data. DisadvantagesofGraphStoredatabases 1. Difficult to scale horizontally to multiple servers. Data can be replicated on multiple servers to enhance read performance, but writing to multiple servers that span multiple nodes is complicated to implement. Overall Advantages of NoSQLDatabases Scalability More specifically, horizontal scaling is a major advantage of NoSQL database systems. This allows organizations to distribute the database across multiple servers and nodes rather than just buying bigger and better servers. Low maintenance in the future Relational databases require highly trained and experienced DBAs and developers. Although DBAs are probably not losing their jobs any time in the near future, NoSQL database systems will require less maintenance, less management, less support, and either fewer DBAs or IT professionals to do a similar job with less extensive training.
  • 10. An Overview of NoSQL Databases | Rich Perry Page 9 of 13 Cost Admins can distribute a NoSQL database system across multiple low cost hardware rather than just buying more expensive servers. Many NoSQL systems are also open source which usually translates into no cost software. Relational database systems tend to rely heavily on costly, proprietary software and hardware. Flexibility There are different NoSQL data models available to developers. This means that organizations have some good choices. Performance Most NoSQL databases provide much higher performance compared to relational databases. The main reason for this is that more choices are available to the IT professionals and management. The IT department can pick which type of database is best suited for the purpose. Key-value store databases are simple and return one value. This means that the performance of a NoSQL database system is primarily related to its focus. With the exception of graph type data stores, the performance is also helped by the low level complexity. Overall Disadvantages of NoSQLDatabases Maturity NoSQL database systems are not mature and therefore not appropriate for all purposes and situations. According to Herman Mehling of Database Journal, NoSQL, in general, lacks credibility in the IT world; whereas some relational databases are known for their
  • 11. An Overview of NoSQL Databases | Rich Perry Page 10 of 13 rich functionality, stability, reliability, vendor support, and wealth of expertise available in the employment pool. Relational databases have a lot of credibility with many IT professionals, such as DBAs, developers, data architects, and IT managers and executives. Support Relational databases may be expensive but their vendors provide a very high level of support. Most NoSQL systems are open source which means the basic software is free which in turn means the there is little or no support from outside your organization. Analytics and Business Intelligence (BI) Most business intelligence tools simply do not have interfaces for or connectivity to NoSQL databases. Quest Software has developed Toad for Cloud databases which provide limited ad hoc query support for some NoSQL systems. However there is a significant overall shortage of BI tools available for NoSQL. Expertise There is a wide and deep pool of IT professionals skilled, trained, and/or experienced with relational databases, but not a lot who know how to use, develop, or maintain NoSQL systems. Over time, the laws of supply and demand with education and employment will address this problem, but it won’t happen immediately. Compatibility Relational databases have many standards. However, NoSQL databases have very few, if any, standards. The lack of standards could make it very difficult to switch from
  • 12. An Overview of NoSQL Databases | Rich Perry Page 11 of 13 one vendor to another (i.e., assumes buying vendor software, not just downloading open source), if an organization becomes displeased with the service. NoSQLvs. Relational database Structure RDBMSs are designed to use only highly structured data. NoSQL databases are designed to use unstructured or semi-structured data. ACID ACID stands for Atomic, Consistent, Isolated, and Durable. It is a way to structure a database to ensure data integrity and keep transactions reliable. Relational databases strictly enforce ACID, but NoSQL databases usually do not. This is generally considered an advantage of relational databases. If your data requires a high amount of transactions, a relational database is probably a better choice. Flexibility Relational databases are not known for being very flexible. As previously mentioned, they structure data to a high degree and also have rigid schemas. Most NoSQL databases are very flexible because they have little or no schema, or have a flexible schema and use data that is unstructured or structured to a much lower degree than relational databases.
  • 13. An Overview of NoSQL Databases | Rich Perry Page 12 of 13 Normalization Relational databases need data to be normalized to at least the 3rd degree (a.k.a., 3rd normal form) in order to fully function in a relational manner and make efficient use of joins and indexes. NoSQL databases are often designed to support queries and do so by denormalizing data based on anticipated queries for the given database. Denormalizing improves query performance, so this is one of several reasons that NoSQL databases usually provide higher performance, at least for reading. Conclusions Any IT professional or manager must carefully consider the purpose before one can intelligently choose the correct NoSQL database or even decide whether transferring data from a relational database is appropriate. NoSQL databases can offer significant benefits, but they are not necessarily an appropriate solution for every data storage problem. However, any organization considering converting from a relational DBMS to a NoSQL database system should carefully note the limitations and other issues associated with these types of databases, and the risks of having no vendor support if using open source software, plus the lack of experienced IT professionals. Both relational databases and NoSQL databases have their place in the IT world. NoSQL may be a better alternative for some situations. Relational databases may also be the correct solution for some types of data storage problems. Some situations may call for a combined solution, where both relational databases and NoSQL databases are used.
  • 14. An Overview of NoSQL Databases | Rich Perry Page 13 of 13 References [McCreary, Dan and Kelly, Ann].[2014].[Making Sense of NoSQL].[Manning] [Brooks, Charlie].[2014].[Enterprise NoSQL for Dummies].[MarkLogic] [Scofield, Ben].[2010].[NoSQL Death to Relational Databases(?)].[Slide Share].[ http://www.slideshare.net/bscofield/nosql-codemash-2010] (accessed [10/17/2014]). Lith, Adam; Jakob Mattson (2010). "Investigating storage solutions for large data: A comparison of well performing and scalable data storage solutions for real time extraction and batch insertion of data" (PDF). Göteborg: Department of Computer Science and Engineering, Chalmers University of Technology. p. 70. Retrieved 05 Oct 2014. "Carlo Strozzi first used the term NoSQL in 1998 as a name for his open source relational database that did not offer a SQL interface[...]" "NoSQL 2009". Blog.sym-link.com. 12 May 2009. Retrieved 05 October 2014. [Harrison, Guy].[2010].[10 things you should know about NoSQL databases].[TechReplublic].[http://www.techrepublic.com/blog/10-things/10-things-you- should-know-about-nosql-databases/].(accessed [10/17/2014]). [Mehling , Herman].[2010].[10 things you Need to Know About NoSQL Databases].[Database Journal].[ http://www.databasejournal.com/features/article.php/3905531/10-things-you-Need-to- Know-About-NoSQL-Databases.htm].(accessed [10/17/2014]).