6. 為什麼要⽤用 NoSQL?(1/2)
• 管理⼤大規模資料:NoSQL 資料庫能輕易處理⼤大量的讀寫
週期、眾多⽤用⼾戶,以及數以 petabytes 計的資料。
50
(petabytes =2
bytes; 1024 terabytes, or a million gigabytes.)
• 不需要資料庫綱要 (Schema):當涉及到綱要建構時,它
們提供了相當廣泛的選擇空間,能輕易地和物件相對應。
• 開發者親和性:NoSQL 資料庫對各主要程式語⾔言提供了
簡單的 API,因此再也⽤用不著複雜的 ORM 框架。如果特定程式
語⾔言沒有 API 可⽤用時,還是可以透過簡單的 Restful API,使⽤用
XML 以及 JSON 格式經由 HTTP 存取資料。
7. 為什麼要⽤用 NoSQL?(2/2)
• 可⽤用性: 多數分散式 NoSQL 資料庫都提供簡易
的資料複製,單⼀一節點的毀損較不會影響資料的可⽤用
性。
• 延展性:NoSQL 資料庫不需要專⽤用的⾼高效能伺服
器。輕易地運⾏行在⼀一般硬體組成的叢集上。
• 低延遲
Do not fully support relational features
no join operations (except within partitions),
no referential integrity constraints across partitions.
8. How did we get?
•
Explosion of social media sites (Google,Facebook,
Twitter) with large data needs
•
Rise of cloud-based solutions such as Amazon S3
(simple storage solution)
•
Open-source community
10. Dynamo and BigTable
•
Three major papers were the seeds of
the NoSQL movement:
BigTable (Google)
Dynamo (Amazon)
•
CAP Theorem
11. CAP Theorem
•
⼜又被稱作 布魯爾定理(Brewer's theorem)
•
Brewer’s CAP “Theorem”: for any system sharing data it is
impossible to guarantee simultaneously all of these three
properties:
•
Consistency: all nodes see the same data at the same time
•
Availability: a guarantee that every request receives a
response about whether it was successful or failed
•
Partition tolerance: the system continues to operate despite
arbitrary message loss or failure of part of the system
12. CAP Theorem
!
•
Very large systems will partition at some point:
It is necessary to decide between C and A
!
•
Traditional DBMS prefer C over A and P
!
•
Most Web applications choose A (except in specific
applications such as order processing)
16. !
•
NoSQL 資料模型(2/2)
Hierarchical :These databases store data in the form of
hierarchical relevance, that is, tree or parent-child
relationship.
ex:階層式資料庫著名的實作包括 Microsoft 的 Windows
Registry 與 IBM 的 IMS 資料庫
•
Triple stores:Triple stores save data in the form of subjectpredicate-object with the predicate being the linking factor
between subject and object.
Support Semantic Web and RDF Storage
20. Suitable or Not suitable
合適的使⽤用狀況
不適合使⽤用狀況
儲存網路通訊對話資訊
取得不同資料間的關係
使⽤用者喜好設定
多個鍵值操作
購物⾞車資料
⽤用資料來查詢
21. Document stores
Like Key-Value Stores except value is document
Data model: (key, document) pairs
Document: JSON, XML, other semistructured formats
Basic operations:
Insert(key,document), Fetch(key),
Update(key), Delete(key)
Also Fetch based on document contents.
Example systems
CouchDB, MongoDB, SimpleDB, …
27. Suitable or Not suitable
合適的使⽤用狀況
不適合使⽤用狀況
事件歷史記錄
讀取或寫⼊入
ACID交易系統
內容管理系統、部落格
平台
開發初期階段
限期使⽤用(廣告推播)
查詢變更(COST)
28. Graph stores
Interfaces and query languages vary
Example systems: Neo4j, FlockDB, Pregel, …
RDF “triple stores” can map to graph databases
•
•
•
•
•
•
Data model: nodes and edges
Nodes may have properties (including ID)
Edges may have labels or roles
31. Suitable or Not suitable
合適的使⽤用狀況
不適合使⽤用狀況
社群網路發佈
更新全部或實體⼦子集
轉發、傳遞基於位置的
服務
Recommendation
32. Conclusion and Discuss
NoSQL database cover only a part of data-intensive
cloud applications (mainly Web applications).
!
Problems with cloud computing:
• SaaS applications require enterprise-level functionality,
including ACID transactions, security, and other
features associated with commercial RDBMS
technology, i.e.
!
•
!
!
!
NoSQL should not be the only option in the cloud.
33. Conclusion and Discuss
!
!
Hybrid solutions:
Voldemort with MySQL as one of storage backend
deal with NoSQL data as semistructured data
integrating RDBMS and NoSQL via SQL/XML
!