This document discusses distributed programming and data consistency. It defines consistency as how systems and observers perceive the state of a system over time. Consistency has a time aspect, where expected and unexpected sequences of states can occur. Distributed systems like caching introduce inconsistencies when data is replicated across servers. The CAP theorem states that a distributed system cannot simultaneously provide consistency, availability, and partition tolerance. Eventual consistency prioritizes availability over strong consistency.
Distributed Programming and Data Consistency w/ Notes - June 2010 updatePaulo Gaspar
June 2010 update. Several URLs were then updated on the notes.
Presentation with NOTES.
Tuning Data Consistency to obtain efficient Distributed Computing solutions.
The solutions that the academic world and the new NoSQL trend is making available to the IT industry in general.
7 Common mistakes in Go and when to avoid themSteven Francia
I've spent the past two years developing some of the most popular libraries and applications written in Go. I've also made a lot of mistakes along the way. Recognizing that "The only real mistake is the one from which we learn nothing. -John Powell", I would like to share with you the mistakes that I have made over my journey with Go and how you can avoid them.
This document contains frequently asked questions (FAQs) about big data technologies like Hadoop, MongoDB, and related topics. Key topics covered include using Hadoop for processing large datasets, MongoDB features and administration, optimizing web crawlers, performing clustering on large datasets, and comparing algorithms like logistic regression, decision trees, and neural networks. Configuration parameters for Hadoop like dfs.name.dir and dfs.data.dir are also discussed.
The document discusses using Storm, Cassandra, and in-memory computing for real-time big data analytics. It describes Storm as a framework for real-time stream processing and Cassandra as a database for handling large volumes of data. The document proposes using an in-memory data grid to provide a high-performance interface between Storm and Cassandra for real-time analytics of streaming data.
This document provides a high-level summary of streaming data processing and the Lambda architecture. It begins with a brief history of batch and streaming systems for big data. It then introduces the Lambda architecture as a way to handle both batch and streaming data using separate batch and speed layers. The document discusses advantages and disadvantages of the Lambda architecture, as well as use cases, implementation tips, and approaches that have emerged beyond the Lambda architecture like Kappa and FastData architectures.
Streaming analytics on Google Cloud Platform, by Javier Ramirez, teowakijavier ramirez
Do you think you can write a system to get data from sensors across the world, do real time analytics, and display the data on a dashboard in under 100 lines of code? Would you like to add some monitoring and autoscaling too? And what about serverless? In this talk I'll show you all the technologies GCP offers to build such a system reliably and at scale.
Non-Relational Databases: This hurts. I like it.Onyxfish
The document discusses non-relational databases, providing an overview of their characteristics and comparing them to relational databases. It outlines some popular non-relational database platforms, and uses the example of an open government project to demonstrate how CouchDB could be used to store and query schema-less data in a scalable way.
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSylvain Kalache
The goal of Skynet is to avoid human doing repetitive things and make a system doing them in a better way. System automation should be the way to go for any system management so that human can focus on stuff that really matters.
Related blog post for more informations https://engineering.linkedin.com/slideshare/skynet-project-_-monitor-scale-and-auto-heal-system-cloud
Distributed Programming and Data Consistency w/ Notes - June 2010 updatePaulo Gaspar
June 2010 update. Several URLs were then updated on the notes.
Presentation with NOTES.
Tuning Data Consistency to obtain efficient Distributed Computing solutions.
The solutions that the academic world and the new NoSQL trend is making available to the IT industry in general.
7 Common mistakes in Go and when to avoid themSteven Francia
I've spent the past two years developing some of the most popular libraries and applications written in Go. I've also made a lot of mistakes along the way. Recognizing that "The only real mistake is the one from which we learn nothing. -John Powell", I would like to share with you the mistakes that I have made over my journey with Go and how you can avoid them.
This document contains frequently asked questions (FAQs) about big data technologies like Hadoop, MongoDB, and related topics. Key topics covered include using Hadoop for processing large datasets, MongoDB features and administration, optimizing web crawlers, performing clustering on large datasets, and comparing algorithms like logistic regression, decision trees, and neural networks. Configuration parameters for Hadoop like dfs.name.dir and dfs.data.dir are also discussed.
The document discusses using Storm, Cassandra, and in-memory computing for real-time big data analytics. It describes Storm as a framework for real-time stream processing and Cassandra as a database for handling large volumes of data. The document proposes using an in-memory data grid to provide a high-performance interface between Storm and Cassandra for real-time analytics of streaming data.
This document provides a high-level summary of streaming data processing and the Lambda architecture. It begins with a brief history of batch and streaming systems for big data. It then introduces the Lambda architecture as a way to handle both batch and streaming data using separate batch and speed layers. The document discusses advantages and disadvantages of the Lambda architecture, as well as use cases, implementation tips, and approaches that have emerged beyond the Lambda architecture like Kappa and FastData architectures.
Streaming analytics on Google Cloud Platform, by Javier Ramirez, teowakijavier ramirez
Do you think you can write a system to get data from sensors across the world, do real time analytics, and display the data on a dashboard in under 100 lines of code? Would you like to add some monitoring and autoscaling too? And what about serverless? In this talk I'll show you all the technologies GCP offers to build such a system reliably and at scale.
Non-Relational Databases: This hurts. I like it.Onyxfish
The document discusses non-relational databases, providing an overview of their characteristics and comparing them to relational databases. It outlines some popular non-relational database platforms, and uses the example of an open government project to demonstrate how CouchDB could be used to store and query schema-less data in a scalable way.
Skynet project: Monitor, analyze, scale, and maintain a system in the CloudSylvain Kalache
The goal of Skynet is to avoid human doing repetitive things and make a system doing them in a better way. System automation should be the way to go for any system management so that human can focus on stuff that really matters.
Related blog post for more informations https://engineering.linkedin.com/slideshare/skynet-project-_-monitor-scale-and-auto-heal-system-cloud
HBase is an open-source implementation of Google's Bigtable storage system and is modeled after Bigtable. It is a distributed, scalable, big data store that allows for storage and retrieval of large amounts of data across clusters of commodity servers. HBase provides a key-value data model and uses Hadoop HDFS for storage. It allows for fast random reads and writes across billions of rows and millions of columns.
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
Tinder’s Quickfire Pipeline powers all things data at Tinder. It was originally built using AWS Kinesis Firehoses and has since been extended to use both Kafka and other event buses. It is the core of Tinder’s data infrastructure. This rich data flow of both client and backend data has been extended to service a variety of needs at Tinder, including Experimentation, ML, CRM, and Observability, allowing backend developers easier access to shared client side data. We perform this using many systems, including Kafka, Spark, Flink, Kubernetes, and Prometheus. Many of Tinder’s systems were natively designed in an RPC first architecture.
Things we’ll discuss decoupling your system at scale via event-driven architectures include:
– Powering ML, backend, observability, and analytical applications at scale, including an end to end walk through of our processes that allow non-programmers to write and deploy event-driven data flows.
– Show end to end the usage of dynamic event processing that creates other stream processes, via a dynamic control plane topology pattern and broadcasted state pattern
– How to manage the unavailability of cached data that would normally come from repeated API calls for data that’s being backfilled into Kafka, all online! (and why this is not necessarily a “good” idea)
– Integrating common OSS frameworks and libraries like Kafka Streams, Flink, Spark and friends to encourage the best design patterns for developers coming from traditional service oriented architectures, including pitfalls and lessons learned along the way.
– Why and how to avoid overloading microservices with excessive RPC calls from event-driven streaming systems
– Best practices in common data flow patterns, such as shared state via RocksDB + Kafka Streams as well as the complementary tools in the Apache Ecosystem.
– The simplicity and power of streaming SQL with microservices
Video: https://youtu.be/LuVT0jsIrZk
------------------------------------------------------------------------------------------------------------------------------------
Hay trabajos y hay carreras. Las oportunidades vienen a golpear la puerta cuando menos lo esperas. La decisión es tuya. Desde tener la oportunidad de hacer algo significativo día tras día, hasta estar rodeado de gente supremamente inteligente y motivada.
¿Estás listo?
Descúbre todas nuestras oportunidades acá:https://mycareer.globant.com/
------------------------------------------------------------------------------------------------------------------------------------
Siguenos en:
Facebook: https://www.facebook.com/Globant/
Twitter: https://twitter.com/Globant
Instagram: https://www.instagram.com/globantpics/
Linkedin: https://www.linkedin.com/company/globant/
This presentation is focused on the architecture, scalability concerns, performance bottlenecks, operational characteristics and lessons learned while designing and implementing Yammer distributed real-time search system. Yammer is an enterprise social network SaaS offering with over 100,000 networks (including 85% of the Fortune 100) and nearly 2 million users. The search system we developed scales well up to 1B messages and serves a foundation of knowledge base analysis services Yammer is developing.
Presentation on the architecture, scalability concerns, performance bottlenecks, operational characteristics and lessons learned while designing and implementing Yammer distributed real-time search system.
Real-time Search at Yammer - By Aleksandrovsky Borislucenerevolution
See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011
This talk will be focused on the architecture, scalability concerns, performance bottlenecks,
operational characteristics and lessons learned while designing and implementing Yammer
distributed real-time search system. Yammer is an enterprise social network SaaS offering with over
100,000 networks (including 85% of the Fortune 100) and nearly 2 million users. The search system
we developed scales well up to 1B messages and serves a foundation of knowledge base analysis
services Yammer is developing.
This document discusses NoSQL databases and their advantages over SQL databases for storing large amounts of data. It describes key features of NoSQL databases like Dynamo and Bigtable that allow for horizontal scaling, including partitioning data across many machines, replicating data for availability, and using an eventually consistent approach. The document also explains techniques used in NoSQL databases like virtual nodes for partitioning, gossip protocols for failure detection and data synchronization between nodes.
The document talks about the overview behind the need and drive for NoSQL databases. It also mentions about some of the most popular NoSQL databases in the market.
The document summarizes the history and evolution of non-relational databases, known as NoSQL databases. It discusses early database systems like MUMPS and IMS, the development of the relational model in the 1970s, and more recent NoSQL databases developed by companies like Google, Amazon, Facebook to handle large, dynamic datasets across many servers. Pioneering systems like Google's Bigtable and Amazon's Dynamo used techniques like distributed indexing, versioning, and eventual consistency that influenced many open-source NoSQL databases today.
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloudJaipaul Agonus
This presentation is a real-world case study about moving a large portfolio of batch analytical programs that process 30 billion or more transactions every day, from a proprietary MPP database appliance architecture to the Hadoop ecosystem in the cloud, leveraging Hive, Amazon EMR, and S3.
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareQuantum Leaps, LLC
Embedded software developers from different industries are independently re-discovering patterns for building concurrent software that is safer, more responsive and easier to understand than naked threads of a Real-Time Operating System (RTOS). These best practices universally favor event-driven, asynchronous, non-blocking, encapsulated state machines instead of naked, blocking RTOS threads. This presentation explains the concepts related to this increasingly popular "reactive approach", and specifically how they apply to real-time embedded systems.
This is the course that was presented by James Liddle and Adam Vile for Waters in September 2008.
The book of this course can be found at: http://www.lulu.com/content/4334860
UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin
I presented to the Georgia Southern Computer Science ACM group. Rather than one topic for 90 minutes, I decided to do an UnConference. I presented them a list of 8-9 topics, let them vote on what to talk about, then repeated.
Each presentation was ~8 minutes, (Except Career) and was by no means an attempt to explain the full concept or technology. Only to wake up their interest.
The functional paradigm is not only applicable to programming. There is even more reason for using functional patterns at an architectural level. MapReduce is the most famous example of such a pattern. In this talk, we will go through a few other architectural patterns, and their corresponding stateful anti-patterns.
CouchBase The Complete NoSql Solution for Big DataDebajani Mohanty
Couchbase is a complete NoSQL database solution for big data. It provides a distributed database that can scale horizontally. Couchbase uses a document-oriented data model and supports the CAP theorem. It sacrifices consistency to achieve high availability and partition tolerance. Couchbase is used by many large companies for applications that involve large, complex datasets with high user volumes and real-time requirements.
Building collaborative HTML5 apps using a backend-as-a-service (HTML5DevConf ...João Parreira
Slide deck for talk at the 2013 HTML5 Developers Conference in San Francisco. Covers the main BaaS critical success factors: SMART (Scalable, Mobile-ready, Available, Real-time enabled and Truly secure)
[Meet Magento 2015, Germany] In this presentation I'll show some pure evil bad practices that somehow made it into way too many Magento modules out there making it hard to integrate, adapt, scale, debug, secure or extend your project. Join this presentation and help making the Magento module ecosystem be a better place instead by spotting these "code smells" in your modules or the modules you're using.
Kostas Tzoumas - Stream Processing with Apache Flink®Ververica
In this talk the basics on Apache Flink are covered: why the project exists, where it came from, what gap does it fill, how it differs from all the other stream processing projects, what is it being used for, and where is it headed. In short, streaming data is now the new trend, and for very good reasons. Most data is produced continuously, and it makes sense that it is processed and analysed continuously. Whether it is the need for more real-time products, adopting micro-services, or building continuous applications, stream processing technology offers to simplify the data infrastructure stack and reduce the latency to decisions.
Debunking Common Myths in Stream ProcessingKostas Tzoumas
This document discusses stream processing with Apache Flink. It begins by defining streaming as the continuous processing of never-ending data streams. It then debunks four common myths about stream processing: 1) that there is always a throughput/latency tradeoff, showing that Flink can achieve high throughput and low latency; 2) that exactly-once processing is not possible, but Flink provides exactly-once state guarantees with checkpoints; 3) that streaming is only for real-time applications, whereas it can also be used for historical data; and 4) that streaming is too hard, whereas most data problems are actually streaming problems. The document concludes by discussing Flink's community and examples of companies using Flink in production.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Mais conteúdo relacionado
Semelhante a Distributed Programming and Data Consistency w/ Notes
HBase is an open-source implementation of Google's Bigtable storage system and is modeled after Bigtable. It is a distributed, scalable, big data store that allows for storage and retrieval of large amounts of data across clusters of commodity servers. HBase provides a key-value data model and uses Hadoop HDFS for storage. It allows for fast random reads and writes across billions of rows and millions of columns.
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
Tinder’s Quickfire Pipeline powers all things data at Tinder. It was originally built using AWS Kinesis Firehoses and has since been extended to use both Kafka and other event buses. It is the core of Tinder’s data infrastructure. This rich data flow of both client and backend data has been extended to service a variety of needs at Tinder, including Experimentation, ML, CRM, and Observability, allowing backend developers easier access to shared client side data. We perform this using many systems, including Kafka, Spark, Flink, Kubernetes, and Prometheus. Many of Tinder’s systems were natively designed in an RPC first architecture.
Things we’ll discuss decoupling your system at scale via event-driven architectures include:
– Powering ML, backend, observability, and analytical applications at scale, including an end to end walk through of our processes that allow non-programmers to write and deploy event-driven data flows.
– Show end to end the usage of dynamic event processing that creates other stream processes, via a dynamic control plane topology pattern and broadcasted state pattern
– How to manage the unavailability of cached data that would normally come from repeated API calls for data that’s being backfilled into Kafka, all online! (and why this is not necessarily a “good” idea)
– Integrating common OSS frameworks and libraries like Kafka Streams, Flink, Spark and friends to encourage the best design patterns for developers coming from traditional service oriented architectures, including pitfalls and lessons learned along the way.
– Why and how to avoid overloading microservices with excessive RPC calls from event-driven streaming systems
– Best practices in common data flow patterns, such as shared state via RocksDB + Kafka Streams as well as the complementary tools in the Apache Ecosystem.
– The simplicity and power of streaming SQL with microservices
Video: https://youtu.be/LuVT0jsIrZk
------------------------------------------------------------------------------------------------------------------------------------
Hay trabajos y hay carreras. Las oportunidades vienen a golpear la puerta cuando menos lo esperas. La decisión es tuya. Desde tener la oportunidad de hacer algo significativo día tras día, hasta estar rodeado de gente supremamente inteligente y motivada.
¿Estás listo?
Descúbre todas nuestras oportunidades acá:https://mycareer.globant.com/
------------------------------------------------------------------------------------------------------------------------------------
Siguenos en:
Facebook: https://www.facebook.com/Globant/
Twitter: https://twitter.com/Globant
Instagram: https://www.instagram.com/globantpics/
Linkedin: https://www.linkedin.com/company/globant/
This presentation is focused on the architecture, scalability concerns, performance bottlenecks, operational characteristics and lessons learned while designing and implementing Yammer distributed real-time search system. Yammer is an enterprise social network SaaS offering with over 100,000 networks (including 85% of the Fortune 100) and nearly 2 million users. The search system we developed scales well up to 1B messages and serves a foundation of knowledge base analysis services Yammer is developing.
Presentation on the architecture, scalability concerns, performance bottlenecks, operational characteristics and lessons learned while designing and implementing Yammer distributed real-time search system.
Real-time Search at Yammer - By Aleksandrovsky Borislucenerevolution
See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011
This talk will be focused on the architecture, scalability concerns, performance bottlenecks,
operational characteristics and lessons learned while designing and implementing Yammer
distributed real-time search system. Yammer is an enterprise social network SaaS offering with over
100,000 networks (including 85% of the Fortune 100) and nearly 2 million users. The search system
we developed scales well up to 1B messages and serves a foundation of knowledge base analysis
services Yammer is developing.
This document discusses NoSQL databases and their advantages over SQL databases for storing large amounts of data. It describes key features of NoSQL databases like Dynamo and Bigtable that allow for horizontal scaling, including partitioning data across many machines, replicating data for availability, and using an eventually consistent approach. The document also explains techniques used in NoSQL databases like virtual nodes for partitioning, gossip protocols for failure detection and data synchronization between nodes.
The document talks about the overview behind the need and drive for NoSQL databases. It also mentions about some of the most popular NoSQL databases in the market.
The document summarizes the history and evolution of non-relational databases, known as NoSQL databases. It discusses early database systems like MUMPS and IMS, the development of the relational model in the 1970s, and more recent NoSQL databases developed by companies like Google, Amazon, Facebook to handle large, dynamic datasets across many servers. Pioneering systems like Google's Bigtable and Amazon's Dynamo used techniques like distributed indexing, versioning, and eventual consistency that influenced many open-source NoSQL databases today.
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloudJaipaul Agonus
This presentation is a real-world case study about moving a large portfolio of batch analytical programs that process 30 billion or more transactions every day, from a proprietary MPP database appliance architecture to the Hadoop ecosystem in the cloud, leveraging Hive, Amazon EMR, and S3.
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareQuantum Leaps, LLC
Embedded software developers from different industries are independently re-discovering patterns for building concurrent software that is safer, more responsive and easier to understand than naked threads of a Real-Time Operating System (RTOS). These best practices universally favor event-driven, asynchronous, non-blocking, encapsulated state machines instead of naked, blocking RTOS threads. This presentation explains the concepts related to this increasingly popular "reactive approach", and specifically how they apply to real-time embedded systems.
This is the course that was presented by James Liddle and Adam Vile for Waters in September 2008.
The book of this course can be found at: http://www.lulu.com/content/4334860
UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin
I presented to the Georgia Southern Computer Science ACM group. Rather than one topic for 90 minutes, I decided to do an UnConference. I presented them a list of 8-9 topics, let them vote on what to talk about, then repeated.
Each presentation was ~8 minutes, (Except Career) and was by no means an attempt to explain the full concept or technology. Only to wake up their interest.
The functional paradigm is not only applicable to programming. There is even more reason for using functional patterns at an architectural level. MapReduce is the most famous example of such a pattern. In this talk, we will go through a few other architectural patterns, and their corresponding stateful anti-patterns.
CouchBase The Complete NoSql Solution for Big DataDebajani Mohanty
Couchbase is a complete NoSQL database solution for big data. It provides a distributed database that can scale horizontally. Couchbase uses a document-oriented data model and supports the CAP theorem. It sacrifices consistency to achieve high availability and partition tolerance. Couchbase is used by many large companies for applications that involve large, complex datasets with high user volumes and real-time requirements.
Building collaborative HTML5 apps using a backend-as-a-service (HTML5DevConf ...João Parreira
Slide deck for talk at the 2013 HTML5 Developers Conference in San Francisco. Covers the main BaaS critical success factors: SMART (Scalable, Mobile-ready, Available, Real-time enabled and Truly secure)
[Meet Magento 2015, Germany] In this presentation I'll show some pure evil bad practices that somehow made it into way too many Magento modules out there making it hard to integrate, adapt, scale, debug, secure or extend your project. Join this presentation and help making the Magento module ecosystem be a better place instead by spotting these "code smells" in your modules or the modules you're using.
Kostas Tzoumas - Stream Processing with Apache Flink®Ververica
In this talk the basics on Apache Flink are covered: why the project exists, where it came from, what gap does it fill, how it differs from all the other stream processing projects, what is it being used for, and where is it headed. In short, streaming data is now the new trend, and for very good reasons. Most data is produced continuously, and it makes sense that it is processed and analysed continuously. Whether it is the need for more real-time products, adopting micro-services, or building continuous applications, stream processing technology offers to simplify the data infrastructure stack and reduce the latency to decisions.
Debunking Common Myths in Stream ProcessingKostas Tzoumas
This document discusses stream processing with Apache Flink. It begins by defining streaming as the continuous processing of never-ending data streams. It then debunks four common myths about stream processing: 1) that there is always a throughput/latency tradeoff, showing that Flink can achieve high throughput and low latency; 2) that exactly-once processing is not possible, but Flink provides exactly-once state guarantees with checkpoints; 3) that streaming is only for real-time applications, whereas it can also be used for historical data; and 4) that streaming is too hard, whereas most data problems are actually streaming problems. The document concludes by discussing Flink's community and examples of companies using Flink in production.
Semelhante a Distributed Programming and Data Consistency w/ Notes (20)
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Inconsistent user experience and siloed data, high costs, and changing customer expectations – Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their “modern digital bank” experiences.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Distributed Programming and Data Consistency w/ Notes
1. Distributed Programming
and Data Consistency
by Paulo Gaspar
@paulogaspar7
1
Twitter: @paulogaspar7 - http://twitter.com/paulogaspar7
Blog: http://paulogaspar7.blogspot.com/
3. What is Consistency?
3
Our perception of consistency is related with what we know about the system and its state. That is how we figure
what might fit...
4. What isn’t?
4
...and what does not fit. Obviously a person will have a different degree of precision and tolerance than an
automated system.
5. Consistency across time
5
Consistency also has a time axis, with state sequences that make sense...
1 of 3=> Expected event sequence (3 slide animation which SlideShare won’t handle)
6. Consistency across time
6
2 of 3=> Expected event sequence (3 slide animation which SlideShare won’t handle)
7. Consistency across time
7
3 of 3=> Expected event sequence (3 slide animation which SlideShare won’t handle)
8. Inconsistency across time
8
...and state sequences that do NOT make sense.
1 of 3=> UNexpected event sequence (3 slide animation which SlideShare won’t handle)
9. Inconsistency across time
9
2 of 3=> UNexpected event sequence (3 slide animation which SlideShare won’t handle)
10. Inconsistency across time
10
3 of 3=> UNexpected event sequence (3 slide animation which SlideShare won’t handle)
11. Consistency is perception
...and time matters...
11
Again, each (type of) observer will have a different degree of evaluation precision and tolerance to inconsistencies.
13. Data Caching Consistency
Multi-layer caching
The 3 second cache for a “LIVE” site
(e.g.: BBC News live soccer reports)
User changing cached data
Schrodinger’s Cache?
13
Even on a “live” site you can use a short lived cache. If the user can NOT observe the exact time of each server state
changes, are any server to client delays (due to caching) really there?
Moreover, it is often a matter of having small update-until-view delays due to caching or really big ones (or the site
down) due to overload.
14. Memcached at FB:
You HAVE TO Replicate to Scale-Out
14
An example of how you still might have to replicate in order to scale, even with a very high performance store.
The reason for FB’s issue (might lack some detail):
http://highscalability.com/blog/2009/10/26/facebooks-memcached-multiget-hole-more-machines-more-
capacit.html
15. So, now it “Loadbalances”...
15
...and with LB inconsistencies along the time axis can happen (eg. by reading from alternate out-of-synch
backends)
16. ...but then you can have...
16
With the possibility of state sequences that do NOT make sense.
1 of 3=> UNexpected event sequence (3 slide animation which SlideShare won’t handle)
17. Inconsistency across time
17
2 of 3=> UNexpected event sequence (3 slide animation which SlideShare won’t handle)
18. Inconsistency across time
18
3 of 3=> UNexpected event sequence (3 slide animation which SlideShare won’t handle)
19. ...now it can pick >1 versions!
19
Why you can have inconsistencies along the time axis.
20. Slow and Big Consistency
(The Higher Latency - BigData)
20
21. MapReduce is for embarrassingly
parallel problems with some time...
21
Consistency scenarios, starting from the most “sexy” (Web, Peta Bytes of Data):
* MapReduce works like vote counting - vote mapped to voting tables, counted, “reduced” to stats;
* MR is appropriate for "embarrassingly parallel" tasks, like indexing the Internet and other huge processing tasks;
* We should use it whenever possible;
* There is a lot to be learned about Map Reduce:
- Evaluation and expression of candidate problems;
- Build and manage an its infrastructure;
- etc.
* Even MR has coordination needs;
* Even MR should have SLAs (Service Level Agreements).
22. MapReduce Implementations
(& Cia.)
Google, coordination by Chubby using Paxos.
Used only at Google;
Google BigTable is a Wide Column Store which works
on top of GoogleFS. Used only at Google;
Hadoop, used at Amazon, Facebook, Rackspace,
Twitter, Yahoo!, etc.;
Hadoop ZooKeeper implements a Paxos variation and
is used at Rackspace, Yahoo!, etc.;
Hadoop HBase is a Wide Column Store, on top of
HDFS and now uses ZooKeeper. Used at Yahoo! etc.
22
Parallel between Google’s internally developed systems and their Hadoop counterparts.
http://hadoop.apache.org/
http://labs.google.com/papers/
The very interesting “coordinators”:
http://labs.google.com/papers/chubby.html
http://hadoop.apache.org/zookeeper/
Zookeeper sure looks like a very interesting and reusable piece of software.
Curiosity: HBase is faster since using ZooKeeper... is it also because of Zookeeper???
http://hadoop.apache.org/hbase/
24. Two “High”/Sexy reasons for
Distributing Data Storage
(not just cache)
High Performance Data Access
(Read / Write)
High Availability (HA)
24
25. Why care about HA?
1.7% HDDs fail in the 1st year, 8.6% in the 3rd (Google)
Unrecoverable RAM errors/year: 1.3% machines,
0.22% DIMM (Google)
Router, Rack, PDU, misc. network failures
Over 4 nines only through redundancy, best hardware
never good enough (James Hamilton-MS and Amazon)
25
Sources:
For Google’s numbers check the slideware at:
http://videolectures.net/wsdm09_dean_cblirs/
For the James Hamilton quote:
http://mvdirona.com/jrh/TalksAndPapers/JamesRH_Ladis2008.pdf
Another very quoted paper with Google’s DRAM failure stats and patterns:
http://research.google.com/pubs/pub35162.html
You can find other HA and Systems related papers from Google and James Hamilton at:
http://mvdirona.com/jrh/work/
http://research.google.com/pubs/DistributedSystemsandParallelComputing.html
26. Why care about Latency?
Google: Half a second delay caused a 20% drop in
traffic (30 results instead of 10, via Marissa Mayer);
Amazon found every 100ms of latency costs 1% sales
(via Greg Linden);
A broker could lose $4 million in revenues per
millisecond if their electronic trading platform is 5 ms
behind the competition (via NYT).
26
You can find all this references trough this page (if you follow the links):
http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it
Including these:
http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html
http://perspectives.mvdirona.com/2009/10/31/TheCostOfLatency.aspx
http://www.nytimes.com/2009/07/24/business/24trading.html?_r=2&hp
27. Other Distributed Data Contexts
(the less sexy daily stuff)
EAI / B2B / Systems Integration
Geographic Distribution (e.g.:Health System+Hospitals)
Systems with n-tier / SOA Architectures
27
The daily jobs of so many IT professionals have much more relation with this type of common distributed systems
than with the sexier kind we talked about before. But these fields too would benefit from the learning the lessons
and using the technologies we are talking about.
28. Fallacies of Distributed Computing
1. The network is reliable;
2. Latency is zero;
3. Bandwidth is infinite;
4. The network is secure;
5. Topology doesn't change;
6. There is one administrator;
7. Transport cost is zero;
8. The network is homogeneous.
28
Just to remember this classic on the HA challenges. A few more details at:
http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing
29. CAP Theorem History
1999: 1st mention on the “Harvest, Yield and Scalable Tolerant Systems”
paper by Eric A. Brewer (Berkley/Inktomi) and Armando Fox (Stanford/Berkley)
2000-07-19: Brewer’s CAP Conjecture part of Brewer’s keynote to the PODC
Conference
2002-06: Brewer’s CAP Theorem proof published by Seth Gilbert (MIT) and
Nancy Lynch (MIT)
2007-10-02: “Amazon's Dynamo” post by Werner Vogels
(Amazon’s CTO) quoting the paper:
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash
Lakshman, Alex Pilchin, Swami Sivasubramanian, Peter Vosshall and Werner Vogels,
“Dynamo: Amazon's Highly Available Key-Value Store”, in the Proceedings of the 21st
ACM Symposium on Operating Systems Principles, Stevenson, WA, October 2007.
2007-12-19: “Eventually Consistent” post by Werner Vogels (Amazon’s CTO)
29
The online book “CouchDB: The Definitive Guide” has an interesting introduction to these concepts - the “Eventual
Consistency” chapter:
http://books.couchdb.org/relax/intro/eventual-consistency
Really essential and truly amazing is the Dynamo paper by Werner Vogels et al, proof that BASE really works in
truly industrial sites, even with stats describing real life behavior:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
...and the now famous Eventually Consistent post by Werner Vogels:
http://www.allthingsdistributed.com/2007/12/eventually_consistent.html
If you dislike the introductory (justifiable) drama, just jump to the next part because this article, by Julian Browne,
is the best I found about the Brewer’s CAP Theorem and its history:
http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
You should still take a look at:
* The 1997 “Cluster-Based Scalable Network Services” paper (Brewer et al.) where the BASE vs ACID dilemma is
already mentioned:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.2034&rep=rep1&type=pdf
* The 1999 “Harvest, Yeld and Scalable Tolerant Systems” paper (Brewer et al.) where the CAP conjecture is already
mentioned:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.3690&rep=rep1&type=pdf
* The PODC 2000 keynote, by Brewer, that made the CAP conjecture and the BASE concept “popular”:
http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
* You might also see with your own eyes how CAP became a proved Theorem:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.20.1495&rep=rep1&type=pdf
Definition of ACID:
http://en.wikipedia.org/wiki/ACID
30. The CAP Theorem
strong Consistency, high Availability, Partition-resilience:
pick at most 2
30
I simply had to put The Diagram, of course.
31. Eventual Consistency for
Availability
BASE ACID
(Basically Available Soft-state Eventual consistency) (Atomicity, Consistency, Isolation, Durability)
Weak Consistency Strong consistency
(stale data ok) (NO stale data)
Availability first Isolation
Best effort Focus on “commit”
Approximate answers OK Availability?
Aggressive (optimistic) Conservative (pessimistic)
Faster Safer
31
You can find a variation of this slide at Brewer’s 2000’s PODC keynote at:
http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
I skipped these rather controversial bits:
ACID: * Nested transactions; * Difficult evolution (e.g. schema)
BASE: * Simpler! * Easier evolution
I already tried both ways (data stores with and without schema) and I rather have some schema mechanism for the most
complex stuff.
ACID:
A)tomicity
Either all of the tasks of a transaction are performed or none of them are.
C)onsistency
A database remains in a consistent state before the start of the transaction and after the transaction is over (whether successful
or not).
I)solation
Other operations cannot access or see the data in an intermediate state during a transaction.
D)urability
Once the user has been notified of success, the transaction will persist. This means it will survive system failure, and that the
database system has checked the integrity constraints and won't need to abort the transaction.
32. CAP Trade-offs
CA without P: Databases providing distributed transactions can
only do it while their network is ok;
CP without A: While there is a partition, transactions to an ACID
database may be blocked until the partition heals
(to avoid merge conflicts -> inconsistency);
AP without C: Caching provides client-server partition resilience
by replicating data, even if the partition prevents verifying if a
replica is fresh. In general, any distributed DB problem can be
solved with either:
expiration-based caching to get AP;
or replicas and majority voting to get PC
(minority is unavailable).
32
Concept introduced at the 1999 “Harvest, Yeld and Scalable Tolerant Systems” paper (Brewer et al.):
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.3690&rep=rep1&type=pdf
I should probably skip this slide during a life presentation. This is stuff you have to read about.
33. Living with CAP
All systems are probabilistic, wether they realize it or not
And so are Distributed Transactions (2 Generals Problem)
Weak CAP Principle: The stronger the guarantees made
about any two of C, A and P, the weaker the guarantees
that can be made about the third
Systems should degrade gracefully, instead of all or
nothing (e.g.: displaying data from available partitions)
Life is Eventually Consistent
Aim for Eventual Consistency
33
Steve Yen clearly illustrates the “Life is Eventually Consistent” idea on the slideware (slides 40 to 45) he used for
his “No SQL is a Horseless Carriage” talk at NoSQL Oakland 2009:
http://dl.dropbox.com/u/2075876/nosql-steve-yen.pdf
The Weak CAP Principle was introduced at the 1999 “Harvest, Yeld and Scalable Tolerant Systems” paper (Brewer et
al.):
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.3690&rep=rep1&type=pdf
To understand how hard (ACID) Distributed Transactions are, you have an excellent history of the concepts related
to this problem here:
http://betathoughts.blogspot.com/2007/06/brief-history-of-consensus-2pc-and.html
The difficulties of (ACID) Distributed Transactions are well illustrated by the classic Two Generals’ Problem:
http://en.wikipedia.org/wiki/Two_Generals'_Problem
Leslie Lamport et al further explore the problem (and its solutions) on the classic “The Byzantine Generals Problem”
paper:
http://research.microsoft.com/en-us/um/people/lamport/pubs/byz.pdf
And if you think that Two Phase Commit is a 100% reliable mechanism... think again:
http://www.cs.cornell.edu/courses/cs614/2004sp/papers/Ske81.pdf
This is just to illustrate the difficulty of the problem. There are more reliable mechanisms, like Three Phase
Commit:
http://en.wikipedia.org/wiki/Three-phase_commit_protocol
http://ei.cs.vt.edu/~cs5204/fall99/distributedDBMS/sreenu/3pc.html
...or the so called Paxos Commit:
http://research.microsoft.com/pubs/64636/tr-2003-96.pdf
34. CAP Theorem History
1999: 1st mention on the “Harvest, Yield, and Scalable Tolerant Systems”
paper by Eric A. Brewer (Berkley/Inktomi) and Armando Fox (Stanford/Berkley)
2000-07-19: Brewer’s CAP Conjecture part of Brewer’s keynote to the PODC
Conference
2002-06: Brewer’s CAP Theorem proof published by Seth Gilbert (MIT) and
Nancy Lynch (MIT)
2007-10-02: “Amazon's Dynamo” post by Werner Vogels
(Amazon’s CTO) quoting the paper:
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash
Lakshman, Alex Pilchin, Swami Sivasubramanian, Peter Vosshall and Werner Vogels,
“Dynamo: Amazon's Highly Available Key-Value Store”, in the Proceedings of the 21st
ACM Symposium on Operating Systems Principles, Stevenson, WA, October 2007.
2007-12-19: “Eventually Consistent” post by Werner Vogels (Amazon’s CTO)
34
Repeated slide, repeated notes (to pass focus from CAP to Dynamo and Eventual Consistency):
The online book “CouchDB: The Definitive Guide” has an interesting introduction to these concepts - the “Eventual
Consistency” chapter:
http://books.couchdb.org/relax/intro/eventual-consistency
Really essential and truly amazing is the Dynamo paper by Werner Vogels et al, proof that BASE really works in
truly industrial sites, even with stats describing real life behavior:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
...and the now famous Eventually Consistent post by Werner Vogels:
http://www.allthingsdistributed.com/2007/12/eventually_consistent.html
If you dislike the introductory (justifiable) drama, just jump to the next part because this article, by Julian Browne,
is the best I found about the Brewer’s CAP Theorem and its history:
http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
You should still take a look at:
* The 1997 “Cluster-Based Scalable Network Services” paper (Brewer et al.) where the BASE vs ACID dilemma is
already mentioned:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.2034&rep=rep1&type=pdf
* The 1999 “Harvest, Yeld and Scalable Tolerant Systems” paper (Brewer et al.) where the CAP conjecture is already
mentioned:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.3690&rep=rep1&type=pdf
* The PODC 2000 keynote, by Brewer, that made the CAP conjecture and the BASE concept “popular”:
http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
* You might also see with your own eyes how CAP became a proved Theorem:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.20.1495&rep=rep1&type=pdf
Definition of ACID:
http://en.wikipedia.org/wiki/ACID
35. Amazon’s Dynamo DB
Also a “Wide Column Store”
Problem Technique
Partitioning Consistent Hashing
High Availability for writes Vector clocks with reconciliation during reads
Handling temporary failures Sloppy Quorum and hinted handoff (NRW)
Recovering from permanent failures Anti-entropy using Merkle trees
Membership and failure detection Gossip-based membership protocol and failure detection.
35
The source here is the already mentioned Dynamo paper:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
Strict distributed DBs, rather than dealing with the uncertainty of the correctness of an answer, make data is made
unavailable until it is absolutely certain that it is correct.
At Amazon, SLAs are expressed and measured at the 99.9th percentile of the distribution - avg or median not good
enough to provide a good experience for all. The choice for 99.9% over an even higher percentile has been made based
on a cost-benefit analysis which demonstrated a significant increase in cost to improve performance that much.
Experiences with Amazon’s production systems have shown that this approach provides a better overall experience
compared to those systems that meet SLAs defined based on the mean or median.
36. N: number of nodes to replicate each item to;
W: number of required nodes for write success;
R: number of required nodes for write success.
W < N = remaining nodes will receive the write later.
R < N = remaining nodes ignored.
36
Also based in the already mentioned Dynamo paper:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
...but you can find a similar diagram and similar mechanisms described about several (NoSQL) databases that
partially clone Dynamo.
37. Wikipedia image
Merkle Tree / Hash Tree
Used to verify / compare a set of data blocks
and efficiently find where the mismatches are.
37
Also based in the already mentioned Dynamo paper:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
...and on the Wikipedia article about this algorithm:
http://en.wikipedia.org/wiki/Hash_tree
38. Wikipedia image
Vector Clocks
On each internal even a process increments its logical clock;
Before sending a message, it increments its own clock in the
vector and sends it with the message;
On receiving a message, it increments its clock and updates
each element on its own vector to max.(own, msg).
38
Also based in the already mentioned Dynamo paper:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
...and on the Wikipedia article about this algorithm:
http://en.wikipedia.org/wiki/Vector_clock
Vector Clocks (and other similar algorithms) have a predecessor in Lamport timestamps:
http://en.wikipedia.org/wiki/Lamport_timestamps
Introduced in the classic paper “Time, Clocks, and the Ordering of Events in a Distributed System” by Leslie
Lamport:
http://en.wikipedia.org/wiki/Lamport_timestamps
39. Amazon Dynamo Lessons
(according to the paper)
Data returned to Shopping Cart 24h profiling:
0.00057% of requests saw 2 versions; 0.00047% of
requests saw 3 versions and 0.00009% of requests
saw 4 versions.
In two years applications have received successful
responses (without timing out) for 99.9995% of its
requests and no data loss event has occurred to date;
With coordination via Gossip protocol it is harder to
scale further than a few hundred nodes.
(Could be better w/ Chubby / ZK like coordinators?)
39
Also based in the already mentioned Dynamo paper:
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
Wikipedia has an article on Gossip Protocols (although, at the data I write this, not as precise as other Wikipedia
articles I just quoted):
http://en.wikipedia.org/wiki/Gossip_protocol
The solution I mention as a possibly more scalable alternative to Gossip Protocols for consensus is the use of
Paxos (or derivates) Coordinators, like the proprietary Google’s Chubby or the open source Apache Hadoop
Zookeeper.
When I first wrote and used (at my SAPO Codebits 2009 talk) these slides, the only support I still had to my (then
intuitive) belief that these more directed approaches should be more efficient than Gossip Protocols was the 6.6
part from the Dynamo paper - the paper even mentions the possibility of “introducing hierarchical extensions to
Dynamo”.
Thanks to my SAPO Codebits talk I met Henrique Moniz, then a Ph.D. student at the University of Lisbon. After I
discussed this issue (consensus scalability) with him he pointed me to a couple of interesting papers, one of which
immediately captured my attention:
* Gossip-based broadcast protocols by João Leitão
http://www.gsd.inesc-id.pt/~jleitao/pdf/masterthesis-leitao.pdf
This paper offers a more complete description of gossip protocols overhead and, to my surprise, also pointed a
few reliability weak spots on known Gossip Protocols. The paper goes on to present a more robust and efficient
Gossip Protocol called “HyParView” using a more “directed” approach.
HyParView sure looks like an interesting solution in terms of robustness for environments with an high incidence
of system/network failures but I still believe that using coordinators will be more efficient in a well controlled data
center.
Not that using coordinators and making them scale out BIG is exactly trivial, as you can read here:
-On the “Vertical Paxos and Primary-Backup Replication” paper, by Leslie Lamport et al, that Henrique Moniz
pointed me to:
http://research.microsoft.com/pubs/80907/podc09v6.pdf
-Or on this interesting article from the Cloudera’s blog about the (now upcoming) Observers feature of Apache
40. Eventually Consistent Systems
Banks
EAI Integrations
Many messaging based (SOA) systems
Google
Amazon
Etc.
40
Unlike what many examples say, Banks often use Eventual Consistency on many (limited value/risk) transactions -
or use “large” periodic transaction / compensation fixed windows to process large numbers of larger value
movements. So much for those ACID transaction examples...
42. Immediately Consistent Systems
Data-grids:
Coherence
Trading Gigaspaces
All Data in RAM
Online Gambling Can do ACID
Very High Speed
Max. Scale-out
42
Trading and Online Gambling really need to do large volumes of fast ACID transactions and are the big customers
of Data Grids.
Why Online Gambling needs ACID transactions has all to do with the type of game and the type of rules/assets
(some virtual) it involves.
Why Trading really needs ACID is s bit more obvious: you might be able to compensate an overdraft at a bank
(more so for limited values) but you really cannot sell shares you do not have for sale.
The performance needs are obvious for both too. For Trading there are even some new reasons, like (again):
http://www.nytimes.com/2009/07/24/business/24trading.html?_r=2&hp
44. NoSQL Taxonomy
by Steve Yen [PG]
key‐value‐cache: memcached, repcached, coherence [?], infinispan, eXtreme scale, jboss
cache, velocity, terracota [???]
key‐value‐store: keyspace [w/Paxos], flare, schema‐free, RAMCloud [, Mnesia (Erlang),
Chordless]
eventually‐consistent key‐value‐store: dynamo, Voldemort, Dynomite, SubRecord,
MotionDb, Dovetaildb
ordered‐key‐value‐store: tokyo tyrant[, BerkleyDB], lightcloud, NMDB, luxio, memcachedb,
actord
data‐structures server: redis
tuple‐store: gigaspaces [?], coord, apache river
object database: ZopeDB, db4o, Shoal
document store: CouchDB [evC, MVCC], MongoDB [evC], Jackrabbit, XML Databases,
ThruDB, CloudKit, Perservere, Riak Basho [evC], Scalaris [Erlang, w/Paxos]
wide columnar store: BigTable, Hadoop HBase [w/ Zookeeper], [Amazon Dynamo-evC, ]
Cassandra [evC], Hypertable, KAI, OpenNeptune, Qbase, KDI
[graph database: Neo4J, Sones, etc.]
44
From Steve Yen’s slideware (slide 54) he used for his “No SQL is a Horseless Carriage” talk at NoSQL Oakland 2009:
http://dl.dropbox.com/u/2075876/nosql-steve-yen.pdf
I do not completely understand or agree with Steve’s criteria but it sure is a possible starting point on building a
database/storage taxonomy.
The stuff in square brackets is mine. “evC” means Eventually Consistent and “?” just means I have doubts / don’t
understand some specific classification.
46. Cases to talk about
Analytics
Live soccer game site (like BBC News did)
Log like / timeline systems
(forums, healthcare, Twitter, etc.)
EAI Integrations
(Should use Vector Clocks?)
Zookeeper at the “Farm” (Config./Coord.)
Logistic Planing across EU
Trading
46
This is the placeholder slide to exercise the ideas and discuss possible applications of some of the mechanisms
which were presented on this talk (had no time at Codebits... still tuning this not-so-easy presentation).
Except for the last two scenarios (and the Twitter alternative on the “Log like” one) all others represent quite
common types of problems which you can meet without having to work for a Fortune Top 50 company or for a
mega web portal / service. Even an “Analytics” with enough data to justify using MapReduce is common enough.
Many large (but not necessarily huge) companies often quit doing more with the data they have just because of
the trouble of finding a way to do it (“more”).
* “Analytics” (high data + easy on consistency as it is) is currently seem to be the playground of Map Reduce, with
Hadoop stuff being used “everywhere”. Look at how many times you can find the words “analytics” or
“analysis” (and “MapReduce”) on these “Powered by” Hadoop web pages:
http://wiki.apache.org/hadoop/PoweredBy
http://wiki.apache.org/hadoop/Hbase/PoweredBy
* “Live soccer game...” is a nice problem to discuss short live caching and its consistency issues;
* “Log like / timeline systems...” are systems where information is mostly “insert only” and most of the effort to
keep consistency is related to keeping proper ordering information (with timestamps being usually enough),
properly merging the data from different sources and respect the explicit or implicit SLAs on data
synchronizations. Obviously, there are different difficulties across the several cases here mentioned, depending on
data flow, necessary performance, etc.;
* “EAI Integrations” often need better knowledge about ordering and are not as simples as the previous scenario.
Due to factors like the use of asynchronous and event driven mechanisms and the possibility of having updates
for a given document across multiple steps of a (multiple) process(es), a timestamp is often too limited as
ordering information... but is often the most you get. IMO this is a good scenario for using Vector Clocks and
company;
* “Zookeeper” is a great system even if “just” to configure the simplest web (or webservice) farm, to coordinate the
simplest cross farm operations (e.g.: cache related) or just for each server to know which are its peers;
* “Logistic Planing” is a complex scenario which demands a mix of solutions. It revolves around a logistics
company which transports goods across Europe, with planning offices on different countries. I will probably have
to remove it from this slide for any future talk I might give on this topic even if it is the most interesting of them
all. So, it does not make much sense to develop it here (maybe a blog post since, to me, this is a >10 year old