Distributed Data Systems

•

1 gostou•649 visualizações

A brief overview of distributed data systems in the context of analytics ingestion data pipelines.

Distributed Data Systems
How Do They Even?

About Me - Jared L Kerim
- Software Developer (Python)
- Mozilla Geolocation Cloud Services Team
- CTO at PressureNET

PressureNET (Shameless Plug)
- Gathers sensor data from
smartphones
- Constant stream of data to
servers
- API to retrieve data
- Visualization
- Analysis

The First Architecture
Sensors Web Servers MySQL API

The Problem: MySQL
- Slow lookups
- Takes a lot of disk space
- Cost (Large Relational DBs are expensive)
- Schema changes (become slow or impossible)

How Big is “Big”
- PressureNET 100 req/s, 1.5 billion records
- Analytics Systems 5000 req/s, 100s of
billions of records
- Ad Buying Service 500k req/s, trillions of
records

The Question
What is ????
Sensors ???? APIWeb Servers

What do we want to accomplish?
- Receive and store large amounts of data
- Access it quickly
- Small fast lookups (visualization)
- Large batch computations (mapreduce)

Considerations
- Durability (we don’t want to lose data)
- Redundancy (expect failures!)
- Scalability (simple growth, no upper limit)

Durability
- Data in a durable store should be ‘safe’
- Don’t remove data from one durable data
store until it is confirmed to be in another
durable data store
- Durable data stores should have redundant
backups (hot standbys)

Redundancy
- Each stage of your system should have
multiple copies
- If one copy goes down, another should take
over
- Redundancy ensures availability

Scalability
- The rate of data intake can grow or spike
- Your system should be able to add more
resources to handle that growth
- Require that your workload is partitionable

Proposed Architecture
Sensors Ingestors Queue Aggregator
S3
DynamoDB

We Are Not Alone
- This architecture is widely adopted
- Analytics
- Ad Serving/Views
- Log Analysis
- Sensor Data
- Game Events
- Video Events

Ingestors
- A redundant, scalable set of nodes which
receive data over http
- Can apply early validation and
authentication
- Stateless, low latency

Queue
- A scalable, durable storage mechanism for
data ‘in flight’
- Only holds data temporarily
- Typically preserves the order data was
received in

Aggregator
- A scalable, stateless set of workers which
consume data from the queue
- Can process data in small batches
- Write raw or transformed data to persistent
storage such as S3, Databases, etc.

Mais conteúdo relacionado

Mais procurados

End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...

End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...

End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...

Data streaming at VRT

Data streaming at VRT

Data streaming at VRT

Matthias De Vriendt

Fluvius is the network operator for electricity and gas in Flanders, Belgium. Their goal is to modernize the way people look at energy consumption using a digital meter that captures consumption and injection data from any electrical installation in Flanders ranging from households to large companies. After full roll-out there will be roughly 7 million digital meters active in Flanders collecting up to terabytes of data per day. Combine this with regulation that Fluvius has to maintain a record of these reading for at least 3 years, we are talking petabyte scale. delaware BeLux was assigned by Fluvius to setup a modern data platform and did so on Azure using Databricks as the core component to collect, store, process and serve these volumes of data to every single consumer and beyond in Flanders. This enables the Belgian energy market to innovate and move forward. Maarten took up the role as project manager and solution architect.

Building the Next-gen Digital Meter Platform for Fluvius

Building the Next-gen Digital Meter Platform for Fluvius

Building the Next-gen Digital Meter Platform for Fluvius

Five ways database modernization simplifies your data life

Five ways database modernization simplifies your data life

Five ways database modernization simplifies your data life

Traitement d'événements

Traitement d'événements

Traitement d'événements

Amazon Web Services

Intuit Analytics Cloud 101

Intuit Analytics Cloud 101

Intuit Analytics Cloud 101

DataWorks Summit/Hadoop Summit

Architecting Data in the AWS Ecosystem

Architecting Data in the AWS Ecosystem

Architecting Data in the AWS Ecosystem

Reporting from the Trenches: Intuit & Cassandra

Reporting from the Trenches: Intuit & Cassandra

Reporting from the Trenches: Intuit & Cassandra

AdTech companies need to address data increase at breakneck speed along with customer demands of insights & analytical reports. At PubMatic we receive billions of events and several TBs of data per day from various geographic regions. This high volume data needs to be processed in realtime to derive actionable insights such as campaign decisions, audience targeting and also provide feedback loop to AdServer for making efficient ad serving decisions. In this talk we will share how we designed and implemented these scalable low latency realtime data processing solutions for our use cases using Apache Apex.

RealTime AdTech reporting & targeting with Apache Apex

RealTime AdTech reporting & targeting with Apache Apex

RealTime AdTech reporting & targeting with Apache Apex

Data Structure and Types

Data Structure and Types

Data Structure and Types

Using Hazelcast in the Kappa architecture

Using Hazelcast in the Kappa architecture

Using Hazelcast in the Kappa architecture

Oliver Buckley-Salmon

Introducing the Hub for Data Orchestration

Introducing the Hub for Data Orchestration

Introducing the Hub for Data Orchestration

Due to explosion of IoT, we have streaming data that needs to be processed in real-time. This needs to be made available for applications as well as analytics scenarios such as anomaly detection. This workshop presents a solution using Confluent Cloud on Azure, Azure Cosmos DB and Azure Synapse Analytics which can be connected in a secure way within Azure VNET using Azure Private link configured on Kafka clusters.

Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...

Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...

Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...

HostedbyConfluent

Real-Time Analytics with Confluent and MemSQL

Real-Time Analytics with Confluent and MemSQL

Real-Time Analytics with Confluent and MemSQL

Дмитрий Попович "How to build a data warehouse?"

Дмитрий Попович "How to build a data warehouse?"

Дмитрий Попович "How to build a data warehouse?"

Managing the Dewey Decimal System

Managing the Dewey Decimal System

Managing the Dewey Decimal System

DataWorks Summit

Real time architecture big data

Real time architecture big data

Real time architecture big data

Sanjeev Solanki

Cap server log file analytics

Cap server log file analytics

Cap server log file analytics

CAP Data Technologies

In this webinar, Vincent de Lagabbe, co-founder and CTO at Kaiko, discusses the emerging trends and technologies within the financial services industry. Vincent explains the Bitcoin ecosystem, which is comprised of Blockchain Data, Miners (their importance and motivations), Exchanges, and Bitcoin Wallets. Vincent also explains how Kaiko analyzes the Bitcoin ecosystem to help businesses make sense of the data, and how they leverage DSE to successfully deliver those services. View webinar recording: https://youtu.be/w4PSkRoWSjk

Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...

Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...

Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...

The New Basics of Business Intelligence Lesson 3: Multi Source Analysis

The New Basics of Business Intelligence Lesson 3: Multi Source Analysis

The New Basics of Business Intelligence Lesson 3: Multi Source Analysis

Mais procurados (20)

End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...

End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...

End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...

Data streaming at VRT

Data streaming at VRT

Data streaming at VRT

Building the Next-gen Digital Meter Platform for Fluvius

Building the Next-gen Digital Meter Platform for Fluvius

Building the Next-gen Digital Meter Platform for Fluvius

Five ways database modernization simplifies your data life

Five ways database modernization simplifies your data life

Five ways database modernization simplifies your data life

Traitement d'événements

Traitement d'événements

Traitement d'événements

Intuit Analytics Cloud 101

Intuit Analytics Cloud 101

Intuit Analytics Cloud 101

Architecting Data in the AWS Ecosystem

Architecting Data in the AWS Ecosystem

Architecting Data in the AWS Ecosystem

Reporting from the Trenches: Intuit & Cassandra

Reporting from the Trenches: Intuit & Cassandra

Reporting from the Trenches: Intuit & Cassandra

RealTime AdTech reporting & targeting with Apache Apex

RealTime AdTech reporting & targeting with Apache Apex

RealTime AdTech reporting & targeting with Apache Apex

Data Structure and Types

Data Structure and Types

Data Structure and Types

Using Hazelcast in the Kappa architecture

Using Hazelcast in the Kappa architecture

Using Hazelcast in the Kappa architecture

Introducing the Hub for Data Orchestration

Introducing the Hub for Data Orchestration

Introducing the Hub for Data Orchestration

Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...

Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...

Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...

Real-Time Analytics with Confluent and MemSQL

Real-Time Analytics with Confluent and MemSQL

Real-Time Analytics with Confluent and MemSQL

Дмитрий Попович "How to build a data warehouse?"

Дмитрий Попович "How to build a data warehouse?"

Дмитрий Попович "How to build a data warehouse?"

Managing the Dewey Decimal System

Managing the Dewey Decimal System

Managing the Dewey Decimal System

Real time architecture big data

Real time architecture big data

Real time architecture big data

Cap server log file analytics

Cap server log file analytics

Cap server log file analytics

Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...

Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...

Webinar: Bitcoins and Blockchains - Emerging Financial Services Trends and Te...

The New Basics of Business Intelligence Lesson 3: Multi Source Analysis

The New Basics of Business Intelligence Lesson 3: Multi Source Analysis

The New Basics of Business Intelligence Lesson 3: Multi Source Analysis

Destaque

Evaluation Q2 dorcas

Evaluation Q2 dorcas

Evaluation Q2 dorcas

CURRICULUM VITAE Ronel Hirsch 19 Maart 2015

CURRICULUM VITAE Ronel Hirsch 19 Maart 2015

CURRICULUM VITAE Ronel Hirsch 19 Maart 2015

Aan de slag met dropbox

Aan de slag met dropbox

Aan de slag met dropbox

Commercial Real Estate Services BldgV2

Commercial Real Estate Services BldgV2

Commercial Real Estate Services BldgV2

Búsquedas por palabras claves

Búsquedas por palabras claves

Búsquedas por palabras claves

Fernando Saravia

Magazine analysis

Magazine analysis

Magazine analysis

Il nucleo

francescascuola

Rbi

akhila t akhila t

Презентация Give5 Club

Презентация Give5 Club

Презентация Give5 Club

Леонид Турцов

Seconday day

Compos first period✧

Compos first period✧

Compos first period✧

My last vacation

My last vacation

My last vacation

Biografi Elvira Devinamira

Biografi Elvira Devinamira

Biografi Elvira Devinamira

Landress garage build

Landress garage build

Landress garage build

Destaque (14)

Evaluation Q2 dorcas

Evaluation Q2 dorcas

Evaluation Q2 dorcas

CURRICULUM VITAE Ronel Hirsch 19 Maart 2015

CURRICULUM VITAE Ronel Hirsch 19 Maart 2015

CURRICULUM VITAE Ronel Hirsch 19 Maart 2015

Aan de slag met dropbox

Aan de slag met dropbox

Aan de slag met dropbox

Commercial Real Estate Services BldgV2

Commercial Real Estate Services BldgV2

Commercial Real Estate Services BldgV2

Búsquedas por palabras claves

Búsquedas por palabras claves

Búsquedas por palabras claves

Magazine analysis

Magazine analysis

Magazine analysis

Il nucleo

Rbi

Презентация Give5 Club

Презентация Give5 Club

Презентация Give5 Club

Seconday day

Compos first period✧

Compos first period✧

Compos first period✧

My last vacation

My last vacation

My last vacation

Biografi Elvira Devinamira

Biografi Elvira Devinamira

Biografi Elvira Devinamira

Landress garage build

Landress garage build

Landress garage build

Semelhante a Distributed Data Systems

Realtime Data Analytics

Realtime Data Analytics

Realtime Data Analytics

What if you were told that within three months, you had to scale your existing platform from 1,000 req/sec (requests per second) to handle 300,000 req/sec with an average latency of 25 milliseconds? And that you had to accomplish this with a tight budget, expand globally, and keep the project confidential until officially announced by well-known global mobile device manufacturers? That’s what exactly happened to us. This session explains how The Weather Company partnered with AWS to scale our data distribution platform to prepare for unpredictable global demand. We cover the many challenges that we faced as we worked on architecture design, technology and tools selection, load testing, deployment and monitoring, and how we solved these challenges using AWS.

(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS

(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS

(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS

Amazon Web Services

Learn how to monitor your database performance closely and troubleshoot database issues quickly using a variety of features provided by Amazon RDS and MySQL including database events, logs, and engine-specific features. You also learn about the security best practices to use with Amazon RDS for MySQL. In addition, you learn about how to effectively move data between Amazon RDS and on-premises instances. Lastly, you learn the latest about MySQL 5.6 and how you can take advantage of its newest features with Amazon RDS.

Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...

Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...

Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...

Amazon Web Services

L21 scalability

L21 scalability

L21 scalability

Ólafur Andri Ragnarsson

Powering Interactive Data Analysis at Pinterest by Amazon Redshift

Powering Interactive Data Analysis at Pinterest by Amazon Redshift

Powering Interactive Data Analysis at Pinterest by Amazon Redshift

Data driven organizations can be challenged to deliver new and growing business intelligence requirements from existing data warehouse platforms, constrained by lack of scalability and performance. The solution for customers is a data warehouse that scales for real-time demands and uses resources in a more optimized and cost-effective manner. Join Snowflake, AWS and Ask.com to learn how Ask.com enhanced BI service levels and decreased expenses while meeting demand to collect, store and analyze over a terabyte of data per day. Snowflake Computing delivers a fast and flexible elastic data warehouse solution that reduces complexity and overhead, built on top of the elasticity, flexibility, and resiliency of AWS. Join us to learn: • Learn how Ask.com eliminates data redundancy, and simplifies and accelerates data load, unload, and administration • Learn how to support new and fluid data consumption patterns with consistently high performance • Best practices for scaling high data volume on Amazon EC2 and Amazon S3 Who should attend: CIOs, CTOs, CDOs, Directors of IT, IT Administrators, IT Architects, Data Warehouse Developers, Database Administrators, Business Analysts and Data Architects

Snowflake Best Practices for Elastic Data Warehousing

Snowflake Best Practices for Elastic Data Warehousing

Snowflake Best Practices for Elastic Data Warehousing

Amazon Web Services

Amazon S3 is the central data hub for Netflix's big data ecosystem. We currently have over 1.5 billion objects and 60+ PB of data stored in S3. As we ingest, transform, transport, and visualize data, we find this data naturally weaving in and out of S3. Amazon S3 provides us the flexibility to use an interoperable set of big data processing tools like Spark, Presto, Hive, and Pig. It serves as the hub for transporting data to additional data stores / engines like Teradata, Redshift, and Druid, as well as exporting data to reporting tools like Microstrategy and Tableau. Over time, we have built an ecosystem of services and tools to manage our data on S3. We have a federated metadata catalog service that keeps track of all our data. We have a set of data lifecycle management tools that expire data based on business rules and compliance. We also have a portal that allows users to see the cost and size of their data footprint. In this talk, we’ll dive into these major uses of S3, as well as many smaller cases, where S3 smoothly addresses an important data infrastructure need. We will also provide solutions and methodologies on how you can build your own S3 big data hub.

AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...

AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...

AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...

Amazon Web Services

Analyzing big data quickly and efficiently requires a data warehouse optimized to handle and scale for large datasets. Amazon Redshift is a fast, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all of your data for a fraction of the cost of traditional data warehouses. In this session, we take an in-depth look at data warehousing with Amazon Redshift for big data analytics. We cover best practices to take advantage of Amazon Redshift's columnar technology and parallel processing capabilities to deliver high throughput and query performance. We also discuss how to design optimal schemas, load data efficiently, and use work load management.

Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift

Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift

Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift

Amazon Web Services

United Airlines is leveraging big data at the enterprise level to help drive revenue, improve the customer experience, optimize operations, and support our employees in their day-to-day activities. At the center of our big data stack is Apache Hadoop, supported by many other emerging open source frameworks that must be integrated with the myriad of operational systems that support a 90-year-old transportation company with worldwide operations. In addition, learn how streaming data and streaming data analytics are helping to drive operational decisions in real time and how this is being architected to scale horizontally to take advantage of high availability and parallel processing. With the rapidly evolving Hadoop ecosystem, and so many new open source technologies at our disposal, the options for solving long-standing industry problems such as modeling how customers make decisions, making timely and meaningful real-time offers, and optimizing logistical operations have never been better. JOE OLSON, Senior Manager, Big Data Analytics, United Airlines and JONATHAN INGALLS, Sr. Solutions Engineer, Hortonworks

Big data at United Airlines

Big data at United Airlines

Big data at United Airlines

DataWorks Summit

SpringOne Platform 2017 Milind Bhandarkar, Ampool "To provide hyper-personalized digital experiences in the emerging market transformation, innovative enterprises are building modern data-driven applications to deliver continuing value to their always-connected customers. Such applications need to utilize closed-loop deep insights to influence their users' behaviors in real-time. However, the traditional ways of capturing users' interactions, transporting data to large data warehouses or data lakes, further away from applications, and processing these data across multiple slow stages cannot meet the real-time expectations of both customers and businesses. What if one could capture, analyze, and serve data from a highly concurrent, high-performance data store powering these applications? In this talk, we'll present a memory-centric Active Data Store (ADS), powered by Apache Geode, to meet the exigent demands of modern applications while providing operational simplicity. Ampool's ADS allows fast ingest and storage of 'hot' app data, in situ updates and analysis, and data serving from the same scalable distributed in-memory data store. As the data cools (ages), Ampool ADS automatically tiers data to warm and cold secondary stores. By speeding analytics several-fold, Ampool enables feeding actionable insights back to applications, driving decisions in a closed loop. We will demonstrate the applicability of Ampool ADS for such an app by serving all data-access patterns from a single memory-centric store."

Real-time Analytics for Data-Driven Applications

Real-time Analytics for Data-Driven Applications

Real-time Analytics for Data-Driven Applications

Strata Hadoop World 2017 San Jose Today’s enterprise architectures are often composed of a myriad of heterogeneous devices. Bring-your-own-device policies, vendor diversification, and the transition to the cloud all contribute to a sprawling infrastructure, the complexity and scale of which can only be addressed by using modern distributed data processing systems. Kevin Mao outlines the system that Capital One has built to collect, clean, and analyze the security-related events occurring within its digital infrastructure. Raw data from each component is collected and preprocessed using Apache NiFi flows. This raw data is then written into an Apache Kafka cluster, which serves as the primary communications backbone of the platform. The raw data is parsed, cleaned, and enriched in real time via Apache Metron and Apache Storm and ingested into ElasticSearch, allowing operations teams to detect and monitor events as they occur. The refined data is also transformed into the Apache ORC data format and stored in Amazon S3, allowing data scientists to perform long-term, batch-based analysis. Kevin discusses the challenges involved with architecting and implementing this system, such as data quality, performance tuning, and the impact of additional financial regulations relating to data governance, and shares the results of these efforts and the value that the data platform brings to Capital One.

Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...

Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...

Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...

Building your First Big Data Application on AWS

Building your First Big Data Application on AWS

Building your First Big Data Application on AWS

Amazon Web Services

Database and Analytics on the AWS Cloud

Database and Analytics on the AWS Cloud

Database and Analytics on the AWS Cloud

Amazon Web Services

Big data on_aws in korea by abhishek sinha (lunch and learn)

Big data on_aws in korea by abhishek sinha (lunch and learn)

Big data on_aws in korea by abhishek sinha (lunch and learn)

Amazon Web Services Korea

Some of the most common questions we hear from users relate to capacity planning and hardware choices. How many replicas do I need? Should I consider sharding right away? How much RAM will I need for my working set? SSD or HDD? No one likes spending a lot of cash on hardware and cloud bills can just be as painful. MongoDB is different from traditional RDBMSs in its resource management, so you need to be mindful when deciding on the cluster layout and hardware. In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. Attendees will gain additional insight as we go through a few real-world scenarios, as experienced with MongoDB Inc customers, and come up with their ideal cluster layout and hardware.

Hardware Provisioning

Hardware Provisioning

Hardware Provisioning

Big data.ppt

Meeting the Priorities and Challenges of the Data Center Data needs to be stored, managed and transmitted across a broad range of IT infrastructures. The biggest dilemma is how to deliver greater performance, reliability, and manageability at an affordable price. Efficiently Managing the Growth of Data Data centers need to collect larger volumes and varieties of data. For data centers with outdated infrastructures harnessing the power of data is extremely challenging. HGST HelioSeal® Platform is ideal for enterprise and data center applications where capacity density and power efficiency are paramount. HGST SSDs provide ultra-high performance in the mission critical 24/7/365 transaction processing environments. The HGST object storage platform allows easy access and retrieval of deep-archived data. HGST solutions meet the needs of cloud service providers delivering scalability, capacity and performance.

Chip ICT | Hgst storage brochure

Chip ICT | Hgst storage brochure

Chip ICT | Hgst storage brochure

Marco van der Hart

透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹

透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹

透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹

Amazon Web Services

Lecture1

講師: Ivan Cheng, Solution Architect, AWS Join us for a series of introductory and technical sessions on AWS Big Data solutions. Gain a thorough understanding of what Amazon Web Services offers across the big data lifecycle and learn architectural best practices for applying those solutions to your projects. We will kick off this technical seminar in the morning with an introduction to the AWS Big Data platform, including a discussion of popular use cases and reference architectures. In the afternoon, we will deep dive into Machine Learning and Streaming Analytics. We will then walk everyone through building your first Big Data application with AWS.

Welcome & AWS Big Data Solution Overview

Welcome & AWS Big Data Solution Overview

Welcome & AWS Big Data Solution Overview

Amazon Web Services

Semelhante a Distributed Data Systems (20)

Realtime Data Analytics

Realtime Data Analytics

Realtime Data Analytics

(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS

(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS

(ARC346) Scaling To 25 Billion Daily Requests Within 3 Months On AWS

Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...

Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...

Amazon RDS for MySQL – Diagnostics, Security, and Data Migration (DAT302) | A...

L21 scalability

L21 scalability

L21 scalability

Powering Interactive Data Analysis at Pinterest by Amazon Redshift

Powering Interactive Data Analysis at Pinterest by Amazon Redshift

Powering Interactive Data Analysis at Pinterest by Amazon Redshift

Snowflake Best Practices for Elastic Data Warehousing

Snowflake Best Practices for Elastic Data Warehousing

Snowflake Best Practices for Elastic Data Warehousing

AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...

AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...

AWS re:Invent 2016: Netflix: Using Amazon S3 as the fabric of our big data ec...

Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift

Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift

Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift

Big data at United Airlines

Big data at United Airlines

Big data at United Airlines

Real-time Analytics for Data-Driven Applications

Real-time Analytics for Data-Driven Applications

Real-time Analytics for Data-Driven Applications

Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...

Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...

Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...

Building your First Big Data Application on AWS

Building your First Big Data Application on AWS

Building your First Big Data Application on AWS

Database and Analytics on the AWS Cloud

Database and Analytics on the AWS Cloud

Database and Analytics on the AWS Cloud

Big data on_aws in korea by abhishek sinha (lunch and learn)

Big data on_aws in korea by abhishek sinha (lunch and learn)

Big data on_aws in korea by abhishek sinha (lunch and learn)

Hardware Provisioning

Hardware Provisioning

Hardware Provisioning

Big data.ppt

Chip ICT | Hgst storage brochure

Chip ICT | Hgst storage brochure

Chip ICT | Hgst storage brochure

透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹

透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹

透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹

Lecture1

Welcome & AWS Big Data Solution Overview

Welcome & AWS Big Data Solution Overview

Welcome & AWS Big Data Solution Overview

Último

Boost Fertility New Invention Ups Success Rates.pdf

Boost Fertility New Invention Ups Success Rates.pdf

Boost Fertility New Invention Ups Success Rates.pdf

sudhanshuwaghmare1

MINDCTI Revenue Release Quarter One 2024

MINDCTI Revenue Release Quarter One 2024

MINDCTI Revenue Release Quarter One 2024

Webinar Recording: https://www.panagenda.com/webinars/why-teams-call-analytics-is-critical-to-your-entire-business Nothing is as frustrating and noticeable as being in an important call and being unable to see or hear the other person. Not surprising then, that issues with Teams calls are among the most common problems users call their helpdesk for. Having in depth insight into everything relevant going on at the user’s device, local network, ISP and Microsoft itself during the call is crucial for good Microsoft Teams Call quality support. To ensure a quick and adequate solution and to ensure your users get the most out of their Microsoft 365. But did you know that ‘bad calls’ are also an excellent indicator of other problems arising? Precisely because it is so noticeable!? Like the canary in the mine, bad calls can be early indicators of problems. Problems that might otherwise not have been noticed for a while but can have a big impact on productivity and satisfaction. Join this session by Christoph Adler to learn how true Microsoft Teams call quality analytics helped other organizations troubleshoot bad calls and identify and fix problems that impacted Teams calls or the use of Microsoft365 in general. See what it can do to keep your users happy and productive! In this session we will cover - Why CQD data alone is not enough to troubleshoot call problems - The importance of attributing call problems to the right call participant - What call quality analytics can do to help you quickly find, fix-, and prevent problems - Why having retrospective detailed insights matters - Real life examples of how others have used Microsoft Teams call quality monitoring to problem shoot problems with their ISP, network, device health and more.

Why Teams call analytics are critical to your entire business

Why Teams call analytics are critical to your entire business

Why Teams call analytics are critical to your entire business

CNIC Information System with Pakdata Cf In Pakistan

CNIC Information System with Pakdata Cf In Pakistan

CNIC Information System with Pakdata Cf In Pakistan

Passkeys: Developing APIs to enable passwordless authentication Cody Salas, Sr Developer Advocate | Solutions Architect - Yubico Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Angeliki Cooney has spent over twenty years at the forefront of the life sciences industry, working out of Wynantskill, NY. She is highly regarded for her dedication to advancing the development and accessibility of innovative treatments for chronic diseases, rare disorders, and cancer. Her professional journey has centered on strategic consulting for biopharmaceutical companies, facilitating digital transformation, enhancing omnichannel engagement, and refining strategic commercial practices. Angeliki's innovative contributions include pioneering several software-as-a-service (SaaS) products for the life sciences sector, earning her three patents. As the Senior Vice President of Life Sciences at Avenga, Angeliki orchestrated the firm's strategic entry into the U.S. market. Avenga, a renowned digital engineering and consulting firm, partners with significant entities in the pharmaceutical and biotechnology fields. Her leadership was instrumental in expanding Avenga's client base and establishing its presence in the competitive U.S. market.

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

Angeliki Cooney

In this keynote, Asanka Abeysinghe, CTO,WSO2 will explore the shift towards platformless technology ecosystems and their importance in driving digital adaptability and innovation. We will discuss strategies for leveraging decentralized architectures and integrating diverse technologies, with a focus on building resilient, flexible, and future-ready IT infrastructures. We will also highlight WSO2's roadmap, emphasizing our commitment to supporting this transformative journey with our evolving product suite.

Platformless Horizons for Digital Adaptability

Platformless Horizons for Digital Adaptability

Platformless Horizons for Digital Adaptability

Strategies for Landing an Oracle DBA Job as a Fresher

Strategies for Landing an Oracle DBA Job as a Fresher

Strategies for Landing an Oracle DBA Job as a Fresher

Remote DBA Services

Accelerating FinTech Innovation: Unleashing API Economy and GenAI Vasa Krishnan, Chief Technology Officer - FinResults Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

💉💊+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI}}+971581248768 +971581248768 Mtp-Kit (500MG) Prices » Dubai [(+971581248768**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Maya Whatsapp +971581248768 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971581248768''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971581248768' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Cl

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Understanding the FAA Part 107 License ..

Understanding the FAA Part 107 License ..

Understanding the FAA Part 107 License ..

Christopher Logan Kennedy

The Good, the Bad and the Governed - Why is governance a dirty word? David O'Neill, Chief Operating Officer - APIContext Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Exploring Multimodal Embeddings with Milvus

Exploring Multimodal Embeddings with Milvus

Exploring Multimodal Embeddings with Milvus

Dubai, known for its towering skyscrapers, luxurious lifestyle, and relentless pursuit of innovation, often finds itself in the global spotlight. However, amidst the glitz and glamour, the emirate faces its own set of challenges, including the occasional threat of flooding. In recent years, Dubai has experienced sporadic but significant floods, disrupting normalcy and posing unique challenges to its infrastructure. Among the critical nodes in this bustling metropolis is the Dubai International Airport, a vital hub connecting the world. This article delves into the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

DBX First Quarter 2024 Investor Presentation

DBX First Quarter 2024 Investor Presentation

DBX First Quarter 2024 Investor Presentation

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on the deployment of external web forms using Jotform for Bonterra Impact Management. This solution can be customized to your organization’s needs and deployed to support the common use cases below: - Intake and consent - Assessments - Surveys - Applications - Program registration Interested in deploying web form automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Jeffrey Haguewood

AWS Community Day CPH - Three problems of Terraform

AWS Community Day CPH - Three problems of Terraform

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Keynote 2: APIs in 2030: The Risk of Technological Sleepwalk Paolo Malinverno, Growth Advisor - The Business of Technology Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Dubai, often portrayed as a shimmering oasis in the desert, faces its own set of challenges, including the occasional threat of flooding. Despite its reputation for opulence and modernity, the emirate is not immune to the forces of nature. In recent years, Dubai has experienced sporadic but significant floods, testing the resilience of its infrastructure and communities. Among the critical lifelines in this bustling metropolis is the Dubai International Airport, a bustling hub that connects the city to the world. This article explores the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Corporate and higher education. Two industries that, in the past, have had a clear divide with very little crossover. The difference in goals, learning styles and objectives paved the way for differing learning technologies platforms to evolve. Now, those stark lines are blurring as both sides are discovering they have content that’s relevant to the other. Join Tammy Rutherford as she walks through the pros and cons of corporate and higher ed collaborating. And the challenges of these different technology platforms working together for a brighter future.

Corporate and higher education May webinar.pptx

Corporate and higher education May webinar.pptx

Corporate and higher education May webinar.pptx

Rustici Software

Último (20)

Boost Fertility New Invention Ups Success Rates.pdf

Boost Fertility New Invention Ups Success Rates.pdf

Boost Fertility New Invention Ups Success Rates.pdf

MINDCTI Revenue Release Quarter One 2024

MINDCTI Revenue Release Quarter One 2024

MINDCTI Revenue Release Quarter One 2024

Why Teams call analytics are critical to your entire business

Why Teams call analytics are critical to your entire business

Why Teams call analytics are critical to your entire business

CNIC Information System with Pakdata Cf In Pakistan

CNIC Information System with Pakdata Cf In Pakistan

CNIC Information System with Pakdata Cf In Pakistan

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

Platformless Horizons for Digital Adaptability

Platformless Horizons for Digital Adaptability

Platformless Horizons for Digital Adaptability

Strategies for Landing an Oracle DBA Job as a Fresher

Strategies for Landing an Oracle DBA Job as a Fresher

Strategies for Landing an Oracle DBA Job as a Fresher

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

Understanding the FAA Part 107 License ..

Understanding the FAA Part 107 License ..

Understanding the FAA Part 107 License ..

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Exploring Multimodal Embeddings with Milvus

Exploring Multimodal Embeddings with Milvus

Exploring Multimodal Embeddings with Milvus

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

DBX First Quarter 2024 Investor Presentation

DBX First Quarter 2024 Investor Presentation

DBX First Quarter 2024 Investor Presentation

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

AWS Community Day CPH - Three problems of Terraform

AWS Community Day CPH - Three problems of Terraform

AWS Community Day CPH - Three problems of Terraform

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Corporate and higher education May webinar.pptx

Corporate and higher education May webinar.pptx

Corporate and higher education May webinar.pptx

Distributed Data Systems

1. Distributed Data Systems How Do They Even?

2. About Me - Jared L Kerim - Software Developer (Python) - Mozilla Geolocation Cloud Services Team - CTO at PressureNET

3. PressureNET (Shameless Plug) - Gathers sensor data from smartphones - Constant stream of data to servers - API to retrieve data - Visualization - Analysis

4. The First Architecture Sensors Web Servers MySQL API

5. The Problem: MySQL - Slow lookups - Takes a lot of disk space - Cost (Large Relational DBs are expensive) - Schema changes (become slow or impossible)

6. How Big is “Big” - PressureNET 100 req/s, 1.5 billion records - Analytics Systems 5000 req/s, 100s of billions of records - Ad Buying Service 500k req/s, trillions of records

7. The Question What is ???? Sensors ???? APIWeb Servers

8. What do we want to accomplish? - Receive and store large amounts of data - Access it quickly - Small fast lookups (visualization) - Large batch computations (mapreduce)

9. Considerations - Durability (we don’t want to lose data) - Redundancy (expect failures!) - Scalability (simple growth, no upper limit)

10. Durability - Data in a durable store should be ‘safe’ - Don’t remove data from one durable data store until it is confirmed to be in another durable data store - Durable data stores should have redundant backups (hot standbys)

11. Redundancy - Each stage of your system should have multiple copies - If one copy goes down, another should take over - Redundancy ensures availability

12. Scalability - The rate of data intake can grow or spike - Your system should be able to add more resources to handle that growth - Require that your workload is partitionable

13. Proposed Architecture Sensors Ingestors Queue Aggregator S3 DynamoDB

14. We Are Not Alone - This architecture is widely adopted - Analytics - Ad Serving/Views - Log Analysis - Sensor Data - Game Events - Video Events

15. Ingestors - A redundant, scalable set of nodes which receive data over http - Can apply early validation and authentication - Stateless, low latency

16. Queue - A scalable, durable storage mechanism for data ‘in flight’ - Only holds data temporarily - Typically preserves the order data was received in

17. Aggregator - A scalable, stateless set of workers which consume data from the queue - Can process data in small batches - Write raw or transformed data to persistent storage such as S3, Databases, etc.