SlideShare uma empresa Scribd logo
1 de 100
Baixar para ler offline
Firehose
Engineering
designing
high-volume
data collection
systems
                     Josh Berkus
                  HiLoad, Moscow
                     October 2011
Firehose Database
   Applications (FDA)

(1) very high volume of data
    input from many producers
(2) continous processing of
    incoming data
Mozilla Socorro
Upwind
Fraud Detection System
Firehose Challenges
1. Volume




●   100's to 1000's facts/second
●   GB/hour
1. Volume




●   spikes in volume
●   multiple uncoorindated sources
1. Volume




volume always grows over time
2. Constant flow
since data arrives 24/7 …

while the user interface can be
down, data collection can never
be down
2. Constant flow
        ●   can't stop
            receiving to
            process
ETL     ●   data can
            arrive out of
            order
3. Database size
●       terabytes to petabytes
    ●   lots of hardware
    ●   single-node DBMSes aren't enough
    ●   difficult backups, redundancy,
        migration
    ●   analytics are resource-consumptive
3. Database size
●       database growth
    ●   size grows quickly
    ●   need to expand storage
    ●   estimate target data size
    ●   create data ageing policies
3. Database size

    “We will decide on a data
retention policy when we run out
         of disk space.”
          – every business user everywhere
4.
many components
 = many failures
4. Component failure
●       all components fail
    ●   or need scheduled downtime
    ●   including the network
●       collection must continue
●       collection & processing must
        recover
solving firehose problems
socorro project
http://crash-stats.mozilla.com
Mozilla Socorro

                          webservers




collectors

                           reports
             processors
socorro data volume
●       3000 crashes/minute
    ●   avg. size 150K
●       40TB accumulated raw data
    ●   500GB accumulated metadata /
        reports
dealing with volume




  load balancers   collectors
dealing with volume




monitor   processors
dealing with size
             data
            40TB
       expandible


   metadata
   views
   500GB
   fixed size
dealing with
       component failure
Lots of hardware ...
  ●   30 Hbase       ●   3 ES servers
      nodes          ●   6 collectors
  ●   2 PostgreSQL   ●   12 processors
      servers        ●   8 middleware &
  ●   6 load             web servers
      balancers

                     … lots of failures
load balancing &
  redundancy




    load balancers   collectors
elastic connections
●       components queue their data
    ●   retain it if other nodes are down
●       components resume work
        automatically
    ●   when other nodes come back up
elastic connections
                        crash mover

           collector




reciever               local file queue
server management
●       puppet
    ●   controls configuration of all servers
    ●   makes sure servers recover
    ●   allows rapid deployment of
        replacement nodes
Upwind
Upwind
 ●   speed
 ●   wind speed
 ●   heat
 ●   vibration
 ●   noise
 ●   direction
Upwind
 1. maximize
    power
    generation
 2. make sure
    turbine isn't
    damaged
dealing with volume
each turbine:
            90 to 700 facts/second
windmills per farm:      up to 100
number of farms:                40+
est. total: 300,000 facts/second
                         (will grow)
dealing with volume


   local     historian   analytic   reports
   storage               database




   local     historian   analytic   reports
   storage               database
dealing with volume


   local     historian   analytic
   storage               database




                                    master
                                    database
   local     historian   analytic
   storage               database
multi-tenant partitioning
●       partition the whole application
    ●   each customer gets their own
        toolchain
●       allows scaling with the
        number of customers
    ●   lowers efficiency
    ●   more efficient with virtualization
dealing with:
        constant flow and size

             minute       hours     days     months   years
historian
             buffer       table     table     table   table



                         minute     hours    days     months
            historian
                         buffer     table    table     table



                                    minute   hours    days
                        historian
                                    buffer   table    table
time-based rollups
●       continuously accumulate
        levels of rollup
    ●   each is based on the level below it
    ●   data is always appended, never
        updated
    ●   small windows == small resources
time-based rollups
●       allows:
    ●   very rapid summary reports for
        different windows
    ●   retaining different summaries for
        different levels of time
    ●   batch/out-of-order processing
    ●   summarization in parallel
firehose tips
data collection must be:

   ●   continuous
   ●   parallel
   ●   fault-tolerant
data processing must be:

   ●   continuous
   ●   parallel
   ●   fault-tolerant
every component
     must be able to fail
●   including the network
●   without too much data loss
●   other components must
    continue
5 tools to use
1.   queueing software
2.   buffering techniques
3.   materialized views
4.   configuration management
5.   comprehensive monitoring
4 don'ts
1.   use cutting-edge technology
2.   use untested hardware
3.   run components to capacity
4.   do hot patching
firehose mastered?
Contact
●       Josh Berkus: josh@pgexperts.com
    ●   blog: blogs.ittoolbox.com/database/soup
●       PostgreSQL: www.postgresql.org
    ●   pgexperts: www.pgexperts.com
●       Upcoming Events
    ●   PostgreSQL Europe: http://2011.pgconf.eu/
    ●   PostgreSQL Italy: http://2011.pgday.it/

        The text and diagrams in this talk is copyright 2011 Josh Berkus and is licensed under the creative
        commons attribution license. Title slide image is licensed from iStockPhoto and may not be
        reproduced or redistributed. Socorro images are copyright 2011 Mozilla Inc.
Firehose
      Engineering
       designing
       high-volume
       data collection
       systems
                                         Josh Berkus
                                      HiLoad, Moscow
                                         October 2011
                                                    1




I created this talk because, for some reason, I've been
   working on a bunch of the same kind of data systems
   lately.
Firehose Database
              Applications (FDA)

          (1) very high volume of data
              input from many producers
          (2) continous processing of
              incoming data

                                                 2




I call these “Firehose Database Applications”. They're
   characterized by two factors, and have a lot of things
   in common regardless of which industry they're used
   in.
1. These applications are focused entirely on the
   collection of large amounts of data coming in from
   many high-volume, producers, and
2. they process data into some more useful form, such
   as summaries, continuously 24 hours a day.

Some examples ...
Mozilla Socorro                  3




Mozilla's Socorro system, which collects firefox &
 thunderbird crash data from around the world.
Upwind




                                             4




Upwind, which collects a huge amount of metrics from
 large wind turbine farms in order to monitor and
 analyze them.
Fraud Detection System




                                               5




And numerous other systems, such as a fraud
 detection system which receives data from over 100
 different transaction processing systems and
 analyzes it to look for patterns which would indicate
 fraud.
Firehose Challenges




                                            6




regardless of how these firehose systems are used,
  they share the same four challenges, all of which
  need to be overcome in some way to get the system
  running and keep it running.
1. Volume




     ●   100's to 1000's facts/second
     ●   GB/hour                             7




The first and most obvious is volume. A firehose
 system is collecting many, many facts per second,
 which may add up to gigabytes per hour.
1. Volume




     ●    spikes in volume
     ●    multiple uncoorindated sources        8




You also have to plan for unpredicted spikes in volume
  which may be 10X normal.
Also your data sources may come in at different rates,
  and with interruptions or out of order data.

But, the biggest challenge of volume always is ...
1. Volume




        volume always grows over time
                                               9




… there is always, always, always more of it than there
 used to be.
2. Constant flow
       since data arrives 24/7 …

       while the user interface can be
       down, data collection can never
       be down

                                               10




the second challenge is dealing with the all-day, all-
  week, all-year nature of the incoming data.
for example, it is often permissiable for the user
  interface or reporting system be out of service, but
  data collection must go on at all times without
  interruption. Otherwise, data would be lost.
this makes for different downtime plans than standard
  web applications, who can have “read only
  downtimes”. Firehose systems can't; they must
  arrange for alternate storage if the primary
  collections is down.
2. Constant flow
                          ●   can't stop
                              receiving to
                              process
           ETL            ●   data can
                              arrive out of
                              order

                                              11




it also means that conventional Extract Transform Load
   (ETL) approaches to data warehousing don't work.
   You can't wait for the down period to process the
   data. Instead, data must be processed continuously
   while still collecting.
Data is also never at rest; you must be prepared for
   interruptions and out-of-order data.
3. Database size
        ●       terabytes to petabytes
            ●   lots of hardware
            ●   single-node DBMSes aren't enough
            ●   difficult backups, redundancy,
                migration
            ●   analytics are resource-consumptive


                                                     12




another obvious challenge is the sheer size of the data.

once you start accumulating data 24x7, you will rapidly
  find yourself dealing with terabytes or even petabytes
  of data. This requires you to deal with lots of
  hardware and large clustered systems or server
  farms, including clustered or distributed databases.
this makes backups and redundancy either very
  difficult or very expensive.
performing analytics on the data and reporting on it
  becomes very resource-intensive and slow at large
  data sizes.
3. Database size
        ●       database growth
            ●   size grows quickly
            ●   need to expand storage
            ●   estimate target data size
            ●   create data ageing policies



                                               13




the database is also growing rapidly, which means that
  solutions which work today might not work next
  month. You need to estimate the target size of your
  data and engineer for that, not the amount of data
  you have now.
you will also have to create a data ageing or expiration
  policy so that you can eventually purge data and limit
  the database to some finite size. this is the hardest
  thing to do because ...
3. Database size

            “We will decide on a data
        retention policy when we run out
                 of disk space.”
                     – every business user everywhere



                                                    14




… users never ever ever want to “throw away” data.
 No matter how old it is or how infrequently it's
 accessed.
This isn't special to firehose systems except that they
 accumulate data faster than any other kind of
 system.
4.




                                                15




problem #4 is illustrated by this architecture diagram of
  the Socorro system. Firehose systems have lots and
  lots of components, often using dozens of servers
  and hundreds of application instances. In addition to
  cost and administrative overhead, what having that
  much stuff means is ...
many components
          = many failures                    16




… every day of the week something is going to be
 crashing on you. You simply can't run a system with
 90+ servers and expect them to be up all the time.
4. Component failure
       ●       all components fail
           ●   or need scheduled downtime
           ●   including the network
       ●       collection must continue
       ●       collection & processing must
               recover
                                              17




Even when components don't fail unexpectedly, they
 have to be taken down for upgrades, maintenance,
 replacement, or migration to new locations and
 software. But the data collection and processing has
 to go on while components are unavailable or offline.
 Even when processing has to stop, components
 must recover quickly when the system is ready
 again, and must do so without losing data.
solving firehose problems              18




So let's talk about how a couple of different teams
 solved some of these firehose problems.
socorro project




                                               19




First we're going to talk about Mozilla's Socorro
  system. I'll bet a lot of you have seen this Firefox
  crash screen. The data it produces gets sent to the
  socorro system at Mozilla, where it's collected and
  processed.
http://crash-stats.mozilla.com




                                                20




At the end of the processing we get pretty reports like
  this, which let the mozilla developers know when
  releases are stable and ready, and where frequent
  crashing and hanging issues are. You can visit it
  right now: It's http://crash-stats.mozilla.com
21




There's quite a rich set of data there collecting
 everything about every aspect of what's making
 Firefox and other Mozilla products crash, individually
 and in the aggregate.
Mozilla Socorro

                                         webservers




               collectors

                                          reports
                            processors

                                                    22




this is a simplified architecture diagram showing the
  very basic data flow of Socorro.
1. Crashes come in through the internet,
2. are harvested by a cluster of collectors
3. raw crashes are stored in Hbase
4. a cluster of processors pull raw crashes from Hbase
  process crashes, and write processed crashes and
  metadata to Hbase and Postgres
5. Postgres populates views which are used to support
  the web interface and reports used internally by
  mozilla developers.
23




a full architecture diagram would look like this one, but
  even this doesn't show all components.
socorro data volume
       ●       3000 crashes/minute
           ●   avg. size 150K
       ●       40TB accumulated raw data
           ●   500GB accumulated metadata /
               reports



                                              24




socorro receives up to 3000 crashes every minute from
  around the world. each of these crashes comes with
  50 to 1000K of binary crash data which needs to be
  processed.
within the target data lifetime of 6-12 months, socorro
  has accumulated nearly 40 terabytes of raw data and
  half a terabyte of metadata, views and reports.
dealing with volume




                                               25
                load balancers   collectors



how does Mozilla deal with this large data volume? In
  one word:

parallelism

1. crashes come in from the internet
2. they hit a set of round-robin load balancers
3. the load balancers randomly assign them to one of
  six crash collectors
4. the crash collectors receive the crash information,
  and then write the raw crash to a 30-server Hbase
  cluster.
dealing with volume




                                               26
              monitor     processors



5. the raw crashes in HBase are pulled out in parallel
  by twelve processor servers running 10 processor
  instances each.
6. these processors do a lot of work to process the
  binary crashes and then write the processed crash
  results to Hbase an the processed crash metadata
  and summary data to Postgres.
7. for scheduling and preventing duplication, the
  processors are coordinated by a monitor server,
  which runs a queue and monitoring system backed
  by Postgres.
dealing with size
                              data
                             40TB
                         expandible


                     metadata
                     views
                     500GB
                     fixed size                27




Why have both Hbase and PostgreSQL? Isn't one
 database enough?
Well, the 30-server Hbase cluster holds 40 terabytes of
 data. It's good at doing this in a redundant, scalable
 way and is very fast for pulling individual crashes out
 of the billion or so which are stored there.
However Hbase is very poor at ad-hoc queries, or
 simple browsing of data over the web. It also often
 requires downtime for maintenance
So we use Postgres which holds metadata and holds
 summary “materialized views” which are used by the
 web application and the reports to present analytics
 to users.
Postgres can't store the raw data though because it
 can't easily expand to the sizes needed.
dealing with
              component failure
      Lots of hardware ...
         ●   30 Hbase       ●   3 ES servers
             nodes          ●   6 collectors
         ●   2 PostgreSQL   ●   12 processors
             servers        ●   8 middleware &
         ●   6 load             web servers
             balancers

                            … lots of failures   28




Of course, with the more than 60 servers we're dealing
 with, something is always going down. That's simply
 reality. Some of the worst failures have involved the
 entire network going out and all components losing
 communication with each other.
This means that we need to be prepared to have
 failures and to recover from them. How do we do
 that?
load balancing &
                 redundancy




                                                    29
                      load balancers   collectors



One obvious way is to have lots of redundant
 components for each role in the application. As we
 already saw, for incoming data there is more than
 one load balancer and are 6 collectors. This means
 that we can lose several machines and still be
 receiving user data, and that in catastrophic failure
 we only lose an minority of the data.
elastic connections
       ●       components queue their data
           ●   retain it if other nodes are down
       ●       components resume work
               automatically
           ●   when other nodes come back up


                                                   30




A more important strategy we've learned through hard
  experience is the requirement to have “elastic
  connections”. That is, each component is prepared
  to lose communication with the components on either
  side of it. They queue data trough several
  mechanisms, and resume work when the next
  component in line comes back up.
elastic connections
                                crash mover

                   collector




        reciever               local file queue
                                                  31




For example, let's look at the collectors. They receive
  data from the web and write to Hbase. But it's more
  complex than that.
1. The collectors can lose communication to Hbase for
  a variety of reasons. It can be down for
  maintenance, or there can be network issues.
2. The crashes are received by receiver processes
3. which write them to local file storage in a primitive,
  filename-based queue.
4. A separate “crashmover” process looks for files in
  the queue
5. When Hbase is available, the crashmover saves the
  files to Hbase.
server management
       ●       puppet
           ●   controls configuration of all servers
           ●   makes sure servers recover
           ●   allows rapid deployment of
               replacement nodes



                                                       32




Another thing which has been essential to managing
 component failure is comprehensive server
 management through Puppet. Having servers under
 central management means that we can make sure
 services are up and configured correctly when they
 are supposed to be. And it also means that if we
 permanently lose nodes we can replace them quickly
 with identically configured nodes.
Upwind




                                               33




let's move on to a different team and different methods
  of facing firehose challenges. Upwind.
Upwind
                          ●    speed
                          ●    wind speed
                          ●    heat
                          ●    vibration
                          ●    noise
                          ●    direction
                                                  34




power-generating wind turbines have onboard
  computers which produce a lot of data about the
  status and operation of the wind turbine Upwind
  takes that data, collects it, summarizes it and turns it
  into graphical summary reports for customers who
  own wind farms.
Upwind
                         1. maximize
                            power
                            generation
                         2. make sure
                            turbine isn't
                            damaged
                                               35




the customers need this data for two reasons:
1. wind turbines are very expensive and they need to
  make sure they're maximizing power generation from
  them.
2. to intervene and keep the turbines from catching fire
  or otherwise becoming permanently damaged.
dealing with volume
       each turbine:
                   90 to 700 facts/second
       windmills per farm:      up to 100
       number of farms:                40+
       est. total: 300,000 facts/second
                                (will grow)
                                                36




turbines come with a lot of different monitors, between
  90 and 700 of them depending on the model. each
  of these monitors produces a reading every second.
an individual “wind farm” can have up to 100 turbines,
  and Upwind expects more than 40 farms to make
  use of the service once it's fully deployed. Plus they
  want to get more customers.
this means an estimated 300,000 metrics per second
  total by next year.
dealing with volume


                  local     historian   analytic   reports
                  storage               database




                  local     historian   analytic   reports
                  storage               database

                                                             37




Upwind chose a very different route than Mozilla for
  dealing with this volume. Let's explain the data flow
  first:
1. Wind farms write metrics to local storage, which can
  hold data for a couple hours.
2. This data is then read by an industry-specific
  application called the historian which interprets the
  raw data and stores each second of data.
3. that data is then read from the historian by Postgres
4. which produces nice summaries and analytics for
  the reports.
As it turns out, data is not shared between wind farms
  most of the time. So Upwind can scale simply by
  adding a whole additional service tool chain for each
  windfarm it brings online.
dealing with volume


                 local     historian   analytic
                 storage               database




                                                  master
                                                  database
                 local     historian   analytic
                 storage               database

                                                             38




In the limited cases where we need to do inter-farm
  reports, we will have a master Postgres node which
  will use pl/proxy and/or foreign data wrappers to
  summarize data from multiple wind farms.
multi-tenant partitioning
        ●       partition the whole application
            ●   each customer gets their own
                toolchain
        ●       allows scaling with the
                number of customers
            ●   lowers efficiency
            ●   more efficient with virtualization
                                                     39




this is what's known as multi-tenant partitioning, where
  you give each customer, or “tenant”, a full stack of
  their own. It's not very resource-efficient, but it does
  easily scale as you add customers, and eliminates
  the need to use complex clustered storage or
  databases.
cloud hosting and virtualization is also making this
  strategy easier to manage.
dealing with:
               constant flow and size

                    minute       hours     days     months   years
       historian
                    buffer       table     table     table   table



                                minute     hours    days     months
                   historian
                                buffer     table    table     table



                                           minute   hours    days 40
                               historian
                                           buffer   table    table


In order to deal with the constant flow of data and the
  large database size, we continously produce a series
  of time-based summaries of the data:
1. a buffer holding a several minutes of data. this also
  accounts for lag and out-of-order data.
2. hour summaries 3. day summaries
4. month summaries 5. annual summaries
time-based rollups
        ●       continuously accumulate
                levels of rollup
            ●   each is based on the level below it
            ●   data is always appended, never
                updated
            ●   small windows == small resources


                                                      41




For efficiency, each level builds on the level beneath it.
 This means that no analytic operation ever has to
 work with a very large set, an so can be very fast.
Even better, the time-based summaries can be run in
 parallel, so that we can make maximum use of the
 processors on the server.
time-based rollups
       ●       allows:
           ●   very rapid summary reports for
               different windows
           ●   retaining different summaries for
               different levels of time
           ●   batch/out-of-order processing
           ●   summarization in parallel

                                                   42




this allows us to have a set of “views” which
  summarize the data at the time windows users look
  at. This means almost “instant” reports for users.
firehose tips
                                               43




based on my experiences with several of these
  firehose projects, here's some general rules for
  building this kind of an application
data collection must be:

               ●   continuous
               ●   parallel
               ●   fault-tolerant



                                               44




data collection has to be 24x7. It must be done in
  parallel by multiple processes/servers, and above all
  must be fault-tolerant.
data processing must be:

              ●   continuous
              ●   parallel
              ●   fault-tolerant



                                                45




data processing has to be built the same way.
every component
              must be able to fail
        ●   including the network
        ●   without too much data loss
        ●   other components must
            continue


                                               46




the lesson most people don't seem to be prepared for
  is that every component must be able to fail without
  permanently downing the system, or causing
  unacceptable data loss.
5 tools to use
       1.   queueing software
       2.   buffering techniques
       3.   materialized views
       4.   configuration management
       5.   comprehensive monitoring

                                              47




here's some classes of tools you should be using in
  any firehose application.
queuing and buffering for fault-tolerance
materialized views to deal with volume and size
configuration management and comprehensive
  monitoring to keep the system up.
4 don'ts
        1.   use cutting-edge technology
        2.   use untested hardware
        3.   run components to capacity
        4.   do hot patching


                                                 48




some things you will be tempted to do and shouldn't
* don't use cutting edge hardware or software because
  it's not reliable and will lead to repeated
  unacceptable downtimes
* don't put hardware into service without testing it first
  beause it will be unreliable or perform poorly,
  dragging the system down
* never run components or layers of your stack at
  capacity, because then you have no headroom for
  when you have a surge in data
* and never do hot patching of production services
  because you can't manage it and it will lead to
  mistakes and taking the system down.
firehose mastered?




                                                49




so now that we know how to do it, it's time for you to
  build your own firehose applications. it's challenging
  and fun.
Contact
       ●       Josh Berkus: josh@pgexperts.com
           ●   blog: blogs.ittoolbox.com/database/soup
       ●       PostgreSQL: www.postgresql.org
           ●   pgexperts: www.pgexperts.com
       ●       Upcoming Events
           ●   PostgreSQL Europe: http://2011.pgconf.eu/
           ●   PostgreSQL Italy: http://2011.pgday.it/

               The text and diagrams in this talk is copyright 2011 Josh Berkus and is licensed under the creative
               commons attribution license. Title slide image is licensed from iStockPhoto and may not be
               reproduced or redistributed. Socorro images are copyright 2011 Mozilla Inc.                           50




contact and copyright information.

Questions?

Mais conteúdo relacionado

Mais procurados

Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoverySteven Francia
 
Database Research on Modern Computing Architecture
Database Research on Modern Computing ArchitectureDatabase Research on Modern Computing Architecture
Database Research on Modern Computing ArchitectureKyong-Ha Lee
 
Webinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data SafetyWebinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data SafetyMongoDB
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Antonio Cesarano
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesOleksii Diagiliev
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukAndrii Vozniuk
 

Mais procurados (7)

Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
Database Research on Modern Computing Architecture
Database Research on Modern Computing ArchitectureDatabase Research on Modern Computing Architecture
Database Research on Modern Computing Architecture
 
Webinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data SafetyWebinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data Safety
 
Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...Cluster based storage - Nasd and Google file system - advanced operating syst...
Cluster based storage - Nasd and Google file system - advanced operating syst...
 
Real-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpacesReal-Time Big Data with Storm, Kafka and GigaSpaces
Real-Time Big Data with Storm, Kafka and GigaSpaces
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
 

Destaque

Growing in the wild. The story by cubrid database developers (Esen Sagynov, E...
Growing in the wild. The story by cubrid database developers (Esen Sagynov, E...Growing in the wild. The story by cubrid database developers (Esen Sagynov, E...
Growing in the wild. The story by cubrid database developers (Esen Sagynov, E...Ontico
 
Microsoft выбирает open source (Sandy Gupta)
Microsoft выбирает open source (Sandy Gupta)Microsoft выбирает open source (Sandy Gupta)
Microsoft выбирает open source (Sandy Gupta)Ontico
 
Go daddy.com Cloud Storage Solution (Adam Knapp)
Go daddy.com Cloud Storage Solution (Adam Knapp)Go daddy.com Cloud Storage Solution (Adam Knapp)
Go daddy.com Cloud Storage Solution (Adam Knapp)Ontico
 
Мир проектного управления глазами не ITишника (Максим Вишнивецкий)
Мир проектного управления глазами не ITишника (Максим Вишнивецкий)Мир проектного управления глазами не ITишника (Максим Вишнивецкий)
Мир проектного управления глазами не ITишника (Максим Вишнивецкий)Ontico
 
Микрофреймворки на PHP (Андрей Синицын)
Микрофреймворки на PHP (Андрей Синицын)Микрофреймворки на PHP (Андрей Синицын)
Микрофреймворки на PHP (Андрей Синицын)Ontico
 
С кем работать и с кем не работать: зачем и как оценивать клиентов консалтинг...
С кем работать и с кем не работать: зачем и как оценивать клиентов консалтинг...С кем работать и с кем не работать: зачем и как оценивать клиентов консалтинг...
С кем работать и с кем не работать: зачем и как оценивать клиентов консалтинг...Ontico
 
Lean organization (Tomasz De Jastrzebiec Wykowski)
Lean organization (Tomasz De Jastrzebiec Wykowski)Lean organization (Tomasz De Jastrzebiec Wykowski)
Lean organization (Tomasz De Jastrzebiec Wykowski)Ontico
 
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)Ontico
 
Artisto: опыт запуска нейросетей в production / Эдуард Тянтов (Mail.ru Group)
Artisto: опыт запуска нейросетей в production / Эдуард Тянтов (Mail.ru Group)Artisto: опыт запуска нейросетей в production / Эдуард Тянтов (Mail.ru Group)
Artisto: опыт запуска нейросетей в production / Эдуард Тянтов (Mail.ru Group)Ontico
 

Destaque (9)

Growing in the wild. The story by cubrid database developers (Esen Sagynov, E...
Growing in the wild. The story by cubrid database developers (Esen Sagynov, E...Growing in the wild. The story by cubrid database developers (Esen Sagynov, E...
Growing in the wild. The story by cubrid database developers (Esen Sagynov, E...
 
Microsoft выбирает open source (Sandy Gupta)
Microsoft выбирает open source (Sandy Gupta)Microsoft выбирает open source (Sandy Gupta)
Microsoft выбирает open source (Sandy Gupta)
 
Go daddy.com Cloud Storage Solution (Adam Knapp)
Go daddy.com Cloud Storage Solution (Adam Knapp)Go daddy.com Cloud Storage Solution (Adam Knapp)
Go daddy.com Cloud Storage Solution (Adam Knapp)
 
Мир проектного управления глазами не ITишника (Максим Вишнивецкий)
Мир проектного управления глазами не ITишника (Максим Вишнивецкий)Мир проектного управления глазами не ITишника (Максим Вишнивецкий)
Мир проектного управления глазами не ITишника (Максим Вишнивецкий)
 
Микрофреймворки на PHP (Андрей Синицын)
Микрофреймворки на PHP (Андрей Синицын)Микрофреймворки на PHP (Андрей Синицын)
Микрофреймворки на PHP (Андрей Синицын)
 
С кем работать и с кем не работать: зачем и как оценивать клиентов консалтинг...
С кем работать и с кем не работать: зачем и как оценивать клиентов консалтинг...С кем работать и с кем не работать: зачем и как оценивать клиентов консалтинг...
С кем работать и с кем не работать: зачем и как оценивать клиентов консалтинг...
 
Lean organization (Tomasz De Jastrzebiec Wykowski)
Lean organization (Tomasz De Jastrzebiec Wykowski)Lean organization (Tomasz De Jastrzebiec Wykowski)
Lean organization (Tomasz De Jastrzebiec Wykowski)
 
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
 
Artisto: опыт запуска нейросетей в production / Эдуард Тянтов (Mail.ru Group)
Artisto: опыт запуска нейросетей в production / Эдуард Тянтов (Mail.ru Group)Artisto: опыт запуска нейросетей в production / Эдуард Тянтов (Mail.ru Group)
Artisto: опыт запуска нейросетей в production / Эдуард Тянтов (Mail.ru Group)
 

Semelhante a Проектирование крупномасштабных приложений сбора данных (Josh Berkus)

OVHcloud – Enterprise Cloud Databases
OVHcloud – Enterprise Cloud DatabasesOVHcloud – Enterprise Cloud Databases
OVHcloud – Enterprise Cloud DatabasesOVHcloud
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataHaluan Irsad
 
Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...Denodo
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...Fred de Villamil
 
Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis Ryft
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchTom Connor
 
Batch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing DifferenceBatch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing Differencejeetendra mandal
 
Google File System
Google File SystemGoogle File System
Google File SystemDreamJobs1
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...Imperva Incapsula
 

Semelhante a Проектирование крупномасштабных приложений сбора данных (Josh Berkus) (20)

PyTables
PyTablesPyTables
PyTables
 
OVHcloud – Enterprise Cloud Databases
OVHcloud – Enterprise Cloud DatabasesOVHcloud – Enterprise Cloud Databases
OVHcloud – Enterprise Cloud Databases
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Py tables
Py tablesPy tables
Py tables
 
PyTables
PyTablesPyTables
PyTables
 
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
 
WW Historian 10
WW Historian 10WW Historian 10
WW Historian 10
 
getFamiliarWithHadoop
getFamiliarWithHadoopgetFamiliarWithHadoop
getFamiliarWithHadoop
 
Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...Data Virtualization Reference Architectures: Correctly Architecting your Solu...
Data Virtualization Reference Architectures: Correctly Architecting your Solu...
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Mis chapter 5
Mis  chapter 5Mis  chapter 5
Mis chapter 5
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
 
Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis
 
A Mayo Clinic Big Data Implementation
A Mayo Clinic Big Data ImplementationA Mayo Clinic Big Data Implementation
A Mayo Clinic Big Data Implementation
 
Big Data for QAs
Big Data for QAsBig Data for QAs
Big Data for QAs
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB Launch
 
Batch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing DifferenceBatch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing Difference
 
Google File System
Google File SystemGoogle File System
Google File System
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...
 

Mais de Ontico

One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...Ontico
 
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Ontico
 
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Ontico
 
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Ontico
 
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Ontico
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)Ontico
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Ontico
 
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Ontico
 
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)Ontico
 
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)Ontico
 
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Ontico
 
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Ontico
 
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Ontico
 
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Ontico
 
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)Ontico
 
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Ontico
 
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Ontico
 
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...Ontico
 
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Ontico
 
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Ontico
 

Mais de Ontico (20)

One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
 
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
 
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
 
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
 
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
 
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
 
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
 
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
 
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
 
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
 
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
 
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
 
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
 
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
 
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
 
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
 
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
 
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
 

Último

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Проектирование крупномасштабных приложений сбора данных (Josh Berkus)

  • 2. Firehose Database Applications (FDA) (1) very high volume of data input from many producers (2) continous processing of incoming data
  • 7. 1. Volume ● 100's to 1000's facts/second ● GB/hour
  • 8. 1. Volume ● spikes in volume ● multiple uncoorindated sources
  • 9. 1. Volume volume always grows over time
  • 10. 2. Constant flow since data arrives 24/7 … while the user interface can be down, data collection can never be down
  • 11. 2. Constant flow ● can't stop receiving to process ETL ● data can arrive out of order
  • 12. 3. Database size ● terabytes to petabytes ● lots of hardware ● single-node DBMSes aren't enough ● difficult backups, redundancy, migration ● analytics are resource-consumptive
  • 13. 3. Database size ● database growth ● size grows quickly ● need to expand storage ● estimate target data size ● create data ageing policies
  • 14. 3. Database size “We will decide on a data retention policy when we run out of disk space.” – every business user everywhere
  • 15. 4.
  • 16. many components = many failures
  • 17. 4. Component failure ● all components fail ● or need scheduled downtime ● including the network ● collection must continue ● collection & processing must recover
  • 21.
  • 22. Mozilla Socorro webservers collectors reports processors
  • 23.
  • 24. socorro data volume ● 3000 crashes/minute ● avg. size 150K ● 40TB accumulated raw data ● 500GB accumulated metadata / reports
  • 25. dealing with volume load balancers collectors
  • 27. dealing with size data 40TB expandible metadata views 500GB fixed size
  • 28. dealing with component failure Lots of hardware ... ● 30 Hbase ● 3 ES servers nodes ● 6 collectors ● 2 PostgreSQL ● 12 processors servers ● 8 middleware & ● 6 load web servers balancers … lots of failures
  • 29. load balancing & redundancy load balancers collectors
  • 30. elastic connections ● components queue their data ● retain it if other nodes are down ● components resume work automatically ● when other nodes come back up
  • 31. elastic connections crash mover collector reciever local file queue
  • 32. server management ● puppet ● controls configuration of all servers ● makes sure servers recover ● allows rapid deployment of replacement nodes
  • 34. Upwind ● speed ● wind speed ● heat ● vibration ● noise ● direction
  • 35. Upwind 1. maximize power generation 2. make sure turbine isn't damaged
  • 36. dealing with volume each turbine: 90 to 700 facts/second windmills per farm: up to 100 number of farms: 40+ est. total: 300,000 facts/second (will grow)
  • 37. dealing with volume local historian analytic reports storage database local historian analytic reports storage database
  • 38. dealing with volume local historian analytic storage database master database local historian analytic storage database
  • 39. multi-tenant partitioning ● partition the whole application ● each customer gets their own toolchain ● allows scaling with the number of customers ● lowers efficiency ● more efficient with virtualization
  • 40. dealing with: constant flow and size minute hours days months years historian buffer table table table table minute hours days months historian buffer table table table minute hours days historian buffer table table
  • 41. time-based rollups ● continuously accumulate levels of rollup ● each is based on the level below it ● data is always appended, never updated ● small windows == small resources
  • 42. time-based rollups ● allows: ● very rapid summary reports for different windows ● retaining different summaries for different levels of time ● batch/out-of-order processing ● summarization in parallel
  • 44. data collection must be: ● continuous ● parallel ● fault-tolerant
  • 45. data processing must be: ● continuous ● parallel ● fault-tolerant
  • 46. every component must be able to fail ● including the network ● without too much data loss ● other components must continue
  • 47. 5 tools to use 1. queueing software 2. buffering techniques 3. materialized views 4. configuration management 5. comprehensive monitoring
  • 48. 4 don'ts 1. use cutting-edge technology 2. use untested hardware 3. run components to capacity 4. do hot patching
  • 50. Contact ● Josh Berkus: josh@pgexperts.com ● blog: blogs.ittoolbox.com/database/soup ● PostgreSQL: www.postgresql.org ● pgexperts: www.pgexperts.com ● Upcoming Events ● PostgreSQL Europe: http://2011.pgconf.eu/ ● PostgreSQL Italy: http://2011.pgday.it/ The text and diagrams in this talk is copyright 2011 Josh Berkus and is licensed under the creative commons attribution license. Title slide image is licensed from iStockPhoto and may not be reproduced or redistributed. Socorro images are copyright 2011 Mozilla Inc.
  • 51. Firehose Engineering designing high-volume data collection systems Josh Berkus HiLoad, Moscow October 2011 1 I created this talk because, for some reason, I've been working on a bunch of the same kind of data systems lately.
  • 52. Firehose Database Applications (FDA) (1) very high volume of data input from many producers (2) continous processing of incoming data 2 I call these “Firehose Database Applications”. They're characterized by two factors, and have a lot of things in common regardless of which industry they're used in. 1. These applications are focused entirely on the collection of large amounts of data coming in from many high-volume, producers, and 2. they process data into some more useful form, such as summaries, continuously 24 hours a day. Some examples ...
  • 53. Mozilla Socorro 3 Mozilla's Socorro system, which collects firefox & thunderbird crash data from around the world.
  • 54. Upwind 4 Upwind, which collects a huge amount of metrics from large wind turbine farms in order to monitor and analyze them.
  • 55. Fraud Detection System 5 And numerous other systems, such as a fraud detection system which receives data from over 100 different transaction processing systems and analyzes it to look for patterns which would indicate fraud.
  • 56. Firehose Challenges 6 regardless of how these firehose systems are used, they share the same four challenges, all of which need to be overcome in some way to get the system running and keep it running.
  • 57. 1. Volume ● 100's to 1000's facts/second ● GB/hour 7 The first and most obvious is volume. A firehose system is collecting many, many facts per second, which may add up to gigabytes per hour.
  • 58. 1. Volume ● spikes in volume ● multiple uncoorindated sources 8 You also have to plan for unpredicted spikes in volume which may be 10X normal. Also your data sources may come in at different rates, and with interruptions or out of order data. But, the biggest challenge of volume always is ...
  • 59. 1. Volume volume always grows over time 9 … there is always, always, always more of it than there used to be.
  • 60. 2. Constant flow since data arrives 24/7 … while the user interface can be down, data collection can never be down 10 the second challenge is dealing with the all-day, all- week, all-year nature of the incoming data. for example, it is often permissiable for the user interface or reporting system be out of service, but data collection must go on at all times without interruption. Otherwise, data would be lost. this makes for different downtime plans than standard web applications, who can have “read only downtimes”. Firehose systems can't; they must arrange for alternate storage if the primary collections is down.
  • 61. 2. Constant flow ● can't stop receiving to process ETL ● data can arrive out of order 11 it also means that conventional Extract Transform Load (ETL) approaches to data warehousing don't work. You can't wait for the down period to process the data. Instead, data must be processed continuously while still collecting. Data is also never at rest; you must be prepared for interruptions and out-of-order data.
  • 62. 3. Database size ● terabytes to petabytes ● lots of hardware ● single-node DBMSes aren't enough ● difficult backups, redundancy, migration ● analytics are resource-consumptive 12 another obvious challenge is the sheer size of the data. once you start accumulating data 24x7, you will rapidly find yourself dealing with terabytes or even petabytes of data. This requires you to deal with lots of hardware and large clustered systems or server farms, including clustered or distributed databases. this makes backups and redundancy either very difficult or very expensive. performing analytics on the data and reporting on it becomes very resource-intensive and slow at large data sizes.
  • 63. 3. Database size ● database growth ● size grows quickly ● need to expand storage ● estimate target data size ● create data ageing policies 13 the database is also growing rapidly, which means that solutions which work today might not work next month. You need to estimate the target size of your data and engineer for that, not the amount of data you have now. you will also have to create a data ageing or expiration policy so that you can eventually purge data and limit the database to some finite size. this is the hardest thing to do because ...
  • 64. 3. Database size “We will decide on a data retention policy when we run out of disk space.” – every business user everywhere 14 … users never ever ever want to “throw away” data. No matter how old it is or how infrequently it's accessed. This isn't special to firehose systems except that they accumulate data faster than any other kind of system.
  • 65. 4. 15 problem #4 is illustrated by this architecture diagram of the Socorro system. Firehose systems have lots and lots of components, often using dozens of servers and hundreds of application instances. In addition to cost and administrative overhead, what having that much stuff means is ...
  • 66. many components = many failures 16 … every day of the week something is going to be crashing on you. You simply can't run a system with 90+ servers and expect them to be up all the time.
  • 67. 4. Component failure ● all components fail ● or need scheduled downtime ● including the network ● collection must continue ● collection & processing must recover 17 Even when components don't fail unexpectedly, they have to be taken down for upgrades, maintenance, replacement, or migration to new locations and software. But the data collection and processing has to go on while components are unavailable or offline. Even when processing has to stop, components must recover quickly when the system is ready again, and must do so without losing data.
  • 68. solving firehose problems 18 So let's talk about how a couple of different teams solved some of these firehose problems.
  • 69. socorro project 19 First we're going to talk about Mozilla's Socorro system. I'll bet a lot of you have seen this Firefox crash screen. The data it produces gets sent to the socorro system at Mozilla, where it's collected and processed.
  • 70. http://crash-stats.mozilla.com 20 At the end of the processing we get pretty reports like this, which let the mozilla developers know when releases are stable and ready, and where frequent crashing and hanging issues are. You can visit it right now: It's http://crash-stats.mozilla.com
  • 71. 21 There's quite a rich set of data there collecting everything about every aspect of what's making Firefox and other Mozilla products crash, individually and in the aggregate.
  • 72. Mozilla Socorro webservers collectors reports processors 22 this is a simplified architecture diagram showing the very basic data flow of Socorro. 1. Crashes come in through the internet, 2. are harvested by a cluster of collectors 3. raw crashes are stored in Hbase 4. a cluster of processors pull raw crashes from Hbase process crashes, and write processed crashes and metadata to Hbase and Postgres 5. Postgres populates views which are used to support the web interface and reports used internally by mozilla developers.
  • 73. 23 a full architecture diagram would look like this one, but even this doesn't show all components.
  • 74. socorro data volume ● 3000 crashes/minute ● avg. size 150K ● 40TB accumulated raw data ● 500GB accumulated metadata / reports 24 socorro receives up to 3000 crashes every minute from around the world. each of these crashes comes with 50 to 1000K of binary crash data which needs to be processed. within the target data lifetime of 6-12 months, socorro has accumulated nearly 40 terabytes of raw data and half a terabyte of metadata, views and reports.
  • 75. dealing with volume 25 load balancers collectors how does Mozilla deal with this large data volume? In one word: parallelism 1. crashes come in from the internet 2. they hit a set of round-robin load balancers 3. the load balancers randomly assign them to one of six crash collectors 4. the crash collectors receive the crash information, and then write the raw crash to a 30-server Hbase cluster.
  • 76. dealing with volume 26 monitor processors 5. the raw crashes in HBase are pulled out in parallel by twelve processor servers running 10 processor instances each. 6. these processors do a lot of work to process the binary crashes and then write the processed crash results to Hbase an the processed crash metadata and summary data to Postgres. 7. for scheduling and preventing duplication, the processors are coordinated by a monitor server, which runs a queue and monitoring system backed by Postgres.
  • 77. dealing with size data 40TB expandible metadata views 500GB fixed size 27 Why have both Hbase and PostgreSQL? Isn't one database enough? Well, the 30-server Hbase cluster holds 40 terabytes of data. It's good at doing this in a redundant, scalable way and is very fast for pulling individual crashes out of the billion or so which are stored there. However Hbase is very poor at ad-hoc queries, or simple browsing of data over the web. It also often requires downtime for maintenance So we use Postgres which holds metadata and holds summary “materialized views” which are used by the web application and the reports to present analytics to users. Postgres can't store the raw data though because it can't easily expand to the sizes needed.
  • 78. dealing with component failure Lots of hardware ... ● 30 Hbase ● 3 ES servers nodes ● 6 collectors ● 2 PostgreSQL ● 12 processors servers ● 8 middleware & ● 6 load web servers balancers … lots of failures 28 Of course, with the more than 60 servers we're dealing with, something is always going down. That's simply reality. Some of the worst failures have involved the entire network going out and all components losing communication with each other. This means that we need to be prepared to have failures and to recover from them. How do we do that?
  • 79. load balancing & redundancy 29 load balancers collectors One obvious way is to have lots of redundant components for each role in the application. As we already saw, for incoming data there is more than one load balancer and are 6 collectors. This means that we can lose several machines and still be receiving user data, and that in catastrophic failure we only lose an minority of the data.
  • 80. elastic connections ● components queue their data ● retain it if other nodes are down ● components resume work automatically ● when other nodes come back up 30 A more important strategy we've learned through hard experience is the requirement to have “elastic connections”. That is, each component is prepared to lose communication with the components on either side of it. They queue data trough several mechanisms, and resume work when the next component in line comes back up.
  • 81. elastic connections crash mover collector reciever local file queue 31 For example, let's look at the collectors. They receive data from the web and write to Hbase. But it's more complex than that. 1. The collectors can lose communication to Hbase for a variety of reasons. It can be down for maintenance, or there can be network issues. 2. The crashes are received by receiver processes 3. which write them to local file storage in a primitive, filename-based queue. 4. A separate “crashmover” process looks for files in the queue 5. When Hbase is available, the crashmover saves the files to Hbase.
  • 82. server management ● puppet ● controls configuration of all servers ● makes sure servers recover ● allows rapid deployment of replacement nodes 32 Another thing which has been essential to managing component failure is comprehensive server management through Puppet. Having servers under central management means that we can make sure services are up and configured correctly when they are supposed to be. And it also means that if we permanently lose nodes we can replace them quickly with identically configured nodes.
  • 83. Upwind 33 let's move on to a different team and different methods of facing firehose challenges. Upwind.
  • 84. Upwind ● speed ● wind speed ● heat ● vibration ● noise ● direction 34 power-generating wind turbines have onboard computers which produce a lot of data about the status and operation of the wind turbine Upwind takes that data, collects it, summarizes it and turns it into graphical summary reports for customers who own wind farms.
  • 85. Upwind 1. maximize power generation 2. make sure turbine isn't damaged 35 the customers need this data for two reasons: 1. wind turbines are very expensive and they need to make sure they're maximizing power generation from them. 2. to intervene and keep the turbines from catching fire or otherwise becoming permanently damaged.
  • 86. dealing with volume each turbine: 90 to 700 facts/second windmills per farm: up to 100 number of farms: 40+ est. total: 300,000 facts/second (will grow) 36 turbines come with a lot of different monitors, between 90 and 700 of them depending on the model. each of these monitors produces a reading every second. an individual “wind farm” can have up to 100 turbines, and Upwind expects more than 40 farms to make use of the service once it's fully deployed. Plus they want to get more customers. this means an estimated 300,000 metrics per second total by next year.
  • 87. dealing with volume local historian analytic reports storage database local historian analytic reports storage database 37 Upwind chose a very different route than Mozilla for dealing with this volume. Let's explain the data flow first: 1. Wind farms write metrics to local storage, which can hold data for a couple hours. 2. This data is then read by an industry-specific application called the historian which interprets the raw data and stores each second of data. 3. that data is then read from the historian by Postgres 4. which produces nice summaries and analytics for the reports. As it turns out, data is not shared between wind farms most of the time. So Upwind can scale simply by adding a whole additional service tool chain for each windfarm it brings online.
  • 88. dealing with volume local historian analytic storage database master database local historian analytic storage database 38 In the limited cases where we need to do inter-farm reports, we will have a master Postgres node which will use pl/proxy and/or foreign data wrappers to summarize data from multiple wind farms.
  • 89. multi-tenant partitioning ● partition the whole application ● each customer gets their own toolchain ● allows scaling with the number of customers ● lowers efficiency ● more efficient with virtualization 39 this is what's known as multi-tenant partitioning, where you give each customer, or “tenant”, a full stack of their own. It's not very resource-efficient, but it does easily scale as you add customers, and eliminates the need to use complex clustered storage or databases. cloud hosting and virtualization is also making this strategy easier to manage.
  • 90. dealing with: constant flow and size minute hours days months years historian buffer table table table table minute hours days months historian buffer table table table minute hours days 40 historian buffer table table In order to deal with the constant flow of data and the large database size, we continously produce a series of time-based summaries of the data: 1. a buffer holding a several minutes of data. this also accounts for lag and out-of-order data. 2. hour summaries 3. day summaries 4. month summaries 5. annual summaries
  • 91. time-based rollups ● continuously accumulate levels of rollup ● each is based on the level below it ● data is always appended, never updated ● small windows == small resources 41 For efficiency, each level builds on the level beneath it. This means that no analytic operation ever has to work with a very large set, an so can be very fast. Even better, the time-based summaries can be run in parallel, so that we can make maximum use of the processors on the server.
  • 92. time-based rollups ● allows: ● very rapid summary reports for different windows ● retaining different summaries for different levels of time ● batch/out-of-order processing ● summarization in parallel 42 this allows us to have a set of “views” which summarize the data at the time windows users look at. This means almost “instant” reports for users.
  • 93. firehose tips 43 based on my experiences with several of these firehose projects, here's some general rules for building this kind of an application
  • 94. data collection must be: ● continuous ● parallel ● fault-tolerant 44 data collection has to be 24x7. It must be done in parallel by multiple processes/servers, and above all must be fault-tolerant.
  • 95. data processing must be: ● continuous ● parallel ● fault-tolerant 45 data processing has to be built the same way.
  • 96. every component must be able to fail ● including the network ● without too much data loss ● other components must continue 46 the lesson most people don't seem to be prepared for is that every component must be able to fail without permanently downing the system, or causing unacceptable data loss.
  • 97. 5 tools to use 1. queueing software 2. buffering techniques 3. materialized views 4. configuration management 5. comprehensive monitoring 47 here's some classes of tools you should be using in any firehose application. queuing and buffering for fault-tolerance materialized views to deal with volume and size configuration management and comprehensive monitoring to keep the system up.
  • 98. 4 don'ts 1. use cutting-edge technology 2. use untested hardware 3. run components to capacity 4. do hot patching 48 some things you will be tempted to do and shouldn't * don't use cutting edge hardware or software because it's not reliable and will lead to repeated unacceptable downtimes * don't put hardware into service without testing it first beause it will be unreliable or perform poorly, dragging the system down * never run components or layers of your stack at capacity, because then you have no headroom for when you have a surge in data * and never do hot patching of production services because you can't manage it and it will lead to mistakes and taking the system down.
  • 99. firehose mastered? 49 so now that we know how to do it, it's time for you to build your own firehose applications. it's challenging and fun.
  • 100. Contact ● Josh Berkus: josh@pgexperts.com ● blog: blogs.ittoolbox.com/database/soup ● PostgreSQL: www.postgresql.org ● pgexperts: www.pgexperts.com ● Upcoming Events ● PostgreSQL Europe: http://2011.pgconf.eu/ ● PostgreSQL Italy: http://2011.pgday.it/ The text and diagrams in this talk is copyright 2011 Josh Berkus and is licensed under the creative commons attribution license. Title slide image is licensed from iStockPhoto and may not be reproduced or redistributed. Socorro images are copyright 2011 Mozilla Inc. 50 contact and copyright information. Questions?