SlideShare uma empresa Scribd logo
1 de 23
1

                               chenshuo.com




          ZURG        PART   1 OF N
2012/04   Shuo Chen
What is it?
2


       An example of muduo protorpc
        A  toy C++ project that can be useful
         https://github.com/chenshuo/muduo-protorpc

       分布式系统部署、监控与进程管理的几重境界
           http://www.cnblogs.com/Solstice/archive/2011/05/09/2041306.html

       多线程服务器的适用场合
           http://blog.csdn.net/Solstice/article/details/5334243

       分布式系统的工程化开发方法
           http://blog.csdn.net/solstice/article/details/5950190   (slides)
           http://techparty.org/2010/10/19/2010q4summary/          (video)

    2012/04                                                              chenshuo.com
Overview
3


       Master-Slave structure
         Communicates with   bi-directional RPC
         Command line tool to change and view status

         A web frontend in future if I have time to learn web

       Central configuration of service placements
         Zurg slave  is memory-less, doesn’t store any thing
         That is different to supervisord

       Also serve as a name server
       Master looks like a SPOF, but can be overcome
    2012/04                                            chenshuo.com
Why not just run services as
4
    daemons?
       It’s fine to do so on 5 hosts, how about 50? 500?
       Not easy to upgrade apps
         Usually needs to   ssh to every host and restart apps
       Not transparent
         How   is every application running well ?
       Has to deploy a monitor system anyway
         And   the notification of app crashing is not real time
       Auto restart daemons could hide the real
        problem and confuse the monitor system
    2012/04                                              chenshuo.com
Zurg slave – functionalities
5


       Process management
         Run  a command (short-lived child process)
         Start/stop a service (long-lived child process)
               Not standard services, but programs written by yourself

         Detect child     death in real time and report to master
               Not polling with pids or process names

       Collecting performance metrics
         Monitor      system health
       Both regular heartbeats and event notifications
        to Master
    2012/04                                                    chenshuo.com
Zurg slave – design decisions
6


       All-in-one single-threaded process
         Don’tkeep running iostat/vmstat/top/netstat/XXXstat
         Replaces(?) nagios/monit/ganglia/munin/supervisord
               No plugins, just compiled what you need into one binary

       C++ for efficient and less resource usage
         Itruns on every hosts, every little helps
         Often the monitoring tools* use too much resource

       No local configuration, easy to deploy & upgrade
         Just    point it to the master
       Start it in init.d, it will take over everything else
    2012/04                                            chenshuo.com
Zurg slave – NOT in scope
7


       Configuration management
       System administration
         Use Puppet   instead
       Deployment of in-house software
         Although can   be done with ‘wget’ followed by ‘tar xf’




    2012/04                                             chenshuo.com
Run a command
8


       Start a child process
       Wait until it finishes (asynchronously, of course)
       Capture stdout/stderr
         No  other opened files in the parent should be leaked
          to child, set FD_CLOEXEC on every fd


       Sounds like re-invent Python subprocess module?
       Not exactly!

    2012/04                                            chenshuo.com
The easy part of process mgmt
9


       Start a new process
         fork(2)/exec*(2)

         How  to get errno if exec() failes? It’s in child process
         “The self-pipe trick” http://cr.yp.to/docs/selfpipe.html

       Get notification when a child terminates
         SIGCHLD, either   signalfd(2) or legacy signal handler
         Signal is not reliable, so run wait(2) periodically (nb)

       Get exit status of a terminated child process
         wait4(2) tells   everything incl. memory/CPU usage
    2012/04                                              chenshuo.com
A simple challenge
10


        Limit the runtime of a command, not CPU time
          Typical timeout of 60 seconds
          Remember the pid when start running a command

          Set up a timer, kill(2) it when timeout

        How do you know that the process you are going
         to kill is the one that you created for the cmd?
          Set atimer to kill pid 9527, 60 seconds later
          What if process 9527 dies just before the timer event,

          And a new process was created with the same pid (?!)

     2012/04                                           chenshuo.com
Pid is unique but not always
11


        Pid wraps        (in minutes or seconds)
          Pid is unique when take a snapshot of all processes
          But it is not unique if time moves on

        The possible values of pids are small (1~32767)
          /proc/sys/kernel/pid_max      default     32768
          /proc/loadavg                 lastpid     3387
          /proc/stat                    processes 423666
        There is a tiny time window between timer wakeup
         and kill(2)ing, anything could happen in between
          And there is no mutex or lock for this race condition
     2012/04                                            chenshuo.com
How to kill a child properly?
12


        So it is not safe to kill-by-pid, you may kill
         someone else’s child process by mistake
        How about check ppid first?
          Youmay kill you own new child, if another
           RunCommand reuses the pid just before the timer.
        The pid + start_time combination is unique in
         space and time
          Start
               time is in /proc/pid/stat, in jiffies since boot
          Remember the start time after fork() a child*

          Check start time before killing the child
     2012/04                                              chenshuo.com
Why it is safe?
13


        If two processes start at almost the same time,
         their pids must be different
        If two processes happen to have the same pid,
         their start time must be different
          It   takes seconds to wrap pid, start time is monotonic
        Since zurg slave is single-threaded, no race
         condition between checking and killing
          Don’t run zurg slave as root, (it quits if euid == 0)
          Don’t run two zurg slaves with same uid on a box

     2012/04                                               chenshuo.com
Capture stdout&stderr, simple ?
14


        Two pipes are needed, dup2() the write fd to 1, 2
         in child, read the other side of two fds in parent.
          Keep data      in memory and send back when finishes
        Command ‘cat /dev/zero’ will blow up zurg slave
        We must limit the size of stdout and stderr
          The default     size is 1024KiB
        Two approaches, when size breaches limit:
          Stop reading, i.e. block writing, wait until timeout
          Close the read side of pipe, i.e. kill child with SIGPIPE
                Directly sending a SIGPIPE signal doesn’t work
     2012/04                                                      chenshuo.com
Race condition at process exits
15


        When a child exits, all its open fds will be closed
          Parent will read(2) a 0, it should close the fd,
           otherwise POLLHUP will cause a busy loop
          A child could close them purposefully before dying

        The events of process exited and std{out,err} fds
         closed could arrive in no particular order
          Is there   any flying data that has not been received?
        The lifetime mgmt of Process/Pipe objects are
         also subtle, as fds are reused so aggressively
        Read the code to find out how to do it correctly
     2012/04                                              chenshuo.com
Run Command Request
16


message RunCommandRequest {
  required string command = 1;
  optional string cwd     = 2 [default = "/tmp"];
  repeated string args    = 3;
  repeated string envs      = 4;
  optional bool envs_only   = 5 [default = false];
  optional int32 max_stdout = 6 [default = 1048576];
  optional int32 max_stderr = 7 [default = 1048576];
  optional int32 timeout    = 8 [default = 60];
  optional int32 max_memory_mb = 9 [default = 32768];
}

     2012/04                                chenshuo.com
Run Command Response
17


message RunCommandResponse {
  required int32 error_code = 1;
  optional int32 pid         = 2;
  optional int32 status      = 3;
  optional bytes std_output = 4;
  optional bytes std_error = 5;
  optional int64 start_time_us = 16;
  optional int64 finish_time_us = 17;
  optional float user_time       = 18;
  optional float system_time     = 19;
  optional int64 memory_maxrss_kb = 20;
  // optional int64 ctxsw = 21;
  optional int32 exit_status = 30 [default = 0];
  optional int32 signaled = 31 [default = 0];
  optional bool coredump = 32 [default = false];
} 2012/04                                          chenshuo.com
Run Script
18


        RunCommand with script file content provided
         in the request
        A programmatic way to run slightly different
         scripts on many hosts




     2012/04                                    chenshuo.com
Application management
19


        Start/monitor/stop applications
          Applications a.k.a
                            services, long running processes
          Apps can be written in C++/Java/Python/etc.

        Share most functionalities of RunCommand
          stdout/stderr redirected to   files, not captured
          No   timeout
        Intrusive vs. non-intrusive
          Canzurg_slave manage any application?
          Should the managed application follow some rules?

     2012/04                                              chenshuo.com
How to detect app exiting
20


        Polling (pid and start time)
          Not real
                  time, always with a poll interval
          How do you know one process is the application?

        SIGCHLD
          Not 100%   reliable, so call wait(2) periodically
        Pipe, leave the write side in child process, read
         in zurg_slave, when app exits, read(2) returns 0
          Reliable and promptly
          The application must not close the fd* (intrusive!)

     2012/04                                              chenshuo.com
What if zurg_slave crashes?
21


        How to prevent starting duplicated services
        SIGCHILD and pipe(2) are nonrenewable
        Sockets? App reconnects to localhost zurg slave
          i.e.
              heartbeat between app and zurg slave
          Even more intrusive, retry logic in all languages



        Other thoughts?
          An     other layer of indirection?


     2012/04                                            chenshuo.com
To be continued
22


        Collecting health & performance data
        Periodically heartbeat to master
          Process status,   performance metrics


        Zurg slave is 50% done as of end of April 2012




     2012/04                                       chenshuo.com
Zurg Master
23


        A multithreaded program
        Its status is all retrievable from outside
          Easy   to build Web/GUI frontends


        Have not started coding yet.




     2012/04                                          chenshuo.com

Mais conteúdo relacionado

Mais procurados

도커 없이 컨테이너 만들기 5편 마운트 네임스페이스와 오버레이 파일시스템
도커 없이 컨테이너 만들기 5편 마운트 네임스페이스와 오버레이 파일시스템도커 없이 컨테이너 만들기 5편 마운트 네임스페이스와 오버레이 파일시스템
도커 없이 컨테이너 만들기 5편 마운트 네임스페이스와 오버레이 파일시스템Sam Kim
 
Kernel Recipes 2019 - Formal modeling made easy
Kernel Recipes 2019 - Formal modeling made easyKernel Recipes 2019 - Formal modeling made easy
Kernel Recipes 2019 - Formal modeling made easyAnne Nicolas
 
1032 cs208 g operation system ip camera case share.v0.2
1032 cs208 g operation system ip camera case share.v0.21032 cs208 g operation system ip camera case share.v0.2
1032 cs208 g operation system ip camera case share.v0.2Stanley Ho
 
[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory Analysis[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory AnalysisMoabi.com
 
[Defcon24] Introduction to the Witchcraft Compiler Collection
[Defcon24] Introduction to the Witchcraft Compiler Collection[Defcon24] Introduction to the Witchcraft Compiler Collection
[Defcon24] Introduction to the Witchcraft Compiler CollectionMoabi.com
 
啄木鸟Twisted
啄木鸟Twisted啄木鸟Twisted
啄木鸟TwistedXuYj
 
Workflow story: Theory versus Practice in large enterprises by Marcin Piebiak
Workflow story: Theory versus Practice in large enterprises by Marcin PiebiakWorkflow story: Theory versus Practice in large enterprises by Marcin Piebiak
Workflow story: Theory versus Practice in large enterprises by Marcin PiebiakNETWAYS
 
Workflow story: Theory versus practice in Large Enterprises
Workflow story: Theory versus practice in Large EnterprisesWorkflow story: Theory versus practice in Large Enterprises
Workflow story: Theory versus practice in Large EnterprisesPuppet
 
Possibility of arbitrary code execution by Step-Oriented Programming
Possibility of arbitrary code execution by Step-Oriented ProgrammingPossibility of arbitrary code execution by Step-Oriented Programming
Possibility of arbitrary code execution by Step-Oriented Programmingkozossakai
 
Kernel Recipes 2019 - CVEs are dead, long live the CVE!
Kernel Recipes 2019 - CVEs are dead, long live the CVE!Kernel Recipes 2019 - CVEs are dead, long live the CVE!
Kernel Recipes 2019 - CVEs are dead, long live the CVE!Anne Nicolas
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDPlcplcp1
 
Python twisted
Python twistedPython twisted
Python twistedMahendra M
 
Kqueue : Generic Event notification
Kqueue : Generic Event notificationKqueue : Generic Event notification
Kqueue : Generic Event notificationMahendra M
 
[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practicalMoabi.com
 
Black hat 2010-bannedit-advanced-command-injection-exploitation-1-wp
Black hat 2010-bannedit-advanced-command-injection-exploitation-1-wpBlack hat 2010-bannedit-advanced-command-injection-exploitation-1-wp
Black hat 2010-bannedit-advanced-command-injection-exploitation-1-wprgster
 
Hardware backdooring is practical : slides
Hardware backdooring is practical : slidesHardware backdooring is practical : slides
Hardware backdooring is practical : slidesMoabi.com
 
Crypto With OpenSSL
Crypto With OpenSSLCrypto With OpenSSL
Crypto With OpenSSLZhi Guan
 
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例National Cheng Kung University
 
Kernel Recipes 2019 - Kernel hacking behind closed doors
Kernel Recipes 2019 - Kernel hacking behind closed doorsKernel Recipes 2019 - Kernel hacking behind closed doors
Kernel Recipes 2019 - Kernel hacking behind closed doorsAnne Nicolas
 
Beating the (sh** out of the) GIL - Multithreading vs. Multiprocessing
Beating the (sh** out of the) GIL - Multithreading vs. MultiprocessingBeating the (sh** out of the) GIL - Multithreading vs. Multiprocessing
Beating the (sh** out of the) GIL - Multithreading vs. MultiprocessingGuy K. Kloss
 

Mais procurados (20)

도커 없이 컨테이너 만들기 5편 마운트 네임스페이스와 오버레이 파일시스템
도커 없이 컨테이너 만들기 5편 마운트 네임스페이스와 오버레이 파일시스템도커 없이 컨테이너 만들기 5편 마운트 네임스페이스와 오버레이 파일시스템
도커 없이 컨테이너 만들기 5편 마운트 네임스페이스와 오버레이 파일시스템
 
Kernel Recipes 2019 - Formal modeling made easy
Kernel Recipes 2019 - Formal modeling made easyKernel Recipes 2019 - Formal modeling made easy
Kernel Recipes 2019 - Formal modeling made easy
 
1032 cs208 g operation system ip camera case share.v0.2
1032 cs208 g operation system ip camera case share.v0.21032 cs208 g operation system ip camera case share.v0.2
1032 cs208 g operation system ip camera case share.v0.2
 
[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory Analysis[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory Analysis
 
[Defcon24] Introduction to the Witchcraft Compiler Collection
[Defcon24] Introduction to the Witchcraft Compiler Collection[Defcon24] Introduction to the Witchcraft Compiler Collection
[Defcon24] Introduction to the Witchcraft Compiler Collection
 
啄木鸟Twisted
啄木鸟Twisted啄木鸟Twisted
啄木鸟Twisted
 
Workflow story: Theory versus Practice in large enterprises by Marcin Piebiak
Workflow story: Theory versus Practice in large enterprises by Marcin PiebiakWorkflow story: Theory versus Practice in large enterprises by Marcin Piebiak
Workflow story: Theory versus Practice in large enterprises by Marcin Piebiak
 
Workflow story: Theory versus practice in Large Enterprises
Workflow story: Theory versus practice in Large EnterprisesWorkflow story: Theory versus practice in Large Enterprises
Workflow story: Theory versus practice in Large Enterprises
 
Possibility of arbitrary code execution by Step-Oriented Programming
Possibility of arbitrary code execution by Step-Oriented ProgrammingPossibility of arbitrary code execution by Step-Oriented Programming
Possibility of arbitrary code execution by Step-Oriented Programming
 
Kernel Recipes 2019 - CVEs are dead, long live the CVE!
Kernel Recipes 2019 - CVEs are dead, long live the CVE!Kernel Recipes 2019 - CVEs are dead, long live the CVE!
Kernel Recipes 2019 - CVEs are dead, long live the CVE!
 
Introduction to eBPF and XDP
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
 
Python twisted
Python twistedPython twisted
Python twisted
 
Kqueue : Generic Event notification
Kqueue : Generic Event notificationKqueue : Generic Event notification
Kqueue : Generic Event notification
 
[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical[Defcon] Hardware backdooring is practical
[Defcon] Hardware backdooring is practical
 
Black hat 2010-bannedit-advanced-command-injection-exploitation-1-wp
Black hat 2010-bannedit-advanced-command-injection-exploitation-1-wpBlack hat 2010-bannedit-advanced-command-injection-exploitation-1-wp
Black hat 2010-bannedit-advanced-command-injection-exploitation-1-wp
 
Hardware backdooring is practical : slides
Hardware backdooring is practical : slidesHardware backdooring is practical : slides
Hardware backdooring is practical : slides
 
Crypto With OpenSSL
Crypto With OpenSSLCrypto With OpenSSL
Crypto With OpenSSL
 
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
LLVM 總是打開你的心:從電玩模擬器看編譯器應用實例
 
Kernel Recipes 2019 - Kernel hacking behind closed doors
Kernel Recipes 2019 - Kernel hacking behind closed doorsKernel Recipes 2019 - Kernel hacking behind closed doors
Kernel Recipes 2019 - Kernel hacking behind closed doors
 
Beating the (sh** out of the) GIL - Multithreading vs. Multiprocessing
Beating the (sh** out of the) GIL - Multithreading vs. MultiprocessingBeating the (sh** out of the) GIL - Multithreading vs. Multiprocessing
Beating the (sh** out of the) GIL - Multithreading vs. Multiprocessing
 

Semelhante a Zurg part 1

Linux Daemon Writting
Linux Daemon WrittingLinux Daemon Writting
Linux Daemon Writtingwinsopc
 
Managing Processes in Unix.pptx
Managing Processes in Unix.pptxManaging Processes in Unix.pptx
Managing Processes in Unix.pptxHarsha Patel
 
Managing Processes in Unix.pptx
Managing Processes in Unix.pptxManaging Processes in Unix.pptx
Managing Processes in Unix.pptxHarsha Patel
 
Process management
Process managementProcess management
Process managementBirju Tank
 
What is-a-computer-process-os
What is-a-computer-process-osWhat is-a-computer-process-os
What is-a-computer-process-osManish Singh
 
Module 3-cpu-scheduling
Module 3-cpu-schedulingModule 3-cpu-scheduling
Module 3-cpu-schedulingHesham Elmasry
 
Lecture_Slide_4.pptx
Lecture_Slide_4.pptxLecture_Slide_4.pptx
Lecture_Slide_4.pptxDiptoRoy21
 
Android crash debugging
Android crash debuggingAndroid crash debugging
Android crash debuggingAshish Agrawal
 
Developing Android Platform Tools
Developing Android Platform ToolsDeveloping Android Platform Tools
Developing Android Platform ToolsOpersys inc.
 
How to drive a malware analyst crazy
How to drive a malware analyst crazyHow to drive a malware analyst crazy
How to drive a malware analyst crazyMichael Boman
 
44CON London 2015 - How to drive a malware analyst crazy
44CON London 2015 - How to drive a malware analyst crazy44CON London 2015 - How to drive a malware analyst crazy
44CON London 2015 - How to drive a malware analyst crazy44CON
 
Monitoring and troubleshooting a glideinWMS-based HTCondor pool
Monitoring and troubleshooting a glideinWMS-based HTCondor poolMonitoring and troubleshooting a glideinWMS-based HTCondor pool
Monitoring and troubleshooting a glideinWMS-based HTCondor poolIgor Sfiligoi
 
Cfgmgmt Challenges aren't technical anymore
Cfgmgmt Challenges aren't technical anymoreCfgmgmt Challenges aren't technical anymore
Cfgmgmt Challenges aren't technical anymoreJulien Pivotto
 
Tarea - 3 Actividad intermedia trabajo colaborativo 2
Tarea - 3 Actividad intermedia trabajo colaborativo 2Tarea - 3 Actividad intermedia trabajo colaborativo 2
Tarea - 3 Actividad intermedia trabajo colaborativo 2HectorFabianPintoOsp
 
LM9 - OPERATIONS, SCHEDULING, Inter process xommuncation
LM9 - OPERATIONS, SCHEDULING, Inter process xommuncationLM9 - OPERATIONS, SCHEDULING, Inter process xommuncation
LM9 - OPERATIONS, SCHEDULING, Inter process xommuncationMani Deepak Choudhry
 

Semelhante a Zurg part 1 (20)

Cs8493 unit 2
Cs8493 unit 2Cs8493 unit 2
Cs8493 unit 2
 
CS6401 OPERATING SYSTEMS Unit 2
CS6401 OPERATING SYSTEMS Unit 2CS6401 OPERATING SYSTEMS Unit 2
CS6401 OPERATING SYSTEMS Unit 2
 
Linux Daemon Writting
Linux Daemon WrittingLinux Daemon Writting
Linux Daemon Writting
 
Managing Processes in Unix.pptx
Managing Processes in Unix.pptxManaging Processes in Unix.pptx
Managing Processes in Unix.pptx
 
Managing Processes in Unix.pptx
Managing Processes in Unix.pptxManaging Processes in Unix.pptx
Managing Processes in Unix.pptx
 
Process management
Process managementProcess management
Process management
 
Process management
Process managementProcess management
Process management
 
CH03.pdf
CH03.pdfCH03.pdf
CH03.pdf
 
What is-a-computer-process-os
What is-a-computer-process-osWhat is-a-computer-process-os
What is-a-computer-process-os
 
Module 3-cpu-scheduling
Module 3-cpu-schedulingModule 3-cpu-scheduling
Module 3-cpu-scheduling
 
Lecture_Slide_4.pptx
Lecture_Slide_4.pptxLecture_Slide_4.pptx
Lecture_Slide_4.pptx
 
OS (1).pptx
OS (1).pptxOS (1).pptx
OS (1).pptx
 
Android crash debugging
Android crash debuggingAndroid crash debugging
Android crash debugging
 
Developing Android Platform Tools
Developing Android Platform ToolsDeveloping Android Platform Tools
Developing Android Platform Tools
 
How to drive a malware analyst crazy
How to drive a malware analyst crazyHow to drive a malware analyst crazy
How to drive a malware analyst crazy
 
44CON London 2015 - How to drive a malware analyst crazy
44CON London 2015 - How to drive a malware analyst crazy44CON London 2015 - How to drive a malware analyst crazy
44CON London 2015 - How to drive a malware analyst crazy
 
Monitoring and troubleshooting a glideinWMS-based HTCondor pool
Monitoring and troubleshooting a glideinWMS-based HTCondor poolMonitoring and troubleshooting a glideinWMS-based HTCondor pool
Monitoring and troubleshooting a glideinWMS-based HTCondor pool
 
Cfgmgmt Challenges aren't technical anymore
Cfgmgmt Challenges aren't technical anymoreCfgmgmt Challenges aren't technical anymore
Cfgmgmt Challenges aren't technical anymore
 
Tarea - 3 Actividad intermedia trabajo colaborativo 2
Tarea - 3 Actividad intermedia trabajo colaborativo 2Tarea - 3 Actividad intermedia trabajo colaborativo 2
Tarea - 3 Actividad intermedia trabajo colaborativo 2
 
LM9 - OPERATIONS, SCHEDULING, Inter process xommuncation
LM9 - OPERATIONS, SCHEDULING, Inter process xommuncationLM9 - OPERATIONS, SCHEDULING, Inter process xommuncation
LM9 - OPERATIONS, SCHEDULING, Inter process xommuncation
 

Último

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Último (20)

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Zurg part 1

  • 1. 1 chenshuo.com ZURG PART 1 OF N 2012/04 Shuo Chen
  • 2. What is it? 2  An example of muduo protorpc A toy C++ project that can be useful  https://github.com/chenshuo/muduo-protorpc  分布式系统部署、监控与进程管理的几重境界  http://www.cnblogs.com/Solstice/archive/2011/05/09/2041306.html  多线程服务器的适用场合  http://blog.csdn.net/Solstice/article/details/5334243  分布式系统的工程化开发方法  http://blog.csdn.net/solstice/article/details/5950190 (slides)  http://techparty.org/2010/10/19/2010q4summary/ (video) 2012/04 chenshuo.com
  • 3. Overview 3  Master-Slave structure  Communicates with bi-directional RPC  Command line tool to change and view status  A web frontend in future if I have time to learn web  Central configuration of service placements  Zurg slave is memory-less, doesn’t store any thing  That is different to supervisord  Also serve as a name server  Master looks like a SPOF, but can be overcome 2012/04 chenshuo.com
  • 4. Why not just run services as 4 daemons?  It’s fine to do so on 5 hosts, how about 50? 500?  Not easy to upgrade apps  Usually needs to ssh to every host and restart apps  Not transparent  How is every application running well ?  Has to deploy a monitor system anyway  And the notification of app crashing is not real time  Auto restart daemons could hide the real problem and confuse the monitor system 2012/04 chenshuo.com
  • 5. Zurg slave – functionalities 5  Process management  Run a command (short-lived child process)  Start/stop a service (long-lived child process)  Not standard services, but programs written by yourself  Detect child death in real time and report to master  Not polling with pids or process names  Collecting performance metrics  Monitor system health  Both regular heartbeats and event notifications to Master 2012/04 chenshuo.com
  • 6. Zurg slave – design decisions 6  All-in-one single-threaded process  Don’tkeep running iostat/vmstat/top/netstat/XXXstat  Replaces(?) nagios/monit/ganglia/munin/supervisord  No plugins, just compiled what you need into one binary  C++ for efficient and less resource usage  Itruns on every hosts, every little helps  Often the monitoring tools* use too much resource  No local configuration, easy to deploy & upgrade  Just point it to the master  Start it in init.d, it will take over everything else 2012/04 chenshuo.com
  • 7. Zurg slave – NOT in scope 7  Configuration management  System administration  Use Puppet instead  Deployment of in-house software  Although can be done with ‘wget’ followed by ‘tar xf’ 2012/04 chenshuo.com
  • 8. Run a command 8  Start a child process  Wait until it finishes (asynchronously, of course)  Capture stdout/stderr  No other opened files in the parent should be leaked to child, set FD_CLOEXEC on every fd  Sounds like re-invent Python subprocess module?  Not exactly! 2012/04 chenshuo.com
  • 9. The easy part of process mgmt 9  Start a new process  fork(2)/exec*(2)  How to get errno if exec() failes? It’s in child process  “The self-pipe trick” http://cr.yp.to/docs/selfpipe.html  Get notification when a child terminates  SIGCHLD, either signalfd(2) or legacy signal handler  Signal is not reliable, so run wait(2) periodically (nb)  Get exit status of a terminated child process  wait4(2) tells everything incl. memory/CPU usage 2012/04 chenshuo.com
  • 10. A simple challenge 10  Limit the runtime of a command, not CPU time  Typical timeout of 60 seconds  Remember the pid when start running a command  Set up a timer, kill(2) it when timeout  How do you know that the process you are going to kill is the one that you created for the cmd?  Set atimer to kill pid 9527, 60 seconds later  What if process 9527 dies just before the timer event,  And a new process was created with the same pid (?!) 2012/04 chenshuo.com
  • 11. Pid is unique but not always 11  Pid wraps (in minutes or seconds)  Pid is unique when take a snapshot of all processes  But it is not unique if time moves on  The possible values of pids are small (1~32767)  /proc/sys/kernel/pid_max default 32768  /proc/loadavg lastpid 3387  /proc/stat processes 423666  There is a tiny time window between timer wakeup and kill(2)ing, anything could happen in between  And there is no mutex or lock for this race condition 2012/04 chenshuo.com
  • 12. How to kill a child properly? 12  So it is not safe to kill-by-pid, you may kill someone else’s child process by mistake  How about check ppid first?  Youmay kill you own new child, if another RunCommand reuses the pid just before the timer.  The pid + start_time combination is unique in space and time  Start time is in /proc/pid/stat, in jiffies since boot  Remember the start time after fork() a child*  Check start time before killing the child 2012/04 chenshuo.com
  • 13. Why it is safe? 13  If two processes start at almost the same time, their pids must be different  If two processes happen to have the same pid, their start time must be different  It takes seconds to wrap pid, start time is monotonic  Since zurg slave is single-threaded, no race condition between checking and killing  Don’t run zurg slave as root, (it quits if euid == 0)  Don’t run two zurg slaves with same uid on a box 2012/04 chenshuo.com
  • 14. Capture stdout&stderr, simple ? 14  Two pipes are needed, dup2() the write fd to 1, 2 in child, read the other side of two fds in parent.  Keep data in memory and send back when finishes  Command ‘cat /dev/zero’ will blow up zurg slave  We must limit the size of stdout and stderr  The default size is 1024KiB  Two approaches, when size breaches limit:  Stop reading, i.e. block writing, wait until timeout  Close the read side of pipe, i.e. kill child with SIGPIPE  Directly sending a SIGPIPE signal doesn’t work 2012/04 chenshuo.com
  • 15. Race condition at process exits 15  When a child exits, all its open fds will be closed  Parent will read(2) a 0, it should close the fd, otherwise POLLHUP will cause a busy loop  A child could close them purposefully before dying  The events of process exited and std{out,err} fds closed could arrive in no particular order  Is there any flying data that has not been received?  The lifetime mgmt of Process/Pipe objects are also subtle, as fds are reused so aggressively  Read the code to find out how to do it correctly 2012/04 chenshuo.com
  • 16. Run Command Request 16 message RunCommandRequest { required string command = 1; optional string cwd = 2 [default = "/tmp"]; repeated string args = 3; repeated string envs = 4; optional bool envs_only = 5 [default = false]; optional int32 max_stdout = 6 [default = 1048576]; optional int32 max_stderr = 7 [default = 1048576]; optional int32 timeout = 8 [default = 60]; optional int32 max_memory_mb = 9 [default = 32768]; } 2012/04 chenshuo.com
  • 17. Run Command Response 17 message RunCommandResponse { required int32 error_code = 1; optional int32 pid = 2; optional int32 status = 3; optional bytes std_output = 4; optional bytes std_error = 5; optional int64 start_time_us = 16; optional int64 finish_time_us = 17; optional float user_time = 18; optional float system_time = 19; optional int64 memory_maxrss_kb = 20; // optional int64 ctxsw = 21; optional int32 exit_status = 30 [default = 0]; optional int32 signaled = 31 [default = 0]; optional bool coredump = 32 [default = false]; } 2012/04 chenshuo.com
  • 18. Run Script 18  RunCommand with script file content provided in the request  A programmatic way to run slightly different scripts on many hosts 2012/04 chenshuo.com
  • 19. Application management 19  Start/monitor/stop applications  Applications a.k.a services, long running processes  Apps can be written in C++/Java/Python/etc.  Share most functionalities of RunCommand  stdout/stderr redirected to files, not captured  No timeout  Intrusive vs. non-intrusive  Canzurg_slave manage any application?  Should the managed application follow some rules? 2012/04 chenshuo.com
  • 20. How to detect app exiting 20  Polling (pid and start time)  Not real time, always with a poll interval  How do you know one process is the application?  SIGCHLD  Not 100% reliable, so call wait(2) periodically  Pipe, leave the write side in child process, read in zurg_slave, when app exits, read(2) returns 0  Reliable and promptly  The application must not close the fd* (intrusive!) 2012/04 chenshuo.com
  • 21. What if zurg_slave crashes? 21  How to prevent starting duplicated services  SIGCHILD and pipe(2) are nonrenewable  Sockets? App reconnects to localhost zurg slave  i.e. heartbeat between app and zurg slave  Even more intrusive, retry logic in all languages  Other thoughts?  An other layer of indirection? 2012/04 chenshuo.com
  • 22. To be continued 22  Collecting health & performance data  Periodically heartbeat to master  Process status, performance metrics  Zurg slave is 50% done as of end of April 2012 2012/04 chenshuo.com
  • 23. Zurg Master 23  A multithreaded program  Its status is all retrievable from outside  Easy to build Web/GUI frontends  Have not started coding yet. 2012/04 chenshuo.com

Notas do Editor

  1. * In script language
  2. *Must be done in child process and pass back to parent