O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Practical Fault Tolerance in Elixir - Alexei Sholik

66 visualizações

Publicada em

Elixir Club 10
March 17, 2018
Kyiv

Publicada em: Software
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

Practical Fault Tolerance in Elixir - Alexei Sholik

  1. 1. Practical Fault Tolerance in Elixir Alexei Sholik Elixir Club Kyiv, 17 Mar 2018
  2. 2. About me Backend engineer at Contractbook.co. Co-host at BeamEaters podcast. Contributor to Elixir. github.com/alco
  3. 3. What is fault tolerance?
  4. 4. Why it’s important
  5. 5. Why do only Erlang/Elixir communities seem to care about it?
  6. 6. A practical example (demo)
  7. 7. “Let it crash” is not the full story
  8. 8. Fail fast → restart → try again
  9. 9. Building blocks of fault tolerance
  10. 10. Process
  11. 11. Error
  12. 12. Link
  13. 13. Monitor
  14. 14. def call(process, request, timeout) do monitor = Process.monitor(process)xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx send(process, {:"$gen_call", {self(), monitor}, request}) receive {^monitor, reply} -> Process.demonitor(monitor, [:flush]) {:ok, reply} {:DOWN, ^monitor, _, _, reason} ->⁣xxxxxxxxxxxxxxxxxxxxxxxxxxxx exit(reason) after timeout -> Process.demonitor(monitor, [:flush]) exit(:timeout) end end
  15. 15. Task
  16. 16. Supervisor
  17. 17. Bohrbug
  18. 18. Heisenbug
  19. 19. Improving our example (demo)
  20. 20. ... DB SuperDocs.Web.Endpoint <request process> <db_connection process> Ecto’s connection pool supervisor
  21. 21. DB SuperDocs.Web.Endpoint <request process> <db_connection process> SuperDocs.TaskSup User.send_confirmation_email ...
  22. 22. Durable message queue hex.pm/packages/amqp
  23. 23. DB SuperDocs.Web.Endpoint <request process> <db_connection process> RabbitMQ SuperDocs.TaskSup ImportService.import_documents_for AMQPClient AMQPWorker AMQPWorker ...
  24. 24. Other tools and techniques
  25. 25. Remote error tracking Sentry, Rollbar, log aggregators
  26. 26. Exponential backoff hex.pm/packages/db_connection hex.pm/packages/gen_retry
  27. 27. Alternative supervisor hex.pm/packages/director
  28. 28. And so on...
  29. 29. Recap ● Trust in OTP… ● ...but don’t dismiss other tools ● Anticipate failures ● Isolate failures ● Fail fast, restart, try again
  30. 30. ● Why do computers fail and what can be done about it? ● Making reliable distributed systems in the presence of software errors ● It's About the Guarantees ● On Erlang, State and Crashes ● Error Kernels Reading material
  31. 31. Thank you! Questions?

×