O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Optimizing Erlang Code for Speed

3.141 visualizações

Publicada em

Considers optimizations allow to reach microseconds latencies and GBs throughput in intelligent network management solution written in Erlang

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Optimizing Erlang Code for Speed

  1. 1. Optimizing Erlang code for speed Revelations from a real-world project based on Erlang on Xen Maxim Kharchenko CTO, Cloudozer LLP mk@cloudozer.com ErlangDripro2014
  2. 2. The road map ● Erlang on Xen intro ● Speed-related notes – – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – ● Arguments are registers Fast counters Q&A
  3. 3. Erlang on Xen 101 ● A new Erlang runtime that runs without OS ● Conceived in 2009 ● Highly-compatible with Erlang/OTP ● Built from scratch, not a “port” ● Optimised for low startup latency ● Not an open source (yet) ● The public build service is free Go to erlangonxen.org 3
  4. 4. Zerg demo: zerg.erlangonxen.org 4
  5. 5. The road map ● Erlang on Xen intro ● Speed-related notes – – ETS tables are (mostly) ok – Do not overuse records – GC is key to speed – gen_server vs. barebone process – NIFS: more pain than gain – ● Arguments are registers Fast counters Q&A
  6. 6. Arguments are registers animal(batman = Cat, Dog, Horse, Pig, Cow, State) -> feed(Cat, Dog, Horse, Pig, Cow, State); animal(Cat, deli = Dog, Horse, Pig, Cow, State) -> pet(Cat, Dog, Horse, Pig, Cow, State); ... ● Many arguments do not make a function any slower ● Do not reshuffle arguments: %% SLOW animal(Cat, Dog, Horse, Pig, Cow, State) -> feed(Goat, Cat, Dog, Horse, Pig, Cow, State); ... 6
  7. 7. ETS tables are (mostly) ok ● A small ETS table lookup = 10x function activations ● Do not use ets:tab2list() inside tight loops ● Treat ETS as a database; not a pool of global variables ● 1-2 ETS lookups on the fast path are ok ● Beware that ets:lookup(), etc create a copy of the data on the heap of the caller, similarly to message passing 7
  8. 8. Do not overuse records ● ● ● selelement() creates a copy of the tuple State#state{foo=Foo1,bar=Bar1,baz=Baz1} creates 3(?) copies of the tuple Use tuples explicitly in the performance-critical sections to see the heap footprint of the code %% from 9p.erl mixer({rauth,_,_}, {tauth,_,AFid,_,_}, _) -> {write_auth,AFid}; mixer({rauth,_,_}, {tauth,_,AFid,_,_,_}, _) -> {write_auth,AFid}; mixer({rwrite,_,_}, _, initial) -> start_attaching; mixer({rerror,_,_}, _, initial) -> auth_failed; mixer({rlerror,_,_}, _, initial) -> auth_failed; mixer({rattach,_,Qid}, {tattach,_,Fid,_,_,AName,_}, initial) -> {attach_more,Fid,AName,qid_type(Qid)}; mixer({rclunk,_}, {tclunk,_,Fid}, initial) -> {forget,Fid}; 8
  9. 9. Garbage collection is key to speed ● Heap is a list of chunks ● 'new heap' is close to its head, 'old heap' - to its tail ● A GC run takes 10μs on average ● GC may run 1000s times per second ● How to tackle GC-related issues: – (Priority 1) Call erlang:garbage_collect() at strategic points – (Priority 2) For the fastest code avoid GC completely – restart the fast process regularly – (Priority 3) Use fullsweep_after option 9
  10. 10. gen_server vs barebone process ● Message passing using gen_server:call() is 2x slower than Pid ! Msg ● For speedy code prefer barebone processes to gen_servers ● Design Principles are about high availability, not high performance 10
  11. 11. NIFs: more pain than gain ● ● ● ● ● A new principle of Erlang development: do not use NIFs For a small performance boost, NIFs undermine key properties of Erlang: reliability and soft-realtime guarantees Most of the time Erlang code can be made as fast as C Most of performance problems of Erlang are traceable to NIFs, or external C libraries, which are similar Erlang on Xen does not have NIFs and we do not plan to add them 11
  12. 12. Fast counters ● ● 32-bit or 64-bit unsigned integer counters with overflow - trivial in C, not easy in Erlang FIXNUMs are signed 29-bit integers, BIGNUMs consume heap and 10-100x slower ● Use two variables for a counter? ● Erlang on Xen has a new experimental feature – fast counters: foo(C1, 16#ffffff, ...) → foo(C1+1, 0, ...); foo(C1, C2, ...) -> foo(C1, C2+1, ...); ... erlang:new_counter(Bits) -> Ref erlang:increment_counter(Ref, Incr) erlang:read_counter(Ref) erlang:release_counter(Ref) 12
  13. 13. Questions? ? ?? ? ? 13

×