O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytics engine"

502 visualizações

Publicada em

The application of Quantitative Analytics to trades for the generation of Risk and P&L metrics has traditionally followed a batch based approach. Regulatory changes impose increasing demand for compute on financial institutions along with a growing demand for real time analytics due to increased volumes in eTrading across all asset classes

The talk is based on a use case for pricing Interest Rate Swaps, using Apache Beam, with a call to an external C++ analytics process. It describes the performance characteristics when operating in a non-cloud environment using Apache Flink as opposed to Google Cloud Dataflow

The talk will touch upon the subtle difference when operating across multiple runners. It will make suggestions on approaches to portability when architecting for a multi-runner operational environment.

Publicada em: Tecnologia
  • Excelente los enlaces. he visto varios videos. https://uautonoma.cl
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui

Flink Forward Berlin 2018: Raj Subramani - "A streaming Quantitative Analytics engine"

  1. 1. A Streaming Quantitative Analytics Engine Raj Subramani <raj@subramani.com>
  2. 2. The Challenge Move to Centralized Clearing Lower margin Higher Volumes Continued Regulatory Expansion Principle based compliance Non-financial Risks Contagion risk Model risk Technology and Analytics as Risk Muscle Big data Machine learning Improved Decision Making Bias recognition Bias elimination Cost Challenges Capex Opex 2
  3. 3. The Anatomy of a Risk Engine Source Workflow Distribution Results Trades Market Data Results Cache Action A Action B Action C 3
  4. 4. Cloud Dataflow - Unified Batch and Stream Any combination of primitive & custom transformations Filtered Filtered & grouped Filtered, grouped & windowed Batch Stream 4
  5. 5. Apache Beam
  6. 6. Apache Beam Language B SDK Language A SDK Language C SDK Runner 1 Runner 3Runner 2 The Beam Model Language A Language CLanguage B The Beam Model 6
  7. 7. The Beam Model & Cloud Dataflow a unified model for batch and stream processing supporting multiple runtimes a great place to run Beam Apache Beam Google Cloud Dataflow 7
  8. 8. DoFn – Functional Programming Style Input (PCollection) Function Output (PCollection) Output (PCollection) 8
  9. 9. Workflow Pipeline DoFn A DoFn B DoFn C DoFn D 9
  10. 10. ParDo ParDo ParDo ParDo – Parallel Processing of DoFn’s (Known as a PTransform) DoFn A DoFn B DoFn C DoFn D ParDo 10
  11. 11. ParDo ParDo Distribution to workers DoFn A ParDo Worker DoFn A Worker DoFn A Worker DoFn A 11
  12. 12. ParDo DoFn A ParDo Dataflow Fusion DoFn A Worker DoFn A Worker DoFn A Worker DoFn A DoFn B DoFn B DoFn B 12
  13. 13. Native Code Execution Worker Container Process (Java) JNI C++ 13
  14. 14. Out of Process Call Worker Container Process (Java) JNI C++ Standard In Process Signal 14
  15. 15. Out of Process Call Process Signal Worker Container Process (Java) JNI C++ Interface 15
  16. 16. Out of Process Call Worker Container Process (Java) JNI C++ Disk Process Signal Input Input Output Output 16
  17. 17. Out of Process Call Protobuf Java C++ C# Python 17
  18. 18. Out of Process Call Worker Container Process (Java) C++ Disk Process Signal Protobuf Input Protobuf Output 18
  19. 19. Out of Process Call: Testing! C++Protobuf Unit Test (C++) Protobuf Unit Test (Java) DoFn 19
  20. 20. Out of Process Call Worker Container Process (Java) Go Protobuf C++ Python 20
  21. 21. Out of Process Call C++Input (Protobuf) Risk Output (Protobuf) Error Output (Protobuf) Worker Container Process (Java) 21
  22. 22. Module Separation S O U R C E S I N K Cloud Pub/Sub Cloud SQL Cloud Spanner Cloud Bigtable Cloud Storage Pipeline code and functionsI/O I/O 22
  23. 23. Module Separation S O U R C E S I N K Cloud SQL Pipeline code and functionsI/O I/O Cloud Storage 23
  24. 24. Apache Flink 24
  25. 25. What is Apache Flink? Apache Flink is open source stream processing framework Code written for Apache Beam can  run on Apache Flink 25
  26. 26. Building a Risk Engine Source Workflow Distribution Results Trades Market Data Results Cache Action A Action B Action C 26
  27. 27. Building Blocks: Flink Source Workflow Distribution Results Results Cache Action A Action B Action C 27
  28. 28. Building Blocks: Flink Running Beam Source Workflow Distribution Results Results Cache Action A Action B Action C 28
  29. 29. Building Blocks: Disk Source Workflow Distribution Results Results Cache Action A Action B Action C 29
  30. 30. What do You Need to do? Results Action A Action B Action C Configure/Scale 30
  31. 31. I Really Need to Run on Premises https://data-artisans.com/da-platform-2  31
  32. 32. During Development Source Workflow Distribution Results Results Cache Action A Action B Action C 32
  33. 33. During Development: Local Disks Source Workflow Distribution Results Results Cache Action A Action B Action C 33
  34. 34. During Testing: Use the Cloud 34
  35. 35. Watch Out for Runner Differences I need a Uber Jar I need a folder of Jars in Google Cloud Storage 35
  36. 36. Watch Out for Runner Differences What level of parallelism do you need? I'll decide parallelism for you 36
  37. 37. Test Setup Results Linux Blades Results Cloud Bigtable Analytics: Open source Quantlib v1.9.2 (for XML over JNI) Open source Quantlib v1.10.0 (for Protobuf/Direct calls) Trade data: 2,000,000 plain vanilla mono currency interest rate swaps 100,000 Bermudan Swaptions Market data: Interest rate curves built using FRA, Futures and Swaps in 12 currencies 37
  38. 38. Batch Results 2,000,000 plain vanilla interest rate swaps Interest rates curves from FRA, Futures & Swaps, OIS & Libor in 12 currencies Open source Quantlib v1.10.1 Batch size 200,000 400,000 600,000 800,000 2,000,000 200,000 400,000 600,000 800,000 2,000,000 0 600 1100 1600 2100 2600 0 500 1000 1500 2000 2500 Wallclocktime(sec) Trades Market Data Analytics Flink: JNI/XML vs Protobuf Dataflow: JNI/XML vs Protobuf XML/JNI: Flink vs Dataflow Protobuf/Direct C++: Flink vs Dataflow 38
  39. 39. Scaling Out 2,000,000 plain vanilla interest rate swaps Interest rates curves from FRA, Futures & Swaps, OIS & Libor in 12 currencies Open source Quantlib v1.10.1 Wallclocktime(sec) Trades Market Data Analytics 2,000 0 4,000 6,000 8,000 10,000 12,000 14,000 16,000 1,000 2,000 3,000 4,000 5,000 6,000 Number of vCPUs deployed Scale out will depend on data structure and workflow logic The more the workflow is controlled by Beam, the better the opportunity for dynamic rebalancing 39
  40. 40. Thank You

×