Mais conteúdo relacionado
Semelhante a The DevOps PaaS Infusion - May meetup (20)
The DevOps PaaS Infusion - May meetup
- 1. Cloud I/O in FS
John Davies
• 17th May 2012
Monday, 21 May 12
- 2. Agenda
• A quick look at data volumes in the front-office
• Front office enterprise architecture
• How can cloud help in this low-latency environment?
• Getting data into GigaSpaces
• A few examples
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 3. Clouds are old!
• Most of us (here) have been using “grid” for a good
decade
• Cloud isn’t much different, it’s just a little more fluffy!
• Having computing resources in the cloud doesn’t solve
integration issues
• In fact is just means they need to handle higher volumes
• Getting Financial Services messages into a cloud for
processing requires some clever integration technology
• Even once it’s in the cloud you need some clever
technology to make the best of what you have
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 4. Front-Office
• One area that may not seem ideal for cloud solutions is
the front office
• Latency is critical, the 100ms latency to the cloud would
be like years for your average arbitrage trader
• But the split-(milli)-second decisions made by the algo
trading engines need to be based on reliable information
• The cloud is the perfect place to perform these
operations...
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 5. We need more CPU power!
• The graph below shows the Dow Jones daily trading
volumes since 1980, the y-axis is logarithmic
• Log(vol) vs time i.e. 8 = 100m, 9 = 1 billion, 10 = 10 billion (per day)
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 6. Algo-trading
• It’s not that complicated... (in theory anyway)
• You need to have access to data from the exchange, come up with a trading strategy
(algorithm), write the code (usually in C, Python, R or something similar), deploy it
to a machine as close as you can physically get to the exchange (co-hosting)
• ... and collect your money :-)
• The algorithm is basically a program that says something
like
• If APPL < opening price and MSFT > 30min moving avg and MSFT > opening price then
buy MSFT, sell APPL
• But it can get way more complex
• Predictive models based on market data feed harmonics trying anticipate where the
market will be in 200 µs
• Correlation trading of stocks that appear unrelated, but mathematically correlate
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 7. So first get the data...
• Protocols to connect - each exchange is different!
• FIX, ITCH, OUCH, PINCH, SCRATCH, many are optimised for performance
• Proprietary APIs can reduce latency to the order of 20µs
• FIX / FAST is a good standard approach, it comes in
several versions and the latest can use 3 different
encodings...
• Standard FIX (tag/value pairs, all tags are integers)
• FixML (a very verbose XMLized version of the above)
• FAST (FIX Adapted for STreaming, like the standard version but compressed)
• When you’ve worked that out there’s the venue specific
dialect that need tuning for each exchange
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 8. Now the OS & Language...
• Since the FIX engine is connecting to your back-end
servers you usually have to make a choice between...
• C / C++
• Still regarded as the fastest FIX engines and usually the choice for the arbitrage traders
• Latency is reliable (i.e. no garbage collection)
• Supporting all different versions of Linux, UNIX, MS, 32bit, 64bit etc. is a real pain
• Java
• Surprisingly only 3rd place, Java FIX engines are very fast but unless they are carefully designed garbage
collection can be a major issue
• Easiest to integrate into other architecture, most flexible
• .NET
• The most popular simply due to the reason that most small businesses start off on
• Microsoft platforms (Excel etc.) and this fits in best
• Less used in the larger businesses (banks, major firms etc.)
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 9. Enterprise Architecture...
• It’s not exactly complicate
from an enterprise view New York Stock Exchange (NYSE) Chicago Mercantile Exchange (CME)
point
• Offices are often in “nice”
locations
Company Company Company Company
FIX Engine Trading Engine FIX Engine Trading Engine
• Of course it’s nothing to do with the
favourable tax
Company Office
• Most of the infrastructure
runs in co-located boxes
• Co-hosting costs money and limits what
you can do
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 10. So what about the cloud?
• So far no cloud, where does come in to the picture
• And I don’t mean the dark clouds appearing over the tax-havens
• The traders view of the world is a little window just a few
milliseconds wide, what if we could expand that?
• It’s like watching a fast movie with no controls, you sneeze as a shot’s fired and
you’ve missed half the plot (welcome to modern movies)
• What if we could skip back an hour or a day and replay
scenarios through our algo trading engines?
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 11. First store the data...
• Tick data is the raw price feed from the exchange
• 8=FIXT.1.1^A9=0^A35=X^A49=CME^A34=2127825^A52=20100120150049656^A
1128=8^A268=1^A279=0^A269=0^A48=109291^A22=8^A270=115060^A271=1^A
273=150049000^A336=2^A346=1^A83=27750^A1023=2^A75=20081117^A10=000^A
• This needs to be stored for legal reasons as you have to be
able to demonstrate “best execution” for clients
• We’ve seen the volumes earlier, we don’t need everything on the exchange but
certainly everything we’re trading
• Typically these are tens of gigabytes per day per exchange
• An interesting solution is to ship it up to EC2 and store it
on EBS and S3
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 12. Shipping to Amazon
• There are two mechanisms to get data up onto Amazon’s
EC2, streaming and batch
• Batch is more efficient, we compress the data hourly and
“scp” it up to an EC2 box
-rw-r--r-- archive onix 12710903 7 Dec 2011 OrderBooksRepository_Channel_7_summary_20111125-14_00_54_154.gz
-rw-r--r-- archive onix 19452739 7 Dec 2011 OrderBooksRepository_Channel_7_summary_20111125-15_00_54_785.gz
-rw-r--r-- archive onix 27549005 7 Dec 2011 OrderBooksRepository_Channel_7_summary_20111125-16_00_55_417.gz
• Streaming adds the interesting advantage of having near-
realtime data on EC2
• Not to mention an EC2 box doing little more than writing to disk
• So we started to add monitoring processes to the data
• Triggers, statistics, filters, aggregators etc.
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 13. Terabytes to Petabytes of data
• Tens of gigabytes per channel, per exchange per day
• Several terabytes per channel per exchange per year
• 20 plus exchanges (28 in one example) we’re into petabytes per year
• Fortunately it compresses well (10:1) so we can “archive”
it as tar/gz and load it on-demand into our applications
• Recent data is loaded onto EBS, “old” data onto S3
• Data is mounted form EBS drives on demand
• We now have a few interesting possibilities...
• We can feed the data back for back-testing on-site
• We can sort/filter/analyse it in the cloud
• We can run the back-testing in the cloud
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 14. Introducing the Technologies
• C24 - Integration Objects
• Basically a Java-Binding tool with built-in messaging standards for financial services
• For example Fix, FpML, ISO-20022, SEPA, SWIFT etc. as Java APIs and self-contained
objects that can self-validate
• GigaSpaces
• The best implementation of Sun’s (now Oracle’s) Jini/JavaSpaces
• Powerful distributed implementation of the Master/Worker pattern
• C24 + GigaSpaces
• The ability to onboard / work with financial services messages (as above) directly in
GigaSpaces with minimal work
• Take huge amounts of data, parse it (with C24-iO) and insert it into GigaSpaces
• The “workers” can “take” data from the “space”, execute the task and “write” back
to the “space”
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 15. Fix 5.0 PostTrade - Trade Capture Report...
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 16. A few clicks and a little code
• We click on the message library we need and deploy the
code
• The result can now be used to insert Fix messages (for
example) into GigaSpaces
• We can write using a very simple Java API (including an
ESB such as Mule) or Spring
• A very simple Master/Worker pattern can be deployed into
GigaSpaces to process/filter/enrich/sort the messages
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 17. Spring Integration
• Spring make it all look very easy...
<bean id="FixSourceFactory" class="biz.c24.io.spring.source.FixSourceFactory">
<property name="encoding" value="ASCII"/>
</bean>
<c24:model id="inputFix" base-element="biz.c24.io.fix50sp2.TradeCaptureReportElement" />
! <file:inbound-channel-adapter
id="filesIn"
! directory="file:/Users/jdavies/dev/Spring_C24/spring-integration-samples/input"
filename-pattern="*.fix">
! <int:poller id="poller" fixed-delay="0"/>
! </file:inbound-channel-adapter>
<int-c24:unmarshalling-transformer
source-factory-ref="FixSourceFactory"
model-ref="inputFix"
input-channel="filesIn"
output-channel="Fix-Space" />
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 18. GigaSpaces does the rest
• We now have a space full of Java Objects that represent
the Fix messages
• We can now use generic workers or Map/Reduce to sort/
process the messages
• The same architecture works on your laptop, a server, a
4,000 CPU grid in the cloud
• The latter is obviously a lot faster :-)
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 19. Some examples
• Front Office
• Huge usage of GigaSpace but latency tends to prohibit cloud, however backtesting
and post-trade processing are increasing looking towards the cloud
• Middle Office
• Matching and reconciliation, “what if?” calculations, anti-fraud
• Data tends to need to be tokenised to conform to PCI regulations
• Prime Brokerage
• Large CSV files arriving via FTP
• Parsing, validation, enrichment, transformation & reconciliation etc.
• Payments
• Loyalties & offer calculations
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12
- 20. Thank you
+
=
Financial Services
in the cloud
Confidential Information of C24 Technologies © 2012 C24 Technologies
Monday, 21 May 12