An application designer usually has to choose where to trade flexibility for specificity (and thus usually performance); knowing when and where to do so is an art and requires experience. This talk will share over a decades worth of experience making these decisions and the learnings from developing Pivotal's successful Real Time Intelligence (RTI) product using the latest versions of Spring projects: Integration, Data, Boot, MVC/REST and XD. A walk through the RTI architecture will provide the base for an explanation about how Spring performs at hundreds (and millions) of events/operations per second and the techniques that you can use right now in your own Spring applications to minimise resource utilisation and gain performance.
2. Bio
2
Stuart Williams
• Software Engineer at Pivotal
– RTI project lead
@pidster
3. What is this all about?
• We built a product using Pivotal products
• Learned some lessons
• We found a few limitations & some room for
improvement…
• … but we addressed them & now things go faster.
A lot faster.
3
4. Dogfood
• Built with Spring IO Platform
• Boot, Data, Integration, Reactor, AMQP, SpEL, Shell (and a little Groovy)
• GemFire
• RabbitMQ
4
Spring
Framework
Spring
Data
Spring
Integration
Spring
Reactor
Spring
AMQP
Spring
Hateoas
Groovy
Spring
Boot
5. Questions for you
• Heard of Spring Integration?
• Tried it?
• In production?
• Heard of Reactor?
• Tried it?
• In production?
5
7. What is RTI?
• RTI == ‘Real Time Intelligence’
• Stream processing application
• Location based services
• Analytics (e.g. network health)
• Telecom network data
• ‘Control plane’ is meta data
• ‘User plane’ is actual data (30x more)
• Rich data model
7
8. Input Data Rates
RTI*
• 100k/s average
• 120k/s daily peak
• 1M/s annual peak
8
*Control-plane only, user-plane is 20x
9.
10. Input Data Rates
RTI*
• 100k/s baseline
• >120k/s daily peak
• >1M/s annual peak
10
*Control-plane only, user-plane is 20x
Twitter**
• 6k/s average
• 9k/s daily peak
• 30k/s large events
**Source @catehstn
twitter.com/catehstn/status/494918021358813184
11. Load Characteristics
• Low numbers of inbound connections
• High rates, micro-bursts
• Occasional peaks of nearly 2x, rare peaks of 10x
• Variable payload size (200B – 300KB)
• Internal fan-outs multiple event rates
11
12. More statistics…
• 100k/s order of magnitude
• 8,640,000,000 (per day)
• An Integer based counter will ‘roll over’ in ~2 days
• 400Mbps of raw data (‘control plane’)
• 10Gbps NICs required to support traffic peaks
• Logging! Any verbose errors can blow a disk away
• Queues backing up == #fail
12
23. Solution #1 – ‘Naïve’ proof of concept
• Build codecs
• More on this in John Davies’ “Big Data In Memory” talk later today…
• Spring Integration (SI) pipeline
• TCP Inbound Adapter
• Transformer
• Filters
• Outbound adapter
23
28. Solution #2 – Use interfaces
• Use the IdGenerator interface
• Use specific endpoint interfaces
• … we’ll come back to SpEL …
• Use a Chain
• Use an Aggregator to build a batch
28
29. Solution #2 results
• IdGenerator helped a lot
• Specific interfaces not recognised!
• Using <int:chain helped
• Aggregator helped, but is too slow
• <int:tcp-inbound-adapter is too slow
29
38. Spring Integration
• UUID generator
• MessageBuilderFactory & MutableMessage
• Dispatcher optimisation
• SpEL parser caching
• Counters are ‘long’
• Interfaces used directly – if you’re specific SI
respects that
38
39. Spring Expression Language
• Compilation of expressions
• Configuration options
• SI just re-evaluates expressions
• Trade-offs & limitations
• Much, much faster
39
43. Summary
• Spring Integration is much faster
• Good performance means better resource
utilisation
• For cloud applications cost savings can be dramatic
• Batching payloads makes a big difference
• Many applications wait on network IO
• Trade-off risk of data loss for performance
• Reactor FTW
43
45. 45
Learn More. Stay Connected
Tweet #rti #s2gx if you’d like to go faster
@pidster
“Big Data in Memory”
John Davis – Trinity 1-2 4.30pm
@springcentral | spring.io/video