Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
3 Ways to Deliver an Elastic, Cost-Effective Cloud Architecture
1. 3 Ways to Deliver an Elastic,
Cost-Effective Cloud
Architecture
Mark Teehan
Snr. Solution Engineer
Confluent (Singapore)
2. Mark Teehan
Sr. Solutions Engineer
Confluent
Rahul Natarajan
P/Cloud Platform Engineer
Accenture
Guru Sattanathan
Solutions Engineer
Confluent
A little bit about us
2
3. Main themes for today
3
Scale Quickly Deploy Anywhere Run efficiently
Faster Better Cheaper
4. Overview of the streaming pipeline
4
Fast Data
Slow Data
Agents StreamsSources
Processing /
Transform
Storage &
Presentation
Use Cases: Streaming ETL Aggregation CDC Data Sync Logs
5. Start at the end and work backwards
5
Start at the end and work backwards
6. Fantasy
Zone
Finding the value sweet spot
Find an initial use case that is valuable, but not
absolutely critical
Avoid low value high risk use cases at first - you
want to understand the tech and the way you’ll
use it first
Nothing exists in the high value low criticality
zone, so don’t bother looking too hard there
Be aware that the opportunity zone borders both
the fantasy and danger zones!
6
Value
Criticality
Danger
Zone
Opportunity
Zone
7. Follow a single thread all the way through
Information flows are rarely as simple as you think they are
Take the time to trace one flow end to end
You can replace just part of a flow at first
You can also branch a flow
7
8. 8
A chain is only as strong as its weakest link
Your data pipeline is a chain
9. Kafka can be your elastic buffer
Plan for resiliency or elasticity in your
pipeline
But don’t over optimize for it
Everyone thinks they’re gonna get millions of
messages per second
9
11. A very brief guide to Kafka
A cluster is… well, a cluster (N brokers)
Brokers are servers that make up a cluster
(they do the actual work)
Topics are logical constructs that consist of
1+ partitions
Partitions live on brokers
As data is written to a topic it is appended to one
partition
Normally partitions have a leader and several
followers who keep copies of data for resiliency
11
12. 12
Our cloud provides aggregated metrics in the UI and via
a REST API so you can see what is happening live
14. Avoiding architectural bottlenecks
Understanding the nature of Apache Kafka
Partitions
• The unit of work in Kafka
• The unit of scale and organization for your
data
Ordering
• Order only exists within partitions
• Providing global order is generally not
worth it
Data placement (proximity)
• Making your stream (Kafka) fast is only
part of the story
• You need to ensure that your processing
can / will be fast
14
15. Dataflow
A graphical tool to help you
quickly locate issues across
your streaming applications
End to end visibility helps
you find where and when
things break. Visibility into:
● Producers
● Topics
● Partitions
● Consumers & Consumer
Groups
15
17. Confluent Cloud
Milliseconds Minutes
Basic, Standard [0-100Mbps]
Do Nothing
Elastic Scaling in SaaS: Confluent Cloud
*Even in public clouds provider quotas for VMs, disks, security groups can be encountered causing delays. Confluent has these limits raised already.
Dedicated [Mbps - Gbps]
1 Click—Select CKU from drop
down in cluster management UI and
click Apply Changes
Other Kafka Services
Days - Weeks
Determine how much capacity is needed
Procure capacity*
Configure new brokers
a. Disks b. OS c. Network d. Kafka (application)
Identify partitions on specific brokers to
rebalance & topics they are part of
For each Topic: migrate partitions
a. Increase ISR +1 b. Wait for new replica to sync
c. Failover master d. Reduce ISR -1 e. Delete old replica
18. Start small and grow
Incremental investment provides lower risk
• You will learn as you go
• It’s easier to adjust when the scope is smaller
Iterate quickly
• Try many lightweight approaches
• Pick the one that works best
Do not be distracted by the long term
• You may not be right the first time
• Don’t risk it all on one big bet
18
20. Recommended limits for Partitions
A single partition has a finite limit
We recommend
• No more than 5 MBps per partition Ingress (write)
• No more than 15MBps per partition Egress (read)
We list or limits and recommendations publicly:
https://docs.confluent.io/current/cloud/features/cluster-types.html#dedicated-cl
uster-ckus-and-limits
Please read them!
20
22. Confluent Cloud: consistent experience on any
provider
Same Experience
• UX / CLI
• Tiers / cluster types
Same services
• Kafka
• ksqlDB
• Connect
• Schema Registry
Fully managed
• Pure SaaS, no infra to
manage
22
23. You can have clusters from many providers in
the same account
23
25. Confluent Cloud and Confluent Platform
Everything we learn in Confluent Cloud goes into Confluent Platform so you can have the
same type of experience you get from our cloud anywhere you need.
We use this tech ourselves: Confluent Operator for K8s is how we multi-cloud
25
Confluent
Platform
Confluent
Cloud
26. Key takeaways
Start small
Keep the entire flow in mind
Understand the end state: start at the end and work backwards
Know when and how to scale, but don’t over provision early / unnecessarily
Run where you want to and where it makes the most sense
26
28. Hybrid streaming
Bridging OT and IT
• Safely bridge with one way flows
Linking remote or satellite locations
intelligently
• Upload aggregate data in
real-time, but full data in
anomaly conditions
• Reducing connectivity costs
Streaming data in real-time instead of
batching at night
• Reducing time to value
28
29. Augmenting critical legacy systems
Enterprises have decades of software and processes in existing platforms
Frontending legacy systems is common already
Before we used queues
Ever try to replay a queue?
Kafka & Confluent won’t replace your
mainframe or SAP, but they can help
you get more from them
29
Connect
Upstream
Data
30. Data distribution: information services
Lower cost data distribution
Replayable syndication streams
Reversing Push and Pull models
Deferring costs from service provider to service
consumer
Ex. Pricing information is an information service
30
Customer A
Customer B
Customer C
Organizational
Boundary