The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
Rnotify
1. Rnotify
A Scalable Distributed
filesystems Notifications
Solution for Applications
Ashwin Raghav
www.rnotifications.com
github.com/ashwinraghav/rnotify-c/
1
1Tuesday, April 30, 13
2. Agenda
• Motivation
• Problem Statement / State of the art
• General Overview
• Hypothesis
• Approach
• Evaluation
• Conclusion
2
2Tuesday, April 30, 13
3. Motivation
• Applications need File System
Notifications
• Previously applications polled
file systems naively
• Now,All Operating Systems
provide FS Notifications API
3
3Tuesday, April 30, 13
5. Problems / State of the art
Use ad-hoc (polling) implementations for Distributed FS.
Polling creates an unfortunate tension between
resource consumption and timeliness
Any general solution must be location transparent,
scalable, tunable.
Use inotify to subscribe to local filesystems
5
5Tuesday, April 30, 13
6. Requirements
• Compatibility with existing applications that use Inotify
• Provide Horizontal Scalability, Decomposition of Functionality,
Tunable Performance
• Location Transparency
• High Throughput notifications per client
6
6Tuesday, April 30, 13
8. Related Work
• FAM (File Alteration Monitor) - does not scale
• Internet scale systems like Thialfi, Zookeeper are built for larger scales
of clients.
• Bayeux, Scribe, Siena, Hermes, Swag etc assume overlay networks to
establish multicast trees for message dissemination
• Inotify was introduced in kernel 2.6.13 - for local FS notifications
8
8Tuesday, April 30, 13
10. Hypothesis
As a result of clearly decomposing functionality into
replicable components, Rnotify can be tuned to fit different
notification workloads to consistently deliver notifications
at low latency.
10
10Tuesday, April 30, 13
11. Key Properties
• Low Latency Notifications (under 10ms)
• Compatible with applications that use Inotify
• Tuned to fit workloads
• Greedy Applications can use Rnotify by distributing their
workloads across nodes.
11
11Tuesday, April 30, 13
21. Representing State - Publisher
Get all
Subscribers
Get all
Notifications
File Id IP address of Subscribers
1 192.168.1.2:3000
192.168.3.4:3001
2 192.168.1.2:3000
192.168.3.4:3001
Subscriber Undelivered Notifications
192.168.1.2:3000 N1, N2, N3
192.168.3.4:3001 N4, N5, N6
File Id Notifications
1 N1, N2, N3,
2 N4, N5
Append new
Notification
21
21Tuesday, April 30, 13
26. Dispatcher Replication
• Dispatcher is provided the registrar location at startup
• It acquires the publisher list from the registrar
transactionally.
• Inform the Proxies independently
26
26Tuesday, April 30, 13
27. Evaluation Strategy
Mid size GlusterFS
deployment on EC2
Postmark Benchmark
to represent FS activity
Using Chef to startup
serviced clients
Measure Latency end
to end
8xl machines with 32 cores each
helped simulate several clients each
All machines were
acquired within a
placement group
27
27Tuesday, April 30, 13
28. Evaluation - Scalability
Tune Dispatchers based on FS throughput
Tune Publishers based on number of clients
28
28Tuesday, April 30, 13
29. Scalability - Overactive FileSystems
Post Mark threads writing to different
directories29
29Tuesday, April 30, 13
30. Scalability - Overactive FileSystems
PostMark threads writing to same directory30
30Tuesday, April 30, 13
31. PostMark threads
writing to different
files
PostMark
threads writing
to same files
Applications like
web/mail server
HPC
applications
Scalability - Overactive FileSystems
31
31Tuesday, April 30, 13
35. Comparison to naive Polling
• Developed a poller
Node.js REST API
• For just 100 clients , 5
files, 50000 stats per
second
• Has an extremely heavy
footprint on the FS
performance
35
35Tuesday, April 30, 13
36. Greedy Applications
• Increasing the number of
notifications delivered
per client
• Linear increase in latency
• Messages spend more
time in queues
36
36Tuesday, April 30, 13
38. Greedy Applications
If you need to consume
more notifications,
Distribute yourself
Inefficient
Application
38
38Tuesday, April 30, 13
39. Summary - Why is this
work different?
• FAM does not scale and is obsolete.
• All PubSub systems do not cater to many notifications per
client
• Multicast Trees are established for reliability (Performance
suffers)
• Pub Sub systems provide a richer set of semantics with lower
performance
39
39Tuesday, April 30, 13
40. Future Work
• Introduce a security model
• Introduce message ordering
• Provide message delivery reliability
40
40Tuesday, April 30, 13
41. Conclusion
• Rnotify is a solution to receive notifications from POSIX
compliant Distributed File Systems
• Tuned to fit different notification workloads
• Incrementally Scalable, location transparent and mimics Inotify
• We have tested Rnotify to scale to 2.5 million notifications per
second
• Latency under 10ms for 88% notifications
41
41Tuesday, April 30, 13