5. Large Computing
Needs
▪ Facebook, Google, ...
▪ more than any OS can provide
6. Happy Hardware Vendor Law
The amount of nodes needed to solve a given task doubles every now and again.
7. OS Scalability Limit
▪ 1 node only
▪ multi-socket and stacks approaching NUMA
▪ E25K, z10, etc — fail for most purposes
8. Operating System — ?
▪ traditional definition no more relevant
▪ the notion itself on the brink of obsolescence
▪ field heavily eroded by current distributed apps
9. Distributed
Applications
▪ forced to be an OS unto themselves
▪ huge overlap
▪ huge opportunity for sharing and consolidation
17. Machine Generated
Data
▪ logs, error messages, status monitors
▪ meant for humans... no more
▪ rethinking for better aggregation and analysis
18. Identity and
Authentication
▪ YP, LDAP outdated and poorly supported
▪ no distributed model
▪ passwd in git as a first stab
19. Remote Procedure Call
▪ ssh losing relevance, HPN or not
▪ all-mighty agent daemon worse than rsh
▪ capabilities, RBAC, WoT
20. Hardware Failures
▪ no culture for low-level fault-tolerance
▪ watchdogd as state-of-the-art self-healing
▪ focus on self-diagnostics: disk error counters, etc
21. Distributed
Configuration
▪ current anti-patterns worsen the problem
▪ role-aware configuration
▪ / in git as a second stab
22. Storage
▪ intra-node redundancy irrelevant
▪ no appropriate local multi-disk FS
▪ no fast path for data exchange
▪ nginx + curl + dispatcher
23. Error Handling
▪ cf MGD and hardware failures
▪ software is 10x more prone to failures
▪ serious problem at scale