This talk will discuss best practices for scaling SaltStack from thousands to hundreds of thousands of minions. But the devil is in the details and how do you scale without losing performance and making sure it all works? At LinkedIn we've learned some valuable lessons as we've grown our SaltStack footprint. We'll discuss how to run SaltStack, how to not run SaltStack, and how we've contributed to the Salt project to help make it better, stronger and faster.
Youtube: https://www.youtube.com/watch?v=qjFOY-QrW_k
This talk will discuss best practices for scaling SaltStack from thousands to hundreds of thousands of minions.
But the devil is in the details and how do you scale without losing performance and making sure it all works?
At LinkedIn we've learned some valuable lessons as we've grown our SaltStack footprint. We'll discuss how to
run SaltStack, how to not run SaltStack, and how we've contributed to the Salt project to help make it
better, stronger and faster.
0.8.9
runners just added
Outputters just added
Cross calling salt modules using __salt__
0.9.9
Highstate test=True
External pillar
mInion swarm
Bad kwargs in “bar”
We can be sure of it, since (as part of the fix) I added regression tests
Fix for issue where master comm errors cause minions to delete all modules
remove default 2h timeout of pillar fetches (stalls daemon)
Doing work on import means it will happen a *lot*
Normal module, doing nothing (except imports) in the module, then we got this error
With all of these we need some way to sandbox modules from each other, and *more* importantly from breaking the core daemons
More than just “is salt-master running”
To be proactive– we have to know whats broken (and what breaks most)
Sometimes performance is about managing resource usage more than going faster
Well, that’s not good…. But I guess we can deal with that…
Woohoo! Now we have *one* sign-in per minion on start!
Callbacks become a nightmare– as anyone with javascript experience can tell you