Automating Google Workspace (GWS) & more with Apps Script
Inoreader OpenNebula + StorPool migration
1.
2. Introduction
Introduction
2
Presenter and company intro
Who are we and what we do?
Migration to OpenNebula and StorPool
In order to fix our scalability problems we pinpointed
the need for a virtualization layer and distributed
storage. After thorough research we ended up with
OpenNebula and StorPool
Inoreader
What is Inoreader and what challenges we
faced while building and maintaining it?
Tips
Infrastructure issues
We were facing numerous scalability issues
while at the same time we hade a an array of
servers doing nothing mostly because of filled
storage. At certain point we hit a brick wall.
QA
If you have any questions I will gladly answer
them
Some useful takeaways for you.
3. I have 10+ years of experience in the Telco IT sector, working with large enterprise solutions as well as
building specialized solutions from scratch.
I have founded a company called Innologica in 2013 with the mission of developing Next-Gen OSS and
BSS solutions. A side project was born back then called Inoreader, which quickly turned into a leading
platform for content consumption and is now a core product of the company.
Yordan Yordanov
3
CEO Innologica
4.
5. Who Are We?
5
Product company
We are not a sweatshop.
We make successful
products.
International market
Our customers are all over
the globe.
Relaxed environment
We do not push the devs,
but we cherish top
performers.
Smart team
The team is small, but
each member brings great
value.
6. Inoreader
RSS News aggregator and information hub
6
150,000 DAU
We have 150k daily active users (DAU) and more than
30k simultaneous sessions in peak times. Closing in on
1M registered users soon. 10k and counting premium
subscribers.
15,000,000,000 articles in MySQL and ES
We keep the full archive in enormous MySQL Databases
and a separate Elasticsearch cluster just for searching.
Around 20TB of data without the replicas. 10M+ new
articles per day.
1,000,000 feed updates per hour
We need to update our 10+ Million feeds in a timely
manner. A lot of machines are dedicated for this task
only.
40 VMs and 10 physical hosts
The platform is currently running on 30 Virtual Machines
mainly in our main DC. There are some physical hosts
that were not good candidates for virtualization mainly for
Elasticsearch.
7. 7
Extreme Makeover
The old and the new setup
7
100% Virtualized
No more services running
directly on bare-metal.
Lighter power
footprint300% more capacity with
60% of the previous
servers with room for
expansion.
Performance gains
Huge compute and storage
performance gains.
Maintainability is a breeze
too.
9. Hardware capacity
9
We needed to constantly buy new servers just to keep up with the
growing databases, because local storages were being quickly
exhausted.
We were using expensive RAID cards and RAID-10 setups for all
databases. Those severs never used more than 10% of their CPUs,
so it was a complete waste of resources.
Our problem
CPU
10%
Memory
Storage
Rack space
50%
90%
100%
10. Hardware failures
Not so common but always hair-pulling
10
All components are bound to fail. Whenever we lose a server, there
was always at least some service disruption if not a whole outage.
All databases needed to have replications, which skyrocketed server
costs and didn’t provide automatic HA. If a hard-drive fails in a
RAID-10 setup you need to replace it ASAP. Bigger drives are more
prone to cause errors while rebuilding.
Large databases on RAID-10 are slow to recover from crashes, so
replications should be carefully set up and should be on identical
(expensive) hardware in case a replication should be promoted to a
master.
Nobody likes to go to a DC on Saturday to replace a failed drive,
reinstall OS and rotate replications. We much prefer to ride bikes!
Problem description
12. Project Timeline
12
2017
Nov 2017
Nov 2017 – Jan 2018
Feb 2018
Mar 2018
PROJECT START
We knew for quite a while
that we need a solution to
the growth problem.
PLANNING AND FIRST TESTS
While the hardware was in
transit we took our time to
learn OpenNebula and test
it as much as possible
SUCCESS
We have finally migrated
our last server and all VMs
were happily running on
OpenNebula and StorPool.
CHOOSING A SOLUTION
We held some meetings
with vendors and
researched different
solutions
EXECUTION
We have migrated all
servers through several
iterations which will be
described in more detail
here
13. Hardware
13
StorPool nodes
We chose a standard 3x SuperMicro SC836 3U servers.
Switches
As recommended by StorPool we chose Quanta LB8 for
the 10G network and Quanta LB4-M for the Gigabit
network.
Hypervisors
We have reused our old servers, but modified their CPUs
and memory.
Others
10G LAN cards and cables
14. StorPool Nodes
14
StorPool recommends to use commodity hardware. Supermicro
offers a good platform without vendor specific requirements for RAID
cards, etc. and is very budget friendly.
Our setup:
• Supermicro CSE-836B chassis
• Supermicro X10SRL-F motherboard
• 1x Intel Xeon E5-1620 v4 CPU (8 threads @3.5Ghz)
• 64GB DDR4-2666 RAM
• Avago 3108L RAID controller with 2G cache
• Intel X520-DA2 10G Ethernet card
• 8x 4TB HDD LFF SATA3 7200 RPM
• 8x 2TB HDD LFF SATA3 7200 RPM (reused from older servers)
Around 3300 EUR per server
15. Gigabit Network – Quanta LB4M
15
We were struggling with some old TP-Link SG2424 switches that we
wanted to upgrade, so we used the opportunity to upgrade the
regular 1G network too. We chose the Quanta LB4M.
Key aspects
• 48x Gigabit RJ45 ports
• 2x 10G SFP+ ports
• Redundant power supplies
• Very cheap!
• EOL – You might want to stack up some spare switches!
• Stable (4 months without a single flop for now)
Around 250 EUR per switch from eBay.
16. 10G Network – Quanta LB8
16
Again due to StorPool recommendation we procured three Quanta
LB8 switches. They seem to be performing great so far.
Key aspects
• 48x 10G SFP+ ports
• Redundant power supplies
• Very cheap for what they offer!
• EOL – You might want to stack up some spare switches!
• Stable (4 months without a single flop for now)
700-1000 EUR per switch from eBay including customs taxes.
17. Hypervisors
17
We have reused our old servers, but with some significant upgrades.
We currently have 12 hypervisors with the following configuration:
• Supermicro 1U chassis with X9DRW motherboards
• 2x Intel Xeon E5-2650 v2 CPU (32 total threads)
• Dual power supply
• 128G DDR3 12800R Memory
• Intel X520-DA2 10G card
• 2xHDD in mdraid for OS only
19. New Rack
19
We have rented a new rack in our collocation center since we didn’t
have any more space available in the old rack.
The idea was simple – Deploy StorPool in the new rack only and
gradually migrate hypervisors.
20. StorPool Nodes
20
The servers landed in our office in late January.
It was Friday afternoon, but we quickly installed them in the lab and
let the StorPool guys do their magic over the weekend.
21. Installation Day
21
The next Monday StorPool finished all tests and the equipment was
ready to be installed in our DC.
22. Installation Day
22
Fast forward several hours and we had our first StorPool cluster up
and running. Still not hypervisors. StorPool needed to perform a full
cluster check in the real environment to see if everything works well.
23. First hypervisors
23
The very next day we installed our first hypervisors – the temporary
ones that were holding VMs installed during our test period. Those
VMs were still running on local storage and NFS.
The next step was to migrate them to StorPool.
24. VM Migration to StorPool
24
Shut down the VM
Use SunStone or cli to shut
down the VM.01
Create StorPool volumesOn the host, use the storpool cli
to create volume(s) for the VM
with the exact size of the original
images
02
Copy the VolumesUse dd or qemu-convert for raw
and qcow2 images respectively
to copy the images to the
StorPool volumes.
03
Reattach imagesDetach local images and attach
StorPool ones. Mind the order.
There’s a catch with large
images*
04
Power up the VM
Check if the VM boots properly.
We’re not done yet…05
Finalize the migrationTo fully migrate persistent VMs use
the Recover -> delete-recreate
function to redeploy all files to
StorPool.
06
*Large images (100G+) takes forever to detach on slow local storage, so we had to kill the cp process and use the onevm recover success
option to lie to OpenNebula that the detach actually completed. This is risky but save a LOT of downtime.
After all VMs are migrated, you can delete the old system and image datastores and leave only StorPool DSs
At this point we are completely on StorPool!
StorPool helps their customers with this step, but here’s the summary of what we did.
25. Next hypervisors
25
From here on we had several iterations that consisted of roughly the
following:
• Create a list of servers for migration. The more hypervisors the
more servers we can move in a single iteration
• Create VMs and migrate the services there
• Use the opportunity to untangle microservices running on the
same machine
• Make sure servers are completely drained from any services.
• Shut down the servers and plan a visit to the DC the next day
• Continue on the next slide…
31. RINSE AND REPEAT
At each iteration we move more servers at
once because we have more capacity for
VMs
32. Current capacity
32
At the end we have achieved 3x capacity boost in terms of
processing power and memory with just a fraction of our previous
servers, because with virtualization we can distribute the resources
however we’d like. In terms of storage we are on a completely
different level since we are no longer restricted to a single machine
capacity, we have 3x redundancy and all the performance we need.
We did it!
Allocated CPU
37%
Allocated Memory
Storage
Rack space
32%
67%
70%
33. Our Dashboard
33
A glimpse at our OpenNebula dashboard.
336 CPU cores and 1.2TB of RAM in just 12 hypervisors.
34. Hypervisor view
34
All hypervisors are all nicely balanced using the default
scheduler.
There’s always enough room to move VMs around in case a
hypervisor crashes or if we need to reboot a host.
36. Optimize CPU for homogenous clusters
36
Available as template setting since OpenNebula 5.4.6. Set to host-
passthrough.
This option presents the real CPU model to the VMs instead of the
default QEMU CPU. It can substantially increase the performance
especially if instructions like aes are needed.
Do not use it if you have different CPU models across the cluster
since it will cause the VMs to crash after live migration.
For older OpenNebula setups set this as RAW DATA in the
template:
<cpu mode="host-passthrough"/>
37. Beware of mkfs.xfs on large StorPool volumes inside VMs
37
We noticed that when doing mkfs.xfs on large StorPool volumes
(e.g. 4TB) there was a big delay before the command completes.
What’s worse is that during this time all VMs on this host starve for
IO, because the storpool_block.bin process is using 100% CPU
time.
The image shown on the left is for 1TB volume.
The reason is that mkfs uses TRIM by default and the StorPool
driver support that.
To remedy it use -K option for mkfs.xfs or -E nodiscard for
mkfs.ext4, e.g.:
• mkfs.xfs -K /dev/sdb1
• mkfs.ext4 -E nodiscard /dev/sdb1
38. Use the 10G network for OpenNebula too
38
This is probably an obvious one, but it deserves to be mentioned. By
default your hosts will probably resolve others via the regular Gigabit
network. Forcing them to talk through the 10G storage network will
drastically improve the live VM migration. The migration is not IO
bound so it will completely saturate the network.
Usually a simple /etc/hosts modification.
Consult with StorPool for your specific use case before doing that.
Live migrating a VM with 8G of ram takes 7 seconds on 10G. The
same VM will take aboud 1.5 minutes on a Gigabit network and will
probably disturb VM communications if the network is saturated.
Live migration on highly loaded VMs can take significantly longer
and should be monitored. In some cases it’s enough to stop busy
services for just a second for the migration to complete.
39. Other tips
39
Those are the more obvious ones that probably everyone uses in
production, but still worth mentioning.
• Use cache=none, io=native when attaching volumes
• Use virtio networking instead of the default 8139 nic. The latter
has performance issues and drops packets when host IO is high
• Measure IO latency instead of IO load to judge saturation. We
have several machines with constant 99% IO load which are
doing perfectly fine.
/etc/one/vmm_exec/vmm_exec_kvm.conf:
…
DISK = [ driver = "raw" , cache = "none", io = "native",
discard = "unmap", bus = "scsi" ]
NIC = [ filter = "clean-traffic", model="virtio" ]
….
41. Grafana Dashboards
41
We have adapted the OpenNebula Dashboards with
Graphite and Grafana scripts by Sebastian Mangelkramer
and used them to create our own Grafana dashboards so
we can see at a glance which hypervisors are most loaded
and how much overall capacity we have.
42. Grafana TV Dashboard
42
Why not have a master dashboard on the TV at the office? This
gives our team a very quick and easy way to tell if everything is
working smoothly.
If all you see is green, we’re good
This dashboard show our main DC on the first row, our backup DC
on the second and then some other critical aspects of our system.
It’s still a WIP, hence the empty space.
At the top is our Geckoboard that we use for more business KPIs.
43. Server Power Usage in Grafana
43
Part of our virtualization project was to optimize the
electricity bill by using less servers. We were able to easily
measure our power usage by using Graphite and Grafana.
If you are interested, the script for getting the data into
Graphite is here:
https://gist.github.com/Jacketbg/6973efdb41a2ecfcf2a83ea8
4c086887
The Grafana Dashboard can be found here:
https://gist.github.com/Jacketbg/7255b4f81ebb2de0e8a570
8b4335c9d7
Obviously you will need to tweak it, especially the formula
for the power bill.
44. StorPool’s Grafana
44
StorPool were nice to give us an access to their own
Grafana instance where they collect a lot of internal data
about the system and KPIs. It gives us great insights that
we couldn’t get otherwise so we can plan and estimate the
system load very well.
45. What’s Left?
45
SSD Pool
We are currently only using a HDD pool, but we could
benefit from a smaller SSD pool for picky MySQL
databases.
Add more hypervisors
As the service grows our needs will too. We will probably
have rack space for the near years to come.
Add more StorPool nodes
We have maxed out the HDD bays on our our current
nodes, so we’ll probably need to add more nodes in the
future.
Upgrade StorPool nodes to 40G
Currently the nodes use 2x10G ports like the
hypervisors. After adding an SSD pool we are
considering upgrading to 40G
46. THANK YOU !
READ MORE ON
BLOG.INOREADER.COM
GET THIS PRESENTATION FROM ino.to/one-
sofia