2. 1Virtual Gerrit User Summit 2020 – On-line GerritForge.com 1
About GerritForge
Founded in
the UK
HQ in London with
presence in Europe and
the USA (GerritForge Inc.)
Committed to
OpenSource
and to Gerrit
Code Review
since 2009
3. 2Virtual Gerrit User Summit 2020 – On-line GerritForge.com 2
Gerrit caches: the problem
Gerrit <= v2.16: ReviewDB + indexes
• Low latency for reading review data
• Ability to join across tables (changes, patch_sets, messages, …)
• DB auto-vacuuming
4. 3Virtual Gerrit User Summit 2020 – On-line GerritForge.com 3
Gerrit caches: the problem
Gerrit >= v2.16: NoteDb + Lucene indexes
• Low latency for reading index data: change details?
• Change data is all de-normalized on NoteDb: huge JSON payload
• No auto-vacuuming: frequent JGit needed
5. 4Virtual Gerrit User Summit 2020 – On-line GerritForge.com 4
Gerrit caches: the problem
Gerrit guava caches to the rescue
• Increase in number of Gerrit caches
• Increase in use of caches for NoteDb, where possible
• Migration to a Java Caffeine cache backend: much faster
6. 5Virtual Gerrit User Summit 2020 – On-line GerritForge.com 5
Gerrit caches: the problem
What about Gerrit restarts?
• In-memory caches are empty
• High pressure on JGit and repository access at startup
• Higher costs in terms of NFS data utilization (e.g. AWS’s EFS)
8. 7Virtual Gerrit User Summit 2020 – On-line GerritForge.com 7
Gerrit caches: the problem
GerritHub.io experiences outages in some sites
commit 87c332644c0442c33227750e0d29381798577d55
Author: Luca Milanesio <luca.milanesio@gmail.com>
Date: Sat Oct 26 20:03:06 2019 +0000
Revert "Enable change_notes persistent cache"
This reverts commit 83008efeaf9c987b3bfe2a8da23f21c73ccc620a.
Reason for revert: Blocks the JVM during the huge H2 defragmentation
Change-Id: Ib42a425dbc13bafc219e0625d7ce999f2e8afc2b
9. 8Virtual Gerrit User Summit 2020 – On-line GerritForge.com 8
Gerrit caches: the problem
GerritHub.io: H2-based persistent GitHub caches
(https://bugs.chromium.org/p/gerrit/issues/detail?id=10276)
Not so easy to reproduce: the problem is not when GitHub denies access straight away but rather when it is
stuck for minutes, and that creates a series of deadlocks in the H2 cache.
GitHub groups are already cached in the github-plugin and the cache is persisted also. However, H2 locking
isn't great and when you have multiple concurrent requests for different groups to be loaded from GitHub, the
APIs are stuck (because of whatever happens on their side) Gerrit gets stuck for a *very long time* because
the H2 table is huge.
10. 9Virtual Gerrit User Summit 2020 – On-line GerritForge.com 9
H2 background: fragmentation + compaction
H2 is a RDBMS: vacuum & compaction
(http://www.h2database.com/html/features.html#compacting)
Empty space in the database file re-used automatically. When
closing the database, the database is automatically compacted for
up to 200 milliseconds by default.
The problem:
JVM threads accessing cache locked for > 200 ms
11. 10Virtual Gerrit User Summit 2020 – On-line GerritForge.com 10
H2 background: locking
H2 is a RDBMS: needs locking on read/writes
(http://www.h2database.com/html/advanced.html)
If a connection wants to write to a table (update or delete a row), an
exclusive lock is required. To get the exclusive lock, other
connection must not have any locks on the object.
The problem:
Concurrent writes to the same cache are locking each other
12. 11Virtual Gerrit User Summit 2020 – On-line GerritForge.com 11
Gerrit solution: pluggable persistent cache
Gerrit v2.14 allows non-H2 implementations
(https://gerrit-review.googlesource.com/c/gerrit/+/176973)
Introduce CacheImpl annotation
There is existing mechanism to provide different implementation for
instance to secure-store but one can replace only certain binding
with it (through provider class). In this case when H2CacheImpl is
installed it adds more bindings and replacing only this class would
keep leftovers being still initiated/running in Gerrit core.
CacheImpl annotation contains 2 types:
MEMORY
PERSISTENT
It is applied to modules that provide corresponding default caches
implementations.
When CacheImpl annotation is added to lib module then it will
override particular default implementation.
Change-Id: I7562b210fad4c5f6dc67887f627cf76815a378cb
Signed-off-by: Jacek Centkowski <jcentkowski@collab.net>
13. 12Virtual Gerrit User Summit 2020 – On-line GerritForge.com 12
Gerrit solution: pluggable persistent cache
Gerrit v3.3 allows to use it in production
(https://gerrit-review.googlesource.com/c/gerrit/+/284195)
Owner:Marcin Czech
Use persistent cache provided by libModule for offline reindex
Gerrit offline reindexing ignores persistent cache implementation
provided as libModule and always generates caches using H2
implementation. This fix allows to use a different persistent cache
backend.
Using different cache implementation during the offline reindexing
and at runtime causes situation when Gerrit starts without any
precomputed caches which significantly impacts overall performance.
On the other hand offline reindexing cannot reuse existing caches
which impacts reindexing performance.
Bug: Issue 13464
Change-Id: I36305282e8ea583dfb37f629e41d219762c3b4a3
14. 13Virtual Gerrit User Summit 2020 – On-line GerritForge.com 13
First NoSQL implementation: ChronicleMap
First production-ready libModule for Gerrit cache
(https://gerrit.googlesource.com/modules/cache-chroniclemap/)
• Based on Open-Source ChronicleMap high-speed persistent cache
(https://github.com/OpenHFT/Chronicle-Map)
• Designed for low-latency
• Never blocks, read or write operations
• Static file allocation: no need to vacuum or compaction
15. 14Virtual Gerrit User Summit 2020 – On-line GerritForge.com 14
First NoSQL implementation: ChronicleMap
DEMO
16. 15Virtual Gerrit User Summit 2020 – On-line GerritForge.com 15
Q&A: excited about the future of Gerrit?
Image from: http://cypp.rutgers.edu/ru-voting/political-information/public-opinion-polls/
17. 16Virtual Gerrit User Summit 2020 – On-line GerritForge.com 16
Wants to know more?
GerritForge.com/contact