We may know that our content is safely stored in the ZODB, but there's a lot more than the Zope Object Database can do for us. In this talk Carlos de la Guardia covers some tips and tricks to do things like rescue crashed databases, do ad-hoc reports of database objects, view the contents of the ZODB outside of Plone, use relstorage and more.
Link to the audio presentation: http://2011ploneconference.sched.org/event/885282df9807bdfec7fa2a16c1fb1ef9
2. Most important performance tip
Find the best size for the ZODB object cache.
How to calculate best size: take amount of
available memory and divide by one ;)
Corollary: Increase RAM as a first step when
you want better performance.
3. Looking inside the ZODB
collective.zodbbrowser is a package that has to
be installed inside Zope and provides access to
all objects and their attributes, including
callables and their source code.
Eye is an external tool that can be used to
browse the ZODB without having to install all
the products it uses.
You can always use the low-tech approach and
use the debug mode of an instance to look at
the values directly using Python.
4. Oh My God, a POSKey error!
I feel your pain.
Unfortunately, getting into the details of how to
fix this would take a full talk.
All is not lost, but you'll need to fire up debug
mode and poke into the internals of your ZODB.
Before anything else: MAKE A BACKUP!
Some detailed information here:
http://plonechix.blogspot.com/2009/12/definitive
-guide-to-poskeyerror.html
5. Getting rid of persistent utilities
Older products that you uninstall sometimes
can leave persistent utilities installed.
This will crash your site, because Zope will try
to import that code.
There is a package that can help (but
remember, backup first!):
http://pypi.python.org/pypi/wildcard.fixpersistent
utilities/
6. Recovering objects
Brute force way: truncate the database
The civiliced way: use zc.beforestorage
%import zc.beforestorage
<before>
before 2008-12-08T10:29:03
<filestorage>
path /zope/var/filestortage/Data.fs
</filestorage>
</before>
7. Searching for transactions
from ZODB.TimeStamp import TimeStamp
from ZODB.FileStorage import FileStorage
storage = FileStorage('/path/to/data.fs', read_only=True)
it = storage.iterator()
earliest = TimeStamp(2010, 2, 26, 6, 0, 0)
# the above is in GMT
for txn in it:
tid = TimeStamp(txn.tid)
if tid > earliest:
print txn.user, txn.description, tid.timeTime(), txn.tid.encode('base64')
for rec in txn:
print rec.pos
8. RelStorage
A storage implementation for ZODB that stores pickles in a
relational database.
It is a drop-in replacement for FileStorage and ZEO.
Designed for high volume sites: multiple ZODB instances
can share the same database. This is similar to ZEO, but
RelStorage does not require ZEO.
According to some tests, RelStorage handles high
concurrency better than the standard combination of ZEO
and FileStorage.
RelStorage starts quickly regardless of database size.
Supports undo, packing, and filesystem-based ZODB
blobs.
Capable of failover to replicated SQL databases.
9. Interesting packages
zodbshootout – benchmark ZEO vs RelStorage
with different backends
zodbupdate – update moved or renamed
classes
dm.historical – get history of objects in the
ZODB
dm.zodb.repair – restore lost objects from a
backup to a target database
zc.zodbactivitylog - provides an activity log that
lets you track database activity
10. Beginner tips for ZODB development
Do not use the root to store objects. It doesn't scale.
Learn about BTrees.
Avoid storing mutable objects, use persistent sub-
objects.
If your objects are bigger than 64k, you need to divide
them or use blobs.
Avoid conflicts, organize application threads and data
structures so that objects are unlikely to be modified
by multiple threads at the same time.
Use data structures that support conflict resolution.
To resolve conflicts, retry. The developer is in charge
of managing concurrency, not the database.
11. Tips From the Experts
I asked some of the old time Zope
developers for some simple tips for using
the ZODB. Here are their responses.
12. David Glick
”If you want instances of a class to have a new
attribute, add it as a class attribute so that existing
instances get a reasonable default”
13. Tips From the Experts
Lennart Regebro
”Products.ZMIntrospection is quick way to look at
all the fields of any ZODB object from the ZMI.”
14. Tips From the Experts
Alec Mitchell
”If you need to store arbitrary key/value pairs: use
PersistentDict when the amount of data is "small"
and/or you tend to require all the data in a given
transaction; use OOBTree (and friends) when you
have a large number of keys and tend to only
need a small subset of them in a transaction.”
15. Tips From the Experts
Alec Mitchell
”If you store data in one of the BTree structures
and you need to count the number of entries,
don't use len(), ever. Use a Btrees.Length object
to keep track of the count separately.”
16. Tips From the Experts
Alan Runyan
”use zc.zlibstorage for txt heavy databases it's a
60-70% storage win for those records. ”
18. Tips From the Experts
Alan Runyan
”Use zc.zodbgc, awesome library which provides
inverse graph of ZODB tree so you can see what
leafs are referneced from”
19. zc.zodbdgc
To use zc.zodbdgc just a part to the buildout
that pulls the egg:
[zodbdgc]
recipe = zc.recipe.egg
eggs = ${instance:eggs}
You can the call the multi-zodb-gc and multi-zodb-
checkrefs.
20. Tips From the Experts
Chris McDonough
”Use the "BTrees.Length" object to implement
counters in the ZODB. It has conflict resolution
built in to it that has the potential to eliminate
conflict errors (as opposed to a normal integer
counter attached to a persistent object).”
21. Tips From the Experts
Tres Seaver
”If you find yourself under intense fire, and
everything around you is crumbling, don't despair,
just increase the ZEO client cache size”
22. Which cache is which
Don't confuse the ZEO client cache with the ZODB
object cache.
The ZODB object cache stores objects in memory for
faster responses. You set it with zodb-cache-size in a
buildout.
The ZEO client cache is used first when amn object is
not in the ZODB object cache and avoids round trips
to the ZEO server. You set it with zeo-client-cache-
size in a buildout.
You can enable cache tracing for analysis by setting
the ZEO_CACHE_TRACE environment variable. More
information at:
http://wiki.zope.org/ZODB/trace.html
23. Tips From the Experts
Jim Fulton
”Avoid non-persistent mutable objects”