2. Contents
• About Caches
• Why caching works
• Will an Application Benefit from Caching?
• How much will an application speed up?
• About Ehcache
• Features of Ehcache
• Key Concepts of Ehcache
• Using Ehcache
• Distributed Ehcache Architecture
• References
3. About Caches
• In Wiktionary
– A store of things that will be required in future and can be
retrieved rapidly
• In computer science
– A collection of temporary data which either duplicates data
located elsewhere of is the result of a computation
– The data can be repeatedly accessed inexpensively
4. Why caching works
• Locality of Reference
– Data that is near other data or has just been used is more likely to be used
again
• The Long Tail
A small number of items may make up the
bulk of sales. – Chris Anderson
– One form of a Power Law distribution is the Pareto distribution (80:20 rule)
– IF 20% of objects are used 80% of the time and a way can be found to
reduce the cost of obtaining that 20%, then system performance will improve
5. Will an Application Benefit from Caching?
CPU bound Application
• The time taken principally depends on the speed of the CPU
and main memory
• Speeding up
– Improving algorithm performance
– Parallelizing the computations across multiple CPUs or multiple
machines
– Upgrading the CPU speed
• The role of caching
– Temporarily store computations that may be reused again
• Ex) DB Cache, Large web pages that have a high rendering cost.
6. Will an Application Benefit from Caching?
I/O bound Application
• The time taken to complete a computation depends principally
on the rate at which data can be obtained
• Speeding up
– Hard disks are speeding up by using their own caching of blocks into
memory
• There is no Moore’s law for hard disk.
– Increase the network bandwidth
• The role of cache
– Web page caching, for pages generated from databases
– Data Access object caching
7. Will an Application Benefit from Caching?
Increased Application Scalability
• Data bases can do 100 expensive queries per second
– Caching may be able to reduce the workload required
8. How much will an application speed up?
(Amdahl’s Law)
• Depend on a multitude of factors
– How many times a cached piece of data can and is
reduced by the application
– The proportion of the response time that is alleviated by
caching
• Amdahl’s Law
P: Proportion speed up
S: Speed up
9. Amdahl’s Law Example
(Speed up from a Database Level Cache)
Un-cached page time: 2 seconds
Database time: 1.5 seconds
Cache retrieval time: 2ms
Proportion: 75% (2/1.5)
The expected system speedup is thus:
1 / (( 1 – 0.75) + 0.75 / (1500/2))
= 1 / (0.25 + 0.75/750)
= 3.98 times system speedup
10. About Ehcache
• Open source, standards-based cache used to boost performance
• Basically, based on in-process
• Scale from in-process with one more nodes through to a mixed in-
process/out-of-process configuration with terabyte-sized caches
• For applications needing a coherent distributed cache, Ehcache uses
the open source Terracotta Server Array
• Java-based Cache, Available under an Apache 2 license
• The Wikimedia Foundation use Ehcache to improve the performance
of its wiki projects
11. Features of Ehcache(1/2)
• Fast and Light Weight
– Fast, Simple API
– Small foot print: Ehcache 2.2.3 is 668 kb making it convenient to package
– Minimal dependencies: only dependency on SLF4J
• Scalable
– Provides Memory and Disk store for scalability into gigabytes
– Scalable to hundreds of nodes with the Terracotta Server Array
• Flexible
– Supports Object or Serializable caching
– Provides LRU, LFU and FIFO cache eviction policies
– Provides Memory and Disk stores
12. Features of Ehcache(2/2)
• Standards Based
– Full implementation of JSR107 JCACHE API
• Application Persistence
– Persistent disk store which stores data between VM restarts
• JMX Enable
• Distributed Caching
– Clustered caching via Terracotta
– Replicated caching via RMI, JGroups, or JMS
• Cache Server
– RESTful, SOAP cache Server
• Search
– Standalone and distributed search using a fluent query language
13. Key Concepts of Ehcache
Key Classes
• CacheManager
– Manages caches
• Ehcache
– All caches implement the Ehcache interface
– A cache has a name and attributes
– Cache elements are stored in the memory store, optionally the also overflow
to a disk store
• Element
– An atomic entry in a cache
– Has key and value
– Put into and removed from caches
14. Key Concepts of Ehcache
Usage patterns: Cache-aside
• Application code use the cache directly
• Order
– Application code consult the cache first
– If cache contains the data, then return the data directly
– Otherwise, the application cod must fetch the data from the system-of-record,
store the data in the cache, then return.
– 0
15. Key Concepts of Ehcache
Usage patterns: Read-through
• Mimics the structure of the cache-aside patterns when reading data
• The difference
– Must implement the CacheEntryFactory interface to instruct the cache how to
read objects on a cache miss
– Must wrap the Ehcache instance with an instance of SelfPopulationCache
– 4
16. Key Concepts of Ehcache
Usage patterns: Write-through and behind
• Mimics the structure of the cache-aside pattern when data write
• The difference
– Must implement the CacheWriter interface and configure the cache for write-through or write
behind
– A write-through cache writes data to the system-of-record in the same thread of execution
– A write-behind queues the data for write at a later time
– d
17. Key Concepts of Ehcache
Usage patterns: Cache-as-sor
• Delegate SOR reading and writing actives to the cache
• To implement, use a combination of the following patterns
– Read-through
– Write-through or write-behind
• Advantages
– Less cluttered application code
– Easily choose between write-through or write-behind strategies
– Allow the cache to solve the “thundering-herd” problem
• Disadvantages
– Less directly visible code-path
18. Key Concepts of Ehcache
Storage Options: Memory Store
• Suitable Element Types
– All Elements are suitable for placement in the Memory Store
• Characteristics
– Thread safe for use by multiple concurrent threads
– Backed By LinkedHashMap (Jdk 1.4 later)
• LinkedHashMap: Hash table and linked list implementation of the Map interface
– Fast
• Memory Use, Spooling and Expiry Strategy
– Least Recently Used (LRU): default
– Least frequently Used (LFU)
– First In First Out (FIFO)
19. Key Concepts of Ehcache
Storage Options: Big-Memory Store
• Pure java product from Terracotta that permits caches to use an additional type of
memory store outside the object heap. (Packaged for use in Enterprise Ehcache)
– Not subject to Java GC
– 100 times faster than Disk-Store
– Allows very large caches to be created(tested up to 350GB)
• Two implementations
– Only Serializable cache keys and values can be placed similar to Disk Store
– Serializaion and deserialization take place putting and getting from the store
• Around 10 times slower than Memory Store
• The memory store holds the hottest subset of data from the off-heap store, already in deserialized form
• Suitable Element Types
– Only Elements which are serializable can be placed in the off-heap
– Any non serializable Elements will be removed and WARNING level log message emitted
20. Key Concepts of Ehcache
Storage Options: Disk Store
• Disk Store are optional
• Suitable Element Type
– Only Elements which are serializable can be placed in the off-heap
– Any non serializable Elements will be removed and WARNING level
log message emitted
• Eviction
– The LFU algorithm is used and it is not configurable or changeable
• Persistence
– Controlled by the disk persistent configuration
– If false or onmitted, disk store will not presit between CacheManager restarts
21. Key Concepts of Ehcache
Replicated Caching
• Ehcache has a pluggable cache replication scheme
– RMI, JGroups, JMS
• Using a Cache Server
– To achieve shared data, all JVMs read to and write from a Cache Server
• Notification Strategies
– If the Element is not available anywhere else then the element it self shoud from the pay load
of the notification
– D
22. Key Concepts of Ehcache
Search APIs
• Allows you to execute arbitrarily complex queries either a standalone
cache or a Terracotta clustered cache with pre-built indexes
• Searchable attributes may be extracted from both key and vales
• Attribute Extractors
– Attributes are extracted from keys or values
– This is done during search or, if using Distributed Ehcache on put() into the
cache using AttributeExtractors
– Supported types
• Boolean, Byte, Character, Double, Float, Integer, Long, Short, String, Enum, java.util.Date,
Java.sql.Date
23. Using Ehcache
General-Purpose Caching
• Local Cache
• Configuration
– Place the Ehcache jar into your class-path
– Configure ehcache.xml and place it in your class-path
– Optionally, configure an appropriate logging level
DB
Local
Application Web
Ehcache
Server
– d Web
Server
24. Using Ehcache
Cache Server
• Support for RESTful and SOAP APIs
• Redundant, Scalable with client hash-based routing
– The client can be implemented in any language
– The client must work out a partitioning scheme
– s
25. Using Ehcache
Integrate with other solutions
• Hivernate
• Java EE Servlet Caching
• JCache style caching
• Spring, cocoon, Acegi and other frameworks
26. Distributed Ehcache Architecture
(Logical View)
• Distributed Ehcache combines an in-process Ehcache with the Terracotta Server Array
• The data is split between an Ehcache node(L1) and the Terracotta Server Array(L2)
– The L1 can hold as much data as is comfortable
– The L2 always a complete copy of all cache data
– The L1 acts as a hot-set of recently used data
27. Distributed Ehcache Architecture
(Ehcache topologies)
• Standalone
– The cache data set is held in the application node
– Any other application nodes are independent with no communication
between them
• Distributed Ehcache
– The data is held in a Terracotta server Array with a subset of recently used
data held in each application cache node
• Replicated
– The cached data set is held in each application node and data is copied or
invalidated across the cluster without locking
– Replication can be either asynchronous or synchronous
– The only consistency mode available is weak consistency
28. Distributed Ehcache Architecture
(Network View)
• From a network topology point of view Distributed Ehcache consist of
– Ehcache node(L1)
• The Ehcache library is present in each app
• An Ehcache instance, running in-process sits in each JVM
– Terracotta Server Array(L2)
• Each Ehcache instance maintains a connection with one or more Terracotta Servers
• Consistent hashing is used by the Ehcache nodes to store and retrieve cache data
• 4
29. Distributed Ehcache Architecture
(Memory Hierarchy View)
• Each in-process Ehcache instance
– Heap memory
– Off-heap memory(Big Memory)
• The Terracotta Server Arrays
– Heap memory
– Off-heap memory
– Disk storage.
• This is optional.(Persistence)
– 1