The document discusses Microsoft SharePoint 2013's distributed cache service. It describes what a cache is and why it is used, provides examples of SharePoint's caches, and discusses cache architecture including high availability and configuration. It also covers cache sizing guidelines, implementation details, and monitoring the cache service through logs, configuration checks, and performance counters.
4. Qu’est-ce que le cache
Databases
Web
Services
Application
Servers
Identity
Services
Hard Disks
Memory
Cache Service
Cache Cluster
5. Pourquoi le cache ?
Pourquoi le cache ?
Gagner de la charge processeur
Gagner de la bande passante
Pourquoi partager le cache ?
Multiplier les bénéfices du cache
Pourquoi le cache distribué ?
Il est extensible … dans une certaine mesure
Plus résistant en cas de panne
6. Les désavantages du Cache
Les données mises en cache peuvent devenir obsolètes
Changement de sources
Atténuations : Expiration de minuterie et événement dotés, expulsion manuelle
Le cache n’est pas aussi fort que la source de données
Les BDD sont faites pour être en haute dispo, sauvegardable, et redondantes
Pas le cache
Les mesures d'atténuation : Cache hautement disponible, ou chute dans la
source
Le cache a besoin de mémoire
Plus d'espace mémoire et disque requis ; moins de temps réseau et traitement
requis
Atténuation : Ne pas utiliser de cache
7. Les Caches de SharePoint
Blob Cache
Les bases de données – Les lectures disques sont plus rapides qu’un A/R vers la
BDD
Cache de sortie des pages Asp.Net
Copie en local des pages HTML
Cache des objets
Les requêtes CrossSite (pas besoin de rééxécuter la requête)
Filesystem Cache
Sérialiser les objets persistants de SharePoint (configDB)
Distributed Cache
Stocke différentes données des services SharePoint
14. AppFabric Infrastructure
Architecture Physique
Architecture Logique
Haute disponibilité
Configuration
Caches
Cache Hosts
Planification des besoins mémoire
SharePoint Cache: Infrastructure et Management
Logs, Configuration Checks, et Compteurs PerfMon
15. AppFabric Physical Architecture
Cache Cluster: Collection de serveurs permettant un accès unique au Distributed Cache
Service.
Cache Cluster Configuration: Stock la configuration des données de la ferme
Cache Host: Membre des noeuds de la ferme
16. AppFabric Logical Architecture
Named Caches: Conteneurs des éléments du cache
Cached Objects: Clé/Valeur individuelle stocké dans un cache nommé
Regions: Collections d’objets mis en cache.
Les objets peuvent être places directement
dans un cache ou une region spécifique.
21. Cache Host Configuration
Communication Ports
Cache Port (22233)
Cluster Port (22234)
Arbitration Port (22235)
Replication Port (22236)
Size
HighWatermark and LowWatermark
IsLeadHost
22. Memory Requirements
Initial allocation is 5% of total physical memory at time of provisioning of Cache Service
CacheSize + 100MB must be available at time of service start
Change cache size:
Update-SPDistributedCacheSize
Set-AFCacheHostConfiguration
Recommendations:
Dedicate a machine to AppFabric
Allocate 50% of available physical memory
Machine should have no more than 16GB of RAM
Throttling
Less than 15% physical memory available
Less than 4% of allocated cache host size available
Dynamic Memory not supported
24. SharePoint Cache Infrastructure
Each Object or Region in a cache is stored once
Will be lost if host is shutdown
SharePoint has trouble with hosts still in cluster but not actually available
Stopping a host:
Stop-SPDistributedCacheServiceInstance –Graceful
Remove-SPDistributedCacheServiceInstance
Starting a host:
Start-SPServiceInstance
Add-SPDistributedCacheServiceInstance
New-SPConfigurationDatabase -SkipRegisterAsDistributedCacheHost
28. PerfMon
AppFabric Caching: Cache and Host
Total Eviction Runs (Cache and Host)
Total Data Size Bytes (Cache and Host)
Total Object Count (Cache and Host)
AppFabric Caching: Host
Available Memory Percentage
Gateway Process Time
Total Available Memory Bytes
Throttled Connections Count
.NET CLR Memory
# Bytes in all Heaps
% Time in GC
Memory
Process/Working Set and Virtual Bytes
29. Cache Server Performance
There are performance counters; there are also counts exposed via
developer’s dashboard
# of reads
# of writes
# of hits
# of misses
time for read
time for write
Total I/O (how much data has been transferred in a given period of time)
30. Cache Service Health
The following health rules have been created to help you track the Cache
Service (look in the Availability section for most):
One of the cache hosts in the cluster is down (Availability)
Firewall client settings on the cache host are incorrect (Configuration)
Cache host is in throttled state (Availability)
The high availability node for SharePoint distributed cache is not available
(Availability) – happens when there are less than 2 servers running the cache service
There exists at least one cache host in the cluster, which SP doesn't know about
(Configuration) – happens when the cache service is disabled in SharePoint but
AppFabric Caching Service is running on the machine
Cached objects have been evicted (Configuration) – indicates eviction happened across
the cache cluster. Not bad in and of itself but may be a clue if it happens frequently and/or
there are perf issues
31. Summary
Configuration details stored in Config DB
10 SharePoint Caches created
Using SharePoint caches for other purposes is not supported
Custom caches not supported; deploy to alternate servers
High Availability/Redundancy not available for SharePoint caches
AppFabric PowerShell cmdlets can be used to monitor and manipulate the
caches
Editor's Notes
A cache is a collection of local copies of items for which authoritative versions are created and/or stored somewhere else.Sources may contain literal copies of the data, such as databases or static web pages, or may generate the data on demand, such as Web Services and Identity Services.Classic example is a browser cache… instead of requesting the data multiple times we store it locally to make things quicker.
Why Cache?Save processing power: Multiple requests for the same resource can be processed just once or a finite number of times. Subsequent requests require less processing and can be served directly by the cached copy.Save network trips to retrieve data: Data can be retrieved from an authoritative source just once and saved for future use.Why Shared Cache?Instead of each server maintaining its own cache, all servers share one cache. This multiplies the benefits of the cache, as one trip and one processing event retrieves the item for all servers sharing the cache.Why Distributed Cache?By creating a caching farm, the cache can withstand the loss of one or more nodes. Also, the cache can handle more load through horizontal scaling.
First, cache servers use about 50% of the allocated cache memory for overhead. Second, the maximum amount of RAM we recommend per cache server is 16 GB. You can go higher than that, but if you do when we need to flush the cache it may take so long that the cache will appear to hang. It’s worth noting that by default when you install SharePoint we allocate 5% of the memory on the server to the cache.Given that 50% of that is used for overhead, that leaves you 7GB of storage for data. We’ll be working on formulas to help you determine how much storage your farm will need overall for caching. Once you have that, then you can use the formula above to determine how much storage you will have per server, and then divide that by the total storage required in the farm. That will tell you how many cache servers you need in the farm.*************************************
As I’ve alluded to a few times now, we recommend that you use servers dedicated to caching if possible – not shared with web front ends or other application servers. The distributed cache service can stress the CPU and memory on an individual server, especially if it is competing with other services on the box for server resources. The data in the cache is divided and distributed between all cache servers; data is not replicated between them. If a server were to go down, that doesn’t mean that your farm is unavailable – it just means that user performance will suffer until other servers in the cache cluster become populated with the data that was lost. If that does happen, you can accelerate the process of populating the cache again using one of the PowerShell cmdlets I have listed on this slide. If you KNOW that you are going to bring a cache server down, then you should do so gracefully with the Stop-SPDistributedCacheServiceInstancecmdlet. That will copy the data in that cache server to the remaining servers in the cache cluster before the shutting down. That will minimize the performance impact in the farm.
ActivityFeed – Caches microfeed entries from followable lists for Newsfeed. When you go to a page with a newsfeed, the system retrieves these items and then serves this data from the cache (auth source == followable lists)ActivityFeedLMT – Caches LMT (last modified time) of every followable list in farm first check LMT cache to find out if changes may have been made since last retrieval of data from cacheLogonToken – Caches SAML tokens after logon. Helps prevent us from needing sticky sessions for claims logonServerToAppServerAccessTokenCache – Caches OAuth tokens retrieved for apps prevents the need to return to ACS for app tokensViewState – For Minimal Download Strategy (MDS), saves ViewState in this cache and sends key as ViewState in page.Search – Query Results cache Not sure exactly how deep this goesSecurityTrimming – Caches results of Social Security Trimming For user profile activity feed security trimmingPotentially added during Beta, but never implementedDefault – Does nothing?Access - ?Bouncer – OneNote (?)
Generic AppFabric discussion, then specifics around SharePoint’s implementation, configuration, and requirements
Details: http://msdn.microsoft.com/en-us/library/hh334305(v=azure.10).aspxLike SharePoint, AppFabric servers are part of a farm. Configuration for the entire farm is stored in a central location; for SharePoint, this is two tables in the configuration database:dbo.CacheClusterConfigdbo.CacheClusterConfigSchemaVersionCache Hosts are individual servers in the AppFabric farm.
AppFabric consists of ‘Named Caches’ which are the logical containers in which cached objects are stored. Each of the caches we named for SharePoint previously are separate named caches in AppFabric.Cached Objects (key/value pair) can be stored directly into a Cache, in which case they are spread across all hosts in the cluster. Alternatively, they can be stored into a specific Region.A Region is a collection of related objects stored in a specific cache. Additional search and tagging functionality is available for objects in the same region. However, the entire region must be stored on only a single host in the cluster. SharePoint does use regions and this is one of the problem with redundancy.Objects are stored either within the cache or within a region in the cache. By default, they are stored only once across the entire cache cluster. AppFabric provides a High Availability mode wherein objects and regions are stored on 2 hosts in the cluster. However, this is not available currently in SharePoint.
http://msdn.microsoft.com/en-us/library/hh351441(v=azure.10).aspxHigh Availability requires: Enterprise edition of Windows Server. At least 3 cache hosts… with only two hosts when one goes down the cache stops.High Availability is enabled at the cache level and not the region or host level.. Consistency refers to whether the copy of the cached item is written to the secondary synchronously (Strong Consistency) or asynchronously. In synchronous configuration, response back to the client awaits acknowledgement from the secondary.Because High Availability requires at least one primary and one secondary for each cache, true high availability requires 3 cache hosts. This way, when one host fails or is stopped, the other two can become primary and secondary for the cache.Without this HA in SharePoint Distributed Cache, the regions are stored only on one host and this is why when a host fails you lose cache data. This is not critical as it’s not the authoritative source for data.
1. Open AppFabricCacheExamples.ps1 in Powershell ISE and explain adding the SharePoint snapin and importing the DistributedCache modules.. 2. Explain about running Connect-AFCacheClusterConfiguration first and then Get-AFCache.3. Show the Cache names and explain the prefix “Distributed” and the suffix (GUID that is the farm ID)4. Show the Host column where it lists each host for that named cache and the regions that are running on that host5. Execute Get-AFCache | % { Get-AFCacheConfiguration –CacheName $_.CacheName} and explain the properties listed for each of the named caches. Expirations… a way to control how to deal with stale items… default of 10 mins.. Not actually removed, but considered staleSecondaries… if this were set to 1 then you would have HA, but SharePoint will always be 0IsExpirable.. If this were set to false, then TTL would never matterEvictionType.. LRU (least recently used) – eviction happens when we exhaust memory and remove objects; LMT cache is set to None which means we do not remove items which can have other consequences that we’ll discuss later **not used in SharePoint**NotificationsEnabled is for notifying applications about cache functionsWriteBehind and ReadThrough were new features added in AppFabric 1.1Grab the Provider and ConnectionString values from the registry to be used later (Get-AFCacheCluster)
Caches with expiration set will remove items from memory when they pass their TTL *if* the host passes the low watermark.At the high watermark, caches forcibly evict cached items from the cache based on the LRU algorithm if specified. If “None” is specified nothing is forcibly evicted. A value of None may cause a cache host to run out of memory. (http://msdn.microsoft.com/en-us/library/hh351453(v=azure.10).aspx).Watermarks are specified for cache hosts (not for caches).Secondaries is set to 1 for high availability.Notifications are set to True if the cache will send queued notifications to clients.Write-Behind and Read-Through specify how the cache will write data back to a data source, or read data from a data source when needed.SharePoint does not use secondaries, notifications, or Write-Behind/Read-Through. Expiration is enabled and TTL is set to 10 minutes. Eviction is enabled on all caches *except* the ActivityFeedLMT cache.
1. Reviewing # Cache Host Configuration section2. 22233 is the main port3. 22235 is a system level communication port4. 22236 is the port for replicating data across each host5. 22234 – Cluster Port… need to research, but it’s mainly a management portAll ports should be open in firewall6. Size… discuss allocation of RAM7. WaterMark is about eviction8. IsLeadHost… basically has no bearing on SharePoint features.
Arbitration and Replication are processes for managing a Windows Fabric cluster.Connection parameters stored in registry at HKLM:\\SOFTWARE\\Microsoft\\AppFabric\\V1.0\\ConfigurationHighWatermark and LowWatermark govern eviction policies. See the slide on Cache Configuration for details.IsLeadHost specifies whether this host participates in managing the cluster. It’s only relevant for clusters where leadHostManagement for the cluster is set to true. For SharePoint, lead host management is not enabled (the provider manages it) so this property is not important.
Memory allocated to AppFabric Cache Host: Total Physical Memory at time of provisioning * 5%. If the cache host is removed entirely and re-added, this will be recalculated.In order for Distributed Cache service to start, allocated memory + 100MB must be available. E.g. If allocated memory is 500MB, 600MB must be available when service starts.Update-SPDistributedCacheSize changes the cache size for every server in the farm currently running the Distributed Cache service. Stops the distributed cache service across the farm. Does this inefficiently by setting it to Graceful shutdown, which isn’t relevant when shutting down entire cluster. Does not effect hosts added later, even if same host is removed and then re-added. Alternatively, use Set-AFCacheHostConfiguration on individual hosts.From http://msdn.microsoft.com/en-us/library/hh334304(v=azure.10).aspxBy default, we recommend that the amount of memory reserved for the cache on a given server is 50% of the total RAM. In this example, half of the RAM is 2 GB. The remaining memory is then available for the operating system and services. Even on machines with much more memory, it is recommended to keep this default setting. As previously mentioned, the caching service assumes that it is running on a dedicated machine, and it may use much more memory than is allocated for the cache. Although part of this memory usage is due to the internal design of the caching service, part of it is also related to .NET memory management and garbage collection. Even when memory is released in a .NET application, it must wait for garbage collection to be freed from the process memory. The process requires a buffer of physical memory to account for the non-deterministic nature of garbage collection.Full garbage collection cycles can cause a short delay, which can often be seen in retry errors. For this reason, we recommend that each cache host have 16 GB or less of memory. Machines with greater than 16 GB of RAM can experience longer pauses for full garbage collection cycles. With that said, there is no restriction against using more memory per cache host. A workload that is more read-only might not experience full garbage collection cycles as frequently. You can best determine this through load testing.From http://msdn.microsoft.com/en-us/library/hh334407(v=azure.10).aspxUse the Performance Monitor to track the Memory | Available MBytes on each cache host. When this gets under 15% of total physical memory, the cache host enters the throttled state. Throttling also occurs when the cache host memory comes within 4% of the CacheSize setting.Invoke-AFCacheGarbageCollectorDynamic memory is not supported mainly because AppFabric does not re-allocate while running…
When the Low Watermark is crossed, expired items are evicted from the cache.When the High Watermark is crossed, all items are evicted from the cache until available memory passes below the Low Watermark. This may cause an application perf hit as now non-expired items even must be retrieved from the authoritative source.Eviction still must be enabled in any case. If eviction is not enabled, cache will fill till 4% of available memory is left, then will begin throttling further Write operations.
The LMT is a specific region where host loss is critical.. This serves the Newsfeed so your activity feed would be empty while it rebuilds.. Stop cache host before restarting. Use Stop-SPDistributedCacheServiceInstan–Graceful to move cache items to another server in the cluster; however you must make sure you have enough headroom in the other servers to hold the extra data. This is actually recommended for patching scenarios also… graceful shutdown, patch, startup.. http://msdn.microsoft.com/en-us/library/hh351393(v=azure.10).aspxRemove-SPDistributedCacheServiceInstance stops the service instance (not gracefully), then removes the host from the cluster. It removes all cluster-related elements from the local server as well.Need to verify this: If server is being restarted and will remain in the cache cluster, use Stop and Start. If server is being permanently removed from cache cluster, use Add and Remove.We do not allow removal of the last Cache Host in the farm.Both New-SPConfigurationDatabase and Connect-SPConfigurationDatabase have the SkipREgisterAsDistributedCacheHost to prevent provisioning Distributed Cache Service on a specific host..
DifferentCacheClient settings like timeout valuesClear-SPDistributedCacheItem is for manual eviction of cache items
Event Logs: System Services is specific to AppFabric Caching Applications covers all WCF services.Config File location is “C:\\Program Files\\AppFabric 1.1 for Windows Server”PS commands are strictly for those PS sessions which are helpful with tshooting/diagnosing problems with AppFabric
For cluster cmdlets, Provider and ConnectionString must be supplied. They’re available at HKLM:\\SOFTWARE\\Microsoft\\AppFabric\\V1.0\\ConfigurationProvider is SPDistributedCacheProvider and ConnectionString is a SQL connection string for the configuration database.Export-AFCacheClusterConfiguration is useful for exporting an XML configuration file
For list of recommended counters: http://msdn.microsoft.com/en-us/library/gg186017.aspxThere are SharePoint specific counters as well. SharePoint Server Feed CacheA lot of eviction runs could mean you need to scale out or up depending your current configuration.
You can track performance metrics about the distributed cache service by using performance monitor. There are literally hundreds of counters so that you can dig into the performance for the cache as a whole, down to individual cache servers or even named caches. In addition to that you can also get cache usage information from the new Developer’s Dashboard in SharePoint 2013.
Finally, there are a set of health rules specifically for the distributed cache service. I’ve listed them here along with the section in which the rule is located. You can review each rule in central admin and set alerts for them as well.That concludes our look at the new distributed cache service in SharePoint 2013.