10. Metadata Allows to track State and Change Versions Creation, Update Each Version Tick Count, Replica ID Knowledge Enumerates changes, Detects conflicts Tombstones
11. Synchronization Algorithm Session gets destination Replica Session sends destination Replica to Source Provider Source provider determines Change Set Source provider sends Change Set to Session Session applies Change Set to Destination Session Source Provider Destination Provider
17. Resources Getting started http://msdn.microsoft.com/ru-ru/library/bb902854.aspx MS Sync Framework 2.0 Redistributable Package http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=18241 Code Gallery http://code.msdn.microsoft.com/site/search?f%5B0%5D.Type=SearchText&f%5B0%5D.Value=Microsoft%20Sync%20Framework&f%5B1%5D.Type=Affiliation&f%5B1%5D.Value=Official&f%5B1%5D.Text=Microsoft Excellent Intro http://msdn.microsoft.com/en-us/sync/bb821992 MS Sync with SQL Azure http://www.microsoft.com/windowsazure/features/database/#data-sync Blog of Liam Cavanagh http://blogs.msdn.com/b/sync/archive/2010/06/07/introducing-data-sync-service-for-sql-azure.aspx Email:akoval@codemastersintl.com Blog:http://www.codemastersintl.com/Blogs/Alexander-Koval
Notas do Editor
Based on the capabilities of the device, the way that a provider integrates synchronization will vary. At the very least, we will assume that the device is capable of programmatically returning information when requested. Ultimately, what needs to be determined is if the device can:* Enable information to be stored and manipulated either on the existing device or within the current data store, and;* Allow applications (in our case a synchronization provider) to be executed directly from the deviceIt is important to distinguish the types of participants that will be part of the synchronization ecosystem because it tells us if they will be able to store the state information required by the provider and it also tells us if we are able to execute the provider directly from the device. Ultimately, the participant model is meant to be generic. As such, a full participant could be configured to be either a partial or simple participant.
A fundamental component of a provider is the ability to store information about the data store and the objects within that data store with respect to state and change information. Metadata can be stored in a file, within a database or within the data source being synchronized. As an optional convenience, Sync Framework offers a complete implementation of a metadata store built on a lightweight database that runs in your process. The metadata for a data store can be broken down into five key components:VersionsKnowledgeTick countReplica IDTombstonesFor each item that is being synchronized, a small amount of information is stored that describes where and when the item was changed. This metadata is composed of two versions: a creation version and an update version. A version is composed of two components: a tick count assigned by the data store and the replica ID for the data store. As items are updated, the current tick count is applied to that item and the tick count is incremented by the data store. The replica ID is a unique value that identifies a particular data store. The creation version is the same as the update version when the item is created. Subsequent updates to the item modify the update version.The two primary ways that versioning can be implemented are:Inline tracking: In this method change tracking information for an item is updated as the change is made. In the case of a database, for example, a trigger may be used to update a change tracking table immediately after a row is updated.Asynchronous tracking: In this method, there is an external process that runs and scans for changes. Any updates found are added to the version information. This process may be part of a scheduled process or it may be executed prior to synchronization. This process is typically used when there are no internal mechanisms to automatically update version information when items are updated (such as when there is no way to inject logic in the update pipeline). A common way to check for changes is to store the state of an item and compare that it to its current state. For example, it might check to see if the last-write-time or file size had changed since the last update.All change-tracking must occur at least at the level of items. In other words, every item must have an independent version. In the case of a database, an item might be the entire row within a table. Alternatively, an item might be a column within a row of a table. In the case of file synchronization an item will likely be the file. More granular tracking is highly desirable in some scenarios as it reduces the potential for data conflicts (two users updating the same item on different replicas). The downside is that it increases the amount of change-tracking information stored.Another key concept that we need to discuss is the notion of knowledge. Knowledge is a compact representation of changes that the replica is aware of. As version information is updated so does the knowledge for the data store. Providers use replica knowledge to:Enumerate changes (determine which changes another replica is not aware of).Detect conflicts (determine which operations were made without knowledge of each other)Each replica must also maintain tombstone information for each of the items that are deleted. This is important because when synchronization is executed, if the item is no longer there, the provider will have no way of telling that this item has been deleted and cannot propagate the change to other providers. A tombstone must contain the following information:Global ID.Deletion version.Creation version.Because the number of tombstones will grow over time, it may be prudent to create a process to clean up this store after a period of time in order to save space. Support for managing tombstone information is provided with Sync Framework.
A fundamental component of a provider is the ability to store information about the data store and the objects within that data store with respect to state and change information. Metadata can be stored in a file, within a database or within the data source being synchronized. As an optional convenience, Sync Framework offers a complete implementation of a metadata store built on a lightweight database that runs in your process. The metadata for a data store can be broken down into five key components:VersionsKnowledgeTick countReplica IDTombstonesFor each item that is being synchronized, a small amount of information is stored that describes where and when the item was changed. This metadata is composed of two versions: a creation version and an update version. A version is composed of two components: a tick count assigned by the data store and the replica ID for the data store. As items are updated, the current tick count is applied to that item and the tick count is incremented by the data store. The replica ID is a unique value that identifies a particular data store. The creation version is the same as the update version when the item is created. Subsequent updates to the item modify the update version.The two primary ways that versioning can be implemented are:Inline tracking: In this method change tracking information for an item is updated as the change is made. In the case of a database, for example, a trigger may be used to update a change tracking table immediately after a row is updated.Asynchronous tracking: In this method, there is an external process that runs and scans for changes. Any updates found are added to the version information. This process may be part of a scheduled process or it may be executed prior to synchronization. This process is typically used when there are no internal mechanisms to automatically update version information when items are updated (such as when there is no way to inject logic in the update pipeline). A common way to check for changes is to store the state of an item and compare that it to its current state. For example, it might check to see if the last-write-time or file size had changed since the last update.All change-tracking must occur at least at the level of items. In other words, every item must have an independent version. In the case of a database, an item might be the entire row within a table. Alternatively, an item might be a column within a row of a table. In the case of file synchronization an item will likely be the file. More granular tracking is highly desirable in some scenarios as it reduces the potential for data conflicts (two users updating the same item on different replicas). The downside is that it increases the amount of change-tracking information stored.Another key concept that we need to discuss is the notion of knowledge. Knowledge is a compact representation of changes that the replica is aware of. As version information is updated so does the knowledge for the data store. Providers use replica knowledge to:Enumerate changes (determine which changes another replica is not aware of).Detect conflicts (determine which operations were made without knowledge of each other)Each replica must also maintain tombstone information for each of the items that are deleted. This is important because when synchronization is executed, if the item is no longer there, the provider will have no way of telling that this item has been deleted and cannot propagate the change to other providers. A tombstone must contain the following information:Global ID.Deletion version.Creation version.Because the number of tombstones will grow over time, it may be prudent to create a process to clean up this store after a period of time in order to save space. Support for managing tombstone information is provided with Sync Framework.
Sync Framework uses metadata that includes all the information that is required to perform synchronization. The metadata is small and efficient, and Sync Framework provides components that handle many of the tasks that involve metadata. The use of metadata keeps synchronization data type agnostic and helps balance freedom, interoperability, and simplicity. The following table lists and describes some metadata benefits that Sync Framework provides.Benefit Description ConciseMetadata is concise because it has no per-item version vectors, and still is enough for single- and multi-master synchronization.EfficientMetadata is efficient because it uses minimal change enumeration, even in loops.PreciseSync Framework uses precise conflict detection, without under- or over-detection and no over-sending of changes. This applies to both unstructured data such as files, and structured data such as detailed change tracking.FlexibleUsers can use any store and any technique for storing metadata, can add verbs to their own protocols, and can use their own techniques to optimize synchronization operations.InteroperableMetadata is agreed upon. Therefore, arbitrary topologies can be supported.EasySync Framework provides a standard toolkit. This handles many of the complexities of multimaster synchronization. The toolkit can also be customized to enable users to make changes to obtain even better performance.UsefulSync Framework manages as much metadata as is required. For example, an application can decide to handle only timestamps and to let Sync Framework handle versions, knowledge, and metadata storage. The application does not have to track deletions because Sync Framework computes them from a list. And the application does not have to track changes because Sync Framework computes them from hashes. Also, if it is necessary, Sync Framework can provide full multimaster support for legacy stores.
Sync Framework uses metadata that includes all the information that is required to perform synchronization. The metadata is small and efficient, and Sync Framework provides components that handle many of the tasks that involve metadata. The use of metadata keeps synchronization data type agnostic and helps balance freedom, interoperability, and simplicity. The following table lists and describes some metadata benefits that Sync Framework provides.Benefit Description ConciseMetadata is concise because it has no per-item version vectors, and still is enough for single- and multi-master synchronization.EfficientMetadata is efficient because it uses minimal change enumeration, even in loops.PreciseSync Framework uses precise conflict detection, without under- or over-detection and no over-sending of changes. This applies to both unstructured data such as files, and structured data such as detailed change tracking.FlexibleUsers can use any store and any technique for storing metadata, can add verbs to their own protocols, and can use their own techniques to optimize synchronization operations.InteroperableMetadata is agreed upon. Therefore, arbitrary topologies can be supported.EasySync Framework provides a standard toolkit. This handles many of the complexities of multimaster synchronization. The toolkit can also be customized to enable users to make changes to obtain even better performance.UsefulSync Framework manages as much metadata as is required. For example, an application can decide to handle only timestamps and to let Sync Framework handle versions, knowledge, and metadata storage. The application does not have to track deletions because Sync Framework computes them from a list. And the application does not have to track changes because Sync Framework computes them from hashes. Also, if it is necessary, Sync Framework can provide full multimaster support for legacy stores.
A fundamental component of a provider is the ability to store information about the data store and the objects within that data store with respect to state and change information. Metadata can be stored in a file, within a database or within the data source being synchronized. As an optional convenience, Sync Framework offers a complete implementation of a metadata store built on a lightweight database that runs in your process. The metadata for a data store can be broken down into five key components:VersionsKnowledgeTick countReplica IDTombstonesFor each item that is being synchronized, a small amount of information is stored that describes where and when the item was changed. This metadata is composed of two versions: a creation version and an update version. A version is composed of two components: a tick count assigned by the data store and the replica ID for the data store. As items are updated, the current tick count is applied to that item and the tick count is incremented by the data store. The replica ID is a unique value that identifies a particular data store. The creation version is the same as the update version when the item is created. Subsequent updates to the item modify the update version.The two primary ways that versioning can be implemented are:Inline tracking: In this method change tracking information for an item is updated as the change is made. In the case of a database, for example, a trigger may be used to update a change tracking table immediately after a row is updated.Asynchronous tracking: In this method, there is an external process that runs and scans for changes. Any updates found are added to the version information. This process may be part of a scheduled process or it may be executed prior to synchronization. This process is typically used when there are no internal mechanisms to automatically update version information when items are updated (such as when there is no way to inject logic in the update pipeline). A common way to check for changes is to store the state of an item and compare that it to its current state. For example, it might check to see if the last-write-time or file size had changed since the last update.All change-tracking must occur at least at the level of items. In other words, every item must have an independent version. In the case of a database, an item might be the entire row within a table. Alternatively, an item might be a column within a row of a table. In the case of file synchronization an item will likely be the file. More granular tracking is highly desirable in some scenarios as it reduces the potential for data conflicts (two users updating the same item on different replicas). The downside is that it increases the amount of change-tracking information stored.Another key concept that we need to discuss is the notion of knowledge. Knowledge is a compact representation of changes that the replica is aware of. As version information is updated so does the knowledge for the data store. Providers use replica knowledge to:Enumerate changes (determine which changes another replica is not aware of).Detect conflicts (determine which operations were made without knowledge of each other)Each replica must also maintain tombstone information for each of the items that are deleted. This is important because when synchronization is executed, if the item is no longer there, the provider will have no way of telling that this item has been deleted and cannot propagate the change to other providers. A tombstone must contain the following information:Global ID.Deletion version.Creation version.Because the number of tombstones will grow over time, it may be prudent to create a process to clean up this store after a period of time in order to save space. Support for managing tombstone information is provided with Sync Framework.