msf: sync your data on-premises and to the cloud - dotnetwork gathering, oct 2010
DESCRIPTION
TRANSCRIPT
Sync Framework Synchronize Your Data On-Premises and to the Cloud
Sameh SamirSenior Software Engineer
Architecture and Infrastructure Team
MedStreaming LLC
What Will Talk About• Brief on Microsoft Sync Framework
• Why I’d Need Synchronization
• Synchronization Ecosystem: The Concert
• Framework Components
• Responsibilities
• Participants
• Application Scenarios: Offline
• Application Scenarios: Collaboration
• How It Works
• Change Tracking
• Conflict Resolution
• Concepts
• Sync Scenarios: On-Premises Two Tier Architecture
• Demo: Synchronizing Data - 2-Tier Architecture
• Sync Scenarios: In the cloud – N Tier Architecture
• Demo: Synchronizing Data : N-Tier Architecture
• Choosing Primary Keys
• Tracing
• Demo: Sync with SQL Azure
Brief on Microsoft Sync Framework
• Microsoft data synchronization platform
• Allow for Collaboration and OCA (Offline) Applications scenarios
• Announced in MIX 2008
• August 2008 – V1.0
• April 2009 – V2.0
• August 2010 – V2.1
• Q1 2011 – V3.0 (Expected)
Why I’d Need Synchronization?
• Offline Availability• Lake of offline availability maybe frustrating for some users. But can be a disaster
for others (Retail Store POS, Medical system)
• Access to Full Client Capabilities• H/W intensive applications (Imaging, Medical, 3D, Media Processing, POS Station,
etc…)
• User Experience• Asynchronous processing improves usability, but you still have to wait• Cache management will be a headache if you would cache everything
• Mobility• Request for mobile accessibility increases• Mobile accessibility is a must for some businesses • Mobile internet still not cheap
Qualities of MSF
• Ease of use
• High Level of Customization
• Data and Transport Agnostic Sync Functionality
• Build - in Providers
• Extensibility
• Custom Providers Framework
Synchronization Ecosystem: The Concert
Sync Provider
Sync Application
Sync Provider
Sync Orchestrator
DataStore
Data Store
changes
changes changes
changes
Metadata Interpretation
Tools
Provider Services
MD StoreSync Runtime
Framework Components
Sync Runtime (Orchestration)
KnowledgeVersionChange
Enumeration
Basic Building Blocks
Built-In Providers
Conflict Detection
Metadata Storage Service
Anchor based Providers
Simple Providers
SQL Sync Provider
SQL CE Sync Provider
File Sync Provider
Feed Sync Provider
End to End Solutions
IDE Integration
Other MS & 3rd Party Providers / Solutions.…ADO Sync
Services
Db Sync Provider
Sync for OData
Full Enumeration Providers
Responsibilities
Developer:• The application• The data store• The data transfer protocol
Sync Framework:• Synchronization session, or manager• The synchronization runtime
Sync Framework, or the Developer:• The sync provider• The metadata store
Participants
• Full Participants: Devices that allow developers to create applications and new data stores directly on the device. E.g. Windows Phone, laptop
• Partial Participants: Devices that have the ability to store data either in the existing data store or another data store on the device but do not have the ability to launch executables. E.g. thumb drives or SD Cards.
• Simple Participants: Devices that are only capable of providing information when requested. These devices cannot store or manipulate new data. E.g. RSS Feeds and web services.
Application Scenarios: Offline
• All clients sync through a single hub (Server)
• Suitable for Occasionally Connected Applications (OCA)
• Single point of failure
• The most common, and easier to implement
Application Scenarios: Collaboration
• Suitable for application where users needs to share data (i.e notes, documents, calendars, project info)
• Each client can sync with other clients or with a central server
• Avoid single point of failure
• Offload the sync processing from server to clients, and thus provide more scalability
• Less common and more complex to implement.
Is metadata up-to date
How It WorksEnumeration
Sync Orchestrator
ProviderFramework
with Runtime
Sync Provider
DataStore Meta-data
Store
GetChangeBatch
Enumerate all objects
Here’s one:Id=‘foo’, LMT=5pm
What was it last time?New
Updated
Same
Update metadata
Bring metadata up-to-date
Enumeratechanges
Metadata is up-to-date!
All done!
What’s missing?
Record deletes
…
How It Works (Cont.)Applying Changes
Sync Orchestrator
ProviderFramework
with Runtime
Sync Provider
DataStoreMeta-data
Store
Enumerate all objects
Here’s one:Id=‘foo’, LMT=5pm
What was it last time?
New
Updated
Same
Update metadata
Bring metadata up-to-date
Metadata is up-to-date!
All done!
What’s missing?
Record deletes
…
ProcessChangeBatch
Get versions
Update item id=‘foo’
LMT was 1pmNew data is ‘bar’New LMT=8pm
Check LMTand write
Updatemetadata
Change Tracking• Change tracking provides a list of changes made from one point in
time to another.
• Commonly implemented using rowversions and triggers, plus a “deleted” table
• The major disadvantages are:• Changes are required to the schema to add columns and tables• Triggers are fired for each change made, which has performance implications.
• SQL Server 2008 has built-in change tracking, implemented without rowversions and triggers
• The Sync Framework database synch providers take advantage of SQL Server 2008 change tracking and provide the following advantages :• No schema changes are required• Triggers are not required for tracking changes• All of the logic for tracking changes is internal to the SQL Server engine
Conflict Resolution
• Conflicts occur when two or more databases make a change to the same piece of data
• A variety of ways to resolve these conflicts. • Last change to come in wins• Highest priority user wins• Manual selection
• Sync Framework provides conflict detection and resolution capabilities out of the box
• SQL Server 2008 makes it easier to identify conflicts.
Concepts
• Sync Scope: • Set of tables that will be available for synchronization
• Sync Group:• Group of that must be synchronized as a single unit (transaction)• Ensure data consistency
• Provisioning a Server• Get the server ready for change tracking• Add change tracking columns and triggers for SQL Server 2005• Enable change tracking feature for a set of tables of a SQL Server 2008
database• Can be done programmatically or through “Configure Data
Synchronization” wizard
Sync ScenariosOn-Premises (Two-Tier Architecture)
Sync Provider
Sync Application
Sync Orchestrator
DataStore
Sync Provider
DataStore
Data ServerClient
DEMOSynchronizing Data - 2-Tier Architecture
Sync ScenariosIn The Cloud (N-Tier Architecture)
Sync Provider
Sync Application
Sync Orchestrator
DataStore
Sync Provider
DataStore
Data ServerClient
Proxy
DEMOSynchronizing Data - N-Tier Architecture
Table Key Selection
Take it seriously, or else
Table Key Selection : The Problem
Client 1 Client 2
1 Customer 1 …
100 Customer 100 …
1 Customer 1 …
100 Customer 100 …
101 Customer 101 …
1 Customer 1 …
100 Customer 100 …
101 Customer 101 …
101 Customer 101 …
Duplicate Key Conflict
Table Key Selection : Solutions
1. Use GUID instead of auto incremented IDs• Solve primary key collisions possible with auto-increment
columns• Increased index size leads to increased query time• Causes fragmented clustered index, which also affects query
processing time. • Can be solved in SQL Server by using NEWSEQUENTIALID
function to generate ordered GUIDs
2. ID Ranges• Split available IDs into segments• Assign each client a unique segments• Client can ask for more ID ranges
Table Key Selection : Solutions (Cont.)
3. Compound Keys
• Use compound key that includes a client identifier
4. Use Business Key as ID
• Use unique business keys (i.e National Number / SSN / Barcode)
• May affect the query performance if key type is not numerical.
5. Online Insert
• Insert directly to the server
DEMOEnable Tracing
DEMOSync To SQL Azure
Q&A
Call To Action
Azure Table Sync Library (azuretablesynclib.codeplex.com)
Open source project aims to create custom data sync providers to allow for the following sync sceanrios
1. Azure Table Storage <-> SQL Server / Express
2. Azure Table Storage <-> SQL CE
3. Azure Table Storage <-> SQL Azure
4. Azure Table Storage <-> Azure Table Storage
Keep in Touch
Email: [email protected]
Blog: www.Cloudy-Ideas.net / www.sameh-samir.net
Twitter: twitter.com/sameh_samir
LinkedIn: linkedin/in/samehsamir