The most important thing for any organization is DATA. There can be 100 of front end applications which utilizing the same data for different purpose. Data plays an important role for any CMS application. This presentation touches different viewpoint while migrating data from external database to Sitecore CMS.
By using these details we able to successfully migrate over 5,00,000+ records in Sitecore.
2. Agenda
Introduction
Questions for Client
Data Mapping
Data Representation and Mapping
Sources of Data
Data Migration Path
Why SQL Server
Guide for content migration resources and team
Images and media
Sitecore Items
Sitecore Fields
Migration Code
Log Files
Code Techniques
Type of resources
Testing
Takeaway
Friday, July
15, 2016
2
4. Questions for client
Is it one time activity?
In how many batches will data provide?
Same format of data into all batches
Client can correct data
Frequent data format changes require code changes
Is there any manual migration involved?
Specify instructions for client in mail, call, SOW etc
Friday, July
15, 2016
4
5. Data Mapping
3 things required : source, destination and mapping
Friday, July
15, 2016
5
6. Data Representation and Mapping
Same data different representation.
WWW + Customization = Where to Where and What + Data Representation
Source Destination Level Example
Single Field Single Field Easy Lastname field of database to Lastname field of Sitecore
Example :- Kalam To Kalam
Multiple Fields Single Field Medium Database Prefix, Firstname, Last name fields to Sitecore Fullname field
(Dr., Abdul, Kalam => Dr. Abdul Kalam)
Single Field Multiple
Fields
Difficult Database Fullname field to Sitecore Prefix, Firstname, Last name fields
(Dr. Abdul Kalam => Dr., Abdul, Kalam)
(Avul Pakir Jainulabdeen Abdul Kalam => ?)
Full
Name
First Name
Last Name
Prefix
Full
Name
First Name
Last Name
Prefix
First
Name
First Name
Friday, July
15, 2016
6
7. Source of Data
Content migration from multiple source with multiple language data.
Code for multiple version destination.
Code for multiple version and multi language data.
Different sources of data
Text files
Excel file
CSV file
XML file
JSON
Database [SQL, MySQL, etc]
Others etc.
Don’t write direct migration code from txt, excel, xml to Sitecore.
Friday, July
15, 2016
7
8. Data Migration Map
Load
• This will load the
data
Parse
• Parse the loaded
data
Dump
• Insert into SQL
Server Database
Get data
• Get Data from
SQL Server
Create/Update
• Create or update
item in Sitecore
This is a process to migrate data and load the same in to Sitecore. The data can come from various sources
like files, SQL server etc. Irrespective from where the data comes, the overall general process is to load the
data from the source, parse the data, dump the same in to SQL Server, get data from SQL server and finally
create/update items in Sitecore .
Friday, July
15, 2016
8
9. Why SQL Server
SQL database can be reuse on different environment like QA, Staging, Production etc.
One can easily filter data in SQL SELECT statement.
Records can be processed in SQL server based on unique keys.
Software drivers like Excel may not be available on staging, production servers.
Migrate once, use multiple times
If your database is publically accessible over Internet then you can access data on any server / environment
Easily restore these databases on any environment.
Friday, July
15, 2016
9
10. Guide for content migration resources
and team
Able to work with minimum supervision
Passionate
Self-starter
Agile team
Friday, July
15, 2016
10
11. Images and Media
Use regex
Identify image and download it
Better if client provide on FTP
Upload and create image item in Sitecore
Replace links from content as per new website links.
Friday, July
15, 2016
11
12. SITECORE ITEMS
Make sure if item is creating from template or branch template. There may be case where
developer write the code for item creation from specific template instead of a branch
template.
Item naming regex - Use same naming process which is used by Sitecore for items
Custom Naming - Special characters can’t be a part of item name. Remove all special
characters from item name and consider only A-Z, underscore(_), Hyphen(-), 0-9 etc.
Item name should not truncate in mid of any word
Instead of “Welcome To New”, create item with name “Welcome To New York”.
Consider item name up to 100 characters like
string newItem = itemName.Substring(0,100).Trim();
newItem = newItem.Substring (0,newItem.LastIndexOf(' '));
Item name plays an important role in URL and SEO
For example – How to specify item name of lawyers?
Lastname + firstname or firstname + lastname
Legacy URL http://www.Oldsite.com/lara-craft , new URL is http://www.newsite.com/craft-lara
In case of multiple language, migrate all related details in the same language in which main
item is created. Friday, July
15, 2016
12
13. Sitecore Fields
All droplink, dropdown, images fields in Sitecore should be shared.
Keep GUID / ID and last updated date in Sitecore to track and migrate only different
data in next batch.
Skip sitecore item for same GUID and last updated date.
Update sitecore item for same GUID but change in date.
Create sitecore item for new GUID.
This strategy will save processing time and avoid any existing change in data.
Keep Is Auto migrated checkbox in sitecore.
Always mark this checkbox during auto migration process.
Keep “Is Valid” checkbox.
In all auto and manual migration this checkbox is checked.
If any item is unchecked, they all are likely to be deleted.
Friday, July
15, 2016
13
14. Migration Code
Create separate class library project for content migration.
Separate class library helps in maintenance of the code and can be easily
accessible.
Create one web page from where you can run script.
This page should be password protected.
Allow to select only one option by using radio buttons in this page and one a button
to activate/run the migration script.
Keep all migration related settings including connection strings in separate
config file located at WebsiteApp_Config.
Friday, July
15, 2016
14
15. Log Files
Maintain separate log files for each
migration script
Logs are handy to check migration progress
status, specially on QA and staging
environment.
These log files are referred by developers
and testers for data verification.
Always keep some counter value for record
counting in logs.
Maintain logs in XML format for better
result.
Use Notepad++ for analyzing the log files.
Friday, July
15, 2016
15
16. Code Technique
Instead of fast query to get child; use Item.Axes.GetDescendants() method as fast
query gives unpredictable results.
Use LINQ as much as possible to minimize code line
Instead of getting item from Sitecore.Context.Database.GetItem(); use database
method as Sitecore.Data.Database.GetDatabase("master").GetItem(); as
context statement give you result from WEB database while we have to refer master
database items.
Always check null, blank, empty value before getting, converting any value in code.
Use break in loops if particular item found as its processing is completed.
If you are going to create > 100 items under one parent than try to create them in
folder. Folder can be according to date hierarchy or alphabetic order.
Write sitecore fetch item code common and call it outside loop. Don’t call such code
inside loop as it slow down the process and its unnecessary delay the processing.
Friday, July
15, 2016
16
17. Code Technique Continue..
Don’t write sitecore fetch code inside
the loop but if you are creating the items
of the same collection which are
iterating in loop than fetch all items
again inside loop.
For eg: Fetch all bio.
Write a common function for validating
item name and image and use the same
function everywhere in project and
upload the image with the name that is
return by this function.
Write your script in aspx page inside
server tags so that you don’t need to put
the DLL on anywhere and it will not
effect in the .net process/ sitecore
process.
Sitecore itself using this technic.
Take backup of Sitecore database before
running any migration script. Friday, July
15, 2016
17
18. Code Technique Continue..
Migration data may contain n (newline). It should be
replaced by <br> or <p></p> tag.
Friday, July 15, 2016
18
19. Testing
Always write auto-script to test and validate data between live and new site. As
testing team can’t test all the records manually. Auto migration data must be
tested by auto-testing.
Run your test auto scripts on isolated server like CA server or QA server. Don’t
disturb staging or client using servers. Run your scripts in off hours On client
accessing servers.
Code review must be done for auto-migration script.
Ask cross question and validate all documents of clients/ vendors before starting
data migration. Try to know the classification of data/ categories of data.
Take backup of database before running the scripts.
Run you migration script for few records/items at first time.Once these items are
reviewed by testing team then only run your script for all records.
Friday, July
15, 2016
19
20. Takeaway
WWW + Customization = Where to Where and What + Data Representation
Data is everything for any organization.
Avoid Update operation with data
Content migration is for developers who like to play with data,
DO YOU?
Friday, July
15, 2016
20
Training can only provide knowledge while real life scenario gives you experience. Client providing you experience as his requirement are weird and unpredictable. Client don’t know about Sitecore,
PM Know about sitecore and instruct you to do the work.
Testers can never be good friends as developer never want to break their code and testers do the exactly opposite. Testers find the issues which developer have to fix and Developer don’t like to change the code. It’s very difficult where Testers and developers are in same project and they are good friend.
For Tester – No bugs means no glory
This presentation and session is dedicated to client, PM and Testers.
Most important is content - data – information.
Ask questions to client
Is it one time activity? If its answer is Yes, believe me u r in safe zone.
In how many batches will it provide?
Strict instruction - We want data in same format into all batches – Educate your client
If it is in different format than client have to correct it developer have to change the code which they don’t like and it create conflict between team and manager.
Is there any manual migration?
Manual migration – Hard work
Specify these clear instruction to client in mail and SOW
Developers questions for manager in Single Source to Multiple Destination
Able to work with minimum supervision -You will be the only one we blame when something goes wrong
Passionate - Perseveres through regular death marches in front of management
Self-starter - We have no process
Agile team - We have daily stand-ups
Give example: difference between 1. XML data and RT field.
2. XML data and RT HTML field.
Give example: difference between 1. XML data and RT field.
2. XML data and RT HTML field.
The imitation Game - It stars Benedict Cumberbatch as real-life British cryptanalyst Alan Turing, who decrypted German intelligence codes for the British government during World War II. The team are trying to break the ciphers created by the Enigma machine, which the Nazis use to provide security for their radio messages.