1. ComparX-R Senthil Sundaresan
Version 1
By
SenthilMurugan. S
senthil.boecp@.com
2. ComparX-R Senthil Sundaresan
This document describes the use of ComparX tool. I hope this will be really helpful to you to
completely understand what the tool does.
Source codes are hidden in the tool.
I started creating a simple macro to automate our Comparison process [of Data from
Legacy systems and Data generated by Distributed systems] by writing VBA codes. But at one
stage I was thinking to make it as a good tool which would be really helpful for other TCSers too.
This tool is really helpful for our team mates during all the phases of our project for
COMPARISON as well as for RECONCILIATION processes.
There are some limitations in this tool which I have mentioned later in this document.
These limitations will be taken care in the future. And some Future Enhancements also
mentioned.
Have a wonderful experience with ComparX.
For feedbacks:
senthil.boecp@gmail.com
3. ComparX-R Senthil Sundaresan
HOME
ComparX-R
ComparX-R is a Comparison Tool
It’s created in Excel using VBA. You can compare two sets of data in a sequential manner.
Comparison will be done on the basis of Rows By Columns.
25 Source Code Modules are there.
4. ComparX-R Senthil Sundaresan
Application Developed In : EXCEL 2003
Language Used : VBA
Operability : Windows Operating System
5. ComparX-R Senthil Sundaresan
Description
Contents of ComparX-R
Use
WYCWYG
Options
Layouts
Specific Features
Limitations
Difference between ComparX-R and ComparX-C
Future Enhancements
6. ComparX-R Senthil Sundaresan
ComparX – R is a Comparison Tool which is very much helpful for you when you have to compare
some huge data.
Normally in Day to day activities Excel is very much a part of our both Official work and personal work.
Here you can put your data in sources sheets and then do whatever options available in Menu with your
data.
This tool will generate about 10 Reports which you can use for your analysis.
Ex:
When you are going to compare production data and test data, this tool will be very useful.
When you are going to compare Mainframe data and ETL data [In Migration projects] this is very much
useful for you to do analysis as well as for presentation.
Process:
It takes the first row from ETL source data and compares it with first row of Mainframes source
data.
Both Mainframes and ETL data should be sorted in the same order based on same number of columns.
If a Mainframe has one record in row number 2 then ETL also should have the same row in the same
row number. [You may or may not have mismatched columns in that rows] But Key [Columns used
for sorting] should be sync between Mainframes and ETL.
7. ComparX-R Senthil Sundaresan
Contents of ComparX-R
ComparX-R Contains Several Sheets:
1) Navigation
2) Steps
3) Sample Data
4) Colors List
5) Mainframe
6) ETL
7) Compare Data
Navigation:
Home Page of ComparX-R
Steps:
Steps to be followed for copying data into Source files [Mainframe and ETL]
Sample Data:
Sample data to test and have the glance about this tool.
Colors List:
List of Colors and its Numbers to refer while Formatting Reports.
Mainframe:
Input Source file to be compared.
ETL:
Output Source file to compare with Mainframe Data
8. ComparX-R Senthil Sundaresan
Compare Data:
Main Menu - here it offers you lot of options like
Four sub menus are there as follows:
Create Reports
File Activities
Formats
Navigator
9. ComparX-R Senthil Sundaresan
Create Reports
Report Creation
•Compared Report
•Matched Report
•MisMatched Report
•MisMatched_Cols Report
•Stats Report
•Compared > 2000 Rows Report
•Matched > 2000 Rows Report
•MisMatched > 2000 Rows Report
•MisMatched_Cols > 2000 Rows Report
•Stats > 2000 Rows Report
File Activities
Data Loading
•Sample data loading
•External data loading
Clear Source Contents
Delete reports
Save Reports Alone
Save the file along with reports
Check Record Count of both Sources
Check Generated Report Count and Names
10. ComparX-R Senthil Sundaresan
Formats
Formatting Reports
Export the generated reports to
•CSV
•HTML
•TXT file formats
Navigator
•Navigation to Generated reports
When you open the application the above mentioned sheets are default. If you want to hide some
sheets you can do this by using Hide buttons provided in the right side of Navigation Sheet.
11. ComparX-R Senthil Sundaresan
What You Click is What You Get
Compared Report – It will generate Comparison of both the sources [Matched and
Mismatched] with format
Compared > 2000 Rows Report – It will generate Comparison of both the sources [Matched and
Mismatched] without format
Matched Report – It will compare both the sources and gives you only the Matched
Records with format
Matched > 2000 Rows Report – It will compare both the sources and gives you only the Matched
Records without format
Mismatched Report – It will compare both the sources and gives you only the MisMatched
Records with format
Mismatched > 2000 Rows Report – It will compare both the sources and gives you only the MisMatched
Records with format. It needs Mismatched Report to be generated
before.
Mismatched_Cols Report – It will compare both the sources and gives you only the MisMatched
Records with format. It needs Mismatched > 2000 Rows Report to
be generated before.
Mismatched_Cols > 2000 Rows Report – It will compare both the sources and gives you only the MisMatched
Records without format
Stats Report – It will generate Status Report for Rows less than 2000, with format.
It needs Compared and Mismatched Report to be generated before.
Stats > 2000 Rows Report – It will generate Status Report for Rows greater than 2000, with
Format. It needs Compared > 2000 Rows Report and Mismatched >
2000 Rows Report to be generated before.
12. ComparX-R Senthil Sundaresan
PS: Less than 2000 and Greater than 2000 rows will be automatically generated once you click the Compared Report,
Matched Report, Mismatched report buttons except for Stats Report, Stats > 2000 Rows Report and Mismatched_Cols >
2000 Rows Report.
Reports:
•Compared Report and Compared > 2000 Rows Report
•Matched Report and Matched >2000 Rows Report
•MisMatched Report and MisMatched > 2000 Rows Report
•MisMatched_Cols Report and MisMatched_Cols > 2000 Rows Report
•Stats Report and Stats > 2000 Rows Report
•Compared Report and Compared > 2000 Rows Report:
It will give results as Mainframe data, ETL data in a single cell with = or <> symbol. At the
end of last columns of each rows status will be displayed as Matched or MisMatched
It will have Matched and Mismatched Data
Stats like number of matched and mismatched records for each columns, and link to other
sheets will also be generated and will be printed after the last record.
Compared Report - For Less than 2000 Rows
Compared > 2000 Rows Report - For Greater than 2000 Rows
•Matched Report and Matched > 2000 Rows Report:
It will give results as Mainframe data, ETL data in a single cell with "=" symbol. At the end of
Last columns of each rows status will be displayed as Matched.
13. ComparX-R Senthil Sundaresan
It will have Matched Data only
Stats like number of matched records out of total records and link to other sheets will also be
generated and will be printed after the last record.
Matched Report - For Less than 2000 Rows
Matched > 2000 Rows Report - For Greater than 2000 Rows
14. ComparX-R Senthil Sundaresan
•MisMatched Report and MisMatched > 2000 Rows Report:
It will give results as Mainframe data, ETL data in a single cell with “<>" symbol. At the end of
Last columns of each rows status will be displayed as MisMatched
It will have MisMatched Records only
Stats like number of matched and mismatched records for each columns, and link to other
sheets
Will also be generated and will be printed after the last record.
MisMatched Report - For Less than 2000 Rows & MisMatched > 2000 Rows Report - For
Greater than 2000 Rows
•MisMatched Cols Report & MisMatched Cols > 2000 Rows Report:
It will give results as Mainframe data, ETL data in a single cell with “<>" symbol. At the end
of last columns of each rows status will be displayed as MisMatched.
It will have MisMatched Records and Mismatched Columns only
Stats like number of Mismatched Cols, and link to other sheets will also be generated and will
be printed after the last record.
MisMatched Report required for Mismatched_Cols Report
MisMatched > 2000 Rows Report required for Mismatched_Cols > 2000 Rows Report
•Stats Report and Stats > 2000 Rows Report:
List of Mismatched columns, Column Numbers, How many mismatched values for each
column will be displayed.
Summary like Total Rows and columns, Total Mismatched Rows and Columns, Matched
Rows and Columns and Percentage for each will also be displayed.
Links to other sheets will be displayed in the bottom of report
Compared Report and Mismatched Report required for Stats Report
Compared >2000 Rows Report and Mismatched >2000 Rows Report required for
Stats >2000 Rows Report.
15. ComparX-R Senthil Sundaresan
•File Activities
Load Sample Data - It loads data from sample data sheet to both
Mainframes and ETL sheets
Load External Data - Load External data from ANY delimited file or
Normal Text files into the sheet which you
choose
Clear Source Contents - It clears the Contents in Sources [Mainframe and
ETL Sheets]
Delete Reports - It deletes the Generated Reports irrespective of
the number of reports.
Save Sources & Reports - It saves only the Sources [Mainframe and ETL]
and the Generated Reports
Save ComparX - It saves the entire Application where in the
ComparX is actually stored.
Get Record Count - Gets you the Row and Column count comparison
in a Popup dialog box.
Get Report Count - Gets you the Available Reports Count and Its
names in a Popup dialog box.
•Formats
Export to HTML - Exports the chosen report into HTML format. If
the report not generated it wont export and will
throw popup
Export to CSV - Exports the chosen report into CSV
Format. If the report not generated it won’t
export
and will throw popup
Export to TXT - Exports the chosen report into TXT format with
delimiter. If the Report not generated it wont
export and will throw popup
16. ComparX-R Senthil Sundaresan
You can export Matched, Mismatched, and Compared Report into the mentioned format.
Generated file will be saved where the ComparX-R is placed. But in the folder of CSV, TXT for
CSV and TXT files.
All Exported files [CSV, TXT, HTML] will be saved where in the ComparX is stored.
Format Reports - You can format Matched, Mismatched, and
Compared report into the mentioned format. But
not Mismatched_Cols Report and Stats Report as
it comes with formats.
•Navigator
It navigates you to the specific report whichever generated.
17. ComparX-R Senthil Sundaresan
Reports:
Compared Report and Compared > 2000 Rows Report:
[MF] <Mainframe data> = <ETL data> [ETL] if both data are equal - This will be displayed
in Single cell.
[MF] <Mainframe data> <> <ETL data> [ETL] {Row #: | Column :} if both data are NOT
Equal - This will be displayed in single cell.
Status Column will have values MATCHED if both data are matched, MISMATCHED if
Both data are not matched.
Status Rows:
1st Row: No. of Rows in mainframe = No. of Rows in ETL
Row count matched or not matched
Completely matched Records
2nd Row: No. of Mismatched Records
Report Generation time
3rd Row: Link to Top Row of the Report, to Compare Data Menu, to Steps Sheet.
Compared Report generates with format
Compared > 2000 Rows Report generates without format
Matched Report and Matched > 2000 Rows Report:
[MF] <Mainframe data> = <ETL data> [ETL] if both data are equal - This will be displayed
in single cell.
Status Column will only have values MATCHED if both data are matched.
Status Rows:
1st Row: No. of Rows in mainframe = No.of Rows in ETL
Completely matched Records
2nd Row: No.of Matched Records
Report Generation time
18. ComparX-R Senthil Sundaresan
3rd Row: Link to Top Row of the Report, to Compare Data Menu, to Steps Sheet.
Matched Report generates with format
Matched > 2000 Rows Report generates without format
MisMatched Report and MisMatched > 2000 Rows Report:
[MF] <Mainframe data> <> <ETL data> [ETL] if both data are equal - This will be displayed
in single cell.
[MF] <Mainframe data> = <ETL data> [ETL] if both data are equal - This will be displayed
in Single cell.
Status Column will only have values MISMATCHED if both data are NOT matched.
Status Rows:
1st Row: No.of Rows in mainframe = No.of Rows in ETL
MISMATCHED Records COUNT
2nd Row: No.of MisMatched Records
Report Generation time
3rd Row: Link to Top Row of the Report, to Compare Data Menu, to Steps Sheet.
MisMatched Report generates with format
MisMatched > 2000 Rows Report generates without format
MisMatched_Cols Report and MisMatched_Cols > 2000 Rows Report:
[MF] <Mainframe data> <> <ETL data> [ETL] if both data are equal - This will be displayed in
single cell.
[MF] <Mainframe data> = <ETL data> [ETL] if both data are equal - This will be displayed in single
cell.
If Entire Column has matching value that column will be deleted
Status Column will only have values MISMATCHED if both data are NOT matched.
Status Rows:
1st Row: No.of Rows in mainframe = No.of Rows in ETL
MISMATCHED Records COUNT
2nd Row: No.of MisMatched Records
Report Generation time
3rd Row: Link to Top Row of the Report, to Compare Data Menu, to Steps Sheet.
19. ComparX-R Senthil Sundaresan
MisMatched_Cols Report generates with format
MisMatched_Cols > 2000 Rows Report generates without format
Stats Report and Stats > 2000 Rows Report:
Table 1
Mismatched Column's Number Mismatched Column's Name Mismatched Records for
Each column
Total Mismatched Values <Sum>
Table 2
<Count of Records <Percentage of Records
for each category> for each category>
Total Rows
Total Columns
Mismatched Rows
Mismatched columns of Mismatched Rows
Matched columns of Mismatched Rows
Matched Rows
Matched columns
Link to Compare Data Menu, Link to Mismatched Report, Link to Compared Report
20. ComparX-R Senthil Sundaresan
Features
1)Gives you the Statistics. This will be really helpful for analysis and presentation.
2)Gives you the Report Generation Time.
3)Gives the flexibility of Formatting the reports for presentation
4)Converting the Excel Data into HTML, CSV, Delimited TXT file formats
5)Gives you the Row and Column count to decide to whether to go for the result set or not.
6)Gives you the Reports Count [Generated Reports]
7)Navigation to all reports, Main Menu, etc.
8)Hide and Show options
9)Colors list generation to see what colors you can use for your report formatting.
10)Loading external delimited data files, Loading the sample data for testing
11)Save Sources and Reports alone in a new file.
12)All Reports, Exported files are saved in the same folder where in the ComparX is actually
saved.
13)Deletion of Reports.
21. ComparX-R Senthil Sundaresan
Limitations
ComparX-R
•Can process 255 columns as excel column limit per sheet is 255 plus one status column.
•Can process 65533 rows as excel sheet limit is 65536 rows and status will take 3 rows.
•Row Count should be equal in both sources to get the desired result.
•Source files should be sorted properly.
•This is really helpful for Sequential Comparison and gives accurate result
•Not much faster than UNIX scripts or other scripts. As it has to compare each cell of one sheet
against each Cell in other sheet in a sequential manner. UNIX scripts will be running in servers
so it would be much faster.
•But Comparison wise it gives Accurate results with a good look and feel.
•Rejected records will not be saved in separate file.
•Even if its blank values in any column when comparing to other column with data, it displays
the status as Mismatched only.
22. ComparX-R Senthil Sundaresan
Difference
Between
ComparX-R and ComparX-C
ComparX-R
Can process 255 columns as excel column limit per sheet is 255 plus one status column.
So it can process 255 columns from each source.
Can process 65533 rows as excel sheet limit is 65536 rows and status will take 3 rows.
ComparX-C
As it has to display 3 columns in reports for each set of columns it can process only 85
columns from each source.
Can process 65532 rows as excel sheet limit is 65536 rows and status will take 4 rows.
PS: ComparX-C is a new tool which has report logic and layouts completely different
from ComparX-R and will be uploaded soon into MIGHTY.
23. ComparX-R Senthil Sundaresan
Future Enhancements
•FTP options
•Export to XML
•Process more than 65533 rows and 255 Columns
•Array based Process [expedite the process]
•Lookup irrespective of Row counts matched or not
•Rejected data into a separate file
•Key based Search and Comparison
•Option for both Delimited and Fixed width files Processing
•E-Mailing the Generated reports
•Rejected records of both files will be placed in separate worksheets.