How can military leaders determine the status of a complex organization in a concise, accurate and meaningful way? Next generation analytical tools offer new approaches and opportunities to uncover data relationships that were never before possible. However, unless they can fundamentally address key challenges in data normalization they will fail to create actionable business intelligence for the enterprise and could harm DOD leaders ability to make the best strategic and operational decisions. This session will explore best practices in data normalization and how to utilize next generation analytic tools to create meaningful intelligence for the enterprise.
Jiro Akiyama, Director, Strategy and Organizational Development, Paragon Technology Group
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Next Generation Analytics: Overcoming the 8 Key Challenges to
1. Next Generation Analytics: Overcoming the
8 Key Challenges to Data Normalization
Jiro Akiyama
Director, Paragon Technology Group
TechNet Mid-America 2012
2. Agenda
• Defining the Problem
• 8 Key Challenges to Data Normalization
• Data Normalization Methodology
• Recap & Resources
3. Defining the Problem
Key Performance Measures
are essential to support
leadership decision making
Data Risks
• Inaccurate data
• Imprecise data
• Misunderstood data
• Misleading data
Result in poor decisions or
detrimental actions that lead
to adverse consequences.
4. Why is Data Normalization Difficult?
• Growing data complexity
• Increased need for near
real-time decisions
• Disparate data sources
• Complexity multiplies when data
is aggregated
6. Data Normalization: Issue #1
Metrics with Differing Units of Measure
Measures Score
Cost Savings $12.3 mil
Employee Satisfaction 98%
Number of Errors 20
Time to Respond 59 Seconds
7. Data Normalization: Issue #1
Metrics with Differing Units of Measure
Normalized
Measures Score
Score
Cost Savings $12.3 mil 7.5
Employee Satisfaction 98% 9.4
Number of Errors 20 6.0
Time to Respond 59 Seconds 2.9
8. Data Normalization: Issue #2
Non-linear Metrics Converted to a Linear scale
Measures Score Normalized
Customer Satisfaction 95% 9.5
Transactional Accuracy 95% 9.5
9. Data Normalization: Issue #3
Differing Control Boundaries
Control Example Measure Score
Boundary
Upper Turnaround Time 2 Days
Lower Systems Accessibility 99.9% Uptime
Channel Variance to Budget ± 5%
10. Data Normalization: Issue #4
Differing Logical Minimum and Maximum Amounts
Measures Logical Logical
Minimum Maximum
Customer Satisfaction 0% 100%
Number of Errors 0 ?
11. Data Normalization: Issue #5
Unbounded Maximums or Minimums
Example: Number of Complaints
6-10
0-5 Complaints =
Range for Green
Complaint
s = Range
11 - ?? Complaints =
Range for Red
for Yellow
12. Data Normalization: Issue #6
Composite Measures
Cost Variance:
Green ≤ ±5%, ±5% ≤ Yellow ≤ ±10% Red ≥ ±10%
Straight Average of Actual Score
Score for Project #1 = 1%
Score for Project #2 = -10.1% Average Normalized Scores
Composite Avg. Score = -4.55% Score for Project #1 = 9.33
Score for Project #2 = 3.31
Average of Green, Yellow, Red
(Green = 3 points, Yellow = 2 points, Red = 1 Composite Avg. Score = 6.32
point)
Score for Project #1 = 3
Score for Project #2 = 1
Composite Avg. Score = 2
15. Summary
Issue #1: Metrics with Differing Units of Measure
Issue #2: Non-linear Metrics Converted to a Linear Scale
Issue #3: Differing Control Boundaries
Issue #4: Differing Logical Minimum and Maximum Amounts.
Issue #5: Unbounded Maximums or Minimums
Issue #6: Composite Measures
Issue #7: Comparability Over Time
Issue #8: Fidelity into Thin Ranges
16. Additional Resources
Paragon Insights Blog (www.paragontech.net)
Contact Info:
Jiro Akiyama
P 301-792-0483
jakiyama@paragontech.net
Editor's Notes
Senior leaders are tasked with monitoring and controlling the activities and performance of their agency or business unit. To accomplish this task, performance measures are selected and data is gathered to show progress/status. This data is often analyzed and presented to senior leaders in the form of a balanced scorecard or operational dashboard. When the data is derived from multiple sources/measures and aggregated at the corporate level, meaning and accuracy can suffer greatly. Paragon Technology Group has developed a data normalization methodology to address the most common sources of ambiguity and inaccuracy.Data normalization is fraught with complex issues. Each issue presents a problem of its own, but when combined with other issues, the ambiguity is multiplied to the point where data can become misleading, concealing, or incorrect. Some of these issues include:
How many times have you been faced with the situation where you have a bunch of measures and some are measured in dollar amounts, some are measured as a percent, some are measured as a number, and some are measure in speed (seconds/hours/days). Comparing these measures side by side is often impossible unless you are deeply familiar with the operations and data sources behind each measure. Paragon’s methodology converts all measures and places them on a scale of 0 to 10.
How many times have you been faced with the situation where you have a bunch of measures and some are measured in dollar amounts, some are measured as a percent, some are measured as a number, and some are measure in speed (seconds/hours/days). Comparing these measures side by side is often impossible unless you are deeply familiar with the operations and data sources behind each measure. Paragon’s methodology converts all measures and places them on a scale of 0 to 10.
Straight conversion of non-linear metrics to a linear scale is a bad idea. A common example of this would be the conversion of an actual score of 95% to a normalized score of 9.5 out of 10. On the surface, this sounds good. But, take these two performance examples into consideration. If I have a performance measure for customer satisfaction and scored 95%, then I may feel like I’ve done an acceptable job. On the other hand, if I have a performance measure for transactional accuracy and scored 95%, I could potentially lose my job. After all, fouling up 1 in 20 transactions is no way to run a company. Both of these measures would show a score of 9.5 but have completely different meanings. This not only introduces confusion but makes leaders wonder if there was an intention to mislead. Regardless of whether an intention existed, the issue may cause the legitimacy and veracity of the entire process and presentation to be brought into question. Paragon’s methodology does not utilize straight conversion.
Control boundaries refer to the performance thresholds that senior leaders set to determine if a measure is “green” (acceptable performance), “yellow” (needs to be monitored closely), or “red” (problem area). Measures can fall into three categories for the purpose of normalization. They are upper control bound measures, lower control bound measures, and channel bound measures.
Logical minimum and maximum amounts refer to a hard stop on the measurement scale. For example, if you are measuring “number of errors”, the logical minimum is 0. In other words, you cannot make less than 0 errors. And, if you are measuring customer satisfaction, the logical maximum would be 100%. Any data normalization methodology must include the ability to input logical minimum and maximum amounts.Conversely, many measures do not have a logical stopping point at the other end. In the example of “number of errors”, while you can’t get less than 0, theoretically, you could have an infinite number of errors which means that there would be no way to define the red status. Paragon’s approach allows senior leadership to uniformly specify how unbounded measures are treated across the agency.
Any upper control bound or lower control bound measure that has an unbounded maximum or unbounded minimum presents a unique problem for normalization. An example of this could be “number of complaints”. While this is an upper control bound measure, it does not have a logical maximum to define the range of red scores. In other words, the number of complaints received could potentially be infinite (hopefully not!). In addition to this, there is also the potential for a measure to have both an unbounded maximum and an unbounded minimum. Although usually associated with profit, the handling of these situations needs to be consistent and driven by senior leadership. Any normalization methodology that seeks to score these measures needs to introduce a mechanism to handle unbounded measures.
Measures on an Agency’s the Strategic Balanced Scorecard are often a composite of sub-measures. Data normalization must occur early in the process to facilitate the roll-up of data to the agency level. Especially if the agency employs weighting criteria, then having an incorrect normalization methodology will act to exacerbate those measures with higher weighting factors.
For example, let’s say I’m saving for college and my goal is to save $10,000 per year for 5 years, In year one, I get a gift of $30,000 for college tuition and place it into my savings. I scoring very high at this point and am well over my threshold for green. In subsequent years, I don’t save any more money and the amount in my savings stays at $30,000. While that number has remained constant, my actual performance over time has gone from green to yellow and then to red. So, even though the actual measured score remained the same from year 1 to year 5, the score in relation to the threshold decrease, thus the normalized score should decrease to reflect the widening performance gap. An accurate normalization methodology utilizes the control boundaries to define green, yellow and red. Any normalization algorithm must automatically adjust the normalized scoring to reflect the new threshold without changing prior period scores to maintain consistency over time.
This issue is especially important with quality and Lean6 measures. When the range of acceptability is measure at the 3.4 defects per million, it is difficult to see how well you are performing within a range. Graphical representations of actual scores often miss the mark when it comes to displays of quality and are difficult to read and compare when the variance between numbers is miniscule. A Percentage would equate to 99.99966%.