SlideShare a Scribd company logo
1 of 16
Next Generation Analytics: Overcoming the
8 Key Challenges to Data Normalization

Jiro Akiyama
Director, Paragon Technology Group
TechNet Mid-America 2012
Agenda

 • Defining the Problem

 • 8 Key Challenges to Data Normalization

 • Data Normalization Methodology

 • Recap & Resources
Defining the Problem

Key Performance Measures
are essential to support
leadership decision making

Data Risks
• Inaccurate data
• Imprecise data
• Misunderstood data
• Misleading data

Result in poor decisions or
detrimental actions that lead
to adverse consequences.
Why is Data Normalization Difficult?


• Growing data complexity

• Increased need for near
real-time decisions

• Disparate data sources

• Complexity multiplies when data
is aggregated
Data Normalization Challenges
Data Normalization: Issue #1

  Metrics with Differing Units of Measure


          Measures           Score

   Cost Savings             $12.3 mil

   Employee Satisfaction      98%

   Number of Errors            20

   Time to Respond         59 Seconds
Data Normalization: Issue #1

  Metrics with Differing Units of Measure

                                        Normalized
          Measures           Score
                                          Score

   Cost Savings             $12.3 mil      7.5

   Employee Satisfaction      98%          9.4

   Number of Errors            20          6.0

   Time to Respond         59 Seconds      2.9
Data Normalization: Issue #2


 Non-linear Metrics Converted to a Linear scale

              Measures      Score   Normalized
  Customer Satisfaction     95%        9.5
  Transactional Accuracy    95%        9.5
Data Normalization: Issue #3


        Differing Control Boundaries

    Control           Example Measure      Score
   Boundary
    Upper     Turnaround Time              2 Days
    Lower     Systems Accessibility     99.9% Uptime
    Channel   Variance to Budget           ± 5%
Data Normalization: Issue #4


Differing Logical Minimum and Maximum Amounts

               Measures     Logical    Logical
                           Minimum    Maximum
   Customer Satisfaction     0%        100%
   Number of Errors           0          ?
Data Normalization: Issue #5


      Unbounded Maximums or Minimums

Example: Number of Complaints

                            6-10
0-5 Complaints =
Range for Green
                         Complaint
                         s = Range
                                      11 -   ?? Complaints =
                                             Range for Red
                         for Yellow
Data Normalization: Issue #6
  Composite Measures
                            Cost Variance:
            Green ≤ ±5%, ±5% ≤ Yellow ≤ ±10% Red ≥ ±10%

Straight Average of Actual Score

Score for Project #1 = 1%
Score for Project #2 = -10.1%                   Average Normalized Scores
Composite Avg. Score = -4.55%                   Score for Project #1 = 9.33
                                                Score for Project #2 = 3.31
Average of Green, Yellow, Red
(Green = 3 points, Yellow = 2 points, Red = 1   Composite Avg. Score = 6.32
point)

Score for Project #1 = 3
Score for Project #2 = 1
Composite Avg. Score = 2
Data Normalization: Issue #7


   Comparability Over Time
Data Normalization: Issue #8

Fidelity into the Thin Ranges
Summary

  Issue #1: Metrics with Differing Units of Measure

  Issue #2: Non-linear Metrics Converted to a Linear Scale

  Issue #3: Differing Control Boundaries

  Issue #4: Differing Logical Minimum and Maximum Amounts.

  Issue #5: Unbounded Maximums or Minimums

  Issue #6: Composite Measures

  Issue #7: Comparability Over Time

  Issue #8: Fidelity into Thin Ranges
Additional Resources


   Paragon Insights Blog (www.paragontech.net)

   Contact Info:
   Jiro Akiyama
   P 301-792-0483
   jakiyama@paragontech.net

More Related Content

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Next Generation Analytics: Overcoming the 8 Key Challenges to

  • 1. Next Generation Analytics: Overcoming the 8 Key Challenges to Data Normalization Jiro Akiyama Director, Paragon Technology Group TechNet Mid-America 2012
  • 2. Agenda • Defining the Problem • 8 Key Challenges to Data Normalization • Data Normalization Methodology • Recap & Resources
  • 3. Defining the Problem Key Performance Measures are essential to support leadership decision making Data Risks • Inaccurate data • Imprecise data • Misunderstood data • Misleading data Result in poor decisions or detrimental actions that lead to adverse consequences.
  • 4. Why is Data Normalization Difficult? • Growing data complexity • Increased need for near real-time decisions • Disparate data sources • Complexity multiplies when data is aggregated
  • 6. Data Normalization: Issue #1 Metrics with Differing Units of Measure Measures Score Cost Savings $12.3 mil Employee Satisfaction 98% Number of Errors 20 Time to Respond 59 Seconds
  • 7. Data Normalization: Issue #1 Metrics with Differing Units of Measure Normalized Measures Score Score Cost Savings $12.3 mil 7.5 Employee Satisfaction 98% 9.4 Number of Errors 20 6.0 Time to Respond 59 Seconds 2.9
  • 8. Data Normalization: Issue #2 Non-linear Metrics Converted to a Linear scale Measures Score Normalized Customer Satisfaction 95% 9.5 Transactional Accuracy 95% 9.5
  • 9. Data Normalization: Issue #3 Differing Control Boundaries Control Example Measure Score Boundary Upper Turnaround Time 2 Days Lower Systems Accessibility 99.9% Uptime Channel Variance to Budget ± 5%
  • 10. Data Normalization: Issue #4 Differing Logical Minimum and Maximum Amounts Measures Logical Logical Minimum Maximum Customer Satisfaction 0% 100% Number of Errors 0 ?
  • 11. Data Normalization: Issue #5 Unbounded Maximums or Minimums Example: Number of Complaints 6-10 0-5 Complaints = Range for Green Complaint s = Range 11 - ?? Complaints = Range for Red for Yellow
  • 12. Data Normalization: Issue #6 Composite Measures Cost Variance: Green ≤ ±5%, ±5% ≤ Yellow ≤ ±10% Red ≥ ±10% Straight Average of Actual Score Score for Project #1 = 1% Score for Project #2 = -10.1% Average Normalized Scores Composite Avg. Score = -4.55% Score for Project #1 = 9.33 Score for Project #2 = 3.31 Average of Green, Yellow, Red (Green = 3 points, Yellow = 2 points, Red = 1 Composite Avg. Score = 6.32 point) Score for Project #1 = 3 Score for Project #2 = 1 Composite Avg. Score = 2
  • 13. Data Normalization: Issue #7 Comparability Over Time
  • 14. Data Normalization: Issue #8 Fidelity into the Thin Ranges
  • 15. Summary Issue #1: Metrics with Differing Units of Measure Issue #2: Non-linear Metrics Converted to a Linear Scale Issue #3: Differing Control Boundaries Issue #4: Differing Logical Minimum and Maximum Amounts. Issue #5: Unbounded Maximums or Minimums Issue #6: Composite Measures Issue #7: Comparability Over Time Issue #8: Fidelity into Thin Ranges
  • 16. Additional Resources Paragon Insights Blog (www.paragontech.net) Contact Info: Jiro Akiyama P 301-792-0483 jakiyama@paragontech.net

Editor's Notes

  1. Senior leaders are tasked with monitoring and controlling the activities and performance of their agency or business unit. To accomplish this task, performance measures are selected and data is gathered to show progress/status. This data is often analyzed and presented to senior leaders in the form of a balanced scorecard or operational dashboard. When the data is derived from multiple sources/measures and aggregated at the corporate level, meaning and accuracy can suffer greatly. Paragon Technology Group has developed a data normalization methodology to address the most common sources of ambiguity and inaccuracy.Data normalization is fraught with complex issues. Each issue presents a problem of its own, but when combined with other issues, the ambiguity is multiplied to the point where data can become misleading, concealing, or incorrect. Some of these issues include:
  2. How many times have you been faced with the situation where you have a bunch of measures and some are measured in dollar amounts, some are measured as a percent, some are measured as a number, and some are measure in speed (seconds/hours/days). Comparing these measures side by side is often impossible unless you are deeply familiar with the operations and data sources behind each measure. Paragon’s methodology converts all measures and places them on a scale of 0 to 10.
  3. How many times have you been faced with the situation where you have a bunch of measures and some are measured in dollar amounts, some are measured as a percent, some are measured as a number, and some are measure in speed (seconds/hours/days). Comparing these measures side by side is often impossible unless you are deeply familiar with the operations and data sources behind each measure. Paragon’s methodology converts all measures and places them on a scale of 0 to 10.
  4. Straight conversion of non-linear metrics to a linear scale is a bad idea. A common example of this would be the conversion of an actual score of 95% to a normalized score of 9.5 out of 10. On the surface, this sounds good. But, take these two performance examples into consideration. If I have a performance measure for customer satisfaction and scored 95%, then I may feel like I’ve done an acceptable job. On the other hand, if I have a performance measure for transactional accuracy and scored 95%, I could potentially lose my job. After all, fouling up 1 in 20 transactions is no way to run a company. Both of these measures would show a score of 9.5 but have completely different meanings. This not only introduces confusion but makes leaders wonder if there was an intention to mislead. Regardless of whether an intention existed, the issue may cause the legitimacy and veracity of the entire process and presentation to be brought into question. Paragon’s methodology does not utilize straight conversion.
  5. Control boundaries refer to the performance thresholds that senior leaders set to determine if a measure is “green” (acceptable performance), “yellow” (needs to be monitored closely), or “red” (problem area). Measures can fall into three categories for the purpose of normalization. They are upper control bound measures, lower control bound measures, and channel bound measures.
  6. Logical minimum and maximum amounts refer to a hard stop on the measurement scale. For example, if you are measuring “number of errors”, the logical minimum is 0. In other words, you cannot make less than 0 errors. And, if you are measuring customer satisfaction, the logical maximum would be 100%. Any data normalization methodology must include the ability to input logical minimum and maximum amounts.Conversely, many measures do not have a logical stopping point at the other end. In the example of “number of errors”, while you can’t get less than 0, theoretically, you could have an infinite number of errors which means that there would be no way to define the red status. Paragon’s approach allows senior leadership to uniformly specify how unbounded measures are treated across the agency.
  7. Any upper control bound or lower control bound measure that has an unbounded maximum or unbounded minimum presents a unique problem for normalization. An example of this could be “number of complaints”. While this is an upper control bound measure, it does not have a logical maximum to define the range of red scores. In other words, the number of complaints received could potentially be infinite (hopefully not!). In addition to this, there is also the potential for a measure to have both an unbounded maximum and an unbounded minimum. Although usually associated with profit, the handling of these situations needs to be consistent and driven by senior leadership. Any normalization methodology that seeks to score these measures needs to introduce a mechanism to handle unbounded measures.
  8. Measures on an Agency’s the Strategic Balanced Scorecard are often a composite of sub-measures. Data normalization must occur early in the process to facilitate the roll-up of data to the agency level. Especially if the agency employs weighting criteria, then having an incorrect normalization methodology will act to exacerbate those measures with higher weighting factors.
  9. For example, let’s say I’m saving for college and my goal is to save $10,000 per year for 5 years, In year one, I get a gift of $30,000 for college tuition and place it into my savings. I scoring very high at this point and am well over my threshold for green. In subsequent years, I don’t save any more money and the amount in my savings stays at $30,000. While that number has remained constant, my actual performance over time has gone from green to yellow and then to red. So, even though the actual measured score remained the same from year 1 to year 5, the score in relation to the threshold decrease, thus the normalized score should decrease to reflect the widening performance gap. An accurate normalization methodology utilizes the control boundaries to define green, yellow and red. Any normalization algorithm must automatically adjust the normalized scoring to reflect the new threshold without changing prior period scores to maintain consistency over time.
  10. This issue is especially important with quality and Lean6 measures. When the range of acceptability is measure at the 3.4 defects per million, it is difficult to see how well you are performing within a range. Graphical representations of actual scores often miss the mark when it comes to displays of quality and are difficult to read and compare when the variance between numbers is miniscule. A Percentage would equate to 99.99966%.