O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

The science and art of data storytelling

126 visualizações

Publicada em

The presentation talks about data storytelling being a vital differentiator to rapidly find actionable insights and make business decisions.

Anand Madhav also teaches data storytelling to analysts and data scientists. Check out more about Gramener's data storytelling workshop at https://gramener.com/data-storytelling-workshop

Publicada em: Dados e análise
  • Login to see the comments

  • Seja a primeira pessoa a gostar disto

The science and art of data storytelling

  1. 1. The science and the art of Data Storytelling Anand Madhav
  2. 2. 2
  3. 3. 3
  4. 4. 4 Soho, London - Today Soho, London – 1853/54
  5. 5. Cholera deaths in London - 1854 5 Water Pump at Broad Street, London
  6. 6. We’ve been telling stories with data for a long time 6 How a nurse changed the course of a war using data storytelling. Created by Florence Nightingale for Queen Victoria during England’s war with France. Visualizes deaths due to: Red: War wounds Black: Other war-related causes Blue: Avoidable hospital diseases
  7. 7. With the growth of self-service BI, 85% of companies have lost track of how many dashboards they generated What QUESTION does the dashboard answer? Is the ANSWER evident from the dashboard? What ACTION should the user take now? BUT 3 THINGS ARE UNCLEAR ON MOST DASHBOARDS 7
  8. 8. Stories have a huge impact on humans 8 Storytelling has a 30X Return on Investment Rob Walker and Joshua Glenn auctioned common items like mugs, golf balls, toys, etc. The item descriptions were stories purpose-written by 200+ contributing writers. Items that were bought for $250 sold for over $8,000 – a return of over 3,000% for storytelling! Stories are memorable and viral People remember stories. They’ll act on them. People share stories. That enables collective action. We analyze data to improve people’s decision making. For this to be effective, data stories are needed more than ever before. http://significantobjects.com/
  9. 9. Data alone can be useless and misleading without context
  10. 10. Data storytelling is a critical skill for data scientists, analysts & managers 10 Share your data & analysis as data stories Whenever you share inferences from data – whether it’s as a presentation, or an email or document with your analysis, or as a dashboard – craft it as a story. Today we will look at a few engaging data stories and find out how to convert an analysis into a memorable story – even if you’ve never told a story before. But analysts present their work, not their message Data scientists present their analysis – what they did, and what they found. That’s not what the audience needs. Audiences need a message that tells them what to do, and why. Told in an engaging way. As a story.
  11. 11. New digital tools allow us to do this at scale
  12. 12. New digital tools allow us to do this at scale Sachin R Tendulkar Sourav C Ganguly Rahul Dravid Mohammad Azharuddin Yuvraj Singh Virender Sehwag Mahendra S Dhoni Alaysinhji D Jadeja Navjot S Sidhu Gautam Gambhir Krishnamachari Srikkanth Kapil Dev Dillip B Vengsarkar Suresh K Raina Ravishankar J Shastri Sunil M Gavaskar Mohammad Kaif Virat Kohli Vinod G Kambli Vangipurappu V S Laxman Rabindra R Singh Sanjay V Manjrekar Mohinder Amarnath Manoj M Prabhakar Rohit G Sharma Irfan K Pathan Nayan R Mongia Ajit B Agarkar Dinesh Mongia Harbhajan Singh Krishna K D Karthik Sandeep M Patil Anil Kumble Yashpal Sharma Javagal Srinath Hemang K Badani Yusuf K Pathan Robin V Uthappa Raman Lamba Zaheer Khan Ravindra A Jadeja Pathiv A Patel Sadagopan Ramesh Roger M H Binny Woorkeri V Raman Sunil B Joshi Kiran S More Praveen K Amre Ashok Malhotra Chetan Sharma Kapil Dev’s 175 against Zimbabwe in 1983 Gavaskar’s 107 against New Zealand in 1987 Srikkanth’s 95 against Sri Lanka in 1982 Siddhu’s 134 against England in Gwalior, 1993 Sachin’s 200 against South Africa in 2010 Dhoni’s 198 against Sri Lamka in 2005 Sachin’s 134 against Australia in 1998 Ganguly’s 183 against Sri Lanka in 1999 Sehwag’s 146 against Sri Lanka in 2009 Yusuf Pathan’s 123 against New Zealand in 2010 Kohli’s 107 against Engalnd in 2011
  13. 13. Segmenting India Geo-demographically Previously, the client was treating contiguous regions as a homogenous entity, from a channel content perspective. To deliver targeted content, we divided India into 6 clusters based on their demographic behavior. Specifically, three composite indices were created based on the economic development lifecycle: • Education (literacy, higher education) that leads to... • Skilled jobs (in mfg. or services) that leads to... • Purchasing power (higher income, asset ownership) Districts were divided (at the average cut-off) by: Offering targeted content to these clusters will reach a more homogenous demographic population. Skilled Poorer Richer Unskilled Skilled Uneducated Educated Uneducated Educated Unskilled Purchasing power Skilled jobs Education Poor Breakout Aspirant Owner Business Rich Poor Rural, uneducated agri workers. Young population with low income and asset ownership. Mostly in Bihar, Jharkhand, UP, MP. Breakout Rural, educated agri workers poised for skilled labor. Higher asset ownership. Parts of UP, Bihar, MP. Aspirant Regions with skilled labor pools but low purchasing power. Cusp of economic development. Mostly WB, Odisha, parts of UP Owner Regions with unskilled labor but high economic prosperity (landlords, etc..) Mostly AP, TN, parts of Karnataka, Gujarat Business Lower education but working in skilled jobs, and prosperous. Typical of business communities. Parts of Gujarat, TN, Urban UP, Punjab, etc. Rich Urban educated population working in skilled jobs. All metros, large cities, parts of Kerala, TN The 6 clusters are
  14. 14. This is a dataset (1975 – 1990) that has been around for several years and has been studied extensively. Yet, a visualization can reveal patterns that are neither obvious nor well known. For example, • Are birthdays uniformly distributed? • Do doctors or parents exercise the C-section option to move dates? • Is there any day of the month that has unusually high or low births? • Are there any months with relatively high or low births? Very high births in September. But this is fairly well known. Most conceptions happen during the winter holiday season Relatively few births during the Christmas and Thanksgiving holidays, as well as New Year and Independence Day. Most people prefer not to have children on the 13th of any month, given that it’s an unlucky day Some special days like April Fool’s day are avoided, but Valentine’s Day is quite popular More births Fewer births … on average, for each day of the year (from 1975 to 1990) Let’s look at 15 years of US Birth Data Education LINK Fraud
  15. 15. The pattern in India is quite different This is a birth date dataset that’s obtained from school admission data for over 10 million children. When we compare this with births in the US, we see none of the same patterns. For example, • Is there an aversion to the 13th or is there a local cultural nuance? • Are holidays avoided for births? • Which months have a higher propensity for births, and why? • Are there any patterns not found in the US data? Very few children are born in the month of August, and thereafter. Most births are concentrated in the first half of the year We see a large number of children born on the 5th, 10th, 15th, 20th and 25th of each month – that is, round numbered dates Such round numbered patterns a typical indication of fraud. Here, birthdates are brought forward to aid early school admission More births Fewer births … on average, for each day of the year (from 2007 to 2013) Education LINK Fraud
  16. 16. This adversely impacts children’s marks It’s a well-established fact that older children tend to do better at school in most activities. Since many children have had their birth dates brought forward, these younger children suffer. The average marks of children “born” on the 1st, 5th, 10th, 15th etc.. of the month tend to score lower marks. • Are holidays avoided for births? • Which months have a higher propensity for births, and why? • Are there any patterns not found in the US data? Higher marks Lower marks … on average, for children born on a given day of the year (from 2007 to 2013) Children “born” on round numbered days score lower marks on average, due to a higher proportion of younger children Education LINK Fraud
  17. 17. An energy utility detected billing fraud This plot shows the frequency of all meter readings from Apr- 2010 to Mar-2011. An unusually large number of readings are aligned with the slab boundaries. Below is a simple histogram (or frequency distribution) of usage levels. Each bar represents the number of customers with a customers with a specific bill amount (in units, or KWh). Tariffs are based on the usage slab. Someone with 101 units is billed in full at a higher tariff than someone with 100 units. So people have a strong incentive to stay at or within a slab boundary. An energy utility (with over 50 million subscribers) had 10 years worth of customer billing data available. Most fraud detection software failed to load the data, and sampled data revealed little or no insight. This can happen in one of two ways. First, people may be monitoring their usage very carefully, and turn of their lights and fans the instant their usage hits the slab boundary. Or, more realistically, there’s probably some level of corruption involved, where customers pay a small sum to the meter reading staff to ensure that it stays exactly at the slab boundary, giving them the advantage of a lower price.
  18. 18. This plot shows the frequency of all meter readings from Apr- 2010 to Mar-2011. An unusually large number of readings are aligned with the tariff slab boundaries. This clearly shows collusion of some form with the customers. Apr-10 May-10 Jun-10 Jul-10 Aug-10 Sep-10 Oct-10 Nov-10 Dec-10 Jan-11 Feb-11 Mar-11 217 219 200 200 200 200 200 200 200 350 200 200 250 200 200 200 201 200 200 200 250 200 200 150 250 150 150 200 200 200 200 200 200 200 200 150 150 200 200 200 200 200 200 200 200 200 200 50 200 200 200 150 180 150 50 100 50 70 100 100 100 100 100 100 100 100 100 100 100 100 110 100 100 150 123 123 50 100 50 100 100 100 100 100 0 111 100 100 100 100 100 100 100 100 50 50 0 100 27 100 50 100 100 100 100 100 70 100 1 1 1 100 99 50 100 100 100 100 100 100 This happens with specific customers, not randomly. Here are such customers’ meter readings. Section Apr-10 May-10 Jun-10 Jul-10 Aug-10 Sep-10 Oct-10 Nov-10 Dec-10 Jan-11 Feb-11 Mar-11 Section 1 70% 97% 136% 65% 110% 116% 121% 107% 114% 88% 74% 109% Section 2 66% 92% 66% 87% 70% 64% 63% 50% 58% 38% 41% 54% Section 3 90% 46% 47% 43% 28% 31% 50% 32% 19% 38% 8% 34% Section 4 44% 24% 36% 39% 21% 18% 24% 49% 56% 44% 31% 14% Section 5 4% 63% -27% 20% 41% 82% 26% 34% 43% 2% 37% 15% Section 6 18% 23% 30% 21% 28% 33% 39% 41% 39% 18% 0% 33% Section 7 36% 51% 33% 33% 27% 35% 10% 39% 12% 5% 15% 14% Section 8 22% 21% 28% 12% 24% 27% 10% 31% 13% 11% 22% 17% Section 9 19% 35% 14% 9% 16% 32% 37% 12% 9% 5% -3% 11% If we define the “extent of fraud” as the percentage excess of the 100 unit meter reading, the value varies considerably across sections, and time New section manager arrives … and is transferred out … with some explainable anomalies. Why would these happen?
  19. 19. 100 years of Weather Data in India Hyderabad Bhopal New Delhi What patterns does India’s temperature profile over a century reveal?
  20. 20. 100YEARSOFINDIA’SWEATHER 1901 1911 1921 1931 1941 1951 1961 1971 1981 1991 2001 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  21. 21. World Bank: Innovation, Technology & Entrepreneurship Does access to new Technology facilitate Innovation? Does it facilitate Entrepreneurship? The Global Information Technology Report findings tell us that "innovation is increasingly based on digital technologies and business models, which can drive economic and social gains from ICTs...". We were curious about whether the data on TCData360 could tell a story about influential factors on innovation and entrepreneurship. With over 1800 indicators, we focused on the Networked Readiness Index, as it has indicators on entrepreneurship, technology, and innovation. LINK SocietyPlatform
  22. 22. Visualizing the Mahabharata LINK Yudhishthira Bhima Arjuna Nakula Sahadeva Krishna Bhishma Karna Vaisampayana Duryodhana Dussasana Pandu Drona Dhritarashtra SanjayaKunti Vidura Draupadi Narada Parva Satyaki Kripa Vyasa Dhrishtadyumna Shalya Janamejaya Drupada Bhrigu Virata Gandhari The closeness between the various characters is represented in a network diagram to highlight the central characters and their ‘network’ of closely related characters. The peripheral characters and those with weak relations are visually represented as well. Just FunPlatform
  23. 23. Live Election Results Our CNN-IBN Microsoft Election Analytics Canter, which was hosted at www.bing.com/elections served over 10 million requests on 16th May 2014 — the day of India election results. This is one of the largest real-time visualizations that we (and perhaps many others) have attempted MediaMaps
  24. 24. SHOW me what is happening with the data EXPLAIN to me why it’s happening Allow me to EXPLOR E and figure it out Just EXPOSE the data to me Low effort High effort High effort Low effort Creator Consumer There are four ways of telling data stories
  25. 25. You have data. You have analysis. Now what? Understanding the audience & intent Finding insights Storylining Designing data stories
  26. 26. Understanding audience & intent Step 1 Understanding the audience & intent Finding insights Storylining Designing data stories
  27. 27. DO IT: Who might be an audience for your analysis? • Lookback at your recent analytics project. • Who do you know that can use this analysis? (Come up with a real or hypothetical personas) CHECK IT: Verify these yourself  Is there a name for the individual?  Was the role specific enough? (Head of sales instead of just executive) The same data analysis can be relevant to many people — each group is called persona. • The trends in sales data for an organization is relevant for a CEO, head of sales, region leads, individual sales team member & every employee. • The analysis of polio cases in UP is relevant to the Minister of health, polio campaign manager, field workers, NGOs, journalists & general public. This section will dive deeper into defining a persona Define your audience, they determine the story
  28. 28. DO IT: Start with your own hypothesis • Pick one of the personas you had listed earlier. • What problems do you think your persona is facing? • How do you feel the persona will use the analysis? • Frame it as a user scenario. CHECK IT: Verify user scenario with a partner  Is it framed as “As a [persona], I’m in [situation] where I face [problem], leading to [consequence]. Solving it by [action] leads to [impact]”  Would the persona relate to this user scenario if they heard it? List scenario(s) for each persona For each persona, answer the following questions: 1. What situation are they currently in? 2. What problems do they face? 3. What is the consequence? 4. What action can they need to take using your analysis? 5. What is the impact of this action? Combine these as a user scenario: “As a [persona], I’m in [situation] where I face [problem], leading to [consequence]. Solving it by [action] leads to [impact]” • John: As a Marketing manager, I have to create region-wise budget for the next quarter. I don’t know which regions give the highest RoI, so my spend isn’t optimized. Solving it by prioritizing the region will lead to maximum ROI. Clear needs & future scenario leads to effective communication. Know your audience’s needs, that helps align the message Reference: SPIN Selling by Neil Rackham
  29. 29. Finding Insights Step 2 Understanding the audience & intent Finding insights Storylining Designing data stories
  30. 30. Insights must be Big, Useful, and Surprising Filter the analyses using these as a checklist IS THE INSIGHT BIG IS THE INSIGHT USEFUL IS THE INSIGHT SURPRISING The analysis must, of course, be statistically significant. But it should also be numerically significant. We want a result that substantially changes the outcome. What should the audience do after hearing the insight? Can they take an action that improves their objective? Even if it’s informational, what should they do next? Is this something they didn’t know? Is it non-obvious? Does it overturn a domain-driven belief or a gut feel? Or does it bring consensus to a group with divided opinion?
  31. 31. Marking each analysis as Big, Useful or Surprising (High, Medium, Low) 31Only those that are high or medium on all aspects are insights Insights Big Useful Surprising Twice as many Detractors talk about our Product’s ease of use. Low Medium High Typing with capitalization in a credit application indicates creditworthiness Low Low High Almost 20% of all voice search queries are triggered by just 25 words Low High Medium More engaged employees have fewer accidents Low High Low About 50% of American small businesses do not have a website High Medium Low The recommendation system influences about 80% of content streamed on Netflix Big Low Low
  32. 32. Storylining Step 3 Understanding the audience & intent Finding insights Storylining Designing data stories
  33. 33. A business storyline • Our NPS improved 6% • It was 34% in 4Q18. Now it’s at 40% in 2Q19 • Despite lower satisfaction with our Support, our NPS grew • This increase in NPS was mainly due to better Product Quality & Research Gladiator’s storyline • The Emperor asks General Maximus to take control of Rome and give it back to people • The ambitious Prince murders the emperor. • Maximus is sold as a gladiator slave. His family is murdered • Maximus grows famous, fights the Prince in the arena, and wins • He joins his family in death. Rome is in the hands of the people Outlines are the backbone on which you flesh out your story. This section explains how to create storylines Storylines are plot outlines. They summarize the entire story Notice “characters” in red. All stories have characters, human or otherwise. 33
  34. 34. DO IT: Write your takeaway as one sentence What’s the one thing you want the audience to remember from your story? What’s the one message that the audience should take away? CHECK IT: Verify these yourself  Is it a single, complete, sentence?  Does it deliver what you want the audience to remember?  Will your audience care a lot about this? Close your eyes. Think of a childhood tale. Summarize the moral of the story in one line We easily we remember these stories and their summary as a moral several years later. Close your eyes. Think of a business presentation from last week. Can you easily summarize the message in one line? Stories are designed around a moral. A single takeaway. An “elevator pitch” It’s a one-sentence summary of the most important message for the audience. 1. Start with the takeaway (The elevator pitch. The moral of the story.) 34
  35. 35. DO IT: Write your supporting analysis 1. List all possible analysis 2. Re-word them as sentences 3. Strike off what’s not relevant CHECK IT: Verify these yourself  Is each necessary? Does each analysis support the takeaway?  Are they sufficient? Do the analyses prove the takeaway? What supports your takeaway from “The Lion and the Mouse”? http://www.read.gov/aesop/007.html  The lion was an Asiatic lion  The lion had a huge paw  The lion spared the mouse it caught  The lion was caught by a hunter’s net  It was stalking its prey when it got caught  The mouse was nibbling grass nearby  The mouse took few minutes to cut the net Only include analysis that proves the takeaway. Ensure that they fully prove the takeaway. 2. Find analysis that supports your takeaway. Ignore irrelevant content There’s no right or wrong answer. Think about how it supports your takeaway. 35
  36. 36. 3. Convert analysis into messages by adding context 36 DO IT: Add context to your analysis 1. Take each relevant analysis 2. Convert it to a message for the audience by adding context CHECK IT: Verify these yourself  Will your audience understand the messages without explanation?  Will your audience understand why this message is relevant? Analysis doesn’t mean anything to people. When it does, it’s a message. We do this by adding context. Three ways to add context are: 1. Compare with similar numbers. Our $15 mn sales is $3 mn more than last year, $1 mn below budget, and twice our nearest competitors. 2. Explain with analogies. If we stopped producing, it’ll take 3 months to dispose our excess inventory of $2 mn. 3. Add business interpretation. Usage is correlated with discounts. For every $1 discount, customer LTV increases by $24. Frame each analysis as a message that the audience will understand and find relevant
  37. 37. 4. Structure the messages into a pyramid or a tree 37 Example of a business tree Launch sales were 30% less than target due to high competition • Launch sales were projected at $20 mn in the first month, but achieved only $14 mn o Sales in every region were 20-50% lower. o Only Philippines & Korea were on target • Competitors discounted price by 35% - which is unsustainable for them o 80 store discounts increased from 15% to 35% o The maximum sustainable discount is 20% • Stores offered higher discounts saw less than 20% of our target sales Construct a pyramid or tree-like outline • Start with the takeaway at the root of the tree • Add a message that supports the takeaway • Add further details or supporting messages • Messages must prove the first message, and only the first message • Strike off any message that isn’t required to prove or support the takeaway • Add next message that supports takeaway • Add details to prove the second message • Remaining messages for the takeaway • Add details as required Arrange messages hierarchically to prove & support the parent message
  38. 38. 4. Structure the messages into a pyramid or a tree Conventional approach is to explain how we did the analysis & found the insight Insight is lost in the set of slides, takes too long to reach to the first insight. Instead, start with insight first, and then take the audience through arguments to support it. Starts with the main message, and then answers why & how the insight makes sense. Title Analysis section 1 Methodology Insight Analysis section 2 Methodology Insight Insight that answers a business question Supporting argument 1 Methodology Supporting reference Supporting argument 2 Methodology Supporting reference
  39. 39. 5. Re-order the messages to increase memorability and motivation 39 Order messages into an emotionally contrasting, motivating sequence. Take this aspects-based flow: • Our profits doubled. But our sales only grew 20%. Our gross margins stayed flat. The “emotional arc” is falling, and not motivating. Here’s the same message re-ordered: • Our sales grew mildly at 20%. Our margins didn’t improve at all. But our profits doubled! This emotional arc falls before rising. This is more motivating. Structure your supporting messages into a memorable flow. Here are 7 flows that help: 1. Time: e.g. Past, Present, Future Sales was $15 mn. Now it’s $18 mn. We expect it to grow to $20 mn. 2. Place: e.g. NA, EU, APAC 3. Aspects: e.g. company, competition, context 4. Benefits: e.g. better, faster, cheaper 5. Scale: e.g. local, regional, global 6. Balance: e.g. pros, cons 7. Priority or climactic: least to most important Remember: Emotional contrast requires bad news – it makes good look better
  40. 40. Designing data stories Step 4 Understanding the audience & intent Finding insights Storylining Designing data stories
  41. 41. How the data should be interpreted decides the type of chart to be used 41 https://gramener.github.io/visual-vocabulary-vega/ Deviation Change- over-Time Spatial Ranking Correlation Part-to- Whole Flow Magnitude Distribution
  42. 42. We use visual design cues to support our annotations & message 42 • Pre-attentive processing drives our attention towards certain elements more than others. • We can leverage these to highlight aspects of the chart that are relevant to our story. • For ex, when listing a set of countries, if the relevant insight is for one country, we can make it stand out as below: Position is the most powerful encoding. The eye and brain are naturally wired to detect mis- alignment of the smallest order 1 Colour, when used in context, is powerful. We can detect miniscule changes or variations in colour when comparing an element with neighbouring elements. This is what makes true colour (32-pixel colour, i.e. 4 billion) a necessity in computer graphics 2 Size is a useful differentiator. The eye can detect moderate size variations at moderate distances. Size also has a natural interpretation: that of priority. 3 Several other encodings are possible Aesthetics such as angle, shadows, shapes, patterns, density, labelling, enclosures, etc. can each be used to map data. 4
  43. 43. Class Xth English Marks Distribution 0 5,000 10,000 15,000 20,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
  44. 44. 4 type of annotations help the audience understand your intent 0 5,000 10,000 15,000 20,000 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Marks # students Teachers add marks to stop some students from failing This chart shows Class 10 students’ English marks in Tamil Nadu, India, in 2011. The X-axis has the mark a student has scored. The Y-axis has the # of students who scored that mark. This is a bell curve. But the spike at 35 (the mark at which students pass) is unusual. Teachers must be adding marks to some of the students who are likely to fail by a small margin. Large number of students score exactly 35 marks Few (but not 0) students score between 30-35 What’s unusual Large number of students score 35 marks. Few (but not 0) students score between 30-35 Only some students get this benefit. Identify a fair policy that will be applied consistently. Summarize the chart in its title Don’t describe the chart. Don’t write the question to answer. Write the answer itself. Like a headline. Explain the chart How should the user read it? What do you say when you talk through it? Explain what the visual is. Then the axes. Then its contents. Then the inference. Recommend an action How should I act on this? You need to change the audience. (Otherwise, you made no difference.) Highlight essential elements What should the user focus their eyes on? Point it out. Interpret what they’re seeing – in words.
  45. 45. Communicating data analysis is not enough This is a chart in a dashboard for state Health Department Some users of the dashboard know how to interpret this well. What do you think this dashboard says? Please take a minute and share your responses.
  46. 46. Annotations convert visuals in data stories that are clearer 4 out of the 8 aspirational Districts in UP moved up into the moderate performance group since last year The table below shows all 8 aspirational districts in Uttar Pradesh. “Rank” shows the district’s rank on overall performance. “Monthly “ show the composite index on May’19. “FY value” shows the YTD value. Sonbhadra, Fatehpur, Balrampur, and Chitrakoot have steadily improved performance over last 6 months by ~30% between Dec 2018 and May 2019. They are now up into the moderate performance group. These four aspirational districts moved from the poor to the moderate performance group. Heading as a summary Explanatio n of the visual Highlights of the key points Takeaway with actions for the user Action: Explore why these 4 districts improved more than the others. Apply the learnings to other aspirational districts
  47. 47. Insights and Story telling approach 47 Stage 1- Identify Business Problem Define the problem statement by understanding: • What is the basic need and desired outcome? • Who will benefit? • What is the impact? • What is the success criteria? Stage 2- Translate to Data Problem • Breakdown the problem statement into multiple use- cases • Connect each use case with a data set • Understand any limitations on data sources- Internal and External? Stage 4- Translate to Business Answer • Stitch insights from individual use case to create a story • Connect data story to help in better decision making • Measure success Stage 3- Data Answer Target each use case with data through: • EDA and transformation • Modelling • Generating insights • Sales Rep • Data Consultant • Account Manager • Solution Lead • Analyst Lead • Data Consultant • Account Manager • Solution Architect • Solution Lead • Analyst Lead • Data Consultant • Data Scientist • Solution Architect • Solution Lead • Data Consultant • Account Manager • Solution Lead
  48. 48. 8 super spreaders are responsible for 2/3rd of all suspected cases in the district 48 This visual represents the Covid-19 suspects in a district Each dot is a suspect. Red tested positive, Green negative, Grey awaiting results and Blue not tested yet Contact tracing for 40 positive cases resulted in ~1,400 suspected cases, an average of 35 contacts per person Top 8 super spreaders are responsible for two- thirds of all suspected cases Of these 1,400 suspected cases, test results are waited for roughly 5% of the cases. Among the declared results 11% are positive One positive patient also came in contact with 11 people from another district
  49. 49. Thank You! Anand Madhav @classicwild /anandmadhav Share feedback at https://bit.ly/3h4BdYc

×