O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a navegar o site, você aceita o uso de cookies. Leia nosso Contrato do Usuário e nossa Política de Privacidade.
O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a utilizar o site, você aceita o uso de cookies. Leia nossa Política de Privacidade e nosso Contrato do Usuário para obter mais detalhes.
Big Data – what is it?Set of new concepts, practices & technologies to manage &exploit digital dataOVUM defines it as:“A data computational problem that is large and varied enough todemand new approaches to traditional SQL & related practices”Key premise is that all data has potential value if it can becollected, analysed and used to generate actionable insight
Big Data – its characteristicsThe 3Vs• Reflects exponential growth of data – predicted 40-60% per annum• Today 2.5 quintillion bytes of data are created every day• 90% of all digital data was created in the last two years• Data generated more varied and complex than before:– Text, Audio, Images, Machine Generated etc.• Much of this data is semi-structured or unstructured• Traditional IT techniques ill equipped to process & analyse it• Data often generated in real time• Analysis and response needs to be rapid, often also real time• Traditional BI / DW environments becoming obsolescent – newapproaches are needed
What’s different about Big Data?New technologies which enable distributed & highlyscalable MPP (Massively Parallel Processing), e.g.Apache HadoopMapReduceNoSQL databasesStrong emphasis on analytical approachesEmergence of “data science”Predictive AnalyticsData MiningThe “democratisation” of dataData made available to all (cf Cloud Computing)Business and not IT led BI
Where does Big Data come from?Widely known sources
Where does Big Data come from?Social Media & Social Networks
Where does Big Data come from?Machine Generated data
Big Data – some vertical applicationsRetail: using point of sale & social media data tosupplement & enrich traditional CRM / Marketing dataInsurance & Banking: fraud detectionHealth: holistic patient analysisUtilities: consumption peaks & troughs & capacityplanningTelcos: call routing optimisation & customer churnManufacturing: predictive fault identification & supplychain optimisationResearch: particle analysis, genomics etc.
Big Data in practice - VolvoEvery Volvo vehicle has hundreds ofmicroprocessors / sensorsData generated used within the car itself butalso captured for analysis by Volvo and itsdealersAll data is loaded into a centralised dataanalysis hub & integrated with CRM,dealership & product dataUsed to optimise design & manufacturing,enhance customer interaction & improvesafety
Big Data – why invest?Better understanding of customer & market behaviourImproved knowledge of product & service performanceAids innovation in products & servicesFact based and more rapid decision makingEnhances revenueReduces costsStimulates economic growth
Big Data – the impact on individualsEmployeesEmpower & devolve decision makingCreate new job & upskilling opportunitiesConsumersBetter targeted offersImproved products & services that meet needs
Big Data – Foundations of SuccessIdentifying the right data to solve the business problem oropportunityThe ability to integrate & match varied data from multiple datasourcesstructured, semi-structured, unstructuredBuilding the right IT infrastructure to support Big DataapplicationsHaving the right capabilities & skills to exploit the data
Big Data – the data integration challengeSOCIALMEDIASENSORSCSDATAEMAILMOBILESEXTERNALDATASOURCESINTERNALDATASOURCESCRMBILLINGOPSSALESPRODSANALYTICS PLATFORM 1ANALYTICS PLATFORM 2ANALYTICS PLATFORM 3ANALYTICS PLATFORM nACTIONABLE INSIGHT & KNOWLEDGE
Big Data – Barriers & PitfallsThe sheer volume of data – what’s worth using?Data extraction challengesThe ability to match data from disparate sources / formats / mediaThe time taken to integrate new data sourcesThe risks of mismatching and incorrect identification of individualsLegal & regulatory pitfallsSecurity concerns – corporate & individualLack of skills & expertiseMaking the case for investment
Big Data – the Data Quality Imperative (1)Need to profile external and internal data sourcesNeed to classify data to define what data really mattersNeed to assure the quality of internal (and some external)data sources for accuracy, completeness, consistencyNeed to define & apply business rules & metadatamanagement to how the data will be defined and usedNeed for a data governance framework to ensureconsistency & control
Big Data – the Data Quality Imperative (2)Need processes & tools to enable:Source data profilingData integrationData parsingData standardisationBusiness rule creation & managementMetadata management & a shared business / IT glossaryData de-duplicationData normalisationData standardisationData matchingData enrichmentData auditMany of these functions must be capable of being carriedout in real time with zero lag
Big Data – the key enablerEXTERNALDATASOURCESINTERNALDATASOURCESANALYTICS PLATFORM 1ANALYTICS PLATFORM 2ANALYTICS PLATFORM 3ANALYTICS PLATFORM nACTIONABLE INSIGHT & KNOWLEDGEPROFILEPARSESTANDARDISEMATCHENRICHDATA QUALITY PLATFORMPROFILEPARSESTANDARDISEMATCHENRICH
Big Data – some algorithms1. BIG DATA + POOR DATA QUALITY = BIG PROBLEMS2. DATA DEMOCRITISATION – DATA GOVERNANCE =ANARCHY3. DATA MASH UPS – DATA QUALITY = DATA MESS4. BIG DATA ANALYTICS + POOR DQ = WRONG RESULTS5. BIG DATA – DATA ASSURANCE = JAIL6. 3V + DATA QUALITY = 4V (VALIDITY)
Big Data – the futureTo date Big Data has been overhyped but now atipping point has comeIt is here and will grow in volume, velocity &varietyImmature concept & market so hard to plan – butconsolidation is happeningBig data in a business context reflects emerginggeneration’s expectations & needsData will increasingly be seen as an assetData skills will become increasingly valued
Big Data – how Trillium Software can helpCurrent Trillium Software products & servicescan help you succeed in your Big Datajourney:Real time & batch data capabilities in:o Data profilingo Parsingo Standardisationo De-duplicationo Matchingo Enrichmento AuditStrategic consulting services to prepare for andrealise Big Data opportunities