O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
Big data
Big data
Carregando em…3
×

Confira estes a seguir

1 de 22 Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Big data (20)

Anúncio

Mais recentes (20)

Anúncio

Big data

  1. 1. Susan Etlinger Susan Etlinger is an industry analyst with Altimeter Group, where she focuses on data and analytics. She conducts independent research and has authored two intriguing reports: “The Social Media ROI Cookbook” and “A Framework for Social Analytics.” She also advises global clients on how to work measurement into their organizational structure and how to extract insights from the social web which can lead to tangible actions. In addition, she works with technology innovators to help them refine their roadmaps and strategies.
  2. 2. What is BIG DATA? ● ‘Big Data’ is similar to ‘Small Data’, but bigger in size ● But having data bigger it requires different approaches: -Techniques, tools and architecture ● An aim to solve new problems or old problems in a better way ● Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques. ● Walmart handles more than 1 million customer transactions every hour. ● Facebook handles 40 billion photos from its user base. ● Decoding the human genome originally took 10 years to proces; now it can be achieved in one week.
  3. 3. So what does this mean for Analytics? So what does this mean for Analytics? Yes, the amount of data that is available to us is exploding And Big Data Platforms and Commodity Hardware and bringing in additional capabilities Media is rife with Big Data and Analytics Big Data and analytics is touted as the panacea for all problems …makes it to on top of CIO agenda AND The Data Scientist makes it from Nerd to the most cool person!! .
  4. 4. Data Science perspective – A Data Science perspective Big Data and AnalyticsImpact of
  5. 5. A brief history of Data Science Pre 1800s 1800-1900 1900-1940 1940-1960 1960 1970 1980 1990 2000 2010 ▪ Text/ string search ▪ 1974 Peter Naur “Concise Survey of Computer Methods”, Data Science, Datalogy ▪ Knuth – Art of Computer Programming. ▪ 1976 – SAS Institute ▪ 1977 The International Association for Statistical Computing (IASC). Computer Science Data Technology Visualization Mathematics/ OR Statistics ▪ Probability ▪ Correlation ▪ Bayes Theorem. ▪ Regression, Least Squares ▪ Time Series. ▪ Theoretical Foundations of Modern Stats ▪ Hypothesis, DOE ▪ Mathematical Statistics. ▪ Bayesian Methods ▪ Time Series Methods (Box Cox, Survival, etc.) ▪ Stochastic Methods. ▪ Simulation, Markov ▪ Computational Statistics. ▪ Decision Science ▪ Pattern recognition ▪ Machine learning. ▪ Liebniz – Binary Logic. ▪ Babbage, Lovelace ▪ Boolean Algebra ▪ Punch cards. ▪ Turing machines ▪ Information Theory ▪ Weiner & Cybernetics ▪ Von Neumann Architecture. ▪ Calculus ▪ Logarithms ▪ Newton-Raphson. ▪ 1989 First KDD Workshop ▪ Gregory Piatetsky-Shapiro. ▪ Sort & Search Algorithms – Dijkstra, Kruskal, Shell Sort, … ▪ Heuristics – Simulated Annealing, … ▪ Graph Algorithms ▪ Multigrid methods ▪ Tree based methods. ▪ Database Marketing ▪ Data Mining, Knowledge Discovery ▪ “Data science, classification, and related methods.” ▪ William Cleveland: Data Science ▪ Leo Breimann: Statistical Modeling: 2 Cultures. ▪ Optimization Methods ▪ Fourier and other transforms ▪ Matrix & Generalizations ▪ Non-euclidean geometries. ▪ Applications to Military, manufacturing, Communications. ▪ 1962 John W. Tukey, Future of Data Analysis ▪ Networks ▪ Assignment Problems ▪ Automation ▪ Scheduling. ▪ First IBM Computers ▪ DBMS. ▪ Removable Disk drives ▪ Relational DBMS. ▪ Desktop, floppy ▪ SQL, OOP ▪ High level languages. ▪ William Playfair ▪ Charles Minard ▪ Florence Nightingale. ▪ Catrography ▪ Astronomical Charts. ▪ John Tukey ▪ Jacques Bertin. ▪ Edward Tufte. ▪ Grammar of Graphics ▪ Word Cloud, Tag Cloud.
  6. 6. Drivers of change Data Availability Technology Ability to Handle Structured and unstructured data Platform Cost Agility Business Expectation Digital Experience Strategic Initiatives New Business Models
  7. 7. Why Big Data? ● FB generates 10TB daily ● Twitter generates 7 TBof Data Daily ● IBM claims 90% of today’s store data was generated in just the last two years
  8. 8. How is Big Data Different ? 1) Automatically generated by a machine (e.g. Sensor embedded in an engine) 2) Typically an entirely new source of data (e.g. Use of the internet) 3) Not designed to be friendly (e.g. Text streams) 4) May not have much values ● Need to focus on the important part
  9. 9. Three Characteristics of Big Data V3S 1. Volume (Data Quantity) Boeing 737 generate 240 terabytes of flight data during a single flight across the US 2. Volume (Data Speed) Machine to machine processes exchange data between billions of devices 3. Variety (Data Types) Big Data isn’t numbers, dates and strings. It is also geospatial data, 3D data, Audio and Video and Unstructured text, including log file and social media
  10. 10. The Structure of Big Data ● Structured Most traditional data sources ● Semi-structured Many sources of big data ● Unstructured Video data, audio data
  11. 11. 3 new insights from the video 1 Big Data = Poor Data ● The more data you have, the less probable your chance of discovering meaning -- the "why" of things. 2 To accelerate our demise ● "We are in an age where guided missiles are operated by misguided men" ● Unless we slow down and analyze our needs we will surely accelerate our demise 3 Coding is not in our control ● Big data can be a transformative force, and should be treated as such
  12. 12. Why and How these Insights relevant to Managers in India ?
  13. 13. 1. Cost Reduction Big data technologies such as Hadoop and cloud-based analytics bring significant cost advantages when it comes to storing large amounts of data – plus they can identify more efficient ways of doing business.
  14. 14. 2. Faster, Better Decision Making With the speed of Hadoop and in-memory analytics, combined with the ability to analyze new sources of data, businesses are able to analyze information immediately – and make decisions based on what they’ve learned.
  15. 15. 3. New Products and Services With the ability to gauge customer needs and satisfaction through analytics comes the power to give customers what they want. Davenport points out that with big data analytics, more companies are creating new products to meet customers’ needs.
  16. 16. Decreased response time Customer experience Information is becoming the new battleground Business expectation
  17. 17. Analytics is playing an ever important role Increased Focus on identifying the customer across all channels Segmentation to Micro segmentation to the individual Personalized Messaging and offers – Increased Individual Customer Centricity Gradual evolution of Customer Analytics Past ▪ Customer segments who are most likely to respond to targeted campaigns for new products offers ▪ Can tailor offers to specific to each customer segment ▪ Mostly delivered through mass mail campaigns and in store promotions. Now ▪ Micro segmentation ▪ Analyze customer behavior and buying patterns across channels ▪ Delivery through email, web, mass mail campaigns. Moving toward ▪ Historical individual customer behavior and buying patterns across channels ▪ Individual customer consumption pattern ▪ In-store basket analytics ▪ Additional dimensions Location & time ▪ Targeted Strategies to pre-empt customers from visiting competition ▪ Instantaneous Delivery in store or a proactive delivery via mobile to bring the customer to store. Segment to Individual to Individual @ time, place and behavior You have purchased Cheese, here are the offers on Bagels You are within 2 KMs of a store offering 50% off garden furniture Do you need coffee?
  18. 18. much of which is outside the organization Increased availability of data Analytics as a Service and Data Monetization New service models Decreasing Time value of data!
  19. 19. Scalability and industrialization to address skill shortage Key to a Great Data Scientist Technical skills (Coding, Statistics, Math) + Perseverance +Creativity + Intuition +Presentation Skills +Business Savvy = Great Data Scientist! ▪ Identified four Data Scientist clusters based on how data scientists think about themselves and their work, not • Years of experience, • Academic degrees, favorite tools • Titles, pay scales, org charts. ▪ Most successful data scientists are those with substantial, deep expertise in at least one aspect of data science, be it statistics, big data, or business communication ▪ T-Shaped Skills.
  20. 20. Presented By Prince Barai Data Analytics Intern at IIM, Lucknow

×