SlideShare a Scribd company logo
1 of 8
Download to read offline
ADELTECH	
  
BIG	
  DATA	
  RELOADED	
  
	
  

Big	
  Data	
  –	
  A	
  perspec8ve	
  
What	
  is	
  it?	
  
¨    Data	
  that	
  exceeds	
  the	
  storing	
  and	
  processing	
  capacity	
  of	
  conven8onal	
  
      database	
  systems	
  
  Tradi&onal	
  paradigm	
                                   BigData	
  Paradigm	
  
  Structured	
  Data	
  (usually	
  tables)	
                Unstructured-­‐Semi	
  structured	
  Data	
  
  Rela8onal	
  DB	
                                          Hadoop/Cassandra/other	
  appropriate	
  system	
  
  Analysis	
  &	
  repor8ng	
                                Models	
  &	
  Insights	
  
	
   Answers	
  -­‐	
  What	
  happened?	
  Why?	
           What	
  might	
  happen?	
  How	
  do	
  I	
  react/evolve?	
  
¨    Big	
  data	
  has	
  always	
  been	
  around	
  -­‐	
  what	
  changed	
  recently	
  to	
  reshape	
  the	
  
      ecosystem:	
  
      ¤    Inexpensive	
  commodity	
  hardware	
  
      ¤    MapReduce	
  paradigm	
  and	
  open-­‐source	
  soKware	
  -­‐	
  to	
  divide	
  complex	
  problems	
  into	
  
            small	
  chunks	
  which	
  can	
  be	
  run	
  on	
  this	
  hardware	
  
      ¤    Cloud	
  architecture	
  -­‐	
  Accessible	
  to	
  everyone	
  
                                                                    ©	
  Adeltech	
  2012	
  
How	
  do	
  we	
  define	
  it?	
  
¨    3	
  terms	
  describe	
  how	
  "big	
  data"	
  differs	
  from	
  "data"	
  (it	
  usually	
  has	
  1	
  or	
  more	
  of	
  
      these	
  a[ributes	
  
      ¤    Volume	
  -­‐	
  Log	
  data	
  from	
  systems	
  (ERP,	
  CRM),	
  sensors	
  or	
  social	
  networks	
  
      ¤    Velocity	
  -­‐	
  Transac8ons	
  at	
  a	
  worldwide	
  financial	
  network	
  
      ¤    Variety	
  -­‐	
  A	
  pharma	
  company	
  analyzing	
  medical	
  records,	
  claims	
  data,	
  FDA	
  data,	
  etc	
  
            together	
  




                                                                         ©	
  Adeltech	
  2012	
  
What	
  it	
  is	
  NOT?	
  
¨    IT	
  IS	
  NOT	
  a	
  magic	
  bullet	
  for	
  all	
  your	
  data	
  issues	
  
¨    IT	
  IS	
  NOT	
  a	
  million-­‐dollar	
  system-­‐wide	
  framework	
  -­‐	
  "Think	
  big,	
  start	
  small".	
  Though	
  
      most	
  soKware	
  vendors	
  like	
  IBM,	
  SAP	
  and	
  Google	
  have	
  their	
  own	
  expensive	
  big	
  data	
  
      offerings,	
  a	
  POC	
  can	
  be	
  started	
  using	
  simple	
  open-­‐source	
  solu8ons	
  
¨    IT	
  IS	
  NOT	
  a	
  new-­‐fangled	
  technology	
  needed	
  only	
  by	
  huge	
  corpora8ons	
  which	
  deal	
  with	
  
      TBs	
  and	
  PBs	
  of	
  data,	
  like	
  NASA,	
  Google	
  or	
  Facebook	
  -­‐	
  It	
  can	
  be	
  used	
  by	
  most	
  
      companies,	
  irrespec8ve	
  of	
  domain	
  or	
  market	
  size	
  to	
  gain	
  more	
  insights	
  from	
  exis8ng	
  
      data	
  (unknown	
  unknowns)	
  or	
  combine	
  more	
  sources	
  of	
  data	
  to	
  be	
  analyzed	
  together	
  
      (mul8ple	
  data	
  sets	
  might	
  throw	
  up	
  rela8onships	
  and	
  correla8ons	
  when	
  combined)	
  
¨    IT	
  IS	
  NOT	
  a	
  single	
  technology	
  that	
  needs	
  to	
  be	
  installed	
  on	
  expensive	
  customized	
  
      servers	
  -­‐	
  It's	
  a	
  paradigm	
  that	
  uses	
  innova8ve	
  algorithms	
  on	
  off-­‐the-­‐shelf	
  commodity	
  
      hardware	
  
¨    IT	
  IS	
  NOT	
  a	
  one-­‐size-­‐fits-­‐all	
  solu8on	
  for	
  large	
  amounts	
  of	
  data	
  -­‐	
  "Data	
  scien8sts"	
  need	
  
      to	
  look	
  at	
  use	
  cases	
  and	
  apply	
  domain	
  exper8se	
  to	
  figure	
  out	
  algorithms	
  and	
  
      technology	
  that	
  can	
  be	
  used	
  for	
  a	
  par8cular	
  problem	
  


                                                                               ©	
  Adeltech	
  2012	
  
Typical	
  Applica8ons	
  
¨    Fraud	
  and	
  money	
  laundering	
  detec&on:	
  Since	
  the	
  worldwide	
  web	
  of	
  money	
  transfer	
  
      spans	
  across	
  geographies	
  and	
  involves	
  numerous	
  people,	
  banks	
  and	
  financial	
  
      intermediaries,	
  it	
  is	
  impossible	
  to	
  track	
  them	
  and	
  derive	
  insights	
  using	
  normal	
  tools	
  
¨    Marke&ng:	
  This	
  is	
  a	
  natural	
  use	
  case	
  of	
  big	
  data	
  technology	
  in	
  order	
  to	
  analyze	
  target	
  
      popula8ons	
  in	
  terms	
  of	
  gender,	
  geography,	
  socioeconomic	
  factors,	
  and	
  a	
  host	
  of	
  other	
  
      factors,	
  some	
  of	
  which	
  might	
  not	
  be	
  apparent	
  directly	
  
¨    Science	
  and	
  technology:	
  Research	
  has	
  become	
  extremely	
  data-­‐intensive	
  in	
  the	
  last	
  few	
  
      years.	
  The	
  LHC	
  at	
  CERN	
  produces	
  13	
  TB	
  of	
  data	
  everyday,	
  most	
  of	
  which	
  is	
  discarded	
  
      because	
  it	
  can't	
  be	
  analyzed	
  at	
  that	
  rate	
  by	
  exis8ng	
  technology.	
  Similarly,	
  NASA's	
  
      Hubble	
  telescope	
  and	
  other	
  terrestrial	
  based	
  radio-­‐telescopes	
  churn	
  out	
  data	
  faster	
  
      than	
  it	
  can	
  be	
  stored	
  or	
  processed.	
  Big	
  data	
  can	
  help	
  make	
  sense	
  of	
  all	
  this.	
  
¨    Service	
  industries,	
  like	
  airlines	
  and	
  mobile	
  telephony	
  -­‐	
  To	
  keep	
  track	
  of	
  consumer	
  
      behavior	
  and	
  derive	
  business	
  intelligence	
  	
  so	
  that	
  marke8ng	
  dollars	
  can	
  be	
  focused	
  in	
  
      the	
  right	
  direc8on	
  
¨    Hiring:	
  The	
  hiring	
  boss	
  for	
  rank	
  and	
  file	
  jobs	
  is	
  now	
  an	
  algorithm	
  at	
  many	
  companies	
  
      like	
  Xerox	
  (using	
  Evolv),	
  IBM	
  (using	
  Kenexa)	
  and	
  Oracle,	
  etc.	
  	
  
                                                                       ©	
  Adeltech	
  2012	
  
The	
  Process	
  
¨     Drive-­‐train	
  approach	
  for	
  BigData	
  projects	
  
        ¤     Define	
  objec8ve	
  -­‐-­‐	
  in	
  concrete	
  terms	
  
        ¤     Iden8fy	
  data	
  sources	
  (levers)	
  -­‐-­‐	
  be	
  crea8ve	
  here	
  
        ¤     Collect	
  and	
  clean	
  data	
  -­‐-­‐	
  technology	
  play	
  
        ¤     Create	
  Models	
  (iterate)	
  -­‐-­‐	
  maths,	
  science	
  and	
  business	
  knowledge	
  	
  
        ¤     Iterate	
  8ll	
  the	
  desired	
  result	
  is	
  achieved	
  




  **	
  Image	
  from	
  Big	
  Data	
  Now	
  –	
  2012	
  Strata	
  Conf.	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ©	
  Adeltech	
  2012	
  
The	
  Engagement	
  Model	
  
¨  “Think	
  Big	
  –	
  Start	
  Small”	
  
¨  Start	
  with	
  a	
  POC	
  	
  

      ¤  using	
  open	
  source	
  soKware	
  and	
  small	
  amounts	
  of	
  
        Data	
  (representa8ve	
  sampling	
  should	
  be	
  done	
  
        thoughhully)	
  
¨  Apply	
  algorithms	
  to	
  gain	
  insights	
  
¨  Scale	
  the	
  models	
  –	
  Test	
  and	
  Implement	
  




                                             ©	
  Adeltech	
  2012	
  
THANKS	
  FOR	
  YOUR	
  TIME	
  




     Visit	
  www.adeltech.com	
  for	
  more	
  details	
  

More Related Content

What's hot

Big data management
Big data managementBig data management
Big data managementzeba khanam
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsMotaz Saad
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companiesRobert Smith
 
DOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyDOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyHarald Erb
 
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse..."Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...Dataconomy Media
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapSrinath Perera
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data ScienceKenny Daniel
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science TeamsEMC
 
Understand the Demand of Analyst Opportunity in U.S
Understand the Demand of Analyst Opportunity in U.SUnderstand the Demand of Analyst Opportunity in U.S
Understand the Demand of Analyst Opportunity in U.SJiaming Zhang
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health CareJeffrey Funk
 
Ai2020 ai and or final
Ai2020 ai and or finalAi2020 ai and or final
Ai2020 ai and or finalRichard Vidgen
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the EnterpriseThe Hive
 
Building Data Science Teams: A Moneyball Approach
Building Data Science Teams: A Moneyball ApproachBuilding Data Science Teams: A Moneyball Approach
Building Data Science Teams: A Moneyball Approachjoshwills
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentationAASTHA PANDEY
 
Data Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentData Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentKalido
 

What's hot (20)

Big data analysis concepts and references
Big data analysis concepts and referencesBig data analysis concepts and references
Big data analysis concepts and references
 
Big data management
Big data managementBig data management
Big data management
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
Data Mining and Business Intelligence Tools
Data Mining and Business Intelligence ToolsData Mining and Business Intelligence Tools
Data Mining and Business Intelligence Tools
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companies
 
DOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyDOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud Journey
 
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse..."Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
 
Data mining
Data miningData mining
Data mining
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
 
Data science unit2
Data science unit2Data science unit2
Data science unit2
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data Science
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
 
Understand the Demand of Analyst Opportunity in U.S
Understand the Demand of Analyst Opportunity in U.SUnderstand the Demand of Analyst Opportunity in U.S
Understand the Demand of Analyst Opportunity in U.S
 
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health Care
 
Ai2020 ai and or final
Ai2020 ai and or finalAi2020 ai and or final
Ai2020 ai and or final
 
Data science
Data scienceData science
Data science
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
Building Data Science Teams: A Moneyball Approach
Building Data Science Teams: A Moneyball ApproachBuilding Data Science Teams: A Moneyball Approach
Building Data Science Teams: A Moneyball Approach
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
Data Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentData Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business Investment
 

Viewers also liked

What is education
What is educationWhat is education
What is educationStuck Mind
 
Historia de guardamar
Historia de guardamarHistoria de guardamar
Historia de guardamarcarvi2012
 
Центр карьеры
Центр карьерыЦентр карьеры
Центр карьерыmakelove
 
יואב טריידל "נבחרת עירונית"
יואב טריידל "נבחרת עירונית"יואב טריידל "נבחרת עירונית"
יואב טריידל "נבחרת עירונית"Yael Doron Drori
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome EconomyHelge Tennø
 

Viewers also liked (6)

Paki sm es
Paki sm esPaki sm es
Paki sm es
 
What is education
What is educationWhat is education
What is education
 
Historia de guardamar
Historia de guardamarHistoria de guardamar
Historia de guardamar
 
Центр карьеры
Центр карьерыЦентр карьеры
Центр карьеры
 
יואב טריידל "נבחרת עירונית"
יואב טריידל "נבחרת עירונית"יואב טריידל "נבחרת עירונית"
יואב טריידל "נבחרת עירונית"
 
The Outcome Economy
The Outcome EconomyThe Outcome Economy
The Outcome Economy
 

Similar to Big Data in small words

An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigDataValarmathi V
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Denodo
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonJeffrey T. Pollock
 
Future of Data Strategy
Future of Data StrategyFuture of Data Strategy
Future of Data StrategyDenodo
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Denodo
 
Data Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowData Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowMapR Technologies
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Denodo
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerMicrosoft
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big dataSitaram Kotnis
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)Denodo
 
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your DataBig Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your DataKai Wähner
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptxAlbert Alex
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data SnapLogic
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptxElsonPaul2
 

Similar to Big Data in small words (20)

An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lon
 
Future of Data Strategy
Future of Data StrategyFuture of Data Strategy
Future of Data Strategy
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
Data Warehouse Evolution Roadshow
Data Warehouse Evolution RoadshowData Warehouse Evolution Roadshow
Data Warehouse Evolution Roadshow
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
 
Big data business case
Big data   business caseBig data   business case
Big data business case
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringer
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your DataBig Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your Data
 
The new EDW
The new EDWThe new EDW
The new EDW
 
Combining hadoop with big data analytics
Combining hadoop with big data analyticsCombining hadoop with big data analytics
Combining hadoop with big data analytics
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptx
 
The new dominant companies are running on data
The new dominant companies are running on data The new dominant companies are running on data
The new dominant companies are running on data
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Big Data in small words

  • 1. ADELTECH   BIG  DATA  RELOADED     Big  Data  –  A  perspec8ve  
  • 2. What  is  it?   ¨  Data  that  exceeds  the  storing  and  processing  capacity  of  conven8onal   database  systems   Tradi&onal  paradigm   BigData  Paradigm   Structured  Data  (usually  tables)   Unstructured-­‐Semi  structured  Data   Rela8onal  DB   Hadoop/Cassandra/other  appropriate  system   Analysis  &  repor8ng   Models  &  Insights     Answers  -­‐  What  happened?  Why?   What  might  happen?  How  do  I  react/evolve?   ¨  Big  data  has  always  been  around  -­‐  what  changed  recently  to  reshape  the   ecosystem:   ¤  Inexpensive  commodity  hardware   ¤  MapReduce  paradigm  and  open-­‐source  soKware  -­‐  to  divide  complex  problems  into   small  chunks  which  can  be  run  on  this  hardware   ¤  Cloud  architecture  -­‐  Accessible  to  everyone   ©  Adeltech  2012  
  • 3. How  do  we  define  it?   ¨  3  terms  describe  how  "big  data"  differs  from  "data"  (it  usually  has  1  or  more  of   these  a[ributes   ¤  Volume  -­‐  Log  data  from  systems  (ERP,  CRM),  sensors  or  social  networks   ¤  Velocity  -­‐  Transac8ons  at  a  worldwide  financial  network   ¤  Variety  -­‐  A  pharma  company  analyzing  medical  records,  claims  data,  FDA  data,  etc   together   ©  Adeltech  2012  
  • 4. What  it  is  NOT?   ¨  IT  IS  NOT  a  magic  bullet  for  all  your  data  issues   ¨  IT  IS  NOT  a  million-­‐dollar  system-­‐wide  framework  -­‐  "Think  big,  start  small".  Though   most  soKware  vendors  like  IBM,  SAP  and  Google  have  their  own  expensive  big  data   offerings,  a  POC  can  be  started  using  simple  open-­‐source  solu8ons   ¨  IT  IS  NOT  a  new-­‐fangled  technology  needed  only  by  huge  corpora8ons  which  deal  with   TBs  and  PBs  of  data,  like  NASA,  Google  or  Facebook  -­‐  It  can  be  used  by  most   companies,  irrespec8ve  of  domain  or  market  size  to  gain  more  insights  from  exis8ng   data  (unknown  unknowns)  or  combine  more  sources  of  data  to  be  analyzed  together   (mul8ple  data  sets  might  throw  up  rela8onships  and  correla8ons  when  combined)   ¨  IT  IS  NOT  a  single  technology  that  needs  to  be  installed  on  expensive  customized   servers  -­‐  It's  a  paradigm  that  uses  innova8ve  algorithms  on  off-­‐the-­‐shelf  commodity   hardware   ¨  IT  IS  NOT  a  one-­‐size-­‐fits-­‐all  solu8on  for  large  amounts  of  data  -­‐  "Data  scien8sts"  need   to  look  at  use  cases  and  apply  domain  exper8se  to  figure  out  algorithms  and   technology  that  can  be  used  for  a  par8cular  problem   ©  Adeltech  2012  
  • 5. Typical  Applica8ons   ¨  Fraud  and  money  laundering  detec&on:  Since  the  worldwide  web  of  money  transfer   spans  across  geographies  and  involves  numerous  people,  banks  and  financial   intermediaries,  it  is  impossible  to  track  them  and  derive  insights  using  normal  tools   ¨  Marke&ng:  This  is  a  natural  use  case  of  big  data  technology  in  order  to  analyze  target   popula8ons  in  terms  of  gender,  geography,  socioeconomic  factors,  and  a  host  of  other   factors,  some  of  which  might  not  be  apparent  directly   ¨  Science  and  technology:  Research  has  become  extremely  data-­‐intensive  in  the  last  few   years.  The  LHC  at  CERN  produces  13  TB  of  data  everyday,  most  of  which  is  discarded   because  it  can't  be  analyzed  at  that  rate  by  exis8ng  technology.  Similarly,  NASA's   Hubble  telescope  and  other  terrestrial  based  radio-­‐telescopes  churn  out  data  faster   than  it  can  be  stored  or  processed.  Big  data  can  help  make  sense  of  all  this.   ¨  Service  industries,  like  airlines  and  mobile  telephony  -­‐  To  keep  track  of  consumer   behavior  and  derive  business  intelligence    so  that  marke8ng  dollars  can  be  focused  in   the  right  direc8on   ¨  Hiring:  The  hiring  boss  for  rank  and  file  jobs  is  now  an  algorithm  at  many  companies   like  Xerox  (using  Evolv),  IBM  (using  Kenexa)  and  Oracle,  etc.     ©  Adeltech  2012  
  • 6. The  Process   ¨  Drive-­‐train  approach  for  BigData  projects   ¤  Define  objec8ve  -­‐-­‐  in  concrete  terms   ¤  Iden8fy  data  sources  (levers)  -­‐-­‐  be  crea8ve  here   ¤  Collect  and  clean  data  -­‐-­‐  technology  play   ¤  Create  Models  (iterate)  -­‐-­‐  maths,  science  and  business  knowledge     ¤  Iterate  8ll  the  desired  result  is  achieved   **  Image  from  Big  Data  Now  –  2012  Strata  Conf.                    ©  Adeltech  2012  
  • 7. The  Engagement  Model   ¨  “Think  Big  –  Start  Small”   ¨  Start  with  a  POC     ¤  using  open  source  soKware  and  small  amounts  of   Data  (representa8ve  sampling  should  be  done   thoughhully)   ¨  Apply  algorithms  to  gain  insights   ¨  Scale  the  models  –  Test  and  Implement   ©  Adeltech  2012  
  • 8. THANKS  FOR  YOUR  TIME   Visit  www.adeltech.com  for  more  details