Anúncio

Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #analysis #data #dataanalysis #Mapreduction

18 de Aug de 2016
Anúncio

Mais conteúdo relacionado

Similar a Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #analysis #data #dataanalysis #Mapreduction(20)

Anúncio

Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #analysis #data #dataanalysis #Mapreduction

  1. Big DATA By- Yash Bheda (1524008) Janhavi Jaltare(1524011) Krisha Udani() Binal Savla (1524003)
  2. Table of Contents Topics History of Big Data Big Data Architecture for Network Network Analysis Algorithm Big Data network analysing Network Application Summary
  3. 1.0: History of Big Data  Big data is a relative term describing when the data in an organization is to be stored and managed by timely decision making. Time Data Generation Processing Initially Employee generated data Single Processor Modern times User generated data Parallel Processing(Multiple processors using servers)Recently System generated data
  4. Contents  Big data generated by user and system are mostly unstructured. Traditional Data Big Data Documents Photographs Finances Audio and Videos Stock Recording 3D Models Personnel Files Simulation Location Data
  5. BIG data  Big Data represents the way this information is analysed to help open Opportunities.  A deep need exists for the structure to parse the data to separate out the unwanted and find the useful threads to uncover opportunities. Input information New processing techniques Better results
  6. Management approach  Traditionally  Modern Data input Storing Analysing Data input Analysing Storing
  7. 4 V’s of BIG data  Volume :vast amounts of data generated every second.  Velocity:speed at which new data generated moves around.  Variability :messiness or trustworthiness of the data. It means inconsistent data flow with periodic peaks.  Variety :different types of data we can now use.
  8. Variety of data
  9. Big Data Classification Why classify?  Complex situations  4 Vs  Results
  10. From classifying big data to choosing a big data solution Defining a logical architecture Understanding atomic patterns for big data solutions Understanding composite patterns to use for big data solutions Choosing a solution pattern for a big data solution Determining the viability of a business problem for a big data solution Selecting the right products to implement a big data solution
  11. Parallel processing
  12. Mappers and Reducers  Map-Reduce job = - Map function (input->key-value pairs)+ -Reduce function(key and list values->output).  Map() procedure (method) that performs filtering and sorting.  Reduce() method that performs a summary operation
  13. NATURAL JOIN- MAPPING  Join of R(A,B) with S(B,C) is the set of tuples (a,b,c).  Mapper need to send R(a,b) and S (b,c) to the same reducer, so they can be joined there.  Mapper output:key=B-value,value=relation and othe component (A or C). -Example:R(1,2)-> (2,(R,1)) S(2,3)-> (2.(S,3))
  14. Mapping Tuples R(1,2) —> —>(2,(R,1)) R(4,2) —> —>(2,(R,4)) S(2,3) —> —>(2,(S,3)) S(5,6) —> —>(5,(S,6)) Mapper For R(1,2) Mapper For R(4,2) Mapper For S(2,3) Mapper For S(5,6)
  15. Grouping Phase  There is a reduce for each key.  Every key-value pair generated by any mapper is sent to the reducer for its key.
  16. Mapping Tuples —>(2,(R,1)) (2,(R,1)) (2,(R,4)) —>(2,(R,4)) (2,(S,3)) —>(2,(S,3)) (5,(S,6)) —>(5,(S,6)) Mapper For R(1,2) Mapper For R(4,2) Mapper For S(2,3) Mapper For S(5,6) Reducer For B=2 Reducer for B=5
  17. Constructing Value-list  The input to each reducer is organized by the system into a pair: - The Key. - The List of values associated with that key.
  18. THE VALUE-LIST FORMAT (2,[(R,1), (R,4), (S,3)])—> (5,[(S,6)])—> Reducer for B=2 Reducer for B=5
  19. The reduce Function For Join Given key b and a list of values that are either (R, 𝑎𝑖 ) or (S, 𝑐𝑗 ), output each triple (𝑎𝑖 ,b,𝑐𝑗 ). -Thus, the number of outputs made by a reducer is the product of the number of R’s on the list and the numbers of S’s on the list.
  20. OUTPUT OF THE REDUCERS (2,[(R,1), (R,4), (S,3)])—> (5,[(S,6)])—> Reducer for B=2 Reducer for B=5 —>(1,2,3), (4,2,3)
  21. Network Resources Related to Big Data The network's capability to absorb and transfer big data traffic is made up of six elements: 1. Bandwidth 2. Network delay 3. Security 4. Data delivery accuracy 5. Availability 6. Resiliency
  22. Network Monitoring of Big Data ● Most monitoring systems deal with major changes, failures, configuration data, and traffic reporting. ● The monitoring function itself is a producer of big data. Therefore, the network data needs to be analyzed with big data applications. ● Traffic trends, where applications are located, what caused the traffic, and what network resources are available to effectively carry the traffic are all part of the network big data information.
  23. Network Monitoring Strategies ● Ensure that your monitoring tools collect the network information with enough granularity to produce detailed statistical representations. ● You will need a dashboard that continuously provides alerts and alarms when traffic changes occur that are outside acceptable. ● Create short-term reports rapidly so that traffic changes that could impair the network operation can be discovered as soon as possible. ● If a cloud service is employed, do you have the traffic data from the cloud delivered in real time so you can make decisions before a problem worsens?
  24. Benefits of Big Data Network Monitoring 1. Load balancing 2. Data Filtering 3. Real-time data analysis 4. Managing Virtual resources
  25. Big Data Impact
  26. Network Applications  Big data for network design  Big data for network management  Big data for network resource optimization  Big data for network security and privacy  Big data for network economics and pricing  Big data for network performance evaluation  Parallel and distributed algorithms for Big Data
  27. Online services  Netflix actually does comparison of their show banners and gives each customer what appeals to them
  28. Targeted marketing and advertising  Using 'tracking cookies' Facebook can collect information about each website you are visiting  It is possible to accurately predict a range of highly sensitive personal attributes simply by analysing the ‘Likes’
  29. Network Security & Bigdata  Software-Defined Networking (SDN)-based controllers and Big Data analytics within and about the data network  Analyzes network security attacks and potential risks immediately, which prevents security breaches.  Eg:Behavior analysis software to prevent the misuse of crutial data.
  30. Implementation  Network partitioning is crucial in setting up big data environments.  Heavy demands from applications do not impact other mission-critical workloads  Prepare now for big data scalability later  Yahoo is running more than 42,000 nodes in its big data environment, in 2013 the average number of nodes in a big data cluster was just over 100
  31. Summary  Big data helps better analysis and market prediction.  Helps develop better logistic and accuracy in systems and reduces redundancy.  The characteristic 4 v’s support the management and utilization of massive data.

Notas do Editor

  1. It's the information owned by a company, obtained and processed through new techniques to produce value in the best way possible.
  2. A problem is broken down into parts that can be solved concurrently. Each part is further broken down into instructions. Instructions execute simultaneously over multiple processors.
Anúncio