SlideShare uma empresa Scribd logo
1 de 34
Introduction to Map/Reduce Data Transformations Tasso Argyros CTO and Co-Founder Aster Data Systems [email_address]
A Brief History of MapReduce Confidential and proprietary. Copyright © 2008 Aster Data Systems
What is MapReduce? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
Why is MapReduce Useful? ,[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
The quick brown fox jumps over the lazy dog. To be or not to be: that is the question. Switch The world only needs five computers. Hello world. In-Database MapReduce is the future. MapReduce is a very powerful programming paradigm. Confidential and proprietary. Copyright © 2008 Aster Data Systems Server A Server B Server C Server D
Goal We Want to Count  the # of Times  Each Word Occurs Confidential and proprietary. Copyright © 2008 Aster Data Systems
1 st  Approach No MapReduce 1 st  Approach No MapReduce Confidential and proprietary. Copyright © 2008 Aster Data Systems
The quick brown fox jumps over the lazy dog To be or not to be: that is the question. Switch The world only needs five computers. Hello world. In-Database MapReduce is the future. MapReduce is a very powerful concept. the quick brown fox jumps over the lazy dog in database mapreduce is the future the world only needs five computers the quick brown fox jumps over the lazy dog in database mapreduce is the future the world only needs five computers hello world mapreduce is a very powerful concept to be or not to be that is the question Confidential and proprietary. Copyright © 2008 Aster Data Systems Server A Server B Server C Server D hello world mapreduce is a very powerful concept to be or not to be that is the question
Confidential and proprietary. Copyright © 2008 Aster Data Systems Server 4 Final Result File the 5 is 3 mapreduce 2 … …
What Did We Do? ,[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
2 nd  Approach No MapReduce Fully Distributed Confidential and proprietary. Copyright © 2008 Aster Data Systems
The quick brown fox jumps over the lazy dog To be or not to be: that is the question. Switch The world only needs five computers. Hello world. In-Database MapReduce is the future. MapReduce is a very powerful concept. Confidential and proprietary. Copyright © 2008 Aster Data Systems Server A Server B Server C Server D the quick brown fox jumps over the lazy dog in database mapreduce is the future the world only needs five computers hello world mapreduce is a very powerful concept to be or not to be that is the question the the the the the database database future world world powerful lazy brown mapreduce mapreduce be be to jumps computers hello is is is question over a that
Confidential and proprietary. Copyright © 2008 Aster Data Systems Server 1 Final Result File the 5 … … . Server 2 Final Result File world 2 … … . Server 3 Final Result File mapreduce 2 … … . Server 4 Final Result File is 3 … … .
2 nd  Approach: No MapReduce, Distributed Confidential and proprietary. Copyright © 2008 Aster Data Systems
Does it work? Yes Is it a pain? Yes!! Does it take lots of time? Yes! Would you do it? No!!! Confidential and proprietary. Copyright © 2008 Aster Data Systems
Moreover… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
Data Redistribution and Grouping Confidential and proprietary. Copyright © 2008 Aster Data Systems Map() Input Any file (e.g. documents) Output Stream of <key, value> pairs (e.g. <word, count> pairs) Input All <key, value> pairs with the  same  key grouped (e.g. all <word, count> pairs where word = “the”) Output Anything (e.g. sum of counts for a specific word) Reduce()
The quick brown fox jumps over the lazy dog In-Database MapReduce is the future. <the, 1> <quick, 1> <brown,1> <fox,1> <jumps,1> <over,1> <the,1> <lazy,1> <dog,1> <in, 1> <database, 1> <mapreduce,1> <is,1> <the,1> <future,1> <world,1> <world,1> <powerful,1> <lazy,1> <brown,1> <mapreduce,1> <mapreduce,1> <be,1> <be,1> <to,1> <jumps,1> <computers,1> <hello,1> <is,1> <is,1> <is,1> <question,1> <over,1> <a,1> <that,1> Switch <the, 1> <the, 1> <the, 1> <the, 1> <the, 1> <database,1> <database,1> <future,1> Map() and Redistribution Phase Confidential and proprietary. Copyright © 2008 Aster Data Systems Map() Map() Server A Server B Server C Server D
<the, 1> <the, 1> <the, 1> <the, 1> <the, 1> <database,1> <database,1> <future,1> <the, 1> <the, 1> <the, 1> <the, 1> <the, 1> <database,1> <database,1> <future,1> Grouping and Reduce() Phase (on Server 1) Confidential and proprietary. Copyright © 2008 Aster Data Systems Reduce() Server 1 Final Result File the 5 database 2 future 1 Reduce() Reduce()
What Just Happened? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
Word Count was Only an Example! ,[object Object],“ The indexing code is simpler, smaller, and easier to understand, because the code that deals with fault tolerance, distribution and parallelization is hidden within the MapReduce library. For example, the size of one phase of the computation dropped from approximately 3,800 lines of C++ code to approximately 700 lines when expressed using MapReduce .” Google 2004 MapReduce paper Confidential and proprietary. Copyright © 2008 Aster Data Systems
Word Count was Only an Example! ,[object Object],“ We adapt Google’s MapReduce paradigm to demonstrate this parallel speed up technique on a variety of learning algorithms including locally weighted linear regression (LWLR), k-means, logistic regression (LR), naive Bayes (NB), SVM, ICA, PCA, gaussian discriminant analysis (GDA), EM, and backpropagation (NN).” Stanford 2006 AI Lab paper Confidential and proprietary. Copyright © 2008 Aster Data Systems
Result? ,[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
But… ,[object Object],[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
Beyond SQL and MapReduce Confidential and proprietary. Copyright © 2008 Aster Data Systems
SQL vs MapReduce: Two different worlds? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
Implementing MR in the Database ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
The SQL/MR Process Confidential and proprietary. Copyright © 2008 Aster Data Systems
SQL/MR Function: Syntax ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Optional conditions & filters (5) Select output (eg. count) (1) Source table or sub-select (3) Sort before the MR function (4) Java/Python/… MR function (2) <key> for data redistribution Optional MR_Function Arguments Confidential and proprietary. Copyright © 2008 Aster Data Systems
Example 1: Tokenization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
Example 2: Sessionization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
Example 2: Sessionization Slide  Session Timeout = 60 seconds Clickstream Confidential and proprietary. Copyright © 2008 Aster Data Systems timestamp userid 10:00:00 Shawn1 00:58:24 PrezBush 10:00:24 Shawn1 02:30:33 PrezBush 10:01:23 Shawn1 10:02:40 Shawn1 timestamp userid sessionid 10:00:00 Shawn1 0 10:00:24 Shawn1 0 10:01:23 Shawn1 0 10:02:40 Shawn1 1 timestamp userid sessionid 00:58:24 PrezBush 0 02:30:33 PrezBush 1 INPUT OUTPUT
MR Applications in the Database ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Confidential and proprietary. Copyright © 2008 Aster Data Systems
Summary ,[object Object],[object Object],[object Object],[email_address] (Questions, Comments) asterdata.com/blog (Lots of technical details) 1.888.Aster.Data (Any other information) Confidential and proprietary. Copyright © 2008 Aster Data Systems

Mais conteúdo relacionado

Mais procurados

Boston Spark Meetup event Slides Update
Boston Spark Meetup event Slides UpdateBoston Spark Meetup event Slides Update
Boston Spark Meetup event Slides Updatevithakur
 
Apache Spark: The Analytics Operating System
Apache Spark: The Analytics Operating SystemApache Spark: The Analytics Operating System
Apache Spark: The Analytics Operating SystemAdarsh Pannu
 
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks DataWorks Summit/Hadoop Summit
 
MapReduce Design Patterns
MapReduce Design PatternsMapReduce Design Patterns
MapReduce Design PatternsDonald Miner
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesPaco Nathan
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinPietro Michiardi
 
Latent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkLatent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkSandy Ryza
 
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...Srivatsan Ramanujam
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerankgothicane
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Databricks
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkDatabricks
 
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...Databricks
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016MLconf
 
Introduction to Spark ML
Introduction to Spark MLIntroduction to Spark ML
Introduction to Spark MLHolden Karau
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkCloudera, Inc.
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Makoto Yui
 
Reactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and RxReactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and RxSumant Tambe
 

Mais procurados (20)

Boston Spark Meetup event Slides Update
Boston Spark Meetup event Slides UpdateBoston Spark Meetup event Slides Update
Boston Spark Meetup event Slides Update
 
Apache Spark: The Analytics Operating System
Apache Spark: The Analytics Operating SystemApache Spark: The Analytics Operating System
Apache Spark: The Analytics Operating System
 
Spark at-hackthon8jan2014
Spark at-hackthon8jan2014Spark at-hackthon8jan2014
Spark at-hackthon8jan2014
 
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Networks
 
MapReduce Design Patterns
MapReduce Design PatternsMapReduce Design Patterns
MapReduce Design Patterns
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig Latin
 
Latent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with SparkLatent Semantic Analysis of Wikipedia with Spark
Latent Semantic Analysis of Wikipedia with Spark
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
PyMADlib - A Python wrapper for MADlib : in-database, parallel, machine learn...
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerank
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
Introduction to Spark ML
Introduction to Spark MLIntroduction to Spark ML
Introduction to Spark ML
 
Neo4j vs giraph
Neo4j vs giraphNeo4j vs giraph
Neo4j vs giraph
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
Reactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and RxReactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and Rx
 

Destaque

MapReduce for Idiots
MapReduce for IdiotsMapReduce for Idiots
MapReduce for Idiotspetewarden
 
Big data vccorp
Big data vccorpBig data vccorp
Big data vccorpTuan Hoang
 
Bfit for healthcare - A Document Management System for Healthcare Industry
Bfit for healthcare - A Document Management System for Healthcare IndustryBfit for healthcare - A Document Management System for Healthcare Industry
Bfit for healthcare - A Document Management System for Healthcare IndustryGlobalsion Software Sdn Bhd
 
Why Are Change Management And Metrics Such Crucial Aspects To Your Overall De...
Why Are Change Management And Metrics Such Crucial Aspects To Your Overall De...Why Are Change Management And Metrics Such Crucial Aspects To Your Overall De...
Why Are Change Management And Metrics Such Crucial Aspects To Your Overall De...AIIM International
 
Technology Investment for Mutual Insurance Companies
Technology Investment for Mutual Insurance CompaniesTechnology Investment for Mutual Insurance Companies
Technology Investment for Mutual Insurance CompaniesChris Reynolds
 
Getting Started on Hadoop
Getting Started on HadoopGetting Started on Hadoop
Getting Started on HadoopPaco Nathan
 
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Deanna Kosaraju
 
Non-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresNon-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresJoël Perras
 
A Practical Guide to Capturing, Organizing, and Securing Your Documents
A Practical Guide to Capturing, Organizing, and Securing Your DocumentsA Practical Guide to Capturing, Organizing, and Securing Your Documents
A Practical Guide to Capturing, Organizing, and Securing Your DocumentsScott Abel
 
The Chief Data Officer Agenda: Metrics for Information and Data Management
The Chief Data Officer Agenda: Metrics for Information and Data ManagementThe Chief Data Officer Agenda: Metrics for Information and Data Management
The Chief Data Officer Agenda: Metrics for Information and Data ManagementDATAVERSITY
 
Alfresco As SharePoint Alternative - Architecture Overview
Alfresco As SharePoint Alternative - Architecture OverviewAlfresco As SharePoint Alternative - Architecture Overview
Alfresco As SharePoint Alternative - Architecture OverviewAlfresco Software
 
Scale your Alfresco Solutions
Scale your Alfresco Solutions Scale your Alfresco Solutions
Scale your Alfresco Solutions Alfresco Software
 
Intro To Alfresco Part 1
Intro To Alfresco Part 1Intro To Alfresco Part 1
Intro To Alfresco Part 1Jeff Potts
 
EDRMS Pre implementation project plan
EDRMS Pre implementation project planEDRMS Pre implementation project plan
EDRMS Pre implementation project planDonna_Maree_Findlay
 
Big data 5Vs 2014 - View from World to Vietnam by Dinh Le Dat
Big data 5Vs 2014 - View from World to Vietnam by Dinh Le DatBig data 5Vs 2014 - View from World to Vietnam by Dinh Le Dat
Big data 5Vs 2014 - View from World to Vietnam by Dinh Le DatDinh Le Dat (Kevin D.)
 
Alfresco 5.2 REST API
Alfresco 5.2 REST APIAlfresco 5.2 REST API
Alfresco 5.2 REST APIJ V
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with HadoopOReillyStrata
 
On business capabilities, functions and application features
On business capabilities, functions and application featuresOn business capabilities, functions and application features
On business capabilities, functions and application featuresJörgen Dahlberg
 
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)Trieu Nguyen
 

Destaque (20)

MapReduce for Idiots
MapReduce for IdiotsMapReduce for Idiots
MapReduce for Idiots
 
Big data vccorp
Big data vccorpBig data vccorp
Big data vccorp
 
DMAvatar
DMAvatarDMAvatar
DMAvatar
 
Bfit for healthcare - A Document Management System for Healthcare Industry
Bfit for healthcare - A Document Management System for Healthcare IndustryBfit for healthcare - A Document Management System for Healthcare Industry
Bfit for healthcare - A Document Management System for Healthcare Industry
 
Why Are Change Management And Metrics Such Crucial Aspects To Your Overall De...
Why Are Change Management And Metrics Such Crucial Aspects To Your Overall De...Why Are Change Management And Metrics Such Crucial Aspects To Your Overall De...
Why Are Change Management And Metrics Such Crucial Aspects To Your Overall De...
 
Technology Investment for Mutual Insurance Companies
Technology Investment for Mutual Insurance CompaniesTechnology Investment for Mutual Insurance Companies
Technology Investment for Mutual Insurance Companies
 
Getting Started on Hadoop
Getting Started on HadoopGetting Started on Hadoop
Getting Started on Hadoop
 
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
Optimal Execution Of MapReduce Jobs In Cloud - Voices 2015
 
Non-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresNon-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value Stores
 
A Practical Guide to Capturing, Organizing, and Securing Your Documents
A Practical Guide to Capturing, Organizing, and Securing Your DocumentsA Practical Guide to Capturing, Organizing, and Securing Your Documents
A Practical Guide to Capturing, Organizing, and Securing Your Documents
 
The Chief Data Officer Agenda: Metrics for Information and Data Management
The Chief Data Officer Agenda: Metrics for Information and Data ManagementThe Chief Data Officer Agenda: Metrics for Information and Data Management
The Chief Data Officer Agenda: Metrics for Information and Data Management
 
Alfresco As SharePoint Alternative - Architecture Overview
Alfresco As SharePoint Alternative - Architecture OverviewAlfresco As SharePoint Alternative - Architecture Overview
Alfresco As SharePoint Alternative - Architecture Overview
 
Scale your Alfresco Solutions
Scale your Alfresco Solutions Scale your Alfresco Solutions
Scale your Alfresco Solutions
 
Intro To Alfresco Part 1
Intro To Alfresco Part 1Intro To Alfresco Part 1
Intro To Alfresco Part 1
 
EDRMS Pre implementation project plan
EDRMS Pre implementation project planEDRMS Pre implementation project plan
EDRMS Pre implementation project plan
 
Big data 5Vs 2014 - View from World to Vietnam by Dinh Le Dat
Big data 5Vs 2014 - View from World to Vietnam by Dinh Le DatBig data 5Vs 2014 - View from World to Vietnam by Dinh Le Dat
Big data 5Vs 2014 - View from World to Vietnam by Dinh Le Dat
 
Alfresco 5.2 REST API
Alfresco 5.2 REST APIAlfresco 5.2 REST API
Alfresco 5.2 REST API
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
 
On business capabilities, functions and application features
On business capabilities, functions and application featuresOn business capabilities, functions and application features
On business capabilities, functions and application features
 
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
TỔNG QUAN VỀ DỮ LIỆU LỚN (BIGDATA)
 

Semelhante a Introduction to MapReduce Data Transformations

What's New in ArcGIS 10.1 Data Interoperability Extension
What's New in ArcGIS 10.1 Data Interoperability ExtensionWhat's New in ArcGIS 10.1 Data Interoperability Extension
What's New in ArcGIS 10.1 Data Interoperability ExtensionSafe Software
 
MapReduce on Zero VM
MapReduce on Zero VM MapReduce on Zero VM
MapReduce on Zero VM Joy Rahman
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introductionHektor Jacynycz García
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everythingLew Tucker
 
Big Data Meetup #7
Big Data Meetup #7Big Data Meetup #7
Big Data Meetup #7Paul Lo
 
Jump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and DatabricksJump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and DatabricksDatabricks
 
Intro to hadoop ecosystem
Intro to hadoop ecosystemIntro to hadoop ecosystem
Intro to hadoop ecosystemGrzegorz Kolpuc
 
Distributed Computing & MapReduce
Distributed Computing & MapReduceDistributed Computing & MapReduce
Distributed Computing & MapReducecoolmirza143
 
A look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutionsA look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutionsDatabricks
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Josef A. Habdank
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?Jeremy Schneider
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...Srivatsan Ramanujam
 
Sql on hadoop the secret presentation.3pptx
Sql on hadoop  the secret presentation.3pptxSql on hadoop  the secret presentation.3pptx
Sql on hadoop the secret presentation.3pptxPaulo Alonso
 
Taste Java In The Clouds
Taste Java In The CloudsTaste Java In The Clouds
Taste Java In The CloudsJacky Chu
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...confluent
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom IndustryCloudera, Inc.
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkDatabricks
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the Worldjhugg
 

Semelhante a Introduction to MapReduce Data Transformations (20)

What's New in ArcGIS 10.1 Data Interoperability Extension
What's New in ArcGIS 10.1 Data Interoperability ExtensionWhat's New in ArcGIS 10.1 Data Interoperability Extension
What's New in ArcGIS 10.1 Data Interoperability Extension
 
MapReduce on Zero VM
MapReduce on Zero VM MapReduce on Zero VM
MapReduce on Zero VM
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everything
 
Big Data Meetup #7
Big Data Meetup #7Big Data Meetup #7
Big Data Meetup #7
 
Jump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and DatabricksJump Start into Apache® Spark™ and Databricks
Jump Start into Apache® Spark™ and Databricks
 
Intro to hadoop ecosystem
Intro to hadoop ecosystemIntro to hadoop ecosystem
Intro to hadoop ecosystem
 
Distributed Computing & MapReduce
Distributed Computing & MapReduceDistributed Computing & MapReduce
Distributed Computing & MapReduce
 
A look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutionsA look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutions
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
 
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
 
Dancing with the Elephant
Dancing with the ElephantDancing with the Elephant
Dancing with the Elephant
 
Sql on hadoop the secret presentation.3pptx
Sql on hadoop  the secret presentation.3pptxSql on hadoop  the secret presentation.3pptx
Sql on hadoop the secret presentation.3pptx
 
Taste Java In The Clouds
Taste Java In The CloudsTaste Java In The Clouds
Taste Java In The Clouds
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 

Último

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Último (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

Introduction to MapReduce Data Transformations

  • 1. Introduction to Map/Reduce Data Transformations Tasso Argyros CTO and Co-Founder Aster Data Systems [email_address]
  • 2. A Brief History of MapReduce Confidential and proprietary. Copyright © 2008 Aster Data Systems
  • 3.
  • 4.
  • 5. The quick brown fox jumps over the lazy dog. To be or not to be: that is the question. Switch The world only needs five computers. Hello world. In-Database MapReduce is the future. MapReduce is a very powerful programming paradigm. Confidential and proprietary. Copyright © 2008 Aster Data Systems Server A Server B Server C Server D
  • 6. Goal We Want to Count the # of Times Each Word Occurs Confidential and proprietary. Copyright © 2008 Aster Data Systems
  • 7. 1 st Approach No MapReduce 1 st Approach No MapReduce Confidential and proprietary. Copyright © 2008 Aster Data Systems
  • 8. The quick brown fox jumps over the lazy dog To be or not to be: that is the question. Switch The world only needs five computers. Hello world. In-Database MapReduce is the future. MapReduce is a very powerful concept. the quick brown fox jumps over the lazy dog in database mapreduce is the future the world only needs five computers the quick brown fox jumps over the lazy dog in database mapreduce is the future the world only needs five computers hello world mapreduce is a very powerful concept to be or not to be that is the question Confidential and proprietary. Copyright © 2008 Aster Data Systems Server A Server B Server C Server D hello world mapreduce is a very powerful concept to be or not to be that is the question
  • 9. Confidential and proprietary. Copyright © 2008 Aster Data Systems Server 4 Final Result File the 5 is 3 mapreduce 2 … …
  • 10.
  • 11. 2 nd Approach No MapReduce Fully Distributed Confidential and proprietary. Copyright © 2008 Aster Data Systems
  • 12. The quick brown fox jumps over the lazy dog To be or not to be: that is the question. Switch The world only needs five computers. Hello world. In-Database MapReduce is the future. MapReduce is a very powerful concept. Confidential and proprietary. Copyright © 2008 Aster Data Systems Server A Server B Server C Server D the quick brown fox jumps over the lazy dog in database mapreduce is the future the world only needs five computers hello world mapreduce is a very powerful concept to be or not to be that is the question the the the the the database database future world world powerful lazy brown mapreduce mapreduce be be to jumps computers hello is is is question over a that
  • 13. Confidential and proprietary. Copyright © 2008 Aster Data Systems Server 1 Final Result File the 5 … … . Server 2 Final Result File world 2 … … . Server 3 Final Result File mapreduce 2 … … . Server 4 Final Result File is 3 … … .
  • 14. 2 nd Approach: No MapReduce, Distributed Confidential and proprietary. Copyright © 2008 Aster Data Systems
  • 15. Does it work? Yes Is it a pain? Yes!! Does it take lots of time? Yes! Would you do it? No!!! Confidential and proprietary. Copyright © 2008 Aster Data Systems
  • 16.
  • 17. Data Redistribution and Grouping Confidential and proprietary. Copyright © 2008 Aster Data Systems Map() Input Any file (e.g. documents) Output Stream of <key, value> pairs (e.g. <word, count> pairs) Input All <key, value> pairs with the same key grouped (e.g. all <word, count> pairs where word = “the”) Output Anything (e.g. sum of counts for a specific word) Reduce()
  • 18. The quick brown fox jumps over the lazy dog In-Database MapReduce is the future. <the, 1> <quick, 1> <brown,1> <fox,1> <jumps,1> <over,1> <the,1> <lazy,1> <dog,1> <in, 1> <database, 1> <mapreduce,1> <is,1> <the,1> <future,1> <world,1> <world,1> <powerful,1> <lazy,1> <brown,1> <mapreduce,1> <mapreduce,1> <be,1> <be,1> <to,1> <jumps,1> <computers,1> <hello,1> <is,1> <is,1> <is,1> <question,1> <over,1> <a,1> <that,1> Switch <the, 1> <the, 1> <the, 1> <the, 1> <the, 1> <database,1> <database,1> <future,1> Map() and Redistribution Phase Confidential and proprietary. Copyright © 2008 Aster Data Systems Map() Map() Server A Server B Server C Server D
  • 19. <the, 1> <the, 1> <the, 1> <the, 1> <the, 1> <database,1> <database,1> <future,1> <the, 1> <the, 1> <the, 1> <the, 1> <the, 1> <database,1> <database,1> <future,1> Grouping and Reduce() Phase (on Server 1) Confidential and proprietary. Copyright © 2008 Aster Data Systems Reduce() Server 1 Final Result File the 5 database 2 future 1 Reduce() Reduce()
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25. Beyond SQL and MapReduce Confidential and proprietary. Copyright © 2008 Aster Data Systems
  • 26.
  • 27.
  • 28. The SQL/MR Process Confidential and proprietary. Copyright © 2008 Aster Data Systems
  • 29.
  • 30.
  • 31.
  • 32. Example 2: Sessionization Slide Session Timeout = 60 seconds Clickstream Confidential and proprietary. Copyright © 2008 Aster Data Systems timestamp userid 10:00:00 Shawn1 00:58:24 PrezBush 10:00:24 Shawn1 02:30:33 PrezBush 10:01:23 Shawn1 10:02:40 Shawn1 timestamp userid sessionid 10:00:00 Shawn1 0 10:00:24 Shawn1 0 10:01:23 Shawn1 0 10:02:40 Shawn1 1 timestamp userid sessionid 00:58:24 PrezBush 0 02:30:33 PrezBush 1 INPUT OUTPUT
  • 33.
  • 34.