SlideShare uma empresa Scribd logo
1 de 27
Data Analytics in Computer
Networking
The Case for Exploratory Data Analysis
Stenio Fernandes
Carleton University / CIn-UFPE
March 2016
Outline
Data Analysis - background
EDA basics
Applied EDA (Examples: WiFi simulated data)
Q&A
References
Data Analytics - Background
Data Science Pipeline
•Analytic Data
•Analytic Code
•Documentation
•Distribution
•ElementsofReproducibleResearch
Report Writing for Data Science in R, Roger D. Peng, 2016
1. Stating and refining the question
2. Exploring the data
3. Building formal statistical models
4. Interpreting the results
5. Communicating the results
Epicycle of Analysis
The Art of Data Science, A Guide for Anyone Who Works with Data, Roger D. Peng and Elizabeth Matsui, 2016
• summarize the measurements in a
single data set without further
interpretation
•Descriptive
• Searching for discoveries, trends,
correlations, or relationships
between multiple variables to
generate ideas or hypotheses
Exploratory
• quantifying whether an observed
pattern will likely hold beyond the
data set in hand
Inferential
• uses a subset of measurements (the
features) to predict another
measurement (the outcome)
Predictive
• what happens to one measurement
if you make another measurement
change
Causal
• changing one measurement always
and exclusively leads to a specific,
deterministic behavior in another
Deterministic
The Elements of Data
Analytic Style, A guide for
people who want to analyze
data, Jeff Leek, 2015
EDA basics
Why use EDA - Summary
• Maximize insight into a data set
• Uncover underlying structure
• Extract important variables
• Detect outliers and anomalies
• Test underlying assumptions
• Develop parsimonious models
• Determine optimal factor
settings
•NIST
• Show comparisons
• Show causality, mechanism,
explanation
• Show multivariate data
• Integrate multiple modes of
evidence
• Describe and document the
evidence
• Content is king
•JHUniversity
Answer to initial questions
What is a typical value for a certain feature?
What is the uncertainty for a typical value of a
feature?
What is a good distributional fit for a feature?
What is the percentile distribution?
Does modification on one variable have an
effect another variable?
Does a factor have an effect on performance
metrics?
What are the most important factors?
What is the best function for relating a
response variable to other variables?
What are the best settings for factors
(i.e. levels)?
Can we separate signal from noise?
Can we extract any structure from multivariate
data?
Does the data have outliers?
EDA Graphs
Understand data properties Find patterns in data
Suggest modeling strategies Debug analyses
Applied EDA
Using R/ggplot2
(mpg dataset) -> fake wifi dataset
Practical Steps
•Before performing any measurements or simulation
• Identify
• Performance Metrics
• Performance Factors and Levels
• Caution: sometimes you have to guess the ranges for the levels
• Use an educated guess
Don’t run tons of simulations / experiments (As previously discussed)
Plot quick and dirty graphs
• No need for titles, labels
Some examples of EDA Graphs - WiFi Data (simulated)
• “Vendor” - factor / levels: LinkSys, …
• “Model“ – factor / Levels: GST200, …
• "Users_Max_Rate“ - factor (background traffic) /
levels: 1.6, 1.8,…,7.0 Mbps
• "Year“ – factor / Levels: 1999, 2008
• "BER“ – factor / Levels: 4, 5, 6, and 8
• "Type“ – factor (type of user) / Levels: 4, f, r
• Rate – performance metric (Mbps)
• Distance - factor (distance from the AP) / “Levels:
50,100m
Features
(Observation
Variables)
Q&A
References
• NIST’s Handbook of Statistics Engineering (online)
• Report Writing for Data Science in R, Roger D. Peng, 2016
• The Art of Data Science, A Guide for Anyone Who Works with Data, Roger D.
Peng and Elizabeth Matsui, 2016
• The Elements of Data Analytic Style, A guide for people who want to analyze
data, Jeff Leek, 2015

Mais conteúdo relacionado

Mais procurados

A Systematic Review of Model-Driven Security
A Systematic Review of Model-Driven SecurityA Systematic Review of Model-Driven Security
A Systematic Review of Model-Driven SecurityPhu H. Nguyen
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceCS, NcState
 
Query aware determinization of uncertain objects
Query aware determinization of uncertain objectsQuery aware determinization of uncertain objects
Query aware determinization of uncertain objectsSoftroniics india
 
Ph.D Annual Report III
Ph.D Annual Report IIIPh.D Annual Report III
Ph.D Annual Report IIIMatteo Avalle
 
Machine Learning for Domain Experts
Machine Learning for Domain ExpertsMachine Learning for Domain Experts
Machine Learning for Domain ExpertsMehmet Alican Noyan
 
Towards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive SystemsTowards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive SystemsManuel Martín
 
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...Daniele Dell'Aglio
 
Work Package Presentation
Work Package PresentationWork Package Presentation
Work Package Presentationlaurensrietveld
 
Linkedin Executive summary
Linkedin Executive summaryLinkedin Executive summary
Linkedin Executive summaryMichael Simonsen
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesLionel Briand
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs StatisticsAndry Alamsyah
 
Data Processing and Data Visualization
Data Processing and Data VisualizationData Processing and Data Visualization
Data Processing and Data VisualizationRenata Brandão
 
Lucas_Taylor_Resume_Gen_Su16
Lucas_Taylor_Resume_Gen_Su16Lucas_Taylor_Resume_Gen_Su16
Lucas_Taylor_Resume_Gen_Su16Luke Taylor
 
Tommi kramer 2013-06-21-caise-re2-kramer
Tommi kramer   2013-06-21-caise-re2-kramerTommi kramer   2013-06-21-caise-re2-kramer
Tommi kramer 2013-06-21-caise-re2-kramercaise2013vlc
 
Test Tool for Industrial Ethernet Network Performance (June 2009)
Test Tool for Industrial Ethernet Network Performance (June 2009)Test Tool for Industrial Ethernet Network Performance (June 2009)
Test Tool for Industrial Ethernet Network Performance (June 2009)Jim Gilsinn
 
Machine learning workshop using Orange datamining framework
Machine learning workshop using Orange datamining frameworkMachine learning workshop using Orange datamining framework
Machine learning workshop using Orange datamining frameworkAmr Rashed
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksBICA Labs
 

Mais procurados (20)

A Systematic Review of Model-Driven Security
A Systematic Review of Model-Driven SecurityA Systematic Review of Model-Driven Security
A Systematic Review of Model-Driven Security
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data Science
 
KREAM@ICCS2013
KREAM@ICCS2013KREAM@ICCS2013
KREAM@ICCS2013
 
Query aware determinization of uncertain objects
Query aware determinization of uncertain objectsQuery aware determinization of uncertain objects
Query aware determinization of uncertain objects
 
Ph.D Annual Report III
Ph.D Annual Report IIIPh.D Annual Report III
Ph.D Annual Report III
 
Machine Learning for Domain Experts
Machine Learning for Domain ExpertsMachine Learning for Domain Experts
Machine Learning for Domain Experts
 
2017_Resume
2017_Resume2017_Resume
2017_Resume
 
Towards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive SystemsTowards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive Systems
 
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...Ontology based top-k query answering over massive, heterogeneous, and dynamic...
Ontology based top-k query answering over massive, heterogeneous, and dynamic...
 
Work Package Presentation
Work Package PresentationWork Package Presentation
Work Package Presentation
 
Linkedin Executive summary
Linkedin Executive summaryLinkedin Executive summary
Linkedin Executive summary
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
 
Data Mining vs Statistics
Data Mining vs StatisticsData Mining vs Statistics
Data Mining vs Statistics
 
Data Processing and Data Visualization
Data Processing and Data VisualizationData Processing and Data Visualization
Data Processing and Data Visualization
 
Lucas_Taylor_Resume_Gen_Su16
Lucas_Taylor_Resume_Gen_Su16Lucas_Taylor_Resume_Gen_Su16
Lucas_Taylor_Resume_Gen_Su16
 
Tommi kramer 2013-06-21-caise-re2-kramer
Tommi kramer   2013-06-21-caise-re2-kramerTommi kramer   2013-06-21-caise-re2-kramer
Tommi kramer 2013-06-21-caise-re2-kramer
 
Test Tool for Industrial Ethernet Network Performance (June 2009)
Test Tool for Industrial Ethernet Network Performance (June 2009)Test Tool for Industrial Ethernet Network Performance (June 2009)
Test Tool for Industrial Ethernet Network Performance (June 2009)
 
Machine learning workshop using Orange datamining framework
Machine learning workshop using Orange datamining frameworkMachine learning workshop using Orange datamining framework
Machine learning workshop using Orange datamining framework
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural Networks
 
Approach AI assurance
Approach AI assuranceApproach AI assurance
Approach AI assurance
 

Destaque

IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
 IEEE ICC 2012 - Dependability Assessment of Virtualized Networks IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
IEEE ICC 2012 - Dependability Assessment of Virtualized NetworksStenio Fernandes
 
統計在半導體產業的應用 -- Basic Statistic Methods
統計在半導體產業的應用 -- Basic Statistic Methods統計在半導體產業的應用 -- Basic Statistic Methods
統計在半導體產業的應用 -- Basic Statistic Methodschunhung chou
 
AlphaPy: A Data Science Pipeline in Python
AlphaPy: A Data Science Pipeline in PythonAlphaPy: A Data Science Pipeline in Python
AlphaPy: A Data Science Pipeline in PythonMark Conway
 
Computer networks--introduction computer-networking
Computer networks--introduction computer-networkingComputer networks--introduction computer-networking
Computer networks--introduction computer-networkingOlorunyomi Segun
 
Networking with Purpose - the Lincoln Hub: Dr Andrew West Vice Chancellor, Li...
Networking with Purpose - the Lincoln Hub: Dr Andrew West Vice Chancellor, Li...Networking with Purpose - the Lincoln Hub: Dr Andrew West Vice Chancellor, Li...
Networking with Purpose - the Lincoln Hub: Dr Andrew West Vice Chancellor, Li...SmartNet
 
The tale of heavy tails in computer networking
The tale of heavy tails in computer networkingThe tale of heavy tails in computer networking
The tale of heavy tails in computer networkingStenio Fernandes
 
Computer hardware and networking components
Computer hardware and networking componentsComputer hardware and networking components
Computer hardware and networking componentsManpreet Singh Bedi
 

Destaque (9)

IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
 IEEE ICC 2012 - Dependability Assessment of Virtualized Networks IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
IEEE ICC 2012 - Dependability Assessment of Virtualized Networks
 
統計在半導體產業的應用 -- Basic Statistic Methods
統計在半導體產業的應用 -- Basic Statistic Methods統計在半導體產業的應用 -- Basic Statistic Methods
統計在半導體產業的應用 -- Basic Statistic Methods
 
AlphaPy: A Data Science Pipeline in Python
AlphaPy: A Data Science Pipeline in PythonAlphaPy: A Data Science Pipeline in Python
AlphaPy: A Data Science Pipeline in Python
 
Computer networks--introduction computer-networking
Computer networks--introduction computer-networkingComputer networks--introduction computer-networking
Computer networks--introduction computer-networking
 
Networking with Purpose - the Lincoln Hub: Dr Andrew West Vice Chancellor, Li...
Networking with Purpose - the Lincoln Hub: Dr Andrew West Vice Chancellor, Li...Networking with Purpose - the Lincoln Hub: Dr Andrew West Vice Chancellor, Li...
Networking with Purpose - the Lincoln Hub: Dr Andrew West Vice Chancellor, Li...
 
The tale of heavy tails in computer networking
The tale of heavy tails in computer networkingThe tale of heavy tails in computer networking
The tale of heavy tails in computer networking
 
Ch12
Ch12Ch12
Ch12
 
Best Ways of Using Moodle
Best Ways of Using MoodleBest Ways of Using Moodle
Best Ways of Using Moodle
 
Computer hardware and networking components
Computer hardware and networking componentsComputer hardware and networking components
Computer hardware and networking components
 

Semelhante a Data analytics in computer networking

Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-stepsShesha R
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfneelakandan2001kpm
 
Data Processing DOH Workshop.pptx
Data Processing DOH Workshop.pptxData Processing DOH Workshop.pptx
Data Processing DOH Workshop.pptxcharlslabarda
 
Introduction of data science
Introduction of data scienceIntroduction of data science
Introduction of data scienceTanujaSomvanshi1
 
Python for Data Analysis: A Comprehensive Guide
Python for Data Analysis: A Comprehensive GuidePython for Data Analysis: A Comprehensive Guide
Python for Data Analysis: A Comprehensive GuideAivada
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxsumitkumar600840
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistLisa Cohen
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...Big Data Value Association
 
SAS Training session - By Pratima
SAS Training session  -  By Pratima SAS Training session  -  By Pratima
SAS Training session - By Pratima Pratima Pandey
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAjaved75
 
التنقيب في البيانات - Data Mining
التنقيب في البيانات -  Data Miningالتنقيب في البيانات -  Data Mining
التنقيب في البيانات - Data Miningnabil_alsharafi
 
Data Science Training in Chandigarh h
Data Science Training in Chandigarh    hData Science Training in Chandigarh    h
Data Science Training in Chandigarh hasmeerana605
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTuri, Inc.
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfData Science Council of America
 
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptxUnit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptxtesfkeb
 
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET Journal
 

Semelhante a Data analytics in computer networking (20)

NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
 
Data Processing DOH Workshop.pptx
Data Processing DOH Workshop.pptxData Processing DOH Workshop.pptx
Data Processing DOH Workshop.pptx
 
Introduction of data science
Introduction of data scienceIntroduction of data science
Introduction of data science
 
Python for Data Analysis: A Comprehensive Guide
Python for Data Analysis: A Comprehensive GuidePython for Data Analysis: A Comprehensive Guide
Python for Data Analysis: A Comprehensive Guide
 
Data Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptxData Science Introduction: Concepts, lifecycle, applications.pptx
Data Science Introduction: Concepts, lifecycle, applications.pptx
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data Scientist
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
 
Data Science and Analysis.pptx
Data Science and Analysis.pptxData Science and Analysis.pptx
Data Science and Analysis.pptx
 
SAS Training session - By Pratima
SAS Training session  -  By Pratima SAS Training session  -  By Pratima
SAS Training session - By Pratima
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
 
التنقيب في البيانات - Data Mining
التنقيب في البيانات -  Data Miningالتنقيب في البيانات -  Data Mining
التنقيب في البيانات - Data Mining
 
Data Science Training in Chandigarh h
Data Science Training in Chandigarh    hData Science Training in Chandigarh    h
Data Science Training in Chandigarh h
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdfThe Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
The Simple 5-Step Process for Creating a Winning Data Pipeline.pdf
 
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptxUnit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
Unit_8_Data_processing,_analysis_and_presentation_and_Application (1).pptx
 
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware Performance
 

Último

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Data analytics in computer networking

  • 1. Data Analytics in Computer Networking The Case for Exploratory Data Analysis Stenio Fernandes Carleton University / CIn-UFPE March 2016
  • 2. Outline Data Analysis - background EDA basics Applied EDA (Examples: WiFi simulated data) Q&A References
  • 3. Data Analytics - Background
  • 4. Data Science Pipeline •Analytic Data •Analytic Code •Documentation •Distribution •ElementsofReproducibleResearch Report Writing for Data Science in R, Roger D. Peng, 2016
  • 5. 1. Stating and refining the question 2. Exploring the data 3. Building formal statistical models 4. Interpreting the results 5. Communicating the results Epicycle of Analysis The Art of Data Science, A Guide for Anyone Who Works with Data, Roger D. Peng and Elizabeth Matsui, 2016
  • 6. • summarize the measurements in a single data set without further interpretation •Descriptive • Searching for discoveries, trends, correlations, or relationships between multiple variables to generate ideas or hypotheses Exploratory • quantifying whether an observed pattern will likely hold beyond the data set in hand Inferential • uses a subset of measurements (the features) to predict another measurement (the outcome) Predictive • what happens to one measurement if you make another measurement change Causal • changing one measurement always and exclusively leads to a specific, deterministic behavior in another Deterministic The Elements of Data Analytic Style, A guide for people who want to analyze data, Jeff Leek, 2015
  • 8. Why use EDA - Summary • Maximize insight into a data set • Uncover underlying structure • Extract important variables • Detect outliers and anomalies • Test underlying assumptions • Develop parsimonious models • Determine optimal factor settings •NIST • Show comparisons • Show causality, mechanism, explanation • Show multivariate data • Integrate multiple modes of evidence • Describe and document the evidence • Content is king •JHUniversity
  • 9. Answer to initial questions What is a typical value for a certain feature? What is the uncertainty for a typical value of a feature? What is a good distributional fit for a feature? What is the percentile distribution? Does modification on one variable have an effect another variable? Does a factor have an effect on performance metrics? What are the most important factors? What is the best function for relating a response variable to other variables? What are the best settings for factors (i.e. levels)? Can we separate signal from noise? Can we extract any structure from multivariate data? Does the data have outliers?
  • 10. EDA Graphs Understand data properties Find patterns in data Suggest modeling strategies Debug analyses
  • 11. Applied EDA Using R/ggplot2 (mpg dataset) -> fake wifi dataset
  • 12. Practical Steps •Before performing any measurements or simulation • Identify • Performance Metrics • Performance Factors and Levels • Caution: sometimes you have to guess the ranges for the levels • Use an educated guess Don’t run tons of simulations / experiments (As previously discussed) Plot quick and dirty graphs • No need for titles, labels
  • 13. Some examples of EDA Graphs - WiFi Data (simulated) • “Vendor” - factor / levels: LinkSys, … • “Model“ – factor / Levels: GST200, … • "Users_Max_Rate“ - factor (background traffic) / levels: 1.6, 1.8,…,7.0 Mbps • "Year“ – factor / Levels: 1999, 2008 • "BER“ – factor / Levels: 4, 5, 6, and 8 • "Type“ – factor (type of user) / Levels: 4, f, r • Rate – performance metric (Mbps) • Distance - factor (distance from the AP) / “Levels: 50,100m Features (Observation Variables)
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. Q&A
  • 27. References • NIST’s Handbook of Statistics Engineering (online) • Report Writing for Data Science in R, Roger D. Peng, 2016 • The Art of Data Science, A Guide for Anyone Who Works with Data, Roger D. Peng and Elizabeth Matsui, 2016 • The Elements of Data Analytic Style, A guide for people who want to analyze data, Jeff Leek, 2015

Notas do Editor

  1. Left figure: Report Writing for Data Science in R, Roger D. Peng, 2016
  2. Left figure: The Art of Data Science, A Guide for Anyone Who Works with Data, Roger D. Peng and Elizabeth Matsui, 2016
  3. Figure and Text: The Elements of Data Analytic Style, A guide for people who want to analyze data, Jeff Leek, 2015