SlideShare uma empresa Scribd logo
1 de 18
INTRODUCTION TO R AND RATTLE
1IAUSHIRAZ1/14/2017
What is the R
Statistical Programming Language
used among statisticians and data miners for developing statistical software and data analysis.
Free and Open Source
Written in C, Fortran and R
Statistical features
Linear and nonlinear modeling
Statistical tests
Classification, Clustering
Can manipulate R Objects with C, C++, Java, .NET or Python code.
2IAUSHIRAZ1/14/2017
Source Example
> x <- c(1,2,3,4,5,6) # Create ordered collection (vector)
> y <- x^2 # Square the elements of x
> print(y) # print (vector) y
[1] 1 4 9 16 25 36
> mean(y) # Calculate average (arithmetic mean) of (vector) y; result is scalar
[1] 15.16667
> var(y) # Calculate sample variance
[1] 178.9667
> lm_1 <- lm(y ~ x) # Fit a linear regression model "y = f(x)" or "y = B0 + (B1 * x)"
# store the results as lm_1
> print(lm_1) # Print the model from the (linear model object) lm_1
Call:
lm(formula = y ~ x)
Coefficients:
(Intercept) x
-9.333 7.000
> summary(lm_1) # Compute and print statistics for the fit
# of the (linear model object) lm_1
Call:
lm(formula = y ~ x)
Residuals:
1 2 3 4 5 6
3.3333 -0.6667 -2.6667 -2.6667 -0.6667 3.3333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -9.3333 2.8441 -3.282 0.030453 *
x 7.0000 0.7303 9.585 0.000662 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.055 on 4 degrees of freedom
Multiple R-squared: 0.9583, Adjusted R-squared: 0.9478
F-statistic: 91.88 on 1 and 4 DF, p-value: 0.000662
> par(mfrow=c(2, 2)) # Request 2x2 plot layout
> plot(lm_1) # Diagnostic plot of regression model
3IAUSHIRAZ1/14/2017
Graphical front-ends
Architect – cross-platform open source IDE based on Eclipse and StatET
DataJoy – Online R Editor focused on beginners to data science and collaboration.
Deducer – GUI for menu-driven data analysis (similar to SPSS/JMP/Minitab).
Java GUI for R – cross-platform stand-alone R terminal and editor based on Java (also known as JGR).
Number Analytics - GUI for R based business analytics (similar to SPSS) working on the cloud.
Rattle GUI – cross-platform GUI based on RGtk2 and specifically designed for data mining.
R Commander – cross-platform menu-driven GUI based on tcltk (several plug-ins to Rcmdr are also
available).
Revolution R Productivity Environment (RPE) – Revolution Analytics-provided Visual Studio-based IDE,
and has plans for web based point and click interface.
RGUI – comes with the pre-compiled version of R for Microsoft Windows.
RKWard – extensible GUI and IDE for R.
RStudio – cross-platform open source IDE (which can also be run on a remote Linux server).
4IAUSHIRAZ1/14/2017
What is the Rattle
R Graphical User Interface Package
Offered by Graham Williams in Togaware Pty Ltd.
Free and Open Source
Represents Statistical and Visual Summaries of data
Tabs :
Load Data
Data Exploration
Model
Evaluation
Test
…
5IAUSHIRAZ1/14/2017
Rattle Installation Process
Download and Installing R
https://r-project.org
About 60MB
Download the Rattle Package
About 300MB
Follow Instructions :
 install.packages("rattle", dependencies=c("Depends", "Suggests"))
 Library(rattle)
 Rattle()
6IAUSHIRAZ1/14/2017
Load Data
Dataset Types :
CSV File (CSV, TXT, EXCELL)
ARFF (CSV File which adds type information)
ODBC (MySQL, SqlLITE, SQL Server, …)
 Set Connections in : /etc/odbcinst.ini & /etc/odbc.ini
R Dataset (Existing Datasets in Current Solution)
R Data File
Library (Pre Existing Datasets)
Corpus ( Collection of Documents)
Script (Scripts for Generating Datasets)
1/14/2017 IAUSHIRAZ 7
Load Data
Variable Types :
Input (Most Variables as Input)
 Predict the Target Variables
Target (Influenced by the Input Variables)
 Known as the Output
 Prefix : TARGET_
Risk (Measure of the size of the Targets)
 Prefix : RISK_
Identifier (any Numeric Variable that has a Unique Value – Not Normally used in modeling)
 Such as : ID, Date
 Prefix : ID_
Ignore (Ignore from Modeling)
 Prefix : IGNORE_
Weight (Weighted by R Formula)
1/14/2017 IAUSHIRAZ 8
Transform
Rescale
Normalize
 Re Center
 Scale [0-1]
 Median/Mad
 Natural Log / Log 10
 Matrix
Order
 Rank
 Interval
 Number of Group
1/14/2017 IAUSHIRAZ 9
Transform
Impute (missing values)
Zero
Mean
Median
Mode
Constant
Recode
Quantiles
K-Means
Equal with
Indicator variable / Join Categories
As Categorical / As Numeric
1/14/2017 IAUSHIRAZ 10
Transform
Cleanup
Delete Ignored
Delete Selected
Delete Missing
Delete Observations with Missing
1/14/2017 IAUSHIRAZ 11
Exploration
Summary
Summary
 Min, Max, Mean, Quartiles Values.
Describe
 Missing, Unique, Sum, Mean, Lowest, Highest Values.
Basics (For Numeric Value)
 Measures of Numeric Data (Missing, Min, Max, Quartiles, Mean, Sum, Skewness, Kurtosis)
Kurtosis (For Numeric Value)
 A larger value indicates a sharper peak.
 A lower value indicates a smoother peak.
Skewness (For Numeric Value)
 A positive skew indicates that the tail to the right is longer.
 A negative skew that the tail to the left is longer.
1/14/2017 IAUSHIRAZ 12
Exploration
Summary
Show Missing
 Each row corresponds to a pattern of missing values.
 Perhaps coming to an understanding of why the data is missing.
 Rows and Columns are sorted in ascending order of missing data.
1/14/2017 IAUSHIRAZ 13
Exploration
Distributions (review the distributions of each variable in dataset)
Annotate (include numeric values in plots)
Group by
Numeric Outputs :
 Box Plot
 Histogram
 Cumulative
 Benford
 For any number of continuous variables
 Pairs
Categorical Outputs :
 Bar Plot
 Dot Plot
 Mosaic
 Pairs
1/14/2017 IAUSHIRAZ 14
Exploration
Correlations (Rattle only computes correlations between numeric variables at this time)
Ordered
 Order by strength of correlations
Explore Missing
 Correlation between missing values
Hierarchical
 Pearson
 Kendall
 Spearman
Principal Components
SVD
 For only Numeric Variables
Eigen
1/14/2017 IAUSHIRAZ 15
Model
Tree
Traditional
 Trade off between performance and simplicity of explanation
Conditional
Forest (many decision trees using random subsets of data and variables)
Number of Trees
Number of Variables
Impute (set median numeric value for missing values)
Sample Size (for balancing classes)
Importance (variable importance)
Rules (collection of random forest rules)
ROC (ROC Curve)
Errors
1/14/2017 IAUSHIRAZ 16
Model
SVM
Start with two parallel vector
Linear (linear regression)
For continues values
All
1/14/2017 IAUSHIRAZ 17
Cluster
K-Means
Set First K
EwKm
K-Means with entropy weighting
Hierarchical
Not needed to set first Cluster Number
BiCluster
Suitable subsets of both the variables and the observations
1/14/2017 IAUSHIRAZ 18

Mais conteúdo relacionado

Mais procurados

Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching moduleSander Timmer
 
Data analysis with R
Data analysis with RData analysis with R
Data analysis with RShareThis
 
Presentation on data preparation with pandas
Presentation on data preparation with pandasPresentation on data preparation with pandas
Presentation on data preparation with pandasAkshitaKanther
 
RDataMining slides-r-programming
RDataMining slides-r-programmingRDataMining slides-r-programming
RDataMining slides-r-programmingYanchang Zhao
 
Data Structure
Data StructureData Structure
Data Structuresheraz1
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using RVictoria López
 
4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply Function4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply FunctionSakthi Dasans
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandasPiyush rai
 
A brief introduction to 'R' statistical package
A brief introduction to 'R' statistical packageA brief introduction to 'R' statistical package
A brief introduction to 'R' statistical packageShanmukha S. Potti
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Serban Tanasa
 
IR-ranking
IR-rankingIR-ranking
IR-rankingFELIX75
 
Motivation and Mechanics behind some aspects of Shapeless
Motivation and Mechanics behind some aspects of ShapelessMotivation and Mechanics behind some aspects of Shapeless
Motivation and Mechanics behind some aspects of ShapelessAnatolii Kmetiuk
 
A Presentation About Array Manipulation(Insertion & Deletion in an array)
A Presentation About Array Manipulation(Insertion & Deletion in an array)A Presentation About Array Manipulation(Insertion & Deletion in an array)
A Presentation About Array Manipulation(Insertion & Deletion in an array)Imdadul Himu
 

Mais procurados (20)

Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
Data analysis with R
Data analysis with RData analysis with R
Data analysis with R
 
Presentation on data preparation with pandas
Presentation on data preparation with pandasPresentation on data preparation with pandas
Presentation on data preparation with pandas
 
RDataMining slides-r-programming
RDataMining slides-r-programmingRDataMining slides-r-programming
RDataMining slides-r-programming
 
Data Structure
Data StructureData Structure
Data Structure
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using R
 
R language
R languageR language
R language
 
R Get Started I
R Get Started IR Get Started I
R Get Started I
 
R language
R languageR language
R language
 
R Get Started II
R Get Started IIR Get Started II
R Get Started II
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
R language introduction
R language introductionR language introduction
R language introduction
 
4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply Function4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply Function
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
 
A brief introduction to 'R' statistical package
A brief introduction to 'R' statistical packageA brief introduction to 'R' statistical package
A brief introduction to 'R' statistical package
 
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
Get up to Speed (Quick Guide to data.table in R and Pentaho PDI)
 
IR-ranking
IR-rankingIR-ranking
IR-ranking
 
Motivation and Mechanics behind some aspects of Shapeless
Motivation and Mechanics behind some aspects of ShapelessMotivation and Mechanics behind some aspects of Shapeless
Motivation and Mechanics behind some aspects of Shapeless
 
R training5
R training5R training5
R training5
 
A Presentation About Array Manipulation(Insertion & Deletion in an array)
A Presentation About Array Manipulation(Insertion & Deletion in an array)A Presentation About Array Manipulation(Insertion & Deletion in an array)
A Presentation About Array Manipulation(Insertion & Deletion in an array)
 

Semelhante a Rattle Graphical Interface for R Language

An Introduction to Spark with Scala
An Introduction to Spark with ScalaAn Introduction to Spark with Scala
An Introduction to Spark with ScalaChetan Khatri
 
Revolution Analytics
Revolution AnalyticsRevolution Analytics
Revolution Analyticstempledf
 
Microsoft R - Data Science at Scale
Microsoft R - Data Science at ScaleMicrosoft R - Data Science at Scale
Microsoft R - Data Science at ScaleSascha Dittmann
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopRevolution Analytics
 
R Programming - part 1.pdf
R Programming - part 1.pdfR Programming - part 1.pdf
R Programming - part 1.pdfRohanBorgalli
 
Introduction to R for data science
Introduction to R for data scienceIntroduction to R for data science
Introduction to R for data scienceLong Nguyen
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL ServerStéphane Fréchette
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkCloudera, Inc.
 
Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Massimo Schenone
 
R programming slides
R  programming slidesR  programming slides
R programming slidesPankaj Saini
 
20130215 Reading data into R
20130215 Reading data into R20130215 Reading data into R
20130215 Reading data into RKazuki Yoshida
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopDataWorks Summit
 

Semelhante a Rattle Graphical Interface for R Language (20)

An Intoduction to R
An Intoduction to RAn Intoduction to R
An Intoduction to R
 
An Introduction to Spark with Scala
An Introduction to Spark with ScalaAn Introduction to Spark with Scala
An Introduction to Spark with Scala
 
Revolution Analytics
Revolution AnalyticsRevolution Analytics
Revolution Analytics
 
Microsoft R - Data Science at Scale
Microsoft R - Data Science at ScaleMicrosoft R - Data Science at Scale
Microsoft R - Data Science at Scale
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 
R basics
R basicsR basics
R basics
 
R Programming - part 1.pdf
R Programming - part 1.pdfR Programming - part 1.pdf
R Programming - part 1.pdf
 
Introduction to R for data science
Introduction to R for data scienceIntroduction to R for data science
Introduction to R for data science
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
Data analytics with R
Data analytics with RData analytics with R
Data analytics with R
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 
Unit 3
Unit 3Unit 3
Unit 3
 
Apache Spark: What? Why? When?
Apache Spark: What? Why? When?Apache Spark: What? Why? When?
Apache Spark: What? Why? When?
 
User biglm
User biglmUser biglm
User biglm
 
Big Data Analytics Part2
Big Data Analytics Part2Big Data Analytics Part2
Big Data Analytics Part2
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
R studio
R studio R studio
R studio
 
20130215 Reading data into R
20130215 Reading data into R20130215 Reading data into R
20130215 Reading data into R
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
High Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and HadoopHigh Performance Predictive Analytics in R and Hadoop
High Performance Predictive Analytics in R and Hadoop
 

Último

➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 

Último (20)

➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night StandCall Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Shivaji Nagar ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 

Rattle Graphical Interface for R Language

  • 1. INTRODUCTION TO R AND RATTLE 1IAUSHIRAZ1/14/2017
  • 2. What is the R Statistical Programming Language used among statisticians and data miners for developing statistical software and data analysis. Free and Open Source Written in C, Fortran and R Statistical features Linear and nonlinear modeling Statistical tests Classification, Clustering Can manipulate R Objects with C, C++, Java, .NET or Python code. 2IAUSHIRAZ1/14/2017
  • 3. Source Example > x <- c(1,2,3,4,5,6) # Create ordered collection (vector) > y <- x^2 # Square the elements of x > print(y) # print (vector) y [1] 1 4 9 16 25 36 > mean(y) # Calculate average (arithmetic mean) of (vector) y; result is scalar [1] 15.16667 > var(y) # Calculate sample variance [1] 178.9667 > lm_1 <- lm(y ~ x) # Fit a linear regression model "y = f(x)" or "y = B0 + (B1 * x)" # store the results as lm_1 > print(lm_1) # Print the model from the (linear model object) lm_1 Call: lm(formula = y ~ x) Coefficients: (Intercept) x -9.333 7.000 > summary(lm_1) # Compute and print statistics for the fit # of the (linear model object) lm_1 Call: lm(formula = y ~ x) Residuals: 1 2 3 4 5 6 3.3333 -0.6667 -2.6667 -2.6667 -0.6667 3.3333 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -9.3333 2.8441 -3.282 0.030453 * x 7.0000 0.7303 9.585 0.000662 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.055 on 4 degrees of freedom Multiple R-squared: 0.9583, Adjusted R-squared: 0.9478 F-statistic: 91.88 on 1 and 4 DF, p-value: 0.000662 > par(mfrow=c(2, 2)) # Request 2x2 plot layout > plot(lm_1) # Diagnostic plot of regression model 3IAUSHIRAZ1/14/2017
  • 4. Graphical front-ends Architect – cross-platform open source IDE based on Eclipse and StatET DataJoy – Online R Editor focused on beginners to data science and collaboration. Deducer – GUI for menu-driven data analysis (similar to SPSS/JMP/Minitab). Java GUI for R – cross-platform stand-alone R terminal and editor based on Java (also known as JGR). Number Analytics - GUI for R based business analytics (similar to SPSS) working on the cloud. Rattle GUI – cross-platform GUI based on RGtk2 and specifically designed for data mining. R Commander – cross-platform menu-driven GUI based on tcltk (several plug-ins to Rcmdr are also available). Revolution R Productivity Environment (RPE) – Revolution Analytics-provided Visual Studio-based IDE, and has plans for web based point and click interface. RGUI – comes with the pre-compiled version of R for Microsoft Windows. RKWard – extensible GUI and IDE for R. RStudio – cross-platform open source IDE (which can also be run on a remote Linux server). 4IAUSHIRAZ1/14/2017
  • 5. What is the Rattle R Graphical User Interface Package Offered by Graham Williams in Togaware Pty Ltd. Free and Open Source Represents Statistical and Visual Summaries of data Tabs : Load Data Data Exploration Model Evaluation Test … 5IAUSHIRAZ1/14/2017
  • 6. Rattle Installation Process Download and Installing R https://r-project.org About 60MB Download the Rattle Package About 300MB Follow Instructions :  install.packages("rattle", dependencies=c("Depends", "Suggests"))  Library(rattle)  Rattle() 6IAUSHIRAZ1/14/2017
  • 7. Load Data Dataset Types : CSV File (CSV, TXT, EXCELL) ARFF (CSV File which adds type information) ODBC (MySQL, SqlLITE, SQL Server, …)  Set Connections in : /etc/odbcinst.ini & /etc/odbc.ini R Dataset (Existing Datasets in Current Solution) R Data File Library (Pre Existing Datasets) Corpus ( Collection of Documents) Script (Scripts for Generating Datasets) 1/14/2017 IAUSHIRAZ 7
  • 8. Load Data Variable Types : Input (Most Variables as Input)  Predict the Target Variables Target (Influenced by the Input Variables)  Known as the Output  Prefix : TARGET_ Risk (Measure of the size of the Targets)  Prefix : RISK_ Identifier (any Numeric Variable that has a Unique Value – Not Normally used in modeling)  Such as : ID, Date  Prefix : ID_ Ignore (Ignore from Modeling)  Prefix : IGNORE_ Weight (Weighted by R Formula) 1/14/2017 IAUSHIRAZ 8
  • 9. Transform Rescale Normalize  Re Center  Scale [0-1]  Median/Mad  Natural Log / Log 10  Matrix Order  Rank  Interval  Number of Group 1/14/2017 IAUSHIRAZ 9
  • 10. Transform Impute (missing values) Zero Mean Median Mode Constant Recode Quantiles K-Means Equal with Indicator variable / Join Categories As Categorical / As Numeric 1/14/2017 IAUSHIRAZ 10
  • 11. Transform Cleanup Delete Ignored Delete Selected Delete Missing Delete Observations with Missing 1/14/2017 IAUSHIRAZ 11
  • 12. Exploration Summary Summary  Min, Max, Mean, Quartiles Values. Describe  Missing, Unique, Sum, Mean, Lowest, Highest Values. Basics (For Numeric Value)  Measures of Numeric Data (Missing, Min, Max, Quartiles, Mean, Sum, Skewness, Kurtosis) Kurtosis (For Numeric Value)  A larger value indicates a sharper peak.  A lower value indicates a smoother peak. Skewness (For Numeric Value)  A positive skew indicates that the tail to the right is longer.  A negative skew that the tail to the left is longer. 1/14/2017 IAUSHIRAZ 12
  • 13. Exploration Summary Show Missing  Each row corresponds to a pattern of missing values.  Perhaps coming to an understanding of why the data is missing.  Rows and Columns are sorted in ascending order of missing data. 1/14/2017 IAUSHIRAZ 13
  • 14. Exploration Distributions (review the distributions of each variable in dataset) Annotate (include numeric values in plots) Group by Numeric Outputs :  Box Plot  Histogram  Cumulative  Benford  For any number of continuous variables  Pairs Categorical Outputs :  Bar Plot  Dot Plot  Mosaic  Pairs 1/14/2017 IAUSHIRAZ 14
  • 15. Exploration Correlations (Rattle only computes correlations between numeric variables at this time) Ordered  Order by strength of correlations Explore Missing  Correlation between missing values Hierarchical  Pearson  Kendall  Spearman Principal Components SVD  For only Numeric Variables Eigen 1/14/2017 IAUSHIRAZ 15
  • 16. Model Tree Traditional  Trade off between performance and simplicity of explanation Conditional Forest (many decision trees using random subsets of data and variables) Number of Trees Number of Variables Impute (set median numeric value for missing values) Sample Size (for balancing classes) Importance (variable importance) Rules (collection of random forest rules) ROC (ROC Curve) Errors 1/14/2017 IAUSHIRAZ 16
  • 17. Model SVM Start with two parallel vector Linear (linear regression) For continues values All 1/14/2017 IAUSHIRAZ 17
  • 18. Cluster K-Means Set First K EwKm K-Means with entropy weighting Hierarchical Not needed to set first Cluster Number BiCluster Suitable subsets of both the variables and the observations 1/14/2017 IAUSHIRAZ 18

Notas do Editor

  1. The intensity of the color is maximal for a perfect correlation, and minimal (white) if there is no correlation. Shades of red are used for negative correlations and blue for positive correlations.