SlideShare uma empresa Scribd logo
1 de 27
Innovation and
Reinvention Driving
Transformation
OCTOBER 9, 2018
2018 HPCC Systems® Community Day
Luke Pezet, Archway Health
HPCC Systems vs SAS: The Final Countdown
“Change is the only constant in life”
HPCC Systems vs SAS: The Final Countdown 2
— Heraclitus
Me, Me and Me...at Archway
• Solution Architect with over 15 years of experience
• Worked for Archway Health Advisors ~ 5 years
• Archway helps care providers manage bundled payment programs.
• Needed to process medical claims 5 years ago and chose HPCC Systems over SAS,
Hadoop*, etc.
• New employees brought other technologies, including SAS
3HPCC Systems vs SAS: The Final Countdown
Introduction
HPCC Systems
• Open-source data-intensive computing system platform developed by
LexisNexis Risk Solutions.
• Development started before 2000.
• Scalable Data refinery called Thor and scalable rapid data delivery engine
called ROXIE.
SAS (“Statistical Analysis System”)
• Proprietary software suite developed by SAS Institute that provides advanced
analytics.
• Development started in 1966.
HPCC Systems vs SAS: The Final Countdown 4
Use Case
• Based on Regression With SAS Chapter 1 - Simple And Multiple Regression web book
from Institute for Digital Research and Education at UCLA.
• It's about data analysis and demonstrates how to use software for regression
analysis. This is not about the statistical basis of multiple regression or which
criterion is best to choose models, etc.
• Data was created by randomly sampling 400 elementary schools from the California
Department of Education's API 2000 dataset.
• Contains a measure of school academic performance as well as other attributes such
as class size, enrollment, poverty, etc.
5HPCC Systems vs SAS: The Final Countdown
Helper
SASsy ECL bundle
ecl-bundle install https://github.com/lpezet/SASsy.git
Usage:
IMPORT SASsy;
// OR
IMPORT SASsy.PROC;
6HPCC Systems vs SAS: The Final Countdown
Loading data
SAS
DATA scores;
INFILE datalines dsd;
INPUT Name : $9. Score1-Score3 Team ~ $25.
Div $;
DATALINES;
Smith,12,22,46,"Green Hornets, Atlanta",AAA
Mitchel,23,19,25,"High Volts, Portland",AAA
Jones,09,17,54,"Vulcans, Las Vegas",AA
;
ECL
layout := { STRING Name; UNSIGNED Score1;
UNSIGNED Score2; UNSIGNED Score3; STRING
Team; STRING Div; };
scores := DATASET( [ { ‘Smith’,12,22,46,’Green
Hornets, Atlanta’, ‘AAA’ }, { ‘Mitchel’,
23,19,25,’High Volts, Portland’, ‘AAA’ }, { ‘Jones’,
09, 17, 54, ‘Vulcans, Las Vegas’, ‘AA’ } ], layout );
HPCC Systems vs SAS: The Final Countdown 7
Looking at the data (SAS)
HPCC Systems vs SAS: The Final Countdown 8
PROC PRINT data=”elemapi” (obs=5);
run;
Looking at the data (ECL)
HPCC Systems vs SAS: The Final Countdown 9
IMPORT SASsy.PROC;
PROC.PRINT( ElemAPIDS, 5 );
// CHOOSEN( ElemAPIDS, 5 );
Looking at the data (SAS)
HPCC Systems vs SAS: The Final Countdown 10
PROC CONTENTS data=”elemapi”;
run;
Looking at the data (ECL)
HPCC Systems vs SAS: The Final Countdown 11
IMPORT SASsy.PROC;
PROC.CONTENTS( ElemAPIDS );
Looking at the data (SAS)
HPCC Systems vs SAS: The Final Countdown 12
PROC MEANS data=”elemapi”;
var api00 acs_k3 meals full;
run;
Looking at the data (ECL)
HPCC Systems vs SAS: The Final Countdown 13
IMPORT SASsy.PROC;
PROC.MEANS( oMeans, ElemAPIDS,
'api00,acs_k3,meals,full' );
OUTPUT( oMeans, NAMED('MEANS'));
Looking at the data (ECL)
HPCC Systems vs SAS: The Final Countdown 14
IMPORT DataPatterns;
DataPatterns.Profile( ElemAPIDS,
features :=
‘fill_rate,best_ecl_types,cardinali
ty,lengths,min_max,mean,std_dev,qua
rtiles,correlations’ );
Looking at the data (SAS)
HPCC Systems vs SAS: The Final Countdown 15
PROC UNIVARIATE data=”elemapi”;
var acs_k3;
run;
Looking at the data (ECL)
HPCC Systems vs SAS: The Final Countdown 16
IMPORT SASsy.PROC;
PROC.UNIVARIATE( ElemAPIDS,
'acs_k3' );
Extreme - Lowest Extreme - Highest
Missing Values
Basics
Looking at the data (SAS)
HPCC Systems vs SAS: The Final Countdown 17
PROC FREQ data=”elemapi”;
tables acs_k3;
run;
Looking at the data (ECL)
HPCC Systems vs SAS: The Final Countdown 18
IMPORT SASsy.PROC;
PROC.FREQ( ACSK3Freq, ElemAPIDS,
'acs_k3' );
OUTPUT( ACSK3Freq, NAMED(‘Frequency’));
Looking at the data (SAS)
HPCC Systems vs SAS: The Final Countdown 19
PROC UNIVARIATE data=”elemapi”;
var acs_k3;
histogram / cfill=gray;
run;
Looking at the data (ECL)
HPCC Systems vs SAS: The Final Countdown 20
IMPORT Visualizer;
PlotData := TABLE( SORT( ElemAPIDS,
acs_k3 ), { STRING label := acs_k3;
COUNT(GROUP); }, acs_k3 );
OUTPUT(oPlotData,
NAMED('PlotData'));
Visualizer.MultiD.Column('myChart',,
'PlotData');
MACROs
SAS
%MACRO MISSINGCHECK(VAR, TYPE);
PROC SQL;
CREATE TABLE &VAR._&TYPE. AS
SELECT DISTINCT CLM_TYPE_1, COUNT(SYSKEY) AS
&VAR._MISSING
FROM OUTPUT.&TYPE.
WHERE &VAR. IS MISSING
GROUP BY CLM_TYPE_1
ORDER BY CLM_TYPE_1;
QUIT;
%MEND MISSINGCHECK;
%MISSINGCHECK(MEMBER_ID, &EPI.GENERAL);
%MISSINGCHECK(CLAIM_ID, &EPI.GENERAL);
%MISSINGCHECK(MS_DRG, &EPI.GENERAL);
%MISSINGCHECK(ADM_DGNS, &EPI.GENERAL);
ECL
MissingCheck( pDS, pField, pMissingValue, pByField ) :=
FUNCTIONMACRO
#UNIQUENAME(tabled)
%tabled% := TABLE( pDS( pField = pMissingValue ), {
pByField; COUNT(GROUP); }, pByField );
#UNIQUENAME(sorted)
%sorted% := SORT( %tabled%, pByField);
RETURN %sorted%;
ENDMACRO;
MissingCheck( ElemAPIDS, meals, ‘’, dnum );
MissingCheck( ElemAPIDS, acs_k3, ‘’, dnum );
MissingCheck( ElemAPIDS, api00, ‘’, dnum );
HPCC Systems vs SAS: The Final Countdown 21
Multiple Regression (SAS)
HPCC Systems vs SAS: The Final Countdown 22
PROC REG data="c:sasregelemapi"
model api00 = acs_k3 meals full;
run;
Multiple Regression (ECL)
HPCC Systems vs SAS: The Final Countdown 23
IMPORT ML_Core;
IMPORT LinearRegression;
IMPORT SASsy;
IndVars := 'acs_k3,meals,full';
DepVars := 'api00';
/* … */
ML_Core.ToField( inddata,
inddataNF, __id__ );
ML_Core.ToField( depdata,
depdataNF, __id__ );
MyOLS := LinearRegression.OLS(
inddataNF, depdataNF );
MyModel := MyOLS.GetModel;
SASsy.Utils.reg_report_on_all(
MyOLS, MyModel, inddataNF );
More
ECL Machine Learning Library
• Statistics (e.g. Means, Std Deviation, Modes, Medians, NTiles, etc.)
• Regression
• Clustering (e.g. K-Means)
• Classification (e.g. Logistic Regression, Decision Trees, Perceptron, etc.)
• Unstructured Data (Tokenize, Transform, CoLocation)
• Association (e.g. AprioriN)
• Matrix Manipulation
HPCC Systems vs SAS: The Final Countdown 24
Today
HPCC Systems used to process data at scale and on a more frequent basis
• Process Medical Claims using Thor and deliver results using Roxie
• Run ETL/ELT processes to load, clean, prepare data
• Run more advanced processing to generate outputs (Bundle Engine)
• Clusters of 8+ nodes
SAS used to run research, exploratory data analysis and modeling.
• Uses HPCC outputs as input
• Single instance
• Restricted on CPU/RAM
25HPCC Systems vs SAS: The Final Countdown
Tomorrow
HPCC Systems
• Still run ETL/ELT processes to load, clean, prepare data
• Run processes that need to happen more frequently
• Porting more Advanced Data Analysis And Modeling features to ECL
• Make it easier to create clusters to make experimentation effortless
SAS
• 1 server
• R&D for now
• Validate/compare results with HPCC Systems
26HPCC Systems vs SAS: The Final Countdown
Thank you!OUTPUT(‘ ’);

Mais conteúdo relacionado

Mais procurados

R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...Edureka!
 
An Introduction to Spark with Scala
An Introduction to Spark with ScalaAn Introduction to Spark with Scala
An Introduction to Spark with ScalaChetan Khatri
 
HEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkHEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkEamonn Maguire
 
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...huguk
 
Intro to Apache Spark - Lab
Intro to Apache Spark - LabIntro to Apache Spark - Lab
Intro to Apache Spark - LabMammoth Data
 
GSLIS Research Showcase Presentation (Expanded)
GSLIS Research Showcase Presentation (Expanded)GSLIS Research Showcase Presentation (Expanded)
GSLIS Research Showcase Presentation (Expanded)Bertram Ludäscher
 
Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...
Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...
Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...Leonidas Akritidis
 
Time series database by Harshil Ambagade
Time series database by Harshil AmbagadeTime series database by Harshil Ambagade
Time series database by Harshil AmbagadeSigmoid
 
Hive LLAP cache roadmap
Hive LLAP cache roadmapHive LLAP cache roadmap
Hive LLAP cache roadmapc-bslim
 
Reproducible, Open Data Science in the Life Sciences
Reproducible, Open  Data Science in the  Life SciencesReproducible, Open  Data Science in the  Life Sciences
Reproducible, Open Data Science in the Life SciencesEamonn Maguire
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R ServicesGregg Barrett
 

Mais procurados (12)

R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners ...
 
An Introduction to Spark with Scala
An Introduction to Spark with ScalaAn Introduction to Spark with Scala
An Introduction to Spark with Scala
 
HEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkHEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 Talk
 
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
Hadoop for Data Science: Moving from BI dashboards to R models, using Hive st...
 
Intro to Apache Spark - Lab
Intro to Apache Spark - LabIntro to Apache Spark - Lab
Intro to Apache Spark - Lab
 
GSLIS Research Showcase Presentation (Expanded)
GSLIS Research Showcase Presentation (Expanded)GSLIS Research Showcase Presentation (Expanded)
GSLIS Research Showcase Presentation (Expanded)
 
Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...
Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...
Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...
 
Time series database by Harshil Ambagade
Time series database by Harshil AmbagadeTime series database by Harshil Ambagade
Time series database by Harshil Ambagade
 
Hive LLAP cache roadmap
Hive LLAP cache roadmapHive LLAP cache roadmap
Hive LLAP cache roadmap
 
Reproducible, Open Data Science in the Life Sciences
Reproducible, Open  Data Science in the  Life SciencesReproducible, Open  Data Science in the  Life Sciences
Reproducible, Open Data Science in the Life Sciences
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 

Semelhante a HPCC Systems vs SAS: The Final Countdown

Tutorial On Database Management System
Tutorial On Database Management SystemTutorial On Database Management System
Tutorial On Database Management Systempsathishcs
 
Ssis ssas sps_mdx_hong_bingli
Ssis ssas sps_mdx_hong_bingliSsis ssas sps_mdx_hong_bingli
Ssis ssas sps_mdx_hong_bingliHong-Bing Li
 
Ssis sql ssrs_ssas_sp_mdx_hb_li
Ssis sql ssrs_ssas_sp_mdx_hb_liSsis sql ssrs_ssas_sp_mdx_hb_li
Ssis sql ssrs_ssas_sp_mdx_hb_liHong-Bing Li
 
Ssis sql ssas_sps_mdx_hong_bingli
Ssis sql ssas_sps_mdx_hong_bingliSsis sql ssas_sps_mdx_hong_bingli
Ssis sql ssas_sps_mdx_hong_bingliHong-Bing Li
 
Ssis sql ssas_sps_mdx_hong_bingli
Ssis sql ssas_sps_mdx_hong_bingliSsis sql ssas_sps_mdx_hong_bingli
Ssis sql ssas_sps_mdx_hong_bingliHong-Bing Li
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learningSamir Bessalah
 
Theits 2014 iaa s saas strategic focus
Theits 2014 iaa s saas strategic focusTheits 2014 iaa s saas strategic focus
Theits 2014 iaa s saas strategic focusGreg Turmel
 
Managing ASQ Data: a Guide for Relief Nursery Administrative Assistants
Managing ASQ Data: a Guide for Relief Nursery Administrative AssistantsManaging ASQ Data: a Guide for Relief Nursery Administrative Assistants
Managing ASQ Data: a Guide for Relief Nursery Administrative AssistantsTinasky
 
Introducing a horizontally scalable, inference-based business Rules Engine fo...
Introducing a horizontally scalable, inference-based business Rules Engine fo...Introducing a horizontally scalable, inference-based business Rules Engine fo...
Introducing a horizontally scalable, inference-based business Rules Engine fo...Cask Data
 
Ssis ssas sps_mdx_hong_bingli
Ssis ssas sps_mdx_hong_bingliSsis ssas sps_mdx_hong_bingli
Ssis ssas sps_mdx_hong_bingliHong-Bing Li
 
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...jsvetter
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6Mahesh Vallampati
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersAdam Hutson
 
Skills Portfolio
Skills PortfolioSkills Portfolio
Skills Portfoliorolee23
 
Bringing OpenClinica Data into SAS
Bringing OpenClinica Data into SASBringing OpenClinica Data into SAS
Bringing OpenClinica Data into SASRick Watts
 
Visualizing HPCC Systems Log Data Using ELK
Visualizing HPCC Systems Log Data Using ELKVisualizing HPCC Systems Log Data Using ELK
Visualizing HPCC Systems Log Data Using ELKHPCC Systems
 
Oracle OpenWorld 2011– Leveraging and Enriching the Capabilities of Oracle Da...
Oracle OpenWorld 2011– Leveraging and Enriching the Capabilities of Oracle Da...Oracle OpenWorld 2011– Leveraging and Enriching the Capabilities of Oracle Da...
Oracle OpenWorld 2011– Leveraging and Enriching the Capabilities of Oracle Da...djkucera
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresSteven Johnson
 

Semelhante a HPCC Systems vs SAS: The Final Countdown (20)

Tutorial On Database Management System
Tutorial On Database Management SystemTutorial On Database Management System
Tutorial On Database Management System
 
Ssis ssas sps_mdx_hong_bingli
Ssis ssas sps_mdx_hong_bingliSsis ssas sps_mdx_hong_bingli
Ssis ssas sps_mdx_hong_bingli
 
Ssis sql ssrs_ssas_sp_mdx_hb_li
Ssis sql ssrs_ssas_sp_mdx_hb_liSsis sql ssrs_ssas_sp_mdx_hb_li
Ssis sql ssrs_ssas_sp_mdx_hb_li
 
Ssis sql ssas_sps_mdx_hong_bingli
Ssis sql ssas_sps_mdx_hong_bingliSsis sql ssas_sps_mdx_hong_bingli
Ssis sql ssas_sps_mdx_hong_bingli
 
Ssis sql hb_li
Ssis sql hb_liSsis sql hb_li
Ssis sql hb_li
 
Ssis sql ssas_sps_mdx_hong_bingli
Ssis sql ssas_sps_mdx_hong_bingliSsis sql ssas_sps_mdx_hong_bingli
Ssis sql ssas_sps_mdx_hong_bingli
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learning
 
Theits 2014 iaa s saas strategic focus
Theits 2014 iaa s saas strategic focusTheits 2014 iaa s saas strategic focus
Theits 2014 iaa s saas strategic focus
 
Managing ASQ Data: a Guide for Relief Nursery Administrative Assistants
Managing ASQ Data: a Guide for Relief Nursery Administrative AssistantsManaging ASQ Data: a Guide for Relief Nursery Administrative Assistants
Managing ASQ Data: a Guide for Relief Nursery Administrative Assistants
 
Introducing a horizontally scalable, inference-based business Rules Engine fo...
Introducing a horizontally scalable, inference-based business Rules Engine fo...Introducing a horizontally scalable, inference-based business Rules Engine fo...
Introducing a horizontally scalable, inference-based business Rules Engine fo...
 
Ssis ssas sps_mdx_hong_bingli
Ssis ssas sps_mdx_hong_bingliSsis ssas sps_mdx_hong_bingli
Ssis ssas sps_mdx_hong_bingli
 
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
 
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
DataStax | Data Science with DataStax Enterprise (Brian Hess) | Cassandra Sum...
 
SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6SQL Optimization With Trace Data And Dbms Xplan V6
SQL Optimization With Trace Data And Dbms Xplan V6
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for Programmers
 
Skills Portfolio
Skills PortfolioSkills Portfolio
Skills Portfolio
 
Bringing OpenClinica Data into SAS
Bringing OpenClinica Data into SASBringing OpenClinica Data into SAS
Bringing OpenClinica Data into SAS
 
Visualizing HPCC Systems Log Data Using ELK
Visualizing HPCC Systems Log Data Using ELKVisualizing HPCC Systems Log Data Using ELK
Visualizing HPCC Systems Log Data Using ELK
 
Oracle OpenWorld 2011– Leveraging and Enriching the Capabilities of Oracle Da...
Oracle OpenWorld 2011– Leveraging and Enriching the Capabilities of Oracle Da...Oracle OpenWorld 2011– Leveraging and Enriching the Capabilities of Oracle Da...
Oracle OpenWorld 2011– Leveraging and Enriching the Capabilities of Oracle Da...
 
MIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome MeasuresMIS5101 WK10 Outcome Measures
MIS5101 WK10 Outcome Measures
 

Mais de HPCC Systems

Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...HPCC Systems
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsHPCC Systems
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsHPCC Systems
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn HPCC Systems
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingHPCC Systems
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle ChangesHPCC Systems
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index HPCC Systems
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningHPCC Systems
 
Expanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesExpanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesHPCC Systems
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsHPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch HPCC Systems
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem HPCC Systems
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis ToolHPCC Systems
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony HPCC Systems
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterHPCC Systems
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...HPCC Systems
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...HPCC Systems
 

Mais de HPCC Systems (20)

Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex Systems
 
Welcome
WelcomeWelcome
Welcome
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon Cutting
 
Path to 8.0
Path to 8.0 Path to 8.0
Path to 8.0
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle Changes
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine Learning
 
Docker Support
Docker Support Docker Support
Docker Support
 
Expanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesExpanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network Capabilities
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis Tool
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL Neater
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
 

Último

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 

Último (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 

HPCC Systems vs SAS: The Final Countdown

  • 1. Innovation and Reinvention Driving Transformation OCTOBER 9, 2018 2018 HPCC Systems® Community Day Luke Pezet, Archway Health HPCC Systems vs SAS: The Final Countdown
  • 2. “Change is the only constant in life” HPCC Systems vs SAS: The Final Countdown 2 — Heraclitus
  • 3. Me, Me and Me...at Archway • Solution Architect with over 15 years of experience • Worked for Archway Health Advisors ~ 5 years • Archway helps care providers manage bundled payment programs. • Needed to process medical claims 5 years ago and chose HPCC Systems over SAS, Hadoop*, etc. • New employees brought other technologies, including SAS 3HPCC Systems vs SAS: The Final Countdown
  • 4. Introduction HPCC Systems • Open-source data-intensive computing system platform developed by LexisNexis Risk Solutions. • Development started before 2000. • Scalable Data refinery called Thor and scalable rapid data delivery engine called ROXIE. SAS (“Statistical Analysis System”) • Proprietary software suite developed by SAS Institute that provides advanced analytics. • Development started in 1966. HPCC Systems vs SAS: The Final Countdown 4
  • 5. Use Case • Based on Regression With SAS Chapter 1 - Simple And Multiple Regression web book from Institute for Digital Research and Education at UCLA. • It's about data analysis and demonstrates how to use software for regression analysis. This is not about the statistical basis of multiple regression or which criterion is best to choose models, etc. • Data was created by randomly sampling 400 elementary schools from the California Department of Education's API 2000 dataset. • Contains a measure of school academic performance as well as other attributes such as class size, enrollment, poverty, etc. 5HPCC Systems vs SAS: The Final Countdown
  • 6. Helper SASsy ECL bundle ecl-bundle install https://github.com/lpezet/SASsy.git Usage: IMPORT SASsy; // OR IMPORT SASsy.PROC; 6HPCC Systems vs SAS: The Final Countdown
  • 7. Loading data SAS DATA scores; INFILE datalines dsd; INPUT Name : $9. Score1-Score3 Team ~ $25. Div $; DATALINES; Smith,12,22,46,"Green Hornets, Atlanta",AAA Mitchel,23,19,25,"High Volts, Portland",AAA Jones,09,17,54,"Vulcans, Las Vegas",AA ; ECL layout := { STRING Name; UNSIGNED Score1; UNSIGNED Score2; UNSIGNED Score3; STRING Team; STRING Div; }; scores := DATASET( [ { ‘Smith’,12,22,46,’Green Hornets, Atlanta’, ‘AAA’ }, { ‘Mitchel’, 23,19,25,’High Volts, Portland’, ‘AAA’ }, { ‘Jones’, 09, 17, 54, ‘Vulcans, Las Vegas’, ‘AA’ } ], layout ); HPCC Systems vs SAS: The Final Countdown 7
  • 8. Looking at the data (SAS) HPCC Systems vs SAS: The Final Countdown 8 PROC PRINT data=”elemapi” (obs=5); run;
  • 9. Looking at the data (ECL) HPCC Systems vs SAS: The Final Countdown 9 IMPORT SASsy.PROC; PROC.PRINT( ElemAPIDS, 5 ); // CHOOSEN( ElemAPIDS, 5 );
  • 10. Looking at the data (SAS) HPCC Systems vs SAS: The Final Countdown 10 PROC CONTENTS data=”elemapi”; run;
  • 11. Looking at the data (ECL) HPCC Systems vs SAS: The Final Countdown 11 IMPORT SASsy.PROC; PROC.CONTENTS( ElemAPIDS );
  • 12. Looking at the data (SAS) HPCC Systems vs SAS: The Final Countdown 12 PROC MEANS data=”elemapi”; var api00 acs_k3 meals full; run;
  • 13. Looking at the data (ECL) HPCC Systems vs SAS: The Final Countdown 13 IMPORT SASsy.PROC; PROC.MEANS( oMeans, ElemAPIDS, 'api00,acs_k3,meals,full' ); OUTPUT( oMeans, NAMED('MEANS'));
  • 14. Looking at the data (ECL) HPCC Systems vs SAS: The Final Countdown 14 IMPORT DataPatterns; DataPatterns.Profile( ElemAPIDS, features := ‘fill_rate,best_ecl_types,cardinali ty,lengths,min_max,mean,std_dev,qua rtiles,correlations’ );
  • 15. Looking at the data (SAS) HPCC Systems vs SAS: The Final Countdown 15 PROC UNIVARIATE data=”elemapi”; var acs_k3; run;
  • 16. Looking at the data (ECL) HPCC Systems vs SAS: The Final Countdown 16 IMPORT SASsy.PROC; PROC.UNIVARIATE( ElemAPIDS, 'acs_k3' ); Extreme - Lowest Extreme - Highest Missing Values Basics
  • 17. Looking at the data (SAS) HPCC Systems vs SAS: The Final Countdown 17 PROC FREQ data=”elemapi”; tables acs_k3; run;
  • 18. Looking at the data (ECL) HPCC Systems vs SAS: The Final Countdown 18 IMPORT SASsy.PROC; PROC.FREQ( ACSK3Freq, ElemAPIDS, 'acs_k3' ); OUTPUT( ACSK3Freq, NAMED(‘Frequency’));
  • 19. Looking at the data (SAS) HPCC Systems vs SAS: The Final Countdown 19 PROC UNIVARIATE data=”elemapi”; var acs_k3; histogram / cfill=gray; run;
  • 20. Looking at the data (ECL) HPCC Systems vs SAS: The Final Countdown 20 IMPORT Visualizer; PlotData := TABLE( SORT( ElemAPIDS, acs_k3 ), { STRING label := acs_k3; COUNT(GROUP); }, acs_k3 ); OUTPUT(oPlotData, NAMED('PlotData')); Visualizer.MultiD.Column('myChart',, 'PlotData');
  • 21. MACROs SAS %MACRO MISSINGCHECK(VAR, TYPE); PROC SQL; CREATE TABLE &VAR._&TYPE. AS SELECT DISTINCT CLM_TYPE_1, COUNT(SYSKEY) AS &VAR._MISSING FROM OUTPUT.&TYPE. WHERE &VAR. IS MISSING GROUP BY CLM_TYPE_1 ORDER BY CLM_TYPE_1; QUIT; %MEND MISSINGCHECK; %MISSINGCHECK(MEMBER_ID, &EPI.GENERAL); %MISSINGCHECK(CLAIM_ID, &EPI.GENERAL); %MISSINGCHECK(MS_DRG, &EPI.GENERAL); %MISSINGCHECK(ADM_DGNS, &EPI.GENERAL); ECL MissingCheck( pDS, pField, pMissingValue, pByField ) := FUNCTIONMACRO #UNIQUENAME(tabled) %tabled% := TABLE( pDS( pField = pMissingValue ), { pByField; COUNT(GROUP); }, pByField ); #UNIQUENAME(sorted) %sorted% := SORT( %tabled%, pByField); RETURN %sorted%; ENDMACRO; MissingCheck( ElemAPIDS, meals, ‘’, dnum ); MissingCheck( ElemAPIDS, acs_k3, ‘’, dnum ); MissingCheck( ElemAPIDS, api00, ‘’, dnum ); HPCC Systems vs SAS: The Final Countdown 21
  • 22. Multiple Regression (SAS) HPCC Systems vs SAS: The Final Countdown 22 PROC REG data="c:sasregelemapi" model api00 = acs_k3 meals full; run;
  • 23. Multiple Regression (ECL) HPCC Systems vs SAS: The Final Countdown 23 IMPORT ML_Core; IMPORT LinearRegression; IMPORT SASsy; IndVars := 'acs_k3,meals,full'; DepVars := 'api00'; /* … */ ML_Core.ToField( inddata, inddataNF, __id__ ); ML_Core.ToField( depdata, depdataNF, __id__ ); MyOLS := LinearRegression.OLS( inddataNF, depdataNF ); MyModel := MyOLS.GetModel; SASsy.Utils.reg_report_on_all( MyOLS, MyModel, inddataNF );
  • 24. More ECL Machine Learning Library • Statistics (e.g. Means, Std Deviation, Modes, Medians, NTiles, etc.) • Regression • Clustering (e.g. K-Means) • Classification (e.g. Logistic Regression, Decision Trees, Perceptron, etc.) • Unstructured Data (Tokenize, Transform, CoLocation) • Association (e.g. AprioriN) • Matrix Manipulation HPCC Systems vs SAS: The Final Countdown 24
  • 25. Today HPCC Systems used to process data at scale and on a more frequent basis • Process Medical Claims using Thor and deliver results using Roxie • Run ETL/ELT processes to load, clean, prepare data • Run more advanced processing to generate outputs (Bundle Engine) • Clusters of 8+ nodes SAS used to run research, exploratory data analysis and modeling. • Uses HPCC outputs as input • Single instance • Restricted on CPU/RAM 25HPCC Systems vs SAS: The Final Countdown
  • 26. Tomorrow HPCC Systems • Still run ETL/ELT processes to load, clean, prepare data • Run processes that need to happen more frequently • Porting more Advanced Data Analysis And Modeling features to ECL • Make it easier to create clusters to make experimentation effortless SAS • 1 server • R&D for now • Validate/compare results with HPCC Systems 26HPCC Systems vs SAS: The Final Countdown