SlideShare uma empresa Scribd logo
1 de 18
SPIDAL JavaHigh Performance Data Analytics with Java on Large Multicore
HPC Clusters
sekanaya@indiana.edu
https://github.com/DSC-SPIDAL | http://saliya.org
24th High Performance Computing Symposium (HPC 2016)
April 3-6, 2016, Pasadena, CA, USA
as part of the SCS Spring Simulation Multi-Conference (SpringSim'16)
Saliya Ekanayake | Supun Kamburugamuve | Geoffrey Fox
High Performance?
4/4/2016 HPC 2016 2
48 Nodes 128 Nodes
40x Speedup with SPIDAL Java
Typical Java with All MPI
Typical Java with Threads and MPI
64x Ideal (if life was so fair!)
We’ll discuss today
Intel Haswell HPC Clusterwith
40Gbps Infiniband
Introduction
• Big Data and HPC
 Big data + cloud is the norm, but not always
 Some applications demand significant computation and communication
 HPC clusters are ideal
 However, it’s not easy
• Java
 Unprecedented big data ecosystem
 Apache has over 300 big data systems, mostly written in Java
 Performance and APIs have improved greatly with 1.7x
 Not much used in HPC, but can give comparative performance (e.g. SPIDAL Java)
 Comparative performance to C (more on this later)
 Google query https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-
8#q=java%20faster%20than%20c will point to interesting discussions on why.
 Interoperable and productive
4/4/2016 HPC 2016 3
SPIDAL Java
• Scalable Parallel Interoperable Data Analytics Library (SPIDAL)
 Available at https://github.com/DSC-SPIDAL
• Includes Multidimensional Scaling (MDS) and Clustering Applications
 DA-MDS
 Y. Ruan and G. Fox, "A Robust and Scalable Solution for Interpolative Multidimensional Scaling with Weighting," eScience
(eScience), 2013 IEEE 9th International Conference on, Beijing, 2013, pp. 61-69.
doi: 10.1109/eScience.2013.30
 DA-PWC
 Fox, G. C. Deterministic annealing and robust scalable data mining for the data deluge. In Proceedings of the 2nd International
Workshop on Petascal Data Analytics: Challenges and Opportunities, PDAC ’11, ACM (New York, NY, USA, 2011), 39–40.
 DA-VS
 Fox, G., Mani, D., and Pyne, S. Parallel deterministic annealing clustering and its application to lc-ms data analysis. In Big
Data, 2013 IEEE International Conference on (Oct 2013), 665–673.
 MDSasChisq
 General MDS implementation using LevenbergMarquardt algorithm
 Levenberg, K. A method for the solution of certain non-linear problems in least squares. Quarterly Journal of Applied
Mathmatics II, 2 (1944), 164–168.)
4/4/2016 HPC 2016 4
SPIDAL Java Applications
• Gene Sequence Clustering and Visualization
 Results at WebPlotViz
 https://spidal-gw.dsc.soic.indiana.edu/resultsets/991946447
 https://spidal-gw.dsc.soic.indiana.edu/resultsets/795366853
 A few snapshots
4/4/2016 HPC 2016 5
Sequence
File
Pairwise
Alignment
DA-MDS
DA-PWC
3D Plot
100,000 fungi sequences 3D phylogenetic tree 3D plot of vector data
SPIDAL Java Applications
• Stocks Data Analysis
 Time series view of stocks
 E.g. with 1 year moving window https://spidal-
gw.dsc.soic.indiana.edu/public/timeseriesview/825496517
4/4/2016 HPC 2016 6
Performance Challenges
• Intra-node Communication
• Exploiting Fat Nodes
• Overhead of Garbage Collection
• Cost of Heap Allocated Objects
• Cache and Memory Access
4/4/2016 HPC 2016 7
Performance Challenges
• Intra-node Communication [1/3]
 Large core counts per node – 24 to 36
 Data analytics use global collective communication – Allreduce, Allgather, Broadcast, etc.
 HPC simulations, in contrast, typically, uses local communications for tasks like halo exchanges.
4/4/2016 HPC 2016 8
3 million double values distributed uniformly over 48 nodes
• Identical message size per node, yet 24
MPI is ~10 times slower than 1 MPI
• Suggests #ranks per node should be 1
for the best performance
• But how to exploit all available cores?
Performance Challenges
• Intra-node Communication [2/3]
 Solution: Shared memory
 Use threads?  didn’t work well (explained shortly)
 Processes with shared memory communication
 Custom implementation in SPIDAL Java outside of MPI framework
4/4/2016 HPC 2016 9
• Only 1 rank per node participates in the MPI collective call
• Others exchange data using shared memory maps
100K DA-MDS Run Communication
200K DA-MDS Run Communication
Performance Challenges
• Intra-node Communication [3/3]
 Heterogeneity support
 Nodes with 24 and 36 cores
 Automatically detects configuration and allocates memory maps
 Implementation
 Custom shared memory implementation using OpenHFT’s Bytes API
 Supports collective calls necessary within SPIDAL Java
4/4/2016 HPC 2016 10
Performance Challenges
• Exploiting Fat Nodes [1/2]
 Large #Cores per Node
 E.g. 1 Node in Juliet HPC cluster
 2 Sockets
 12 Cores each
 2 Hardware threads per core
 L1 and L2 per core
 L3 shared per socket
 Two approaches
 All processes  1 proc per core
 1 Process multiple threads
 Which is better?
4/4/2016 HPC 2016 11
Socket 0
Socket 1
1 Core – 2 HTs
Performance Challenges
• Exploiting Fat Nodes [2/2]
 Suggested thread model in literature  fork-join regions within a process
4/4/2016 HPC 2016 12
Iterations
1. Thread creation and scheduling
overhead accumulates over
iterations and significant (~5%)
• True for Java threads as well as
OpenMP in C/C++ (see
https://github.com/esaliya/JavaThreads
and
https://github.com/esaliya/CppStack/tree/
master/omp2/letomppar)
2. Long running threads do better
than this model, still have non-
negligible overhead
3. Solution is to use processes with
shared memory communications as
in SPIDAL Java
process
Prev. Optimization
Performance Challenges
• Garbage Collection
 “Stop the world” events are expensive
 Especially, for parallel processes with collective communications
 Typical OOP  allocate – use – forget
 Original SPIDAL code produced frequent garbage of small
arrays
 Solution: Zero-GC using
 Static allocation and reuse
 Off-heap static buffers (more on next slide)
 Advantage
 No GC – obvious
 Scale to larger problem sizes
 E.g. Original SPIDAL code required 5GB (x 24 = 120 GB per node)
heap per process to handle 200K MDS. Optimized code use < 1GB heap
to finish within the same timing.
 Note. Physical memory is 128GB, so with optimized SPIDAL can
now do 1 million point MDS within hardware limits.
4/4/2016 HPC 2016 13
Heap size per
process reaches
–Xmx (2.5GB)
early in the
computation
Frequent GC
Heap size per
process is well
below (~1.1GB)
of –Xmx (2.5GB)
Virtually no GC activity
after optimizing
Performance Challenges
4/4/2016 HPC 2016 14
• I/O with Heap Allocated Objects
 Java-to-native I/O creates copies of objects in heap
 Otherwise can’t guarantee object’s memory location due to GC
 Too expensive
 Solution: Off-heap buffers (memory maps)
 Initial data loading  significantly faster than Java stream API calls
 Intra-node messaging  gives the best performance
 MPI inter-node communications
• Cache and Memory Access
 Nested data structures are neat, but expensive
 Solution: Contiguous memory with 1D arrays over 2D structures
 Indirect memory references are costly
 Also, adopted from HPC
 Blocked loops and loop ordering
Evaluation
4/4/2016 HPC 2016 15
• HPC Cluster
 128 Intel Haswell nodes with 2.3GHz nominal frequency
 96 nodes with 24 cores on 2 sockets (12 cores each)
 32 nodes with 36 cores on 2 sockets (18 cores each)
 128GB memory per node
 40Gbps Infiniband
• Software
 Java 1.8
 OpenHFT JavaLang 6.7.2
 Habanero Java 0.1.4
 OpenMPI 1.10.1
Application: DA-MDS
• Computations grow 𝑂 𝑁2
• Communication global and is 𝑂(𝑁)
4/4/2016 HPC 2016 16
100K
DA-MDS
200K
DA-MDS 400K
DA-MDS
• 1152 Total Parallelism across 48 nodes
• All combinations of 24 way parallelism per node
• LHS is all processes
• RHS is all internal threads and MPI across nodes
1. With SM communications in SPIDAL, processes
outperform threads (blue line)
2. Other optimizations further improves performance
(green line)
4/4/2016 HPC 2016 17
• Speedup for varying data sizes
• All processes
• LHS is 1 proc per node across 48 nodes
• RHS is 24 procs per node across 128 nodes
(ideal 64x speedup)
Larger data sizes show better speedup
(400K – 45x, 200K – 40x, 100K – 38x)
• Speedup on 36 core nodes
• All processes
• LHS is 1 proc per node across 32 nodes
• RHS is 36 procs per node across 32 nodes
(ideal 36x speedup)
Speedup plateaus around 23x after 24
way parallelism per node
4/4/2016 HPC 2016 18
The effect of different optimizations on speedup
Java, is it worth it? – YES!
Also, with JIT some cases in MDS are better
than C

Mais conteúdo relacionado

Mais procurados

Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabImpetus Technologies
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Lessons Learned on Benchmarking Big Data Platforms
Lessons Learned on Benchmarking  Big Data PlatformsLessons Learned on Benchmarking  Big Data Platforms
Lessons Learned on Benchmarking Big Data Platformst_ivanov
 
PEARC 17: Spark On the ARC
PEARC 17: Spark On the ARCPEARC 17: Spark On the ARC
PEARC 17: Spark On the ARCHimanshu Bedi
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labImpetus Technologies
 
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...Geoffrey Fox
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Databricks
 
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...Spark Summit
 
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLParikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLMLconf
 
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC Geoffrey Fox
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsJen Aman
 
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBenchWBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBencht_ivanov
 
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...Databricks
 
Hadoop Summit 2010 Benchmarking And Optimizing Hadoop
Hadoop Summit 2010 Benchmarking And Optimizing HadoopHadoop Summit 2010 Benchmarking And Optimizing Hadoop
Hadoop Summit 2010 Benchmarking And Optimizing HadoopYahoo Developer Network
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataAlbert Bifet
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsAntonio Severien
 

Mais procurados (20)

Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Lessons Learned on Benchmarking Big Data Platforms
Lessons Learned on Benchmarking  Big Data PlatformsLessons Learned on Benchmarking  Big Data Platforms
Lessons Learned on Benchmarking Big Data Platforms
 
PEARC 17: Spark On the ARC
PEARC 17: Spark On the ARCPEARC 17: Spark On the ARC
PEARC 17: Spark On the ARC
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
 
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
HPC-ABDS High Performance Computing Enhanced Apache Big Data Stack (with a ...
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
 
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
 
04 open source_tools
04 open source_tools04 open source_tools
04 open source_tools
 
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLParikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
 
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
 
RISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time DecisionsRISELab:Enabling Intelligent Real-Time Decisions
RISELab:Enabling Intelligent Real-Time Decisions
 
Scaling hadoopapplications
Scaling hadoopapplicationsScaling hadoopapplications
Scaling hadoopapplications
 
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBenchWBDB 2015 Performance Evaluation of Spark SQL using BigBench
WBDB 2015 Performance Evaluation of Spark SQL using BigBench
 
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
ADMM-Based Scalable Machine Learning on Apache Spark with Sauptik Dhar and Mo...
 
Hadoop Summit 2010 Benchmarking And Optimizing Hadoop
Hadoop Summit 2010 Benchmarking And Optimizing HadoopHadoop Summit 2010 Benchmarking And Optimizing Hadoop
Hadoop Summit 2010 Benchmarking And Optimizing Hadoop
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 

Semelhante a Spidal Java: High Performance Data Analytics with Java on Large Multicore HPC Clusters

Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Saliya Ekanayake
 
Towards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and BenchmarkingTowards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and BenchmarkingSaliya Ekanayake
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetupGanesan Narayanasamy
 
The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)Nicolas Poggi
 
Spark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit
 
Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...Ahsan Javed Awan
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...Reynold Xin
 
What is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkWhat is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkAndy Petrella
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...Chester Chen
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-isctembreternitz
 
Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418inside-BigData.com
 
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale SystemsDesigning Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systemsinside-BigData.com
 
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Gyula Fóra
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem DataWorks Summit/Hadoop Summit
 
Sempala - Interactive SPARQL Query Processing on Hadoop
Sempala - Interactive SPARQL Query Processing on HadoopSempala - Interactive SPARQL Query Processing on Hadoop
Sempala - Interactive SPARQL Query Processing on HadoopAlexander Schätzle
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesAhsan Javed Awan
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudNicolas Poggi
 

Semelhante a Spidal Java: High Performance Data Analytics with Java on Large Multicore HPC Clusters (20)

Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
 
Towards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and BenchmarkingTowards a Systematic Study of Big Data Performance and Benchmarking
Towards a Systematic Study of Big Data Performance and Benchmarking
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup
 
The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)
 
Spark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni Schiefer
 
Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...Performance Characterization and Optimization of In-Memory Data Analytics on ...
Performance Characterization and Optimization of In-Memory Data Analytics on ...
 
Manycores for the Masses
Manycores for the MassesManycores for the Masses
Manycores for the Masses
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Spark
SparkSpark
Spark
 
What is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache SparkWhat is Distributed Computing, Why we use Apache Spark
What is Distributed Computing, Why we use Apache Spark
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
Mauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscteMauricio breteernitiz hpc-exascale-iscte
Mauricio breteernitiz hpc-exascale-iscte
 
Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418
 
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale SystemsDesigning Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
 
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
Large-Scale Stream Processing in the Hadoop Ecosystem - Hadoop Summit 2016
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem
 
Sempala - Interactive SPARQL Query Processing on Hadoop
Sempala - Interactive SPARQL Query Processing on HadoopSempala - Interactive SPARQL Query Processing on Hadoop
Sempala - Interactive SPARQL Query Processing on Hadoop
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of Techniques
 
The state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the CloudThe state of SQL-on-Hadoop in the Cloud
The state of SQL-on-Hadoop in the Cloud
 

Mais de Geoffrey Fox

AI-Driven Science and Engineering with the Global AI and Modeling Supercomput...
AI-Driven Science and Engineering with the Global AI and Modeling Supercomput...AI-Driven Science and Engineering with the Global AI and Modeling Supercomput...
AI-Driven Science and Engineering with the Global AI and Modeling Supercomput...Geoffrey Fox
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data Geoffrey Fox
 
Data Science and Online Education
Data Science and Online EducationData Science and Online Education
Data Science and Online EducationGeoffrey Fox
 
Big Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsBig Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsGeoffrey Fox
 
Lessons from Data Science Program at Indiana University: Curriculum, Students...
Lessons from Data Science Program at Indiana University: Curriculum, Students...Lessons from Data Science Program at Indiana University: Curriculum, Students...
Lessons from Data Science Program at Indiana University: Curriculum, Students...Geoffrey Fox
 
Data Science Curriculum at Indiana University
Data Science Curriculum at Indiana UniversityData Science Curriculum at Indiana University
Data Science Curriculum at Indiana UniversityGeoffrey Fox
 
What is the "Big Data" version of the Linpack Benchmark? ; What is “Big Data...
What is the "Big Data" version of the Linpack Benchmark?; What is “Big Data...What is the "Big Data" version of the Linpack Benchmark?; What is “Big Data...
What is the "Big Data" version of the Linpack Benchmark? ; What is “Big Data...Geoffrey Fox
 
Experience with Online Teaching with Open Source MOOC Technology
Experience with Online Teaching with Open Source MOOC TechnologyExperience with Online Teaching with Open Source MOOC Technology
Experience with Online Teaching with Open Source MOOC TechnologyGeoffrey Fox
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsGeoffrey Fox
 
Big Data and Clouds: Research and Education
Big Data and Clouds: Research and EducationBig Data and Clouds: Research and Education
Big Data and Clouds: Research and EducationGeoffrey Fox
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Geoffrey Fox
 
Classification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsClassification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsGeoffrey Fox
 
FutureGrid Computing Testbed as a Service
 FutureGrid Computing Testbed as a Service FutureGrid Computing Testbed as a Service
FutureGrid Computing Testbed as a ServiceGeoffrey Fox
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Geoffrey Fox
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGGeoffrey Fox
 
51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data StackGeoffrey Fox
 
Linking Programming models between Grids, Web 2.0 and Multicore
Linking Programming models between Grids, Web 2.0 and Multicore Linking Programming models between Grids, Web 2.0 and Multicore
Linking Programming models between Grids, Web 2.0 and Multicore Geoffrey Fox
 
CTS Conference Web 2.0 Tutorial Part 2
CTS Conference Web 2.0 Tutorial Part 2CTS Conference Web 2.0 Tutorial Part 2
CTS Conference Web 2.0 Tutorial Part 2Geoffrey Fox
 

Mais de Geoffrey Fox (20)

AI-Driven Science and Engineering with the Global AI and Modeling Supercomput...
AI-Driven Science and Engineering with the Global AI and Modeling Supercomput...AI-Driven Science and Engineering with the Global AI and Modeling Supercomput...
AI-Driven Science and Engineering with the Global AI and Modeling Supercomput...
 
High Performance Computing and Big Data
High Performance Computing and Big Data High Performance Computing and Big Data
High Performance Computing and Big Data
 
Data Science and Online Education
Data Science and Online EducationData Science and Online Education
Data Science and Online Education
 
Big Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other thingsBig Data HPC Convergence and a bunch of other things
Big Data HPC Convergence and a bunch of other things
 
Lessons from Data Science Program at Indiana University: Curriculum, Students...
Lessons from Data Science Program at Indiana University: Curriculum, Students...Lessons from Data Science Program at Indiana University: Curriculum, Students...
Lessons from Data Science Program at Indiana University: Curriculum, Students...
 
Data Science Curriculum at Indiana University
Data Science Curriculum at Indiana UniversityData Science Curriculum at Indiana University
Data Science Curriculum at Indiana University
 
What is the "Big Data" version of the Linpack Benchmark? ; What is “Big Data...
What is the "Big Data" version of the Linpack Benchmark?; What is “Big Data...What is the "Big Data" version of the Linpack Benchmark?; What is “Big Data...
What is the "Big Data" version of the Linpack Benchmark? ; What is “Big Data...
 
Experience with Online Teaching with Open Source MOOC Technology
Experience with Online Teaching with Open Source MOOC TechnologyExperience with Online Teaching with Open Source MOOC Technology
Experience with Online Teaching with Open Source MOOC Technology
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data Analytics
 
Big Data and Clouds: Research and Education
Big Data and Clouds: Research and EducationBig Data and Clouds: Research and Education
Big Data and Clouds: Research and Education
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Classification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different FacetsClassification of Big Data Use Cases by different Facets
Classification of Big Data Use Cases by different Facets
 
Remarks on MOOC's
Remarks on MOOC'sRemarks on MOOC's
Remarks on MOOC's
 
FutureGrid Computing Testbed as a Service
 FutureGrid Computing Testbed as a Service FutureGrid Computing Testbed as a Service
FutureGrid Computing Testbed as a Service
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWG
 
51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack51 Use Cases and implications for HPC & Apache Big Data Stack
51 Use Cases and implications for HPC & Apache Big Data Stack
 
Linking Programming models between Grids, Web 2.0 and Multicore
Linking Programming models between Grids, Web 2.0 and Multicore Linking Programming models between Grids, Web 2.0 and Multicore
Linking Programming models between Grids, Web 2.0 and Multicore
 
CTS Conference Web 2.0 Tutorial Part 2
CTS Conference Web 2.0 Tutorial Part 2CTS Conference Web 2.0 Tutorial Part 2
CTS Conference Web 2.0 Tutorial Part 2
 

Último

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Último (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Spidal Java: High Performance Data Analytics with Java on Large Multicore HPC Clusters

  • 1. SPIDAL JavaHigh Performance Data Analytics with Java on Large Multicore HPC Clusters sekanaya@indiana.edu https://github.com/DSC-SPIDAL | http://saliya.org 24th High Performance Computing Symposium (HPC 2016) April 3-6, 2016, Pasadena, CA, USA as part of the SCS Spring Simulation Multi-Conference (SpringSim'16) Saliya Ekanayake | Supun Kamburugamuve | Geoffrey Fox
  • 2. High Performance? 4/4/2016 HPC 2016 2 48 Nodes 128 Nodes 40x Speedup with SPIDAL Java Typical Java with All MPI Typical Java with Threads and MPI 64x Ideal (if life was so fair!) We’ll discuss today Intel Haswell HPC Clusterwith 40Gbps Infiniband
  • 3. Introduction • Big Data and HPC  Big data + cloud is the norm, but not always  Some applications demand significant computation and communication  HPC clusters are ideal  However, it’s not easy • Java  Unprecedented big data ecosystem  Apache has over 300 big data systems, mostly written in Java  Performance and APIs have improved greatly with 1.7x  Not much used in HPC, but can give comparative performance (e.g. SPIDAL Java)  Comparative performance to C (more on this later)  Google query https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF- 8#q=java%20faster%20than%20c will point to interesting discussions on why.  Interoperable and productive 4/4/2016 HPC 2016 3
  • 4. SPIDAL Java • Scalable Parallel Interoperable Data Analytics Library (SPIDAL)  Available at https://github.com/DSC-SPIDAL • Includes Multidimensional Scaling (MDS) and Clustering Applications  DA-MDS  Y. Ruan and G. Fox, "A Robust and Scalable Solution for Interpolative Multidimensional Scaling with Weighting," eScience (eScience), 2013 IEEE 9th International Conference on, Beijing, 2013, pp. 61-69. doi: 10.1109/eScience.2013.30  DA-PWC  Fox, G. C. Deterministic annealing and robust scalable data mining for the data deluge. In Proceedings of the 2nd International Workshop on Petascal Data Analytics: Challenges and Opportunities, PDAC ’11, ACM (New York, NY, USA, 2011), 39–40.  DA-VS  Fox, G., Mani, D., and Pyne, S. Parallel deterministic annealing clustering and its application to lc-ms data analysis. In Big Data, 2013 IEEE International Conference on (Oct 2013), 665–673.  MDSasChisq  General MDS implementation using LevenbergMarquardt algorithm  Levenberg, K. A method for the solution of certain non-linear problems in least squares. Quarterly Journal of Applied Mathmatics II, 2 (1944), 164–168.) 4/4/2016 HPC 2016 4
  • 5. SPIDAL Java Applications • Gene Sequence Clustering and Visualization  Results at WebPlotViz  https://spidal-gw.dsc.soic.indiana.edu/resultsets/991946447  https://spidal-gw.dsc.soic.indiana.edu/resultsets/795366853  A few snapshots 4/4/2016 HPC 2016 5 Sequence File Pairwise Alignment DA-MDS DA-PWC 3D Plot 100,000 fungi sequences 3D phylogenetic tree 3D plot of vector data
  • 6. SPIDAL Java Applications • Stocks Data Analysis  Time series view of stocks  E.g. with 1 year moving window https://spidal- gw.dsc.soic.indiana.edu/public/timeseriesview/825496517 4/4/2016 HPC 2016 6
  • 7. Performance Challenges • Intra-node Communication • Exploiting Fat Nodes • Overhead of Garbage Collection • Cost of Heap Allocated Objects • Cache and Memory Access 4/4/2016 HPC 2016 7
  • 8. Performance Challenges • Intra-node Communication [1/3]  Large core counts per node – 24 to 36  Data analytics use global collective communication – Allreduce, Allgather, Broadcast, etc.  HPC simulations, in contrast, typically, uses local communications for tasks like halo exchanges. 4/4/2016 HPC 2016 8 3 million double values distributed uniformly over 48 nodes • Identical message size per node, yet 24 MPI is ~10 times slower than 1 MPI • Suggests #ranks per node should be 1 for the best performance • But how to exploit all available cores?
  • 9. Performance Challenges • Intra-node Communication [2/3]  Solution: Shared memory  Use threads?  didn’t work well (explained shortly)  Processes with shared memory communication  Custom implementation in SPIDAL Java outside of MPI framework 4/4/2016 HPC 2016 9 • Only 1 rank per node participates in the MPI collective call • Others exchange data using shared memory maps 100K DA-MDS Run Communication 200K DA-MDS Run Communication
  • 10. Performance Challenges • Intra-node Communication [3/3]  Heterogeneity support  Nodes with 24 and 36 cores  Automatically detects configuration and allocates memory maps  Implementation  Custom shared memory implementation using OpenHFT’s Bytes API  Supports collective calls necessary within SPIDAL Java 4/4/2016 HPC 2016 10
  • 11. Performance Challenges • Exploiting Fat Nodes [1/2]  Large #Cores per Node  E.g. 1 Node in Juliet HPC cluster  2 Sockets  12 Cores each  2 Hardware threads per core  L1 and L2 per core  L3 shared per socket  Two approaches  All processes  1 proc per core  1 Process multiple threads  Which is better? 4/4/2016 HPC 2016 11 Socket 0 Socket 1 1 Core – 2 HTs
  • 12. Performance Challenges • Exploiting Fat Nodes [2/2]  Suggested thread model in literature  fork-join regions within a process 4/4/2016 HPC 2016 12 Iterations 1. Thread creation and scheduling overhead accumulates over iterations and significant (~5%) • True for Java threads as well as OpenMP in C/C++ (see https://github.com/esaliya/JavaThreads and https://github.com/esaliya/CppStack/tree/ master/omp2/letomppar) 2. Long running threads do better than this model, still have non- negligible overhead 3. Solution is to use processes with shared memory communications as in SPIDAL Java process Prev. Optimization
  • 13. Performance Challenges • Garbage Collection  “Stop the world” events are expensive  Especially, for parallel processes with collective communications  Typical OOP  allocate – use – forget  Original SPIDAL code produced frequent garbage of small arrays  Solution: Zero-GC using  Static allocation and reuse  Off-heap static buffers (more on next slide)  Advantage  No GC – obvious  Scale to larger problem sizes  E.g. Original SPIDAL code required 5GB (x 24 = 120 GB per node) heap per process to handle 200K MDS. Optimized code use < 1GB heap to finish within the same timing.  Note. Physical memory is 128GB, so with optimized SPIDAL can now do 1 million point MDS within hardware limits. 4/4/2016 HPC 2016 13 Heap size per process reaches –Xmx (2.5GB) early in the computation Frequent GC Heap size per process is well below (~1.1GB) of –Xmx (2.5GB) Virtually no GC activity after optimizing
  • 14. Performance Challenges 4/4/2016 HPC 2016 14 • I/O with Heap Allocated Objects  Java-to-native I/O creates copies of objects in heap  Otherwise can’t guarantee object’s memory location due to GC  Too expensive  Solution: Off-heap buffers (memory maps)  Initial data loading  significantly faster than Java stream API calls  Intra-node messaging  gives the best performance  MPI inter-node communications • Cache and Memory Access  Nested data structures are neat, but expensive  Solution: Contiguous memory with 1D arrays over 2D structures  Indirect memory references are costly  Also, adopted from HPC  Blocked loops and loop ordering
  • 15. Evaluation 4/4/2016 HPC 2016 15 • HPC Cluster  128 Intel Haswell nodes with 2.3GHz nominal frequency  96 nodes with 24 cores on 2 sockets (12 cores each)  32 nodes with 36 cores on 2 sockets (18 cores each)  128GB memory per node  40Gbps Infiniband • Software  Java 1.8  OpenHFT JavaLang 6.7.2  Habanero Java 0.1.4  OpenMPI 1.10.1 Application: DA-MDS • Computations grow 𝑂 𝑁2 • Communication global and is 𝑂(𝑁)
  • 16. 4/4/2016 HPC 2016 16 100K DA-MDS 200K DA-MDS 400K DA-MDS • 1152 Total Parallelism across 48 nodes • All combinations of 24 way parallelism per node • LHS is all processes • RHS is all internal threads and MPI across nodes 1. With SM communications in SPIDAL, processes outperform threads (blue line) 2. Other optimizations further improves performance (green line)
  • 17. 4/4/2016 HPC 2016 17 • Speedup for varying data sizes • All processes • LHS is 1 proc per node across 48 nodes • RHS is 24 procs per node across 128 nodes (ideal 64x speedup) Larger data sizes show better speedup (400K – 45x, 200K – 40x, 100K – 38x) • Speedup on 36 core nodes • All processes • LHS is 1 proc per node across 32 nodes • RHS is 36 procs per node across 32 nodes (ideal 36x speedup) Speedup plateaus around 23x after 24 way parallelism per node
  • 18. 4/4/2016 HPC 2016 18 The effect of different optimizations on speedup Java, is it worth it? – YES! Also, with JIT some cases in MDS are better than C