Massive-Scale Analytics Applied to Real-World Problems

David A. Bader
• Full Professor and Chair, Computational Science and Engineering
• IEEE Fellow, AAAS Fellow
• High-performance computing and real-world applications: massive-scale data
analytics.
• Over $183M of research awards
• Steering Committees of major HPC conferences: IPDPS and HiPC
• EIC of IEEE Transactions on Parallel and Distributed Systems
• Elected chair of IEEE and SIAM committees on HPC
• 230+ publications, ≥ 7,500 citations, h-index ≥ 52
• National Science Foundation CAREER Award recipient
• Directed: NVIDIA GPU Center of Excellence
• Directed: Sony-Toshiba-IBM Center for the Cell/B.E. Processor
• Founder: Graph500 List benchmarking “Big Data” platforms
• Recognized as a “RockStar” of High Performance Computing by InsideHPC in 2012
and as HPCwire’s People to Watch in 2012 and 2014.
19 July 2018 David A. Bader 2

Innovate.
Collaborate.
Problem Solved.
• Computational Science and
Engineering (CSE) is a diverse,
interdisciplinary innovation
ecosystem composed of award-
winning faculty, researchers and
students that
• Solves real-world problems and
creates future leaders
• Enables breakthroughs in
scientific discovery and
engineering practice
• Uses the most advanced
resources, techniques and ideas
• Is highly collaborative with an
impressive roster of GT and
industry partners
David A. Bader 319 July 2018

Profile of CSE: History
• Founded in 2005, officially recognized as a
school in 2010.
• Focus on high performance computing, big
data, analytics & visualization, machine
learning, cybersecurity.
• $6.6 million in research expenditures;
approximately $39 million in active awards
(FY 2017)
• NSF South Big Data Hub partnership: $1.25
million over 3 years to support new analysis
projects in line with CSE mission.

High Performance
Computing
Machine
Learning
Analytics &
Visualization
Cybersecurity
Core Research Areas
Big Data
Design fast theoretic algorithms on
large-scale graphs, and detect
malicious activity
Develop new methods to analyze large
and complex data sets, transforming data
into value and solve grand challenges
Present data in ways that best yield insight
and support decisions as problems scale
and complexity increase
Construct and study algorithms that
build models, and make efficient data-
driven predictions or decisions
Devise computing solutions at the
absolute limits of scale and speed using
efficient, reliable and fast algorithms,
software, tools and applications

Profile of CSE: People
• By the numbers (Fall 2017):
• 47 faculty and staff
• (13 tenure-track faculty)
• 115 PhD students
• 148 masters students
• Award-winning research teams: ACM Gordon
Bell Prize awarded to team composed mainly of
CSE faculty and students.
• Other honors include: 1 Regents’ professor, 6 NSF
CAREER awards , 4 IEEE fellows, 2 AAAS fellows,
and 1 SIAM fellow

Strategic Partnership Program

• Georgia Tech is building Coda, a multi-story, 750,000-square-foot HPC building in the
heart of Atlanta (Midtown) with a targeted opening in January 2019
• Devoted to data science and high-performance computing for centralized collaboration
among industry, academia and government
• Location of CSE, IDEAS and the HPC Center, and the South BD Hub
• Georgia Tech is the anchor tenant, taking approximately one-half of the new
development. Remaining space will be for corporate entities and partners.
• The Institute plans to locate academic and leading-edge research programs in
computing and advanced big data analytics there.
A New Home for the Future of CSE

Exascale Streaming Data Analytics: Real-world challenges
All involve analyzing massive
streaming complex networks:
• Health care  disease spread, detection
and prevention of epidemics/pandemics
(e.g. SARS, Avian flu, H1N1 “swine” flu)
• Massive social networks  understanding
communities, intentions, population
dynamics, pandemic spread, transportation
and evacuation
• Intelligence  business analytics, anomaly
detection, security, knowledge discovery
from massive data sets
• Systems Biology  understanding complex
life systems, drug design, microbial research,
unravel the mysteries of the HIV virus;
understand life, disease,
• Electric Power Grid  communication,
transportation, energy, water, food supply
• Modeling and Simulation  Perform full-
scale economic-social-political simulations
0
50
100
150
200
250
300
350
400
450
Dec-04
Mar-05
Jun-05
Sep-05
Dec-05
Mar-06
Jun-06
Sep-06
Dec-06
Mar-07
Jun-07
Sep-07
Dec-07
Mar-08
Jun-08
Sep-08
Dec-08
Mar-09
Jun-09
Sep-09
Dec-09
Facebook Active Users
Million Users
Exponential growth:
Billions of active users
Sample queries:
Allegiance switching:
identify entities that switch
communities.
Community structure:
identify the genesis and
dissipation of communities
Phase change: identify
significant change in the
network structure
REQUIRES PREDICTING / INFLUENCE CHANGE IN REAL-TIME AT SCALE
Ex: discovered minimal
changes in O(billions)-size
complex network that could
hide or reveal top influencers
in the community

Graphs are pervasive in large-scale data analysis
• Sources of massive data: peta- and exa-scale simulations, experimental
devices, the Internet, scientific applications.
• New challenges for analysis: data sizes, heterogeneity, uncertainty, data
quality.
Astrophysics
Problem: Outlier detection.
Challenges: massive datasets,
temporal variations.
Graph problems: clustering,
matching.
Bioinformatics
Problem: Identifying drug target
proteins.
Challenges: Data heterogeneity,
quality.
Graph problems: centrality,
clustering.
Social Informatics
Problem: Discover emergent
communities, model spread of
information.
Challenges: new analytics routines,
uncertainty in data.
Graph problems: clustering,
shortest paths, flows.
Image sources: (1) http://physics.nmt.edu/images/astro/hst_starfield.jpg
(2,3) www.visualComplexity.com 10DavidA. Bader

Network Analysis for Intelligence and Survelliance
• [Krebs ’04] Post 9/11 Terrorist Network
Analysis from public domain
information
• Plot masterminds correctly identified
from interaction patterns: centrality
• A global view of entities is often more
insightful
• Detect anomalous activities by
exact/approximate graph matching
Image Source: http://www.orgnet.com/hijackers.html
Image Source: T. Coffman, S. Greenblatt, S. Marcus, Graph-based technologies
for intelligence analysis, CACM, 47 (3, March 2004): pp 45-47
11DavidA. Bader

Characterizing Graph-theoretic computations
• graph sparsity (m/n ratio)
• static/dynamic nature
• weighted/unweighted, weight
distribution
• vertex degree distribution
• directed/undirected
• simple/multi/hyper graph
• problem size
• granularity of computation at
nodes/edges
• domain-specific characteristics
• paths
• clusters
• partitions
• matchings
• patterns
• orderings
Input: Graph
abstraction
Problem: Find ***
Factors that influence
choice of algorithmGraph
algorithms
• traversal
• shortest path
algorithms
• flow algorithms
• spanning tree
algorithms
• topological
sort
…..
Graph problems are often recast as sparse
linear algebra (e.g., partitioning) or linear
programming (e.g., matching) computations
12David A. Bader

Massive Streaming Graph Analytics
David A. Bader 13
(A, B, t1, poke)
(A, C, t2, msg)
(A, D, t3, view wall)
(A, D, t4, post)
(B, A, t2, poke)
(B, A, t3, view wall)
(B, A, t4, msg)
Analysts

Mining Twitter for Social Good
David A. Bader 14
ICPP 2010
Image credit: bioethicsinstitute.org

• CDC / Nation-scale surveillance of
public health
• Cancer genomics and drug design
• computed Betweenness Centrality of
Human Proteome
Human Genome core protein interactions
Degree vs. Betweenness Centrality
Degree
1 10 100
BetweennessCentrality
1e-7
1e-6
1e-5
1e-4
1e-3
1e-2
1e-1
1e+0
Massive Data Analytics: Protecting our Nation
US High Voltage Transmission
Grid (>150,000 miles of line)
Public Health
David A. Bader 15
ENSG0
000014
5332.2
Kelch-
like
protein
implicat
ed in
breast
cancer

STING Initiative:
Focusing on Globally Significant Grand Challenges
• Many globally-significant grand challenges can be modeled by Spatio-
Temporal Interaction Networks and Graphs (or “STING”).
• Emerging real-world graph problems include
• detecting community structure in large social networks,
• defending the nation against cyber-based attacks,
• discovering insider threats (e.g. Ft. Hood shooter, WikiLeaks),
• improving the resilience of the electric power grid, and
• detecting and preventing disease in human populations.
• Unlike traditional applications in computational science and
engineering, solving these problems at scale often raises new research
challenges because of sparsity and the lack of locality in the massive
data, design of parallel algorithms for massive, streaming data analytics,
and the need for new exascale supercomputers that are energy-
efficient, resilient, and easy-to-program.

Hierarchical Identify Verify Exploit (HIVE)
• SHARP: Software Toolkit for Accelerating
GrapH AlgoRithms on HIVE Processors
• Georgia Tech with University of Southern
California
• Launched Spring 2017
• Performers include Intel, Qualcomm,
PNNL, Northrop Grumman
• Challenges
• Programmers are required to exploit low-level
hardware and operating systems primitives.
• Limited portability of frameworks for new
architectures and accelerators.
David A. Bader 17
TA2 SHARP
OPTIMIZATION
DATAFLOW MODELLING
Preprocessing
Primitive
Unrolling (PU)
Database
Dataflow Model
ILP Optimization
Partially Annotated Code via
SHARP API s
Task to Hardware
Mapping and Optimal
Data Layout for Input
Code
Runtime Scheduler and
Resource Manager
STINGER
Graph
Algorithms Dynamic
Primitives
Hierarchical Primitive Decomposition
Architecture Info &
Hardware Primitives
from TA1
Abstract
Model of
HIVE
Hardware
Suitable Primitives
API Definitions
to TA3
Initial Mapping
Primitive Set
Evaluation
Data
Layout
Graph
Primitives
Data Store
Communication
Cost Model
Access Cost
Model
Decomposition
Analysis
Graph Primitives
from TA3
Primitive Feedback
and Dataset Analysis
to TA1
• Utilizing commodity hardware designed for different application domain
• Goals
• Design an unique library for first of its kind graph processor. Fully utilize new hardware features.
• Portable and scalable framework for massive graphs.
• Configurable data layout that will be decided by: input, algorithm, and targeted hardware.
19 July 2018

Power Efficiency Revolution for Embedded
Computing Technologies (PERFECT)
• Challenges
• Performance Per Watt is now a metric of concern.
• Data movement (caches, networks, storage devices) is becoming a
dominant factor in both execution time and in power consumption.
• Power consumption is limiting application and architecture scalabilty.
• Goals
• One approach to reducing power consumption is to reduce execution time.
• Find additional ways to utilized shared memory systems. Better shared
memory implementations can help reduce network size and limit network
data movement.
• Help user select: graph data layout, programming model (vertex centric vs.
edge centric, identify ideal accelerators for an application, load-balancing
techniques and much more.

Evaluating Memory-Centric
Architectures for HPDA
• Jason Riedy (PI), David A. Bader, Tom Conte
• High-performance data analysis does not fit current CPU centric
architectures well.
• Need new approaches to achieve high performance.
• Emu Chick: Move threads to data!
• Application areas: Streaming graphs and sparse tensors.
David A. Bader 19
• Needs new programming paradigms, new algorithm optimizations, new ideas!
• Goals:
• Evaluate the Emu migratory thread system, and
• develop new methods optimzed for memory-centric architectures.
IARPA
19 July 2018

STINGER – Time Frame
David A. Bader 20
STINGER is officially
proposed. May 2009
First prototype, clustering
coefficients. Apr 2010
Structure tracking of
streaming social
networks. Apr 2011
High Performance Data
Structure for Streaming
Graphs. Sep 2012.
HPEC BEST PAPER AWARD
Dynamic betweenness
centrality algorithm,
Sep 2012
Streaming connected
component, Dec 2013
Performance evaluation
of open-source graph
data-bases, Feb 2014
Community detection in
dynamic networks Sep
2015
PageRank for Streaming
Graphs. May 2016
Streaming graph need
arises (over a decade
ago)
19 July 2018

Streaming graph example
David A. Bader 21
• Dynamic/Streaming:
• At time 𝑡:
• 𝑣 and 𝑤 become friends
• 𝐼𝑛𝑠𝑒𝑟𝑡 (𝑣, 𝑤)
• At time 𝑡:
• 𝑢 upsets 𝑣. 𝑢 and 𝑣 𝑎𝑟𝑒 no
longer friends
• 𝐷𝑒𝑙𝑒𝑡𝑒 𝑢, 𝑣
• small subgraph… 𝑣
𝑢
𝑤
19 July 2018

Streaming Analytics move us
from reporting the news to predictive analytics
Traditional HPC
• Great for “static” data sets.
• Massive scalability at the
cost of programmability.
• Great for dense problems.
• Sparse problems typically
underutilize the system.
Streaming Analytics
• Requires specialized analytics
and data structures.
• Rapidly changing data.
• Low data re-usage.
• Focused on memory operations
and not FLOPS.

STING Extensible Representation (STINGER)
• Design goals:
• Enable algorithm designers to implement dynamic graph
algorithms with ease.
• Portable semantics for various platforms
• Good performance for all types of graph problems and
algorithms - static and dynamic.
• Assumes globally addressable memory access
• Support multiple, parallel readers and a single writer
• One server manages the graph data structures
• Multiple analytics run in background with read-only
permissions.

STING Extensible Representation
David A. Bader 24
• Semi-dense edge list
blocks with free space
• Compactly stores
timestamps, types,
weights
• Maps from
application IDs to
storage IDs
• Deletion by negating
IDs, separate
compaction
19 July 2018

STINGER Graph & Analytic Update Process
Accumulate recent graph updates in main memory and
create a batch.
David A. Bader 25
Pre-process, Sort,
Reconcile
“Age off” old vertices
Modify STINGER graph
Update metrics (execute streaming analytics)
STINGER
graph
Insertions /
Deletions
Affected vertices
Change detection
19 July 2018

STING: High-level architecture
David A. Bader 26
◮ Server: Graph storage, kernel orchestration
◮ OpenMP + sufficiently POSIX-ish
◮ Multiple processes for resilience
19 July 2018

STINGER: as an analysis package
• Streaming edge insertions and deletions:
Performs new edge insertions, updates, and deletions in batches or individually.
Optimized to update at rates of over 3 million edges per second on graphs of one billion edges.
• Streaming clustering coefficients:
Tracks the local and global clustering coefficients of a graph.
• Streaming connected components:
Real time tracking of the connected components.
• Streaming Betweenness Centrality:
Find the key points within information flows and structural vulnerabilities.
• Streaming community detection:
Track and update the community structures within the graph as they change.
• Anything that a static graph package can do (and a whole lot more):
• Parallel agglomerative clustering:
Find clusters that are optimized for a user-defined edge scoring function.
• K-core Extraction:
Extract additional communities and filter noisy high-degree vertices.
• Classic breadth-first search:
Performs a parallel breadth-first search of the graph starting at a given source vertex to find shortest paths.
• Parallel connected components:
Finds the connected components in a static network.
David A. Bader 27
http://www.stingergraph.com/
19 July 2018

Streaming Updates
Update process
• Group updates into batches
• Updates can include insertions
and deletions
• Big batches ⇒ Better
performances
[HPEC; 2012]
Throughput rate
David A. Bader 28
Experiment setup
• 4x10 Intel E7-8870 processors
• RMAT Graph
• Vertices: 16M
• Edges: 128M
• Various batch sizes
• ~93% of updates are insertions
• ~7% of updates are deletions
Takeaway
• STINGER supports extremely fast updates.
• Updates are not the bottleneck for
analytics.
• Analytic computations are the bottleneck!
• Highly scalable
19 July 2018

Streaming Clustering
Coefficients & Triangle Counting
Background
• Scores how tightly bound players are in
their local community.
• Looks for common relationships for two
adjacent vertices.
• Hence the term triangle counting
• Complexity for static graph algorithm
(intersection based): 𝑂 𝑣 ⋅ 𝑑 𝑚𝑎𝑥
2
[MTAAP; 2010]
David A. Bader 29
Multiple streaming
implementations
• Brute-force – straightforward and exact
• Bloom-filter – approximate yet extremely fast
• Sorted-list – uses intersections. Fast and exact.
Larger batches give faster speedups.
Experiment setup
• Executed on two systems
• Cray XMT – 64 nodes
• 2x4 Intel E5530 system
• 8 cores, 16 threads
• Used RMAT synthetic graphs
• 2M vertices, 16 edges
• Hundreds of thousands updates per second
19 July 2018

Streaming Connected Components
Background
• Tracks connected components in
high velocity networks.
• Connected components imply that
players are connected to each
other some sequence of
relationships
[HiPC; 2013]
David A. Bader 30
Our algorithm
• Takes into account small-world
property
• Diameter is a small.
• Most players have numerous
relationships within the connected
component.
• Edge insertions are always easy.
• Very edge deletions are complex.
Takeaway
• Up to 1.26 million updates per second on
4 × 16 AMD (Opteron 6282)
• Hundreds of time faster than static
computation.
• Great for social networks.
• STINGER requires only 10% of execution
time. Rest of time - analytic update.
• Scalability similar to BFS.
Average edge degree:
19 July 2018

Dynamic Betweenness Centrality
Background
• Used for finding key players in network
based on the number of relationships
that go through them.
• Fastest known algorithm by Brandes
(2002) is still computationally
expensive for large networks:
𝑂 𝑉 ⋅ 𝑉 + 𝐸 .
[Social Computing; 2012]
David A. Bader 31
Our dynamic graph algorithm
• Supports optimizations:
• Approximation: reduces accuracy,
significantly faster.
• Parallelization: utilizes many core
systems
• Supports: vertex insertions & deletions
and edge insertions & deletions.
Experimental setup and takeaways
• 4x10 Intel E7-8870 processors
• Thousands of times faster than static
recomputations.
• Significantly reduces the amount of
necessary computations.
• Only small percentage of the graph is
affected due to update
19 July 2018

Streaming Community
Detection and Monitoring
Background
• Communities are typically defined by
groups of vertices with more intra-
relationships than inter-relationships.
• More formally: 𝑄 𝐶 =
𝐼𝑛𝑡𝑟𝑎 𝐶
𝐸
−
𝐼𝑛𝑡𝑒𝑟 𝐶
2
|𝐸|2
• In addition to the graph, an additional
community network is maintained.
• Significantly smaller than full network!
• Updates applied to community network.
[MTAAP; 2013]
David A. Bader 32
An agglomerative approach
• Certain types of updates do not
change the community structure.
• We only need to process updates that
“might” cause change.
• Few updates require a lot of process
time.
Experimental setup and takeaways
• 4x8 Intel E7-4820 system
• 32 cores, 64 threads
• Easily supports millions of updates per second.
• Bigger batches offer improved performance.
• Dynamic algorithm is 1000s of time faster than static
graph algorithm.
• Real-time tracking of communities with a network.
19 July 2018

Streaming Seed-Set Expansion
Background
• Seeds are vertices of interest.
• Seed Set Expansion is the process of
detecting a community around a seed.
• Streaming SSE – allows tracking vertices of
interest over time
• Important events such as community split
and merging can be reported
[ASONAM;2015]
David A. Bader 33
Algorithm details
• Greedy algorithm.
• Vertices are inserted one at a time into the
community.
• An edge update checks for possible changes in the
community.
• Pruning can be applied when an update causes a
big change in the community.
• Pruning makes things slower
• Pruning offers more accurate results in comparison
with static graph algorithm.
Takeaways
• Highly accurate in comparison to
static graph algorithm.
• Precision and recall typically
above 90%.
• Larger batches require more
work ⇒ smaller speedups
19 July 2018

Incremental Page-Rank
Background
• Pagerank is used by measuring the
importance of vertices by the number
and weight of links going through it.
• Works like a propagation algorithm.
• Algorithm continues until no changes
are detected.
[GABB;2016]
David A. Bader 34
Algorithm
• Uses STINGER to perform linear algebra
operations.
• Supports both insertions and deletions.
• Incremental implies that only a small subset
of the graph is traversed.
• Does only the necessary amount of work
Takeaway
• Large batches: reduce lower latency by > 2×
over restarting on average.
• Small batches: potentially hundreds of time
faster than restart.
• Improved power performance (modeled).
• Can deal with several thousand updates per
second.
19 July 2018

Community Centric Analysis (in
process)
Background
• Focuses on finding key players in communities
• Might be overlooked by network wide analytics.
• Computationally less expensive.
• Highly scalable
• Key players detected due to change to their
community upon extraction.
• We modify several widely used analytics for this
new type of computation.
• Starts off with an initial exploration of static
graphs
Modified metrics of interest
• Change in community modularity
• Change to the community diameter
• Change in the number of connected
components in the community
David A. Bader 35
Community Centric Approach
• Given a community 𝐶 and a metric 𝑀, for each vertex 𝑢 in
each community 𝐶:
• Calculate initial metric 𝑀𝑖𝑛𝑖𝑡𝑖𝑎𝑙 on community (left
figure) using static graph algorithm (done once)
• Remove vertex 𝑢 and links using STINGER
• Calculate changed metric 𝑀 𝑎𝑓𝑡𝑒𝑟 using 𝑑𝑦𝑛𝑎𝑚𝑖𝑐 graph
algorithm using STINGER
• Look at change to community: Δ𝑀 𝑢 =
𝑀 𝑎𝑓𝑡𝑒𝑟
𝑀 𝑖𝑛𝑖𝑡𝑖𝑎𝑙
• Insert vertex 𝑢 and links using STINGER
Initial Findings
• A different way to use streaming analytics:
“𝑣𝑒𝑟𝑡𝑒𝑥 𝑑𝑒𝑙𝑡𝑎 𝑐𝑜𝑚𝑝𝑢𝑡𝑎𝑡𝑖𝑜𝑛𝑠”
• Multiple metrics pinpoint same key vertices.
• Computationally efficient
• Over 20𝑋 faster than networks approach.
• Highly scalable
• Applicable to other metrics as well
19 July 2018

Current Research

Analysis of Centrality on Graphs
11/26/2016 localhost:8000
• Numerical Centrality
• Theoretically guaranteeing highly ranked vertices
from calculations of Katz Centrality from an
iterative solver
• Nathan, Sanders, Fairbanks, Henson, Bader. “Graph
Ranking Guarantees for Numerical Approximations
to Katz Centrality," International Conference on
Computational Science (ICCS). June 2017.
• Dynamic Centrality
• Develop algorithms to efficiently update Katz
Centrality in dynamic graphs: faster than static
recomputation and maintains high quality of
results. Algorithms are from both a linear algebraic
environment and personalized agglomerative
approach.
• Nathan and Bader. “A Dynamic Algorithm for
Updating Katz Centrality in Graphs," IEEE/ACM
International Conference on Social Networks
Analysis and Mining (ASONAM). July 2017.
• Nathan and Bader. “Approximating Personalized
Katz Centrality in Dynamic Graphs," 12th
International Conference on Parallel Processing
and Applied Mathematics (PPAM). September
2017.

Dynamic Communities
• Goals
• Detect local communities in networks given seed vertices of interest
• Allows a relevant subgraph to be extracted for targeted analysis
• Incrementally update and track communities over time in dynamic graphs
• Publications
• Zakrzewska and Bader. “A dynamic algorithm for local community detection in graphs,” ASONAM 2015.
• Zakrzewska and Bader. “Tracking local communities in streaming graphs with a dynamic algorithm,”
SNAM Journal 6(1) 2016.
• Zakrzewska, Nathan, Fairbanks, Bader. “A local measure of community change in dynamic graphs,”
ASONAM 2016.
• Nathan, Zakrzewska, Riedy, Bader. “Local community detection in dynamic graphs using personalized
centrality,” Journal of Algorithms 2017.

Sampling Streaming Graphs
• Challenges
• Many relational datasets are large, with new data constantly generated
• The volume of data may be too large to store or run graph analytics
• Goals
• Sample a stream of relational data (edges) to create a graph representation
• The sampling method should restrict both the number of vertices and edges to
limit the memory needed to store the sampled graph
• In some applications, newer data is more relevant
• Allow for sampling bias towards newer edges when needed or for
temporally uniform sampling
• Zakrzewska and Bader. “Streaming graph sampling with size restrictions,” ASONAM
2017.
David A. Bader 39
10
3
9
1
4
11
5
2
6
8
7
10
8
4
74
1
4
2
6
5
7
5
5
4
8
19 July 2018

STINGER: Where do you get it?
http://www.stingergraph.com/
• Gateway to
• code,
• development,
• documentation,
• presentations...
• Users / contributors /
questioners: Georgia Tech,
PNNL, CMU, Berkeley, Intel, Cray,
NVIDIA, IBM, Federal
Government, Ionic Security, Citi,
Accenture

STINGER Development
Enterprise
• Tech transfer for GTRI
• Enterprise software integrity
• Nightly builds
• Unit testing required
Academic
• Maintained by Georgia Tech
• Ideal for prototyping.
• Sandbox for developing new
concepts
• When software matures…
David A. Bader 41
http://git.cc.gatech.edu/git/u/eriedy3/stinger.git/ https://github.com/stingergraph
19 July 2018

STINGER Summary
• Massive-Scale Streaming Analytics require
• Simple programming model
• Simple API.
• CSR-like in concept.
• STINGER has a lot more under the hood.
• Extremely fast updates
• Millions of updates per second.
• These must not be bottlenecks for updating an analytic.
• STINGER offers these
• STINGER has major performance benefits
• Thousands of times faster than static graph computation.
• Hundreds of thousands of updates per second for numerous
analytics.
• Real-time monitoring of underlying network.

Conclusions
• Massive-Scale Streaming Analytics will require new
• High-performance computing platforms
• Streaming algorithms
• Energy-efficient implementations
and are promising to solve real-world challenges!
• Mapping applications to high performance
architectures may yield 6 or more orders of
magnitude performance improvement

Acknowledgments
• Jason Riedy, Research Scientist, (Georgia Tech)
• Oded Green, Research Scientist, (Georgia Tech)
• Current Graduate Students (Georgia Tech):
• Xiaojing An
• James Fox
• Kasimir Gabert
• Euna Kim
• Recent Bader Alumni:
• Dr. Eisha Nathan (Lawrence Livermore National Lab)
• Dr. Vipin Sachdeva (IBM)
• Dr. Anita Zakrzewska (Lawrence Livermore National Lab)
• Dr. Lluis Miquel Munguia (Google)
• Prof. Kamesh Madduri (Penn State)
• Dr. David Ediger (GTRI)
• Dr. James Fairbanks (GTRI)
• Dr. Seunghwa Kang (Pacific Northwest National Lab)

PhD students
Emily Rogers
James Fox
Xiaojing An
Euna Kim
Anita Zakrzewska Eisha Nathan
Chunxing Yin Kasimir Gabert Lluis Miquel Munguia
David A. Bader 45
Buzz
19 July 2018

Bader, Related Recent Publications (2005-2009)
• D.A. Bader, G. Cong, and J. Feo, “On the Architectural Requirements for Efficient Execution of Graph Algorithms,” The 34th International Conference on Parallel Processing (ICPP
2005), pp. 547-556, Georg Sverdrups House, University of Oslo, Norway, June 14-17, 2005.
• D.A. Bader and K. Madduri, “Design and Implementation of the HPCS Graph Analysis Benchmark on Symmetric Multiprocessors,” The 12th International Conference on High
Performance Computing (HiPC 2005), D.A. Bader et al., (eds.), Springer-Verlag LNCS 3769, 465-476, Goa, India, December 2005.
• D.A. Bader and K. Madduri, “Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2,” The 35th International Conference on Parallel
Processing (ICPP 2006), Columbus, OH, August 14-18, 2006.
• D.A. Bader and K. Madduri, “Parallel Algorithms for Evaluating Centrality Indices in Real-world Networks,” The 35th International Conference on Parallel Processing (ICPP 2006),
Columbus, OH, August 14-18, 2006.
• K. Madduri, D.A. Bader, J.W. Berry, and J.R. Crobak, “Parallel Shortest Path Algorithms for Solving Large-Scale Instances,” 9th DIMACS Implementation Challenge -- The Shortest
Path Problem, DIMACS Center, Rutgers University, Piscataway, NJ, November 13-14, 2006.
• K. Madduri, D.A. Bader, J.W. Berry, and J.R. Crobak, “An Experimental Study of A Parallel Shortest Path Algorithm for Solving Large-Scale Graph Instances,” Workshop on Algorithm
Engineering and Experiments (ALENEX), New Orleans, LA, January 6, 2007.
• J.R. Crobak, J.W. Berry, K. Madduri, and D.A. Bader, “Advanced Shortest Path Algorithms on a Massively-Multithreaded Architecture,” First Workshop on Multithreaded
Architectures and Applications (MTAAP), Long Beach, CA, March 30, 2007.
• D.A. Bader and K. Madduri, “High-Performance Combinatorial Techniques for Analyzing Massive Dynamic Interaction Networks,” DIMACS Workshop on Computational Methods
for Dynamic Interaction Networks, DIMACS Center, Rutgers University, Piscataway, NJ, September 24-25, 2007.
• D.A. Bader, S. Kintali, K. Madduri, and M. Mihail, “Approximating Betewenness Centrality,” The 5th Workshop on Algorithms and Models for the Web-Graph (WAW2007), San Diego,
CA, December 11-12, 2007.
• David A. Bader, Kamesh Madduri, Guojing Cong, and John Feo, “Design of Multithreaded Algorithms for Combinatorial Problems,” in S. Rajasekaran and J. Reif, editors, Handbook
of Parallel Computing: Models, Algorithms, and Applications, CRC Press, Chapter 31, 2007.
• Kamesh Madduri, David A. Bader, Jonathan W. Berry, Joseph R. Crobak, and Bruce A. Hendrickson, “Multithreaded Algorithms for Processing Massive Graphs,” in D.A. Bader,
editor, Petascale Computing: Algorithms and Applications, Chapman & Hall / CRC Press, Chapter 12, 2007.
• D.A. Bader and K. Madduri, “SNAP, Small-world Network Analysis and Partitioning: an open-source parallel graph framework for the exploration of large-scale networks,” 22nd
IEEE International Parallel and Distributed Processing Symposium (IPDPS), Miami, FL, April 14-18, 2008.
• S. Kang, D.A. Bader, “An Efficient Transactional Memory Algorithm for Computing Minimum Spanning Forest of Sparse Graphs,” 14th ACM SIGPLAN Symposium on Principles and
Practice of Parallel Programming (PPoPP), Raleigh, NC, February 2009.
• Karl Jiang, David Ediger, and David A. Bader. “Generalizing k-Betweenness Centrality Using Short Paths and a Parallel Multithreaded Implementation.” The 38th International
Conference on Parallel Processing (ICPP), Vienna, Austria, September 2009.
• Kamesh Madduri, David Ediger, Karl Jiang, David A. Bader, Daniel Chavarría-Miranda. “A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating
Betweenness Centrality on Massive Datasets.” 3rd Workshop on Multithreaded Architectures and Applications (MTAAP), Rome, Italy, May 2009.
• David A. Bader, et al. “STINGER: Spatio-Temporal Interaction Networks and Graphs (STING) Extensible Representation.” 2009.
46David A. Bader19 July 2018

• David Ediger, Karl Jiang, E. Jason Riedy, and David A. Bader. “Massive Streaming Data Analytics: A Case Study with Clustering
Coefficients,” Fourth Workshop in Multithreaded Architectures and Applications (MTAAP), Atlanta, GA, April 2010.
• Seunghwa Kang, David A. Bader. “Large Scale Complex Network Analysis using the Hybrid Combination of a MapReduce cluster and a
Highly Multithreaded System:,” Fourth Workshop in Multithreaded Architectures and Applications (MTAAP), Atlanta, GA, April 2010.
• David Ediger, Karl Jiang, Jason Riedy, David A. Bader, Courtney Corley, Rob Farber and William N. Reynolds. “Massive Social Network
Analysis: Mining Twitter for Social Good,” The 39th International Conference on Parallel Processing (ICPP 2010), San Diego, CA,
September 2010.
• Virat Agarwal, Fabrizio Petrini, Davide Pasetto and David A. Bader. “Scalable Graph Exploration on Multicore Processors,” The 22nd IEEE
and ACM Supercomputing Conference (SC10), New Orleans, LA, November 2010.
• Z. Du, Z. Yin, W. Liu, and D.A. Bader, “On Accelerating Iterative Algorithms with CUDA: A Case Study on Conditional Random Fields
Training Algorithm for Biological Sequence Alignment,” IEEE International Conference on Bioinformatics & Biomedicine, Workshop on
Data-Mining of Next Generation Sequencing Data (NGS2010), Hong Kong, December 20, 2010.
• D. Ediger, J. Riedy, H. Meyerhenke, and D.A. Bader, “Tracking Structure of Streaming Social Networks,” 5th Workshop on
Multithreaded Architectures and Applications (MTAAP), Anchorage, AK, May 20, 2011.
• D. Mizell, D.A. Bader, E.L. Goodman, and D.J. Haglin, “Semantic Databases and Supercomputers,” 2011 Semantic Technology
Conference (SemTech), San Francisco, CA, June 5-9, 2011.
• P. Pande and D.A. Bader, “Computing Betweenness Centrality for Small World Networks on a GPU,” The 15th Annual High
Performance Embedded Computing Workshop (HPEC), Lexington, MA, September 21-22, 2011.
• David A. Bader, Christine Heitsch, and Kamesh Madduri, “Large-Scale Network Analysis,” in J. Kepner and J. Gilbert, editor, Graph
Algorithms in the Language of Linear Algebra, SIAM Press, Chapter 12, pages 253-285, 2011.
• Jeremy Kepner, David A. Bader, Robert Bond, Nadya Bliss, Christos Faloutsos, Bruce Hendrickson, John Gilbert, and Eric Robinson,
“Fundamental Questions in the Analysis of Large Graphs,” in J. Kepner and J. Gilbert, editor, Graph Algorithms in the Language of
Linear Algebra, SIAM Press, Chapter 16, pages 353-357, 2011.

Bader, Related Recent Publications (2012)
• E.J. Riedy, H. Meyerhenke, D. Ediger, and D.A. Bader, “Parallel Community Detection for Massive Graphs,” The 9th International Conference on Parallel Processing and Applied Mathematics (PPAM
2011), Torun, Poland, September 11-14, 2011. Lecture Notes in Computer Science, 7203:286-296, 2012.
• E.J. Riedy, D. Ediger, D.A. Bader, and H. Meyerhenke, “Parallel Community Detection for Massive Graphs,” 10th DIMACS Implementation Challenge -- Graph Partitioning and Graph Clustering, Atlanta,
GA, February 13-14, 2012.
• E.J. Riedy, H. Meyerhenke, D.A. Bader, D. Ediger, and T. Mattson, “Analysis of Streaming Social Networks and Graphs on Multicore Architectures,” The 37th IEEE International Conference on
Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, Japan, March 25-30, 2012.
• J. Riedy, H. Meyerhenke, and D.A. Bader, “Scalable Multi-threaded Community Detection in Social Networks,” 6th Workshop on Multithreaded Architectures and Applications (MTAAP), Shanghai,
China, May 25, 2012.
• H. Meyerhenke, E.J. Riedy, and D.A. Bader, “Parallel Community Detection in Streaming Graphs,” Minisymposium on Parallel Analysis of Massive Social Networks, 15th SIAM Conference on Parallel
Processing for Scientific Computing (PP12), Savannah, GA, February 15-17, 2012.
• D. Ediger, E.J. Riedy, H. Meyerhenke, and D.A. Bader, “Analyzing Massive Networks with GraphCT,” Poster Session, 15th SIAM Conference on Parallel Processing for Scientific Computing (PP12),
Savannah, GA, February 15-17, 2012.
• R.C. McColl, D. Ediger, and D.A. Bader, “Many-Core Memory Hierarchies and Parallel Graph Analysis,” Poster Session, 15th SIAM Conference on Parallel Processing for Scientific Computing (PP12),
Savannah, GA, February 15-17, 2012.
• E.J. Riedy, D. Ediger, H. Meyerhenke, and D.A. Bader, “STING: Software for Analysis of Spatio-Temporal Interaction Networks and Graphs,” Poster Session, 15th SIAM Conference on Parallel
Processing for Scientific Computing (PP12), Savannah, GA, February 15-17, 2012.
• Y. Chai, Z. Du, D.A. Bader, and X. Qin, "Efficient Data Migration to Conserve Energy in Streaming Media Storage Systems," IEEE Transactions on Parallel & Distributed Systems, 2012.
• M. S. Swenson, J. Anderson, A. Ash, P. Gaurav, Z. Sükösd, D.A. Bader, S.C. Harvey and C.E Heitsch, "GTfold: Enabling parallel RNA secondary structure prediction on multi-core desktops," BMC
Research Notes, 5:341, 2012.
• D. Ediger, K. Jiang, E.J. Riedy, and D.A. Bader, "GraphCT: Multithreaded Algorithms for Massive Graph Analysis," IEEE Transactions on Parallel & Distributed Systems, 2012.
• D.A. Bader and K. Madduri, "Computational Challenges in Emerging Combinatorial Scientific Computing Applications," in O. Schenk, editor, Combinatorial Scientific Computing, Chapman & Hall
/ CRC Press, Chapter 17, pages 471-494, 2012.
• O. Green, R. McColl, and D.A. Bader, "GPU Merge Path -- A GPU Merging Algorithm," 26th ACM International Conference on Supercomputing (ICS), San Servolo Island, Venice, Italy, June 25-29, 2012.
• O. Green, R. McColl, and D.A. Bader, "A Fast Algorithm for Streaming Betweenness Centrality," 4th ASE/IEEE International Conference on Social Computing (SocialCom), Amsterdam, The
Netherlands, September 3-5, 2012.
• D. Ediger, R. McColl, J. Riedy, and D.A. Bader, "STINGER: High Performance Data Structure for Streaming Graphs," The IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA,
September 20-22, 2012. Best Paper Award.
• J. Marandola, S. Louise, L. Cudennec, J.-T. Acquaviva and D.A. Bader, "Enhancing Cache Coherent Architecture with Access Patterns for Embedded Manycore Systems," 14th IEEE International
Symposium on System-on-Chip (SoC), Tampere, Finland, October 11-12, 2012.
• L.M. Munguía, E. Ayguade, and D.A. Bader, "Task-based Parallel Breadth-First Search in Heterogeneous Environments," The 19th Annual IEEE International Conference on High Performance
Computing (HiPC), Pune, India, December 18-21, 2012.

Bader, Related Recent Publications (2013)
• X. Liu, P. Pande, H. Meyerhenke, and D.A. Bader, "PASQUAL: Parallel Techniques for Next Generation Genome Sequence Assembly," IEEE
Transactions on Parallel & Distributed Systems, 24(5):977-986, 2013.
• David A. Bader, Henning Meyerhenke, Peter Sanders, and Dorothea Wagner (eds.), Graph Partitioning and Graph Clustering, American
Mathematical Society, 2013.
• E. Jason Riedy, Henning Meyerhenke, David Ediger and David A. Bader, "Parallel Community Detection for Massive Graphs," in David A. Bader,
Henning Meyerhenke, Peter Sanders, and Dorothea Wagner (eds.), Graph Partitioning and Graph Clustering, American Mathematical Society,
Chapter 14, pages 207-222, 2013.
• S. Kang, D.A. Bader, and R. Vuduc, "Energy-Efficient Scheduling for Best-Effort Interactive Services to Achieve High Response Quality," 27th
IEEE International Parallel and Distributed Processing Symposium (IPDPS), Boston, MA, May 20-24, 2013.
• J. Riedy and D.A. Bader, "Multithreaded Community Monitoring for Massive Streaming Graph Data," 7th Workshop on Multithreaded
Architectures and Applications (MTAAP), Boston, MA, May 24, 2013.
• D. Ediger and D.A. Bader, "Investigating Graph Algorithms in the BSP Model on the Cray XMT," 7th Workshop on Multithreaded Architectures
and Applications (MTAAP), Boston, MA, May 24, 2013.
• O. Green and D.A. Bader, "Faster Betweenness Centrality Based on Data Structure Experimentation," International Conference on
Computational Science (ICCS), Barcelona, Spain, June 5-7, 2013.
• Z. Yin, J. Tang, S. Schaeffer, and D.A. Bader, "Streaming Breakpoint Graph Analytics for Accelerating and Parallelizing the Computation of DCJ
Median of Three Genomes," International Conference on Computational Science (ICCS), Barcelona, Spain, June 5-7, 2013.
• T. Senator, D.A. Bader, et al., "Detecting Insider Threats in a Real Corporate Database of Computer Usage Activities," 19th ACM SIGKDD
Conference on Knowledge Discovery and Data Mining (KDD), Chicago, IL, August 11-14, 2013.
• J. Fairbanks, D. Ediger, R. McColl, D.A. Bader and E. Gilbert, "A Statistical Framework for Streaming Graph Analysis," IEEE/ACM International
Conference on Advances in Social Networks Analysis and Modeling (ASONAM), Niagara Falls, Canada, August 25-28, 2013.
• A. Zakrzewska and D.A. Bader, "Measuring the Sensitivity of Graph Metrics to Missing Data," 10th International Conference on Parallel
Processing and Applied Mathematics (PPAM), Warsaw, Poland, September 8-11, 2013.
• O. Green and D.A. Bader, "A Fast Algorithm for Streaming Betweenness Centrality," 5th ASE/IEEE International Conference on Social
Computing (SocialCom), Washington, DC, September 8-14, 2013.
• R. McColl, O. Green, and D.A. Bader, "A New Parallel Algorithm for Connected Components in Dynamic Graphs," The 20th Annual IEEE
International Conference on High Performance Computing (HiPC), Bangalore, India, December 18-21, 2013.

• R. McColl, D. Ediger, J. Poovey, D. Campbell, and D.A. Bader, "A Performance Evaluation of Open Source Graph Databases," The 1st Workshop
on Parallel Programming for Analytics Applications (PPAA 2014) held in conjunction with the 19th ACM SIGPLAN Symposium on Principles and
Practice of Parallel Programming (PPoPP 2014), Orlando, Florida, February 16, 2014.
• O. Green, L.M. Munguia, and D.A. Bader, "Load Balanced Clustering Coefficients," The 1st Workshop on Parallel Programming for Analytics
Applications (PPAA 2014) held in conjunction with the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP
2014), Orlando, Florida, February 16, 2014.
• A. McLaughlin and D.A. Bader, "Revisiting Edge and Node Parallelism for Dynamic GPU Graph Analytics," 8th Workshop on Multithreaded
Architectures and Applications (MTAAP), held in conjuntion with The IEEE International Parallel and Distributed Processing Symposium (IPDPS
2014), Phoenix, AZ, May 23, 2014.
• Z. Yin, J. Tang, S. Schaeffer, D.A. Bader, "A Lin-Kernighan Heuristic for the DCJ Median Problem of Genomes with Unequal Contents," 20th
International Computing and Combinatorics Conference (COCOON), Atlanta, GA, August 4-6, 2014.
• Y. You, D.A. Bader and M.M. Dehnavi, "Designing an Adaptive Cross-Architecture Combination for Graph Traversal," The 43rd International
Conference on Parallel Processing (ICPP 2014), Minneapolis, MN, September 9-12, 2014.
• A. McLaughlin, J. Riedy, and D.A. Bader, "Optimizing Energy Consumption and Parallel Performance for Betweenness Centrality using
GPUs," The 18th Annual IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, September 9-11, 2014.
• A. McLaughlin and D.A. Bader, "Scalable and High Performance Betweenness Centrality on the GPU," The 26th IEEE and ACM Supercomputing
Conference (SC14), New Orleans, LA, November 16-21, 2014. Best Student Paper Finalist.
• D. Dauwe, E. Jonardi, R. Friese, S. Pasricha, A.A. Maciejewski, D.A. Bader, and H.J. Siegel, “A Methodology for Co-Location Aware Application
Performance Modeling in Multicore Computing,” 17th Workshop on Advances on Parallel and Distributed Processing Symposium (APDCM),
Hyderabad, India, May 25, 2015.
• A. Zakrzewska and D.A. Bader, “Fast Incremental Community Detection on Dynamic Graphs,” 11th International Conference on Parallel
Processing and Applied Mathematics (PPAM), Krakow, Poland, September 6-9, 2015.
• A. McLaughlin, J. Riedy, and D.A. Bader, “An Energy-Efficient Abstraction for Simultaneous Breadth-First Searches,” The 19th Annual IEEE High
Performance Extreme Computing Conference (HPEC), Waltham, MA, September 15-17, 2015.
• A. McLaughlin, D. Merrill, M. Garland and D.A. Bader, “Parallel Methods for Verifying the Consistency of Weakly-Ordered Architectures,” The
24th International Conference on Parallel Architectures and Compilation Techniques (PACT), San Francisco, CA, October 18-21, 2015.
• A. McLaughlin and D.A. Bader, “Fast Execution of Simultaneous Breadth-First Searches on Sparse Graphs,'' The 21st IEEE International
Conference on Parallel and Distributed Systems (ICPADS), Melbourne, Australia, December 14-17, 2015.

• David Bader, Aleksandra Michalewicz, Oded Green, Jessie Birkett-Rees, Jason Riedy, James Fairbanks, and Anita Zakrzewska, “Semantic database applications at the
Samtavro Cemetery, Georgia,” , The 44th Computer Applications and Quantitative Methods in Archaeology Conference (CAA), Oslo, Norway, March 29 – April 2, 2016.
• Vipin Sachdeva, Srinivas Aluru, David A. Bader, “A Memory and Time Scalable Parallelization of the Reptile Error-Correction Code,” 15th IEEE International Workshop on
High Performance Computational Biology (HiCOMB), Chicago, IL, May 23, 2016.
• James Fairbanks, Anita Zakrzewska, and David A. Bader, “New Stopping Criteria For Spectral Partitioning,” IEEE/ACM International Conference on Advances in Social
Networks Analysis and Modeling (ASONAM), San Francisco, CA, August 18-21, 2016.
• Anita Zakrzewska, Eisha Nathan, James Fairbanks, and David A. Bader, “A Local Measure of Community Change in Dynamic Graphs,” IEEE/ACM International Conference on
Advances in Social Networks Analysis and Modeling (ASONAM), San Francisco, CA, August 18-21, 2016.
• Anita Zakrzewska and David A. Bader, “Aging Data in Dynamic Graphs: A Comparative Study,” 2nd International Workshop on Dynamics in Networks (DyNo), held in
conjunction with IEEE/ACM International Conference on Advances in Social Networks Analysis and Modeling (ASONAM), San Francisco, CA, August 18, 2016.
• O. Green and D.A. Bader, “cuSTINGER: Supporting Dynamic Graph Algorithms for GPUs,” The 20th Annual IEEE High Performance Extreme Computing Conference (HPEC),
Waltham, MA, September 13-15, 2016.
• Jeremy Kepner, Peter Aaltonen, David A. Bader, Aydin Buluc, Franz Franchetti, John Gilbert, Dylan Hutchison, Manoj Kumar, Andrew Lumsdaine, Henning Meyerhenke, Scott
McMillan, Jose Moreira, John D. Owens, Carl Yang, Marcin Zalewski, and Timothy Mattson, “Mathematical Foundations of the GraphBLAS,” The 20th Annual IEEE High
Performance Extreme Computing Conference (HPEC), Waltham, MA, September 13-15, 2016.
• X. Hui, Z. Du, J. Liu, H. Sun, Y. He and D.A. Bader, “When Good Enough Is Better: Energy-Aware Scheduling for Multicore Servers,” 13th Workshop on High-Performance,
PowerAware Computing (HPPAC), Orlando, FL, May 29, 2017.
• E. Nathan, G. Sanders, J. Fairbanks, V. Henson and D.A. Bader, “Graph Ranking Guarantees for Numerical Approximations to Katz Centrality,” International Conference on
Computational Science (ICCS), Zurich, Switzerland, June 12-14, 2017.
• Anita Zakrzewska and David A. Bader, “Streaming Graph Sampling with Size Restrictions,” IEEE/ACM International Conference on Advances in Social Networks Analysis and
Modeling (ASONAM), Sydney, Australia, July 31 - August 3, 2017.
• Eisha Nathan and David A. Bader, “A Dynamic Algorithm for Updating Katz Centrality in Graphs,” , IEEE/ACM International Conference on Advances in Social Networks
Analysis and Modeling (ASONAM), Sydney, Australia, July 31 - August 3, 2017.
• E. Nathan and D.A. Bader, “Approximating Personalized Katz Centrality in Dynamic Graphs,” , 12th International Conference on Parallel Processing and Applied
Mathematics(PPAM), Lublin, Poland, September 10-13, 2017.
• O. Green, J. Fox, E. Kim, F. Busato, N. Bombieri, K. Lakhotia, S. Zhou, S. Singapura, H. Zeng, R. Kannan, V. Prasanna, D. Bader, “Quickly Finding a Truss in a Haystack”, IEEE High
Performance Extreme Computing Conference (HPEC), Waltham, Massachusetts, 2017 (HPEC Graph Challenge Innovation Award)
• S. Zhou, K. Lakhotia, S. Singapura, H. Zeng, R. Kannan, V. Prasanna, J. Fox, E. Kim, O. Green, D. Bader, “Design and Implementation of Parallel PageRank on Multicore
Platforms”, IEEE High Performance Extreme Computing Conference (HPEC), Waltham, Massachusetts, 2017 (HPEC Graph Challenge Student Innovation Award)
• D. Makkar, D. Bader, O. Green, “Deterministic and Parallel Triangle Counting in Streaming Graphs”, IEEE International Conference on High Performance Computing, Data,
and Analytics, Jaipur, India, 2017

Opportunities
• Application-oriented Opportunities:
• High performance computing for massive graphs
• Streaming analytics
• Informational Visualization techniques for massive
graphs
• Heterogeneous systems: Methodologies for combining
the use of the Cloud and Manycore for high-
performance computing
• Energy-efficient high-performance computing
David A. Bader 52

Opportunity 1: High performance computing for massive graphs
• Traditional HPC has focused primarily on solving large problems from
chemistry, physics, and mechanics, using dense linear algebra.
• HPC faces new challenges to deal with:
• time-varying interactions among entities, and
• massive-scale graph abstractions where the vertices represent the nouns or
entities and the edges represent their observed interactions.
• Few parallel computers run well on these problems because
• they often lack locality required to get high performance from distributed-
memory cache-based supercomputers.
• Case study: Massively threaded architectures are shown to run several orders
of magnitude faster than the fastest supercomputers on these types of
problems!
 A focused research agenda is needed to design algorithms that
scale on these new platforms.
David A. Bader 53

• While our high performance computers have delivered a sustained petaflop, they have
done so using the same antiquated batch processing style where a program and a
static data set are scheduled to compute in the next available slot.
• Today, data is overwhelming in volume and rate, and we struggle to keep up with these
streams.
Fundamental computer science research is needed in:
the design of streaming architectures, and
data structures and algorithms that can compute important analytics while sitting in the
middle of these torrential flows.
Opportunity 2: Streaming analytics
David A. Bader 54

Opportunity 3: Information Visualization techniques for
massive graphs
• Information Visualization today
• addresses traditional scientific computing (fluid flow, molecular dynamics), or
• when handling discrete data, scale to only hundreds of vertices at best.
 However, there is a strong need for visualization in the data sciences so that
analytics can gain understanding from data sets with from millions to
billions of interacting non-planar discrete entities.
• Applications include: data mining, intelligence, situational awareness
David A. Bader 55NNDB Mapper of George Washington
Twitter social
network using
Large Graph
Layout
Source: Akshay Java, from ebiquity group

Opportunity 4: Heterogeneous Systems, Cloud, Internet of Things:
Methodologies for combining the use of the Cloud, IoT, and
accelerators
• Today, there is a dichotomy between
using clouds (e.g. Hadoop, map-
reduce) for massive data storage,
filtering, summarization, and
massively parallel/multithreaded
systems for data-intensive
computation.
We must develop methodologies
for employing these complementary
systems for solving grand challenges
in data analysis.
David A. Bader 56
Steve Mills, SVP of IBM Software (left),
and Dr. John Kelly, SVP of IBM
Research, view Stream Computing
technology

Opportunity 5: Energy-efficient high-performance computing
• The main constraint for our ability to compute has changed
• from availability of compute resources
• to the ability to power and cool our systems within budget.
 Holistic research is needed that can permeate from the architecture
and systems up to the applications AND DATA CENTERS, whereby
energy use is a first-class object that can be optimized at all levels.
David A. Bader 57
Microsoft’s Chicago Million Server DataCenter

Acknowledgment of Support

Backup Slides

HPEC Graph Challenge “Innovation Award”
• Static Graph Challenge – Static Graph Challenge: Subgraph Isomorphism
• Finding the maximal K-Truss subgraph
• A Truss is a relaxation of a clique that still has a good amount of connectivity
• In collaboration with USC and the University of Verona
• O. Green, J. Fox, E. Kim, F. Busato, N. Bombieri, K. Lakhotia, S. Zhou, S.
Singapura, H. Zeng, R. Kannan, V. Prasanna, D. Bader, “Quickly Finding a
Truss in a Haystack”, IEEE High Performance Extreme Computing
Conference (HPEC), Waltham, Massachusetts, 2017 (HPEC Graph
Challenge Innovation Award)
• Trusses are found by deleting edges out of the graph,
• Uses dynamic graph techniques for solving a static graph problem.
• Greatly reduces the amount of work needed for finding the Truss.

HPEC Graph Challenge
“Student Innovation Award”
• In collaboration with USC
• S. Zhou, K. Lakhotia, S. Singapura, H. Zeng, R. Kannan, V. Prasanna, J. Fox, E. Kim,
O. Green, D. Bader, “Design and Implementation of Parallel PageRank
on Multicore Platforms”, IEEE High Performance Extreme Computing
Conference (HPEC), Waltham, Massachusetts, 2017 (HPEC Graph Challenge
Student Innovation Award)
• Uses an edge centric approach: edge list is partitioned into shards.
• Shards are shorted by destination.
• Greatly reduces the number of cache misses and data movement.
• Faster than PageRank Pipeline benchmark – 2.5X faster than the multi-
threaded version of PageRank pipeline.
• Applicable to accelerators such as FPGAs

Graph500 Benchmark, www.graph500.org
• Cybersecurity
• 15 Billion Log Entires/Day (for large
enterprises)
• Full Data Scan with End-to-End Join
Required
• Medical Informatics
• 50M patient records, 20-200
records/patient, billions of individuals
• Entity Resolution Important
• Social Networks
• Example, Facebook, Twitter
• Nearly Unbounded Dataset Size
• Data Enrichment
• Easily PB of data
• Example: Maritime Domain Awareness
• Hundreds of Millions of Transponders
• Tens of Thousands of Cargo Ships
• Tens of Millions of Pieces of Bulk Cargo
• May involve additional data (images,
etc.)
• Symbolic Networks
• Example, the Human Brain
• 25B Neurons
• 7,000+ Connections/Neuron
David A. Bader 62
Defining a new set of benchmarks to guide the design of hardware architectures and
software systems intended to support such applications and to help procurements.
Graph algorithms are a core part of many analytics workloads.
Executive Committee: D.A. Bader, R. Murphy, M. Snir, A. Lumsdaine
• Five Business Area Data Sets:
19 July 2018

Heterogeneity in “Big Data” systems:
High Performance Data Analytics
• Analytic platforms will combine:
• Cloud (Hadoop/map-reduce)
• Stream processing
• Large shared-memory systems
• Massive multithreaded architectures
• Multicore and accelerators
The challenge: developing
methodologies for employing
these complementary systems in
an enterprise-class analytics
framework for solving grand
challenges in massive data
analysis for discovery, real-time
analytics, and forensics.
David A. Bader 63
Steve Mills, SVP of IBM Software (left),
and Dr. John Kelly, SVP of IBM
Research, view Stream Computing
technology
19 July 2018

Future Architectures
• Highly multithreaded
• High bandwidth (network and memory)
• Complex but flexible memory hierarchy
• Heterogeneous design in core capability and ISA

National Strategic Computing Initiative
• (29 July 2015, The White House) The National Strategic Computing Initiative (NSCI) is an effort
to create a cohesive, multi-agency strategic vision and Federal investment strategy in high-
performance computing (HPC).
• This strategy will be executed in collaboration with industry and academia, maximizing the
benefits of HPC for the United States.
• HPC systems, through a combination of processing capability and storage capacity, can solve
computational problems that are beyond the capability of small- to medium-scale systems.
They are vital to the Nation’s interests in science, medicine, engineering, technology, and
industry.
• The NSCI will spur the creation and deployment of computing technology at the leading edge,
helping to advance Administration priorities for economic competiveness, scientific discovery,
and national security.
• The National Strategic Computing Initiative has five strategic themes.
1. Create systems that can apply exaflops of computing power to exabytes of data.
2. Keep the United States at the forefront of HPC capabilities.
3. Improve HPC application developer productivity.
4. Make HPC readily available.
5. Establish hardware technology for future HPC systems.
19 July 2018 David A. Bader
65

NSCI Anniversary Meeting, 29 July 2016
19 July 2018 David A. Bader
66

Dynamic Graphs and Streaming Support
David A. Bader 67
DynamicSupport
Galois
TitanDB
GraphX
GraphLab
Giraph
Gunrock
Boost
GraphCHI
Dynamic
Static
UpdateStreaming rate
STINGER
Ligra
Hornet
19 July 2018

Performance and Data Scalability
David A. Bader 68
Datasize
GaloisTitanDB
GraphX
GraphLab
Giraph
Gunrock
Boost
GraphCHI
Large
Small
Performance Scalability
Ligra
STINGER
GPU-Hornet
CPU-Hornet
19 July 2018

Overview of STINGER and Hornet Capabilities
David A. Bader 69
Algorithm Static Dynamic Implementation Formulation
Breadth first search   CPU+GPU VC
Triangle Counting   CPU+GPU VC
Connect components   CPU+GPU VC
Betweenness Centrality   CPU+GPU VC
Page Rank   CPU+GPU VC, BLAS
Katz Centrality   CPU+GPU VC, BLAS
Community Detection   CPU VC
Seed Set Expansion   CPU VC
K-Truss Decomposition  GPU VC
Maximal Independent Set  GPU VC
Legend: VC - Vertex Centric, BLAS
19 July 2018

Graph Analytics (Dynamic vs. Streaming)
• Many libraries support updates to graph
• They do not have dynamic graph analytics
David A. Bader 70
Streaming
RateSlow (very
low)
Fast
Graph and
algorithm properties
Dynamic
Static
STINGE
R
GraphLa
b
GiraphGraphX
Galois
TitanDB
19 July 2018

Boost
Gunrock
Scalability (Volume vs. Performance)
• Many libraries support updates to graph
• They do not have dynamic graph analytics
David A. Bader 71
Performanc
e
Scalability
Low Fast
Data
Scalability
Large
Small
STINGE
R
GraphLa
b
Giraph
GraphX
GaloisTitanDB
GraphCH
I
19 July 2018

PageRank - Performance Analysis
David A. Bader 72
• Lower is better.
• 2 orders of magnitude difference in performance.
• Still outperforms other static-only graph packages.
• Outperforms the distributed systems even for large networks
with plenty of computational demand!
• Some platforms did not complete in reasonable amount of time.
R-MAT Graph
• Vertices: 16M
• Edges: 128M
19 July 2018

Other algorithms
David A. Bader 73
R-MAT Graph
• Vertices: 1M
• Edges: 8M
• STINGER is orders of magnitude faster.
• Still outperforms other static-only graph packages.
Static Single-Source Shortest Path Static Connected Components
19 July 2018

High Performance Data Analytics (HPDA)
• With Pacific Northwest National Lab
• Project objectives –
• Develop novel tools and algorithms for
dealing with massive dynamic networks.
• Enable analysts to search through vast data
at near real-time speeds.
• Improve accuracy of past approaches
through the use of community centric
analysis
David A. Bader 74
• Successes
• Developed the first scalable dynamic graph data for the GPU (that also works for CPUs).
Data structure supports over 90 million updates per second.
• Designed novel dynamic algorithms for Katz Centrality and triangle counting
• Developed personalized centrality metrics
• Sketched out an asynchronous model, the first of its kind, for analyzing the correctness of
dynamic graph algorithms when the underlying graph is changing.
19 July 2018

Leveraging High Performance Computing
for Mixed-Integer Programming
In a joint collaboration with the ExxonMobil Upstream Research Company, we
focus on developing effective Mixed Integer Programming (MIP) methods for
difficult planning and scheduling problems arising in the petrochemical industry.
• Challenges
• MIPs are NP-Hard problems.
• Available parallel algorithms show poor
scalability.
• Goals :
• To study and develop new scalable parallel
algorithms for MIP solving.
• Our focus is on large scale industrial
optimization problems.
• To offer MIP practitioners a portfolio of
parallel algorithms, which emphasize
finding high quality solutions quickly as
well as proving optimality.

NSF XSCALA
• A Community Repository for Model-driven Design and Tuning of Data-
Intensive Applications for Extreme-scale Accelerator-based Systems
• Collaboration with Georgia Tech and University of Southern California
• Launched 2012
• Challenges
• Sparse data intense computations with irregular memory access patterns.
• Extremely hard for compilers to parallelize 
Requires hand tuning for new architectures and platforms.
David A. Bader 76
• Goals
• Design tools for automatic fine tuning and
optimizations
• Develop runtime scheduling techniques for
load-balancing and for hardware selection
• Offer programmers an intuitive modeling
environment for design-time and run-time
optimizations.
19 July 2018

Parallel Alternating Criteria Search
• A parallel Large Neighboorhood Search(LNS) heuristic aimed at obtaining high quality
feasible solutions for general Mixed Integer Programs (MIPs).
• Solution improvements are found by solving in parallel a large number of restricted subproblems, which
are derived from the original problem.
• It is the first parallel general purpose heuristic developed for MIPs.
• Significantly more scalable and effective at finding high quality solutions than current
commercial MIP solvers, especially for large-scale MIP instances.
• The framework provides an excellent platform for the rapid prototyping of highly
effective problem-specific heuristics, which exploit problem structure.

Massive-Scale Analytics Applied to Real-World Problems

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Massive-Scale Analytics Applied to Real-World Problems

Semelhante a Massive-Scale Analytics Applied to Real-World Problems (20)

Mais de inside-BigData.com

Mais de inside-BigData.com (20)

Último

Último (20)

Massive-Scale Analytics Applied to Real-World Problems