SlideShare uma empresa Scribd logo
1 de 33
Tuesday 16th July 2019
Towards Parallel and Lazy
Model Queries
Sina Madani, Dimitris Kolovos, Richard Paige
{sm1748, dimitris.kolovos, richard.paige}@york.ac.uk
Enterprise Systems, Department of Computer Science
ECMFA 2019, Eindhoven
Outline
• Current shortcomings of (most) model querying tools
• Parallel execution of functional collection operations
• Lazy evaluation
• Short-circuiting
• Pipelining
• Performance evaluation
• Future work
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Background
• Scalability is an active research area in model-driven engineering
• Collaboration and versioning
• Persistence and distribution
• Continuous event processing
• Queries and transformations
• Very large models / datasets common in complex industrial projects
• First-order operations frequently used in model management tasks
• Diminishing single-thread performance, increasing number of cores
• Vast majority of operations on collections are pure functions
• i.e. inherently thread-safe and parallelisable
Tuesday 16th July 2019ECMFA 2019, Eindhoven
OCL limitations
• Willink (2017) Deterministic Lazy Mutable OCL Collections:
• “Immutability implies inefficient collection churning.”
• “Specification implies eager evaluation.”
• “Invalidity implies full evaluation.”
• “The OCL specification is far from perfect.”
• Consequence: Inefficiency!
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Example query: IMDb
Movie.allInstances()
->select(title.startsWith('The '))
->collect(persons)
->selectByKind(Actress)
->forAll(movies->size() > 5)
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Unoptimised execution algorithm
1. Load all Movie instances into memory
2. Find all Movies starting with “The ” from (1)
3. Get all Actors and Actresses for all of the Movies in (2)
4. Filter the collection in (3) to only include Actresses
5. For every Actress in (4), check whether they have played in more
than 5 movies, returning true iff all elements satisfy this condition
Tuesday 16th July 2019ECMFA 2019, Eindhoven
How can we do better?
• Independent evaluation for each model element
• e.g. ->select(title.startsWith('The '))
• Every Movie’s title can be checked independently, in parallel
• No need to evaluate forAll on every element
• Can stop if we find a counter-example which doesn’t satisfy the predicate
• Pipeline fusion (“vertical” instead of “horizontal” evaluation)
• Don’t need to perform intermediate steps for every single element
• Feed each element one by one through the processing pipeline
• Avoid unnecessary evaluations once we have the result
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Epsilon Object Language (EOL)
• Powerful model-oriented language with imperative constructs
• Looks like OCL, works like Java (with reflection)
• No invalid / 4-valued logic etc. – just exceptions, objects and null
• Declarative collection operations
• Ability to invoke arbitrary Java code
• Global variables, cached operations, not purely functional
• ...and more
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Concurrency: challenges & assumptions
• Need to capture state prior to parallel execution
• Any declared variables need to be accessible
• Side-effects need not be persisted
• Caches need to be thread-safe
• Through synchronization or atomicity
• Mutable engine internals (e.g. frame stack) are thread-local
• Intermediate variables’ scope is limited to each parallel “job”
• Thread-local execution trace for reporting exceptions
• No nested parallelism
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Example: parallelSelect operation
var jobs = new ArrayList<Callable<Optional<T>>>(source.size());
for (T element : source) {
jobs.add(() -> {
if (predicate.execute(element))
return Optional.of(element);
else return Optional.empty();
});
}
context.executeParallel(jobs)
.forEach(opt -> opt.ifPresent(results::add));
return results;
Tuesday 16th July 2019ECMFA 2019, Eindhoven
context.executeParallel (ordered)
EolThreadPoolExecutor executorService = getExecutorService();
List<Future<T>> futureResults = jobs.stream()
.map(executorService::submit).collect(Collectors.toList());
List<T> actualResults = new ArrayList<>(futureResults.size());
for (Future<T> future : futureResults) {
actualResults.add(future.get());
}
return actualResults;
OCL 2018, Copenhagen 11
Efficient operation chaining
• We can evaluate the pipeline one element at a time, rather than each
operation for all elements
Movie.allInstances()
->select(year >= 1990)
->select(rating > 0.87)
->collect(persons)
->exists(name = 'Joe Pesci')
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Efficient imperative implementation
for (Movie m : Movie.allInstances())
if (m.year >= 1990 && m.rating > 0.87)
for (Person p : m.persons)
if ("Joe Pesci".equals(p.name))
return true;
return false;
• Short-circuiting, no unnecessary intermediate collections!
• But what about those parallelisable for loops?
Tuesday 16th July 2019ECMFA 2019, Eindhoven
java.util.stream.* approach
• Data sources are “Streams”
• Basically fancy Iterables / generators
• Can be lazy / infinite
• Computations are fused into a pipeline
• Compute pipeline triggered on “terminal operations”
• Declarative, functional (thus inherently parallelisable)
• see java.util.Spliterator for how parallelisation is possible
• Let’s Get Lazy: Exploring the Real Power of Streams by Venkat Subramaniam
https://youtu.be/ekFPGD2g-ps
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Integrating Streams into Epsilon
• Need to convert EOL first-order syntax to native lambdas
• Requires automatic inference of appropriate functional type
• Changes to AST / parser
• Built-in types to handle common types, e.g. Predicates
• Runtime implementation of unknown / to be discovered interface
• Use java.lang.reflect.Proxy
• Exception handling / reporting
• Preventing nested parallelism
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Stream code in Epsilon
• Efficient execution semantics with EOL / OCL-compatible syntax
Movie.allInstances().stream()//.parallel()
.filter(m | m.year >= 1990)
.filter(m | m.rating > 0.87)
.map(m | m.persons)
.anyMatch(p | p.name = "Joe Pesci");
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Counting elements
Movie.allInstances()
->select(m | m.year > 1980 and m.rating > 0.85)
->size() > 63
• Do we need to allocate a new collection?
• Is short-circuiting possible?
Tuesday 16th July 2019ECMFA 2019, Eindhoven
select(...)->size()
public <T> Integer count(Collection<T> source, Predicate<T> predicate) {
var result = new AtomicInteger(0);
for (T element : source) {
executor.execute(() -> {
if (predicate.test(element)) {
result.incrementAndGet();
}
});
}
executor.awaitCompletion();
return result.get();
}
Tuesday 16th July 2019ECMFA 2019, Eindhoven
count operation
Movie.allInstances()
->select(m | m.year > 1980 and m.rating > 0.85)
->size() > 63
Movie.allInstances()
->count(m | m.year > 1980 and m.rating > 0.85) > 63
Tuesday 16th July 2019ECMFA 2019, Eindhoven
select(...)->size()  N
• Want to know whether a collection has...
• at least (>=)
• exactly (==)
• at most (<=)
• ... N elements satisfying a predicate
• Short-circuiting opportunity!
• If not enough possible matches remaining
• If the required number of matches is exceeded
Tuesday 16th July 2019ECMFA 2019, Eindhoven
nMatch logic
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Tuesday 16th July 2019ECMFA 2019, Eindhoven
int ssize = source.size();
var currentMatches = new AtomicInteger(0);
var evaluated = new AtomicInteger(0);
var jobResults = new ArrayList<Future<?>>(ssize);
for (T element : source) {
jobResults.add(executor.submit(() -> {
int cInt = predicate.testThrows(element) ?
currentMatches.incrementAndGet() : currentMatches.get(),
eInt = evaluated.incrementAndGet();
if (shouldShortCircuit(ssize, targetMatches, cInt, eInt)) {
executor.getExecutionStatus().signalCompletion();
}
}));
}
executor.shortCircuitCompletion(jobResults);
return determineResult(currentMatches.get(), targetMatches);
nMatch operation
Movie.allInstances()
->select(m | m.year > 1980 and m.rating > 0.85)
->size() > 63
Movie.allInstances()->atLeastNMatch(m |
m.year > 1980 and m.rating > 0.85,
64
)
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Performance evaluation
• Comparing Eclipse OCL, EOL, Parallel EOL, Java Streams in EOL, Java Streams in Java
def: coActorsQuery() : Integer = Person.allInstances()
->select(a | a.movies->collect(persons)->flatten()->asSet()
->exists(co |
co.name < a.name and a.movies->size() >= 3 and
co.movies->excludingAll(a.movies)
->size() <= (co.movies->size() - 3)
)
)->size()
Tuesday 16th July 2019ECMFA 2019, Eindhoven
github.com/epsilonlabs/parallel-erl
Test system
• AMD Threadripper 1950X @ 3.6 GHz 16-core CPU
• 32 (4x8) GB DDR4-3003 MHz RAM
• Fedora 29 OS (Linux kernel 5.1)
• OpenJDK 11.0.3 Server VM
• Samsung 960 EVO 250 GB M.2 NVMe SSD
Tuesday 16th July 2019ECMFA 2019, Eindhoven
3.53 million model elements
Tuesday 16th July 2019ECMFA 2019, Eindhoven
[h]:mm:ss
36x
30x
~2.25x
325x
Speedup over model size (EOL)
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Thread scalability – 3.53m elements
Tuesday 16th July 2019ECMFA 2019, Eindhoven
[h]:mm:ss
Equivalence testing
• EUnit – JUnit-style tests for Epsilon
• Testing of all operations, with corner cases
• Testing semantics (e.g. short-circuiting) as well as results
• Equivalence test of sequential and parallel operations
• Equivalence testing of Streams and builtin operations
• Testing of scope capture, operation calls, exception handling etc.
• Repeated many times with no failures
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Related Works
• Lazy Evaluation for OCL (Tisi, 2015)
• On-demand, iterator-based collection operations for OCL
• Runtime Model Validation with Parallel OCL (Vajk, 2011)
• CSP formalisation of OCL with code generation for C#
• Towards Scalable Querying of Large-Scale Models (Kolovos, 2013)
• Introduces laziness and native queries for relational database models
• Automated Analysis and Suboptimal Code Detection (Wei, 2014)
• Static analysis of EOL, recognition of inefficient code
• Provides recommendations for more optimal expressions
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Future Work
• Parallelism and laziness at the modelling level
• e.g. Stream-based APIs rather than Collections
• Automated replacement of suboptimal code
• Re-write AST at parse-time if safe to do so
• Use static analysis for detection – see Wei and Kolovos, 2014
• Automated refactoring of imperative code with Streams
• Native parallel + lazy code generation from OCL / EOL
• Best possible performance
• User-friendly and compatible with existing programs
Tuesday 16th July 2019ECMFA 2019, Eindhoven
Summary
• OCL performance is very far from its potential
• Both due to implementations and restrictive specification
• Short-circuiting and lazy evaluation make a huge difference
• Relatively easy to implement
• May require some static analysis for more advanced optimisations
• Parallelism provides further benefits when combined with laziness
• Short-circuiting more challenging in parallel but still viable
• Interpreted significantly lower than compiled, even with parallelism
• Most performant solution is native parallel code
Tuesday 16th July 2019ECMFA 2019, Eindhoven

Mais conteúdo relacionado

Semelhante a Towards Parallel and Lazy Model Queries

Semelhante a Towards Parallel and Lazy Model Queries (20)

Easier smart home development with simulators and rule engines
Easier smart home development with simulators and rule enginesEasier smart home development with simulators and rule engines
Easier smart home development with simulators and rule engines
 
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML PipelinesPyData Meetup - Feature Store for Hopsworks and ML Pipelines
PyData Meetup - Feature Store for Hopsworks and ML Pipelines
 
Symbolic Execution And KLEE
Symbolic Execution And KLEESymbolic Execution And KLEE
Symbolic Execution And KLEE
 
TiMetmay10
TiMetmay10TiMetmay10
TiMetmay10
 
Ti met may10
Ti met may10Ti met may10
Ti met may10
 
Angular Optimization Web Performance Meetup
Angular Optimization Web Performance MeetupAngular Optimization Web Performance Meetup
Angular Optimization Web Performance Meetup
 
A Brief Conceptual Introduction to Functional Java 8 and its API
A Brief Conceptual Introduction to Functional Java 8 and its APIA Brief Conceptual Introduction to Functional Java 8 and its API
A Brief Conceptual Introduction to Functional Java 8 and its API
 
An Answer Set Programming based framework for High-Utility Pattern Mining ext...
An Answer Set Programming based framework for High-Utility Pattern Mining ext...An Answer Set Programming based framework for High-Utility Pattern Mining ext...
An Answer Set Programming based framework for High-Utility Pattern Mining ext...
 
Java n-plus-1-incl-demo-slides
Java n-plus-1-incl-demo-slidesJava n-plus-1-incl-demo-slides
Java n-plus-1-incl-demo-slides
 
The Role of Models in Semiconductor Smart Manufacturing
The Role of Models in Semiconductor Smart ManufacturingThe Role of Models in Semiconductor Smart Manufacturing
The Role of Models in Semiconductor Smart Manufacturing
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
 
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
OSLC KM (Knowledge Management): elevating the meaning of data and operations ...
 
Static Energy Prediction in Software: A Worst-Case Scenario Approach
Static Energy Prediction in Software: A Worst-Case Scenario ApproachStatic Energy Prediction in Software: A Worst-Case Scenario Approach
Static Energy Prediction in Software: A Worst-Case Scenario Approach
 
Integrating Performance Modeling in Industrial Automation through AutomationM...
Integrating Performance Modeling in Industrial Automation through AutomationM...Integrating Performance Modeling in Industrial Automation through AutomationM...
Integrating Performance Modeling in Industrial Automation through AutomationM...
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
SpringOne Platform recap 정윤진
SpringOne Platform recap 정윤진SpringOne Platform recap 정윤진
SpringOne Platform recap 정윤진
 
Model Execution: Past, Present and Future
Model Execution: Past, Present and FutureModel Execution: Past, Present and Future
Model Execution: Past, Present and Future
 
A knowledge-based solution for automatic mapping in component based automat...
A knowledge-based solution for  automatic mapping in component  based automat...A knowledge-based solution for  automatic mapping in component  based automat...
A knowledge-based solution for automatic mapping in component based automat...
 
The Performance Analysis of a Fettling Shop Using Simulation
The Performance Analysis of a Fettling Shop Using SimulationThe Performance Analysis of a Fettling Shop Using Simulation
The Performance Analysis of a Fettling Shop Using Simulation
 
CodeChecker summary 21062021
CodeChecker summary 21062021CodeChecker summary 21062021
CodeChecker summary 21062021
 

Último

JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)
Max Lee
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
mbmh111980
 

Último (20)

The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion Production
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand
 
AI Hackathon.pptx
AI                        Hackathon.pptxAI                        Hackathon.pptx
AI Hackathon.pptx
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
What need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersWhat need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java Developers
 
how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdf
 
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purityAPVP,apvp apvp High quality supplier safe spot transport, 98% purity
APVP,apvp apvp High quality supplier safe spot transport, 98% purity
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
 
How to pick right visual testing tool.pdf
How to pick right visual testing tool.pdfHow to pick right visual testing tool.pdf
How to pick right visual testing tool.pdf
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabber
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
 

Towards Parallel and Lazy Model Queries

  • 1. Tuesday 16th July 2019 Towards Parallel and Lazy Model Queries Sina Madani, Dimitris Kolovos, Richard Paige {sm1748, dimitris.kolovos, richard.paige}@york.ac.uk Enterprise Systems, Department of Computer Science ECMFA 2019, Eindhoven
  • 2. Outline • Current shortcomings of (most) model querying tools • Parallel execution of functional collection operations • Lazy evaluation • Short-circuiting • Pipelining • Performance evaluation • Future work Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 3. Background • Scalability is an active research area in model-driven engineering • Collaboration and versioning • Persistence and distribution • Continuous event processing • Queries and transformations • Very large models / datasets common in complex industrial projects • First-order operations frequently used in model management tasks • Diminishing single-thread performance, increasing number of cores • Vast majority of operations on collections are pure functions • i.e. inherently thread-safe and parallelisable Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 4. OCL limitations • Willink (2017) Deterministic Lazy Mutable OCL Collections: • “Immutability implies inefficient collection churning.” • “Specification implies eager evaluation.” • “Invalidity implies full evaluation.” • “The OCL specification is far from perfect.” • Consequence: Inefficiency! Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 5. Example query: IMDb Movie.allInstances() ->select(title.startsWith('The ')) ->collect(persons) ->selectByKind(Actress) ->forAll(movies->size() > 5) Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 6. Unoptimised execution algorithm 1. Load all Movie instances into memory 2. Find all Movies starting with “The ” from (1) 3. Get all Actors and Actresses for all of the Movies in (2) 4. Filter the collection in (3) to only include Actresses 5. For every Actress in (4), check whether they have played in more than 5 movies, returning true iff all elements satisfy this condition Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 7. How can we do better? • Independent evaluation for each model element • e.g. ->select(title.startsWith('The ')) • Every Movie’s title can be checked independently, in parallel • No need to evaluate forAll on every element • Can stop if we find a counter-example which doesn’t satisfy the predicate • Pipeline fusion (“vertical” instead of “horizontal” evaluation) • Don’t need to perform intermediate steps for every single element • Feed each element one by one through the processing pipeline • Avoid unnecessary evaluations once we have the result Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 8. Epsilon Object Language (EOL) • Powerful model-oriented language with imperative constructs • Looks like OCL, works like Java (with reflection) • No invalid / 4-valued logic etc. – just exceptions, objects and null • Declarative collection operations • Ability to invoke arbitrary Java code • Global variables, cached operations, not purely functional • ...and more Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 9. Concurrency: challenges & assumptions • Need to capture state prior to parallel execution • Any declared variables need to be accessible • Side-effects need not be persisted • Caches need to be thread-safe • Through synchronization or atomicity • Mutable engine internals (e.g. frame stack) are thread-local • Intermediate variables’ scope is limited to each parallel “job” • Thread-local execution trace for reporting exceptions • No nested parallelism Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 10. Example: parallelSelect operation var jobs = new ArrayList<Callable<Optional<T>>>(source.size()); for (T element : source) { jobs.add(() -> { if (predicate.execute(element)) return Optional.of(element); else return Optional.empty(); }); } context.executeParallel(jobs) .forEach(opt -> opt.ifPresent(results::add)); return results; Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 11. context.executeParallel (ordered) EolThreadPoolExecutor executorService = getExecutorService(); List<Future<T>> futureResults = jobs.stream() .map(executorService::submit).collect(Collectors.toList()); List<T> actualResults = new ArrayList<>(futureResults.size()); for (Future<T> future : futureResults) { actualResults.add(future.get()); } return actualResults; OCL 2018, Copenhagen 11
  • 12. Efficient operation chaining • We can evaluate the pipeline one element at a time, rather than each operation for all elements Movie.allInstances() ->select(year >= 1990) ->select(rating > 0.87) ->collect(persons) ->exists(name = 'Joe Pesci') Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 13. Efficient imperative implementation for (Movie m : Movie.allInstances()) if (m.year >= 1990 && m.rating > 0.87) for (Person p : m.persons) if ("Joe Pesci".equals(p.name)) return true; return false; • Short-circuiting, no unnecessary intermediate collections! • But what about those parallelisable for loops? Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 14. java.util.stream.* approach • Data sources are “Streams” • Basically fancy Iterables / generators • Can be lazy / infinite • Computations are fused into a pipeline • Compute pipeline triggered on “terminal operations” • Declarative, functional (thus inherently parallelisable) • see java.util.Spliterator for how parallelisation is possible • Let’s Get Lazy: Exploring the Real Power of Streams by Venkat Subramaniam https://youtu.be/ekFPGD2g-ps Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 15. Integrating Streams into Epsilon • Need to convert EOL first-order syntax to native lambdas • Requires automatic inference of appropriate functional type • Changes to AST / parser • Built-in types to handle common types, e.g. Predicates • Runtime implementation of unknown / to be discovered interface • Use java.lang.reflect.Proxy • Exception handling / reporting • Preventing nested parallelism Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 16. Stream code in Epsilon • Efficient execution semantics with EOL / OCL-compatible syntax Movie.allInstances().stream()//.parallel() .filter(m | m.year >= 1990) .filter(m | m.rating > 0.87) .map(m | m.persons) .anyMatch(p | p.name = "Joe Pesci"); Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 17. Counting elements Movie.allInstances() ->select(m | m.year > 1980 and m.rating > 0.85) ->size() > 63 • Do we need to allocate a new collection? • Is short-circuiting possible? Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 18. select(...)->size() public <T> Integer count(Collection<T> source, Predicate<T> predicate) { var result = new AtomicInteger(0); for (T element : source) { executor.execute(() -> { if (predicate.test(element)) { result.incrementAndGet(); } }); } executor.awaitCompletion(); return result.get(); } Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 19. count operation Movie.allInstances() ->select(m | m.year > 1980 and m.rating > 0.85) ->size() > 63 Movie.allInstances() ->count(m | m.year > 1980 and m.rating > 0.85) > 63 Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 20. select(...)->size()  N • Want to know whether a collection has... • at least (>=) • exactly (==) • at most (<=) • ... N elements satisfying a predicate • Short-circuiting opportunity! • If not enough possible matches remaining • If the required number of matches is exceeded Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 21. nMatch logic Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 22. Tuesday 16th July 2019ECMFA 2019, Eindhoven int ssize = source.size(); var currentMatches = new AtomicInteger(0); var evaluated = new AtomicInteger(0); var jobResults = new ArrayList<Future<?>>(ssize); for (T element : source) { jobResults.add(executor.submit(() -> { int cInt = predicate.testThrows(element) ? currentMatches.incrementAndGet() : currentMatches.get(), eInt = evaluated.incrementAndGet(); if (shouldShortCircuit(ssize, targetMatches, cInt, eInt)) { executor.getExecutionStatus().signalCompletion(); } })); } executor.shortCircuitCompletion(jobResults); return determineResult(currentMatches.get(), targetMatches);
  • 23. nMatch operation Movie.allInstances() ->select(m | m.year > 1980 and m.rating > 0.85) ->size() > 63 Movie.allInstances()->atLeastNMatch(m | m.year > 1980 and m.rating > 0.85, 64 ) Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 24. Performance evaluation • Comparing Eclipse OCL, EOL, Parallel EOL, Java Streams in EOL, Java Streams in Java def: coActorsQuery() : Integer = Person.allInstances() ->select(a | a.movies->collect(persons)->flatten()->asSet() ->exists(co | co.name < a.name and a.movies->size() >= 3 and co.movies->excludingAll(a.movies) ->size() <= (co.movies->size() - 3) ) )->size() Tuesday 16th July 2019ECMFA 2019, Eindhoven github.com/epsilonlabs/parallel-erl
  • 25. Test system • AMD Threadripper 1950X @ 3.6 GHz 16-core CPU • 32 (4x8) GB DDR4-3003 MHz RAM • Fedora 29 OS (Linux kernel 5.1) • OpenJDK 11.0.3 Server VM • Samsung 960 EVO 250 GB M.2 NVMe SSD Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 26. 3.53 million model elements Tuesday 16th July 2019ECMFA 2019, Eindhoven [h]:mm:ss 36x 30x ~2.25x 325x
  • 27. Speedup over model size (EOL) Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 28. Thread scalability – 3.53m elements Tuesday 16th July 2019ECMFA 2019, Eindhoven [h]:mm:ss
  • 29. Equivalence testing • EUnit – JUnit-style tests for Epsilon • Testing of all operations, with corner cases • Testing semantics (e.g. short-circuiting) as well as results • Equivalence test of sequential and parallel operations • Equivalence testing of Streams and builtin operations • Testing of scope capture, operation calls, exception handling etc. • Repeated many times with no failures Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 30. Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 31. Related Works • Lazy Evaluation for OCL (Tisi, 2015) • On-demand, iterator-based collection operations for OCL • Runtime Model Validation with Parallel OCL (Vajk, 2011) • CSP formalisation of OCL with code generation for C# • Towards Scalable Querying of Large-Scale Models (Kolovos, 2013) • Introduces laziness and native queries for relational database models • Automated Analysis and Suboptimal Code Detection (Wei, 2014) • Static analysis of EOL, recognition of inefficient code • Provides recommendations for more optimal expressions Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 32. Future Work • Parallelism and laziness at the modelling level • e.g. Stream-based APIs rather than Collections • Automated replacement of suboptimal code • Re-write AST at parse-time if safe to do so • Use static analysis for detection – see Wei and Kolovos, 2014 • Automated refactoring of imperative code with Streams • Native parallel + lazy code generation from OCL / EOL • Best possible performance • User-friendly and compatible with existing programs Tuesday 16th July 2019ECMFA 2019, Eindhoven
  • 33. Summary • OCL performance is very far from its potential • Both due to implementations and restrictive specification • Short-circuiting and lazy evaluation make a huge difference • Relatively easy to implement • May require some static analysis for more advanced optimisations • Parallelism provides further benefits when combined with laziness • Short-circuiting more challenging in parallel but still viable • Interpreted significantly lower than compiled, even with parallelism • Most performant solution is native parallel code Tuesday 16th July 2019ECMFA 2019, Eindhoven

Notas do Editor

  1. Where we are now with (unoptimized) OCL both in terms of spec and implementation vs. where we could be if we cared about performance
  2. Spend no more than 30 seconds here
  3. Spend no more than 20 seconds here
  4. See previous slide, but don’t spend too long as more examples will come later
  5. Only an overview here – more detail in upcoming slides
  6. Emphasize the lack of strict / performance-limiting specification. More Java-like nature allows for optimisations as well as more expressive programs without hampering usability
  7. Be careful not to spend too much time here – say that it’s already covered in previous works. Only an overview
  8. State that this is covered in a previous paper. Main thing is that order is deterministic even in parallel because we’re using Futures.
  9. Note the List: ordering is guaranteed because jobs are submitted sequentially (and queries sequentially)
  10. Note that parallelisation is non-trivial here due to nesting
  11. Divide & Conquer (Fork-Join) approach to parallelisation
  12. Again note that the parallelism of Streams is different to how we implement our parallel operations
  13. Emphasize that this is offered as an efficient alternative – if eager evaluation is required, use the standard (i.e. OCL-like) built-in operations
  14. Point out the select and short-circuiting exists – no need to go into detail on what the query actually means. Also acknowledge Atenea (UMA) for the inspiration
  15. Emphasize the huge disparity between OCL and native Java – sequential stream in Java is 36x faster than interpreted OCL! Parallel Stream in Java is 9x faster than sequential Stream Parallel EOL (whether it’s Stream or builtin operations) is 30x faster than interpreted OCL, though still slower than single-threaded Java code
  16. Hand-coded operations beat Streams every time, and substantially so for larger models! Speedup still substantial even for smaller models. Parallel EOL sees small but linear improvements with model size
  17. Drop-off in efficiency after 4 threads is due to memory bandwidth bottleneck With 32 threads, a script taking >1 hour now takes <5 mins