The growing size of software models poses significant scalability challenges. Amongst these scalability issues is the execution time of queries and transformations. Although the processing pipeline for models may involve numerous stages such as validation, transformation and code generation, many of these complex processes are (or can be) expressed by a combination of simpler and more fundamental operations. In many cases, these underlying operations are pure functions, making them amenable to parallelisation. We present parallel execution algorithms for a range of iteration-based operations in the context of the OCL-inspired Epsilon Object Language. Our experiments show a significant improvement in the performance of queries on large models.
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
Parallel First-Order Operations
1. Parallel first-order operations
Sina Madani, Dimitris Kolovos, Richard Paige
{sm1748, dimitris.kolovos, richard.paige}@york.ac.uk
Enterprise Systems, Department of Computer Science
1OCL 2018, Copenhagen
2. Outline
• Background and related work
• Epsilon Object Language (EOL)
• Parallelisation challenges and solutions
• Performance evaluation
• Future work
• Questions
2OCL 2018, Copenhagen
3. Motivation
• Scalability is an active research area in model-driven engineering
• Collaboration and versioning
• Persistence and distribution
• Continuous event processing
• Queries and transformations
• Very large models / datasets common in complex industrial projects
• First-order operations frequently used in model management tasks
• Diminishing single-thread performance, increasing number of cores
• Vast majority of operations on collections are pure functions
• i.e. inherently thread-safe and parallelisable
3OCL 2018, Copenhagen
4. Related Work
• Parallel ATL (Tisi et al., 2013)
• Task-parallel approach to model transformation
• Parallel OCL (Vajk et al., 2011)
• Automated parallel code generation based on CSP and C#
• Lazy OCL (Tisi et al., 2015)
• Iterator-based lazy evaluation of expressions on collections
• Parallel Streams (Java 8+, 2013)
• Rich and powerful API for general queries and transformations
• Combines lazy semantics with divide-and-conquer parallelism
4OCL 2018, Copenhagen
5. Epsilon Object Language (EOL)
• Powerful imperative programming constructs
• Independent of underlying modelling technology
• Interpreted, model-oriented Java + OCL-like language
• Base query language of Epsilon
• Global variables
• Cached operations
• ...and more
5OCL 2018, Copenhagen
6. General challenges / assumptions
• Need to capture state prior to parallel execution
• e.g. Any declared variables need to be accessible
• Side-effects need not be persisted
• e.g. through operation invocations
• Operations should not depend on mutable global state
• Caches need to be thread-safe
• Through synchronization or atomicity
• Mutable engine internals (e.g. frame stack) are thread-local
• Intermediate variables’ scope is limited to each parallel “job”
• No nested parallelism
6OCL 2018, Copenhagen
7. Collection<T> select (Expression<Boolean> predicate, Collection<T> source)
• Filters the collection based on a predicate applied to each element
var jobs = new ArrayList<Callable<Optional<T>>>(source.size());
for (T element : source) {
jobs.add(() -> {
if (predicate.execute(element))
return Optional.of(element);
else return Optional.empty();
});
}
context.executeParallel(jobs).forEach(opt -> opt.ifPresent(results::add));
return results;
OCL 2018, Copenhagen 7
9. T selectOne (Expression<Boolean> predicate, Collection<T> source)
• Finds any* element matching the predicate
• Same as select, except with short-circuiting
for (T element : source) {
jobs.add(() -> {
if (predicate.execute(element))
context.completeShortCircuit(Optional.of(element));
});
}
Optional<T> result = context.awaitShortCircuit(jobs);
hasResult = result != null;
if (hasResult) return result.get();
OCL 2018, Copenhagen 9
10. context.shortCircuit
• ExecutionStatus object used for signalling completion
• “AwaitCompletion” thread waits for completion of jobs
• Also checks whether the completion status has been signalled
• Main thread waits for the ExecutionStatus to be signalled
• Call to context.completeShortCircuit() signals the ExecutionStatus
• “AwaitCompletion” terminates upon interruption
• After control returns to main thread, remaining jobs are cancelled
OCL 2018, Copenhagen 10
11. Boolean nMatch (Expression<Boolean> predicate, int n, Collection<T> source)
• Returns true iff the collection contains exactly n elements satisfying
the predicate
AtomicInteger matches = new AtomicInteger(), evaluated = new AtomicInteger();
for (T element : source) {
jobs.add(() -> {
int evaluatedInt = evaluated.incrementAndGet();
if (predicate.execute(element) && (matches.incrementAndGet() > n ||
sourceSize – evaluatedInt < n - matches.get())) {
context.completeShortCircuit();
}
});
}
return matches.get() == n;
OCL 2018, Copenhagen 11
12. Boolean exists (Expression<Boolean> predicate, Collection<T> source)
• Returns true if any element matches the predicate
• Same as selectOne, but returns a Boolean
var selectOne = new ParallelSelectOneOperation();
selectOne.execute(source, predicateExpression);
return selectOne.hasResult();
OCL 2018, Copenhagen 12
13. Boolean forAll (Expression<Boolean> predicate, Collection<T> source)
• Returns true iff all elements match the predicate
• Delegate to nMatch to benefit from short-circuiting
var nMatch = new ParallelNMatchOperation(source.size());
return nMatch.execute(source, predicateExpression);
• Alternatively, delegate to exists with inverted predicate
OCL 2018, Copenhagen 13
14. Collection<R> collect (Expression<R> mapFunction, Collection<T> source)
• Transforms each element T into R, returning the result collection
• Computationally similar to select, but simpler
• No wrapper required, since we’re performing a one-to-one mapping
var jobs = new ArrayList<Callable<R>>(source.size());
for (T element : source) {
jobs.add(() -> mapFunction.execute(element));
}
context.executeParallel(jobs).forEach(results::add);
return results;
OCL 2018, Copenhagen 14
15. List<T> sortBy (Expression<Comparable<?>> property, Collection<T> source)
• Sorts the collection according to the derived Comparable
• Maps each element to a Comparable using collect
• Sorts the derived collection based on the Comparator property of
each derived element
• Sorting can be parallelised using java.util.Arrays.parallelSort
• Divide-and-conquer approach, sequential threshold = 8192 elements
OCL 2018, Copenhagen 15
16. Map<K, Collection<T>> mapBy (Expression<K> keyExpr, Collection<T> source)
• Groups elements based on the derived key expression
var jobs = new ArrayList<Callable<Map.Entry<K, T>>>(source.size());
for (T element : source) {
jobs.add(() -> {
K result = keyExpr.execute(element);
return new SimpleEntry<>(result, element);
});
}
Collection<Map.Entry<K, T>> intermediates = context.executeParallel(jobs);
Map<K, Sequence<T>> result = mergeByKey(intermediates);
return result;
OCL 2018, Copenhagen 16
17. Testing for correctness
• EUnit – JUnit-style tests for Epsilon
• Testing of all operations, with corner cases
• Equivalence test of sequential and parallel operations
• Testing of scope capture, operation calls, exception handling etc.
• Repeated many times with no failures
OCL 2018, Copenhagen 17
19. Performance evaluation
19
• Execution time on X axis
• Speedup indicated on data points (higher is better)
• Number of threads indicated in parentheses on Y axis
• All tests performed on following system:
• AMD Threadripper 1950X (16 core / 32 threads)
• 32 GB (4 x 8GB) DDR4-3000MHz RAM
• Oracle JDK 11 HotSpot VM
• Fedora 28 OS
OCL 2018, Copenhagen
23. Future Work
• closure
• aggregate and iterate
• Identify bottlenecks to improve performance
• Combine with lazy solution
• More comprehensive performance evaluation
• Test all operations
• Compare with Eclipse OCL
• More varied and complex models / queries
23OCL 2018, Copenhagen
24. Questions?
24
sm1748@york.ac.uk
OCL 2018, Copenhagen
eclipse.org/epsilon
• Data-parallelisation of first-order operations on collections
• Short-circuiting operations more complex to deal with
• Stateful operations, such as mapBy, require different approach
• Significant performance improvement with more cores
• Open-source
github.com/epsilonlabs/parallel-erl
25. Thread-local base delegation example
• Can be used to solve variable scoping
• Each thread has its own frame stack (used for storing variables)
• Each thread-local frame stack has a reference to the main thread’s
frame stack
• If a variable in the thread-local frame stack can’t be found, look in the
main thread frame stack
• Main thread frame stack should be thread-safe, but thread-local
frame stacks needn’t be
25OCL 2018, Copenhagen
26. Control Flow Traceability
• Different parts of the program could be executing simultaneously
• Need execution trace for all threads
• Solution:
• Each thread has its own execution controller
• Record the trace when exception occurs
• Parallel execution terminates when any thread encounters an exception
26OCL 2018, Copenhagen
Notas do Editor
Spend no more than 30 seconds here
Performance issues arise with very large models.
Lazy evaluation can be performed on collections using iterators, which improves performance when chaining operations.
EVL is a hybrid language. It provides declarative structure like OCL but has general-purpose programming constructs.
Lazy initialisation of data structures like caches can also be a problem.
Similarities to “Effectively final” concept in Java lambdas and streams
Note the List: ordering is guaranteed because jobs are submitted sequentially (and queries sequentially)
*Any = not necessarily first
Short-circuit if not enough or too many matches
No new/explicitly parallel implementation needed
mergeByKey code uses Collectors API (omitted for brevity, but should be obvious what the idea is)
SMT only improving performance by +1x
Approximately 2+ hours down to about 10 minutes!