Doppl is a new programming language being developed that aims to provide natural syntax for parallel programming. The language is focused on shared memory applications and message passing between tasks. This first development diary outlines the goals of the language, which include allowing programmers to model algorithms as state machines and represent data as attribute arrays to improve cache performance. Future iterations will explore additional features like loop structures and type systems.
1. DOPPL
Data Oriented Parallel Programming Language
Development Diary
Introduction
Diego PERINI
Department of Computer Engineering
Istanbul Technical University, Turkey
2013-04-08
1
2. Abstract
This paper stands for the very first development diary entry for Doppl, a new programming
language that aims providing a natural syntax for implementing parallel algorithms, designing data
structures for shared memory applications and automated message passing among multiple tasks.
Development lifecycle of the language is planned as consecutive iterations that are going to be
documented separately. Any declared proposition about language terminology or lexical assets in each
iteration may be subject to change on subsequent diary entries.
1. Introduction
Serial code compilation whether from a high level language or a virtual machine bytecode is
simply a translation from a valid syntactically organized declarations and instructions into metaphorical
Turing Machine operations. Such computer represents the main philosophy current programming
languages are designed on which happened to be very efficient and complete to describe what
computation stands for.
Current computers do their operations, nonetheless in a slightly improved fashion where there
are multiple needles working on a single magnetic tape. Despite the fact that what happens in contrast
still remains the same, the design procedure performed by the programmer turned out to be outdated as
current multitasking approaches are constructions of workarounds that encapsulate serial coding paradigm
with easy to use function, class or subroutine libraries. Whilst their efficiency and aid for current
advancements in information technology is undeniable, what multitasked programming requires a more
abstract model to represent simultaneous machine instruction executions.
The remainder of this paper is set out as follows. Section 2 provides a literature review of
programming languages that benefits from parallel programming style. Section 3 outlines the hypothesis
behind Dopple, Section 4 considers to the method used to address the research question, Section 5
outlines the result. Section 6 discusses these results and identifies implications for future developments.
Finally Section 7 summarizes the paper.
2. Literature Review
2.1 Declarative Programming Languages
By definition, declarative programming focuses on what computation should be performed
instead of how it should be computed. Such approach usually forces programmer to abandon imperative
coding style since declarations themselves are effective enough to mimic looped structures as well as
consecutive instruction calls.
Functional programming is a subset of declarative programming which is built on lambda
calculus. It intrinsically neglects function side effects and guaranties immutable data structures as one can
only define a binding in terms of a function. Overwritten bindings in function programming languages
is automatically denied and often generate a compiler or interpreter error. Immutable data structures
does not respond to inner changes therefore objects of values can only be altered by copy and new
bindings which labels these languages without side effects. Subroutines without side effects can easily
be optimized and deforested thanks to their ability to provide the same result for same inputs. Parallel
2
3. optimizations are easy to integrate in this kind of languages likewise. Haskell, Clojure and Scheme are
some of the leading, pure functional programming languages.
Domain specific languages such as Make, SQL and Regular Expressions are also counted as
declarative languages. Their declarations often consist of state transition rules and constraint definitions
which is used to define synchronized program executions or element filtering on block data. Provided
their concerned datasets never change, these languages can also be counted as languages without side
effects as well. Furthermore, as long as execution dependencies permit processes to operate without
barriers, these languages compile or interpreted into highly parallelized processor instructions.
2.2 State Machines
Communication, barriers and wait locks in parallel computation often correspond to state changes
or are able to be encapsulated into metaphorical states, thus giving the programmer to express the
computed algorithms in terms of finite state machines. Data decomposition, message exchange, reduced
collections becomes states of operated data and operating processes where state variables and output form
the conditions for the state changes.
Moore machine is a type of finite state machine whose output is calculated using solely state
variables. State transition tables of Moore machines associate outputs of a node with pointed edges ending
at another (can also end on self) node. Such property provides the programmer to model an imperative
behavior using deterministic Moore tables as these tables become roadmaps for imperative function calls.
Mealy machine is a type of finite state machine whose output is calculated using both state and
input variables. State transition tables of these machines associate a tuple of output and state variables
with pointed edges ending at another node. Such property provides the programmer to model functional
behavior using Mealy tables as these tables become roadmaps for nested and recursive function calls.
2.3 Data Oriented Programming Paradigm
High requirements of cache optimizations and multilevel cache mechanisms on multicore
processors conflicted with object oriented approach for computations interested in large array of several
object properties.
Assuming a block memory full of allocated objects of type T that has attributes of x, y, z
is constructed. Any computation on solely xs of all objects of T requires all x, y and z values to be
summoned to memory cache since in object oriented languages, objects encapsulate these attributes in
a sticky manner. This type of summoning culminates with high amount of cache misses due to cache
overflow caused by unnecessary fetch operations for non required y and z attributes.
Data oriented programming suggests that object arrays should be expressed as structures of
attribute arrays instead of arrays of attribute structures, with a side effect of these objects becoming
natively singleton. Object instantiation in such approach results with multiple value append operations
on x, y, z arrays instead of a new (x, z, y) tuple allocation. Losing object referencing using pointers to
array index grants the processor the ability to fetch only required attribute arrays to cache, culminating
relatively high amount of cache hits which is a dramatic performance improvement for real time,
3
4. clustered, parallel algorithms. 3D Computer graphic shaders are often applied to pixels or vertices using
these types of data structures to satisfy real time rendering constraints.
3. Hypothesis
Multitasked programming in currently preferred general purpose languages highly relates to
callback design, automated clustering via hardware accelerated computing (e.g Microsoft HLSL and
computer graphics pipeline), asynchronous event polling, predefined signals among tasks and stateful
objects or protocols. Targeted methodology for Doppl use case scenarios highly abstracts these topics and
encourages the programmer to use non complex Doppl syntax to achieve same effects as these patterns,
hence assuming that any parallel behavior or computation can still be implementable freed from these
burdens.
Doppl accuses that customary definitions of processes and threads no longer validates current
purpose of these tools. Despite their differences in terms of operating system implementations, both tools
provide the same functionality via different software interfaces and system calls, giving Doppl a chance to
merge the two into a single unit, a widely accepted figure of speech, a task.
A task is a specialized computation agent which can be cloned, forked and distributed over a
number of processors or cores. Instances of the same tasks may work on shared, private or composite data
with the restriction of forced appliance of the same logic. Such limitation enforces the programmer to
design different tasks for different kind of computations which in fact gives the designer to model
program logic as pipelined MIMD (Multiple Instructions Multiple Data) flow charts free of language
constraints and additional utility concerns. Since different operating systems handle task concurrency
differently on hardware level, first iteration of Doppl development does not adopt threads or processes as
task baselines leaving the discussion to further diary entries.
4. Research Method
Doppl is planned to be a compiled language that executes real machine instructions instead of a
virtual one. The main reason behind this decision comes from data oriented design and targeted, relatively
high cache hit ratio. Since a Doppl program is assumed to be highly parallel, a multicore environment can
only be benefitted once the software is able to interact with the environment directly. Despite the fact that
JIT compilation of bytecode languages are no longer considered slow, an abstraction of each processor
architecture with a virtual machine hinders designing a generic, data oriented memory organization
template.
First iteration of Doppl does not cite any compilation tool for the language, however a cross
language compiler is likely to be implemented in the future iterations. Target compiled language is
planned to be C, due to its ability to be executed on almost any type of architecture. GNU C Compiler
(gcc) is the prefered C compiler that will be used to create the final executables.
5. Limitations, Future Research and Conclusion
State machines are able to simulate loops via circular, recurring transition paths. Since
Doppl ecosystem is formed by states, synchronization points, stateless operations and transition rules
(conditionals), a loop snippet or block structure to create loops will not be covered by the language
standard.
4
5. Task members will be typed statically and allocated to guarantee data oriented cache formation.
Their evaluation however can be lazy if required operations are compute heavy. Lazy operations will
never cause non-deterministic results.
Regular functions in most common programming languages will be available via member traits of
tasks which are closures of functions encapsulated with the relevant member data accessed via a language
operator. The language standard is planned to provide per type traits for default, common operations.
User defined traits will also be included and can be implemented as distinct Doppl tasks by programmers
themselves.
Data hiding will only be applied among tasks. Access modifiers will therefore only indicate
whether the tasks will share their members or not. Shared members will always be available on shared
memory for high parallelization.
Predefined types, custom types, immutable members, code imports, source code encoding and
dynamic data allocations/bindings are decided to be as future research subjects.
5