Page 26 - Krész, Miklós, and Andrej Brodnik (eds.). MATCOS-13. Proceedings of the 2013 Mini-Conference on Applied Theoretical Computer Science. Koper: University of Primorska Press, 2016.
P. 26
the most common use case, the system accepts a script Each type module can also define a parser hook for con-
written in our domain-specific language. This script de- stants. While looking for constants, the lexer gives full con-
scribes the data-flow graph (nodes and connections) using trol to such hooks. This means that the syntax of constant
standard function calls (this could be achieved in any other values can be completely arbitrary. One could even include
language if the system was converted into a library). an interpreter for another programming language and use
that to generate a constant by running a program.
First, the system loads and initialises I/O and operation
modules. This is followed by initialisation of the compiler, 3.2 Grammar
which also loads the type modules. Then the script is com-
piled into machine code in memory and executed. The script The grammar for our language in BNF [1]:
forms the data-flow graph to be executed in parallel. The
graph is executed after script execution finishes. identifierexpr ::= identifier | identifier ‘(’ expression * ‘)’
The main advantage of having external modules for I/O, constexpr ::= const
operations, and types is greater flexibility, as the main pro-
gram doesn’t need to be rewritten or recompiled to support parenexpr ::= ‘(’ expression ‘)’
new file formats, operations, and data types.
ifexpr ::= ‘if’ expression ‘then’ expression
The subsequent sections of this paper contain more detailed ‘else’ expression
explanations of the various subsystems.
forexpr ::= ‘for’ type identifier ‘=’ expression ‘,’
3. OUR DOMAIN-SPECIFIC LANGUAGE expression [‘,’ expression ] ‘do’ expression
The main purpose of the language we created is to make withexpr ::= ‘with’ type identifier [‘=’ expression ]
developing new graph operations easier, especially for people (‘,’ type identifier [‘=’ expression ])* ‘do’
who aren’t professional programmers. Its secondary purpose expression
is to provide a language in which users can describe the
data-flow graph, however, this functionality could easily be primary ::= identifierexpr
accomplished using an existing language. | constexpr
| parenexpr
Our language is functional, strongly typed and compiled into | ifexpr
machine code for greater performance. | forexpr
| withexpr
Optimisations and machine code generation are provided by
the LLVM framework (http://www.llvm.org/). The com- unary ::= primary | unop unary
piler we developed compiles source code into LLVM inter-
mediate representation, on which we run various LLVM op- binoprhs ::= ( op unary )*
timisation passes and finally generate native machine code
for the architecture of the machine the program is running expression ::= unary binoprhs
on. The code is compiled directly to memory, no temporary
files are created during compilation. prototype ::= type identifier ‘(’ [ type identifier [‘,’
type identifier ]*] ‘)’
3.1 Type system
func definition ::= ‘func’ prototype expression
All types in the language are implemented as external shared
objects (plugins). Each plugin module implements functions extern definition ::= ‘extern’ prototype
for generating code for operators (e.g. adding two real num-
bers, multiplying a matrix with a vector, etc.) and parsing program ::= ( func definition | extern definition )*
constants (optional). The code generation functions actually
generate LLVM intermediate representation instructions. The symbol identifier represents the name of a variable
or function. It can contain alphanumeric characters and
Apart from operators, each module can also generate code underscores, but cannot start with a digit.
for various library functions for that type (e.g. functions
for dot and cross product in the case of a 3D vector type). const represents a constant value, which can be parsed by
One of the library functions for composite types (e.g. vectors any of the type modules. When encountering this symbol,
and matrices) is also the constructor for that type. The the lexer goes through all the defined hooks in type modules
constructor function allocates memory for the structure and until it finds one that accepts the constant.
sets its members to the values passed to the constructor.
type represents a type keyword. Each type module defines
One of the advantages of generating code for operators and a unique keyword for its type.
functions (as opposed to simply implementing them and gen-
erating function calls instead) is that this enables further Comments in the language begin with a # sign and continue
optimisations. until the end of the line.
The precedence of operators is shown in table 1.
Operators Precedence
; 1 (binds weakest)
=2
or 6
xor 7
and 8
==, <> 9
<, >, <=, >= 10
+, - 20
*, / 40 (binds strongest)
Table 1: Precedence of binary operators.
Currently, only two unary operators are implemented —
unary minus (-) and logical negation (not).
m a t c o s -1 3 Proceedings of the 2013 Mini-Conference on Applied Theoretical Computer Science 26
Koper, Slovenia, 10-11 October
written in our domain-specific language. This script de- stants. While looking for constants, the lexer gives full con-
scribes the data-flow graph (nodes and connections) using trol to such hooks. This means that the syntax of constant
standard function calls (this could be achieved in any other values can be completely arbitrary. One could even include
language if the system was converted into a library). an interpreter for another programming language and use
that to generate a constant by running a program.
First, the system loads and initialises I/O and operation
modules. This is followed by initialisation of the compiler, 3.2 Grammar
which also loads the type modules. Then the script is com-
piled into machine code in memory and executed. The script The grammar for our language in BNF [1]:
forms the data-flow graph to be executed in parallel. The
graph is executed after script execution finishes. identifierexpr ::= identifier | identifier ‘(’ expression * ‘)’
The main advantage of having external modules for I/O, constexpr ::= const
operations, and types is greater flexibility, as the main pro-
gram doesn’t need to be rewritten or recompiled to support parenexpr ::= ‘(’ expression ‘)’
new file formats, operations, and data types.
ifexpr ::= ‘if’ expression ‘then’ expression
The subsequent sections of this paper contain more detailed ‘else’ expression
explanations of the various subsystems.
forexpr ::= ‘for’ type identifier ‘=’ expression ‘,’
3. OUR DOMAIN-SPECIFIC LANGUAGE expression [‘,’ expression ] ‘do’ expression
The main purpose of the language we created is to make withexpr ::= ‘with’ type identifier [‘=’ expression ]
developing new graph operations easier, especially for people (‘,’ type identifier [‘=’ expression ])* ‘do’
who aren’t professional programmers. Its secondary purpose expression
is to provide a language in which users can describe the
data-flow graph, however, this functionality could easily be primary ::= identifierexpr
accomplished using an existing language. | constexpr
| parenexpr
Our language is functional, strongly typed and compiled into | ifexpr
machine code for greater performance. | forexpr
| withexpr
Optimisations and machine code generation are provided by
the LLVM framework (http://www.llvm.org/). The com- unary ::= primary | unop unary
piler we developed compiles source code into LLVM inter-
mediate representation, on which we run various LLVM op- binoprhs ::= ( op unary )*
timisation passes and finally generate native machine code
for the architecture of the machine the program is running expression ::= unary binoprhs
on. The code is compiled directly to memory, no temporary
files are created during compilation. prototype ::= type identifier ‘(’ [ type identifier [‘,’
type identifier ]*] ‘)’
3.1 Type system
func definition ::= ‘func’ prototype expression
All types in the language are implemented as external shared
objects (plugins). Each plugin module implements functions extern definition ::= ‘extern’ prototype
for generating code for operators (e.g. adding two real num-
bers, multiplying a matrix with a vector, etc.) and parsing program ::= ( func definition | extern definition )*
constants (optional). The code generation functions actually
generate LLVM intermediate representation instructions. The symbol identifier represents the name of a variable
or function. It can contain alphanumeric characters and
Apart from operators, each module can also generate code underscores, but cannot start with a digit.
for various library functions for that type (e.g. functions
for dot and cross product in the case of a 3D vector type). const represents a constant value, which can be parsed by
One of the library functions for composite types (e.g. vectors any of the type modules. When encountering this symbol,
and matrices) is also the constructor for that type. The the lexer goes through all the defined hooks in type modules
constructor function allocates memory for the structure and until it finds one that accepts the constant.
sets its members to the values passed to the constructor.
type represents a type keyword. Each type module defines
One of the advantages of generating code for operators and a unique keyword for its type.
functions (as opposed to simply implementing them and gen-
erating function calls instead) is that this enables further Comments in the language begin with a # sign and continue
optimisations. until the end of the line.
The precedence of operators is shown in table 1.
Operators Precedence
; 1 (binds weakest)
=2
or 6
xor 7
and 8
==, <> 9
<, >, <=, >= 10
+, - 20
*, / 40 (binds strongest)
Table 1: Precedence of binary operators.
Currently, only two unary operators are implemented —
unary minus (-) and logical negation (not).
m a t c o s -1 3 Proceedings of the 2013 Mini-Conference on Applied Theoretical Computer Science 26
Koper, Slovenia, 10-11 October