-
Notifications
You must be signed in to change notification settings - Fork 0
/
WACCLangSpec.tex
605 lines (487 loc) · 29.6 KB
/
WACCLangSpec.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
\documentclass[a4paper]{article}
\setcounter{tocdepth}{3}
% latex package inclusions here
\usepackage{fullpage}
\usepackage{hyperref}
\usepackage{tabulary}
\usepackage{amsthm}
% set up BNF generator
\usepackage{syntax}
\setlength{\grammarparsep}{10pt plus 1pt minus 1pt}
\setlength{\grammarindent}{10em}
% set up source code inclusion
\usepackage{listings}
\lstset{
tabsize=2,
basicstyle = \ttfamily\small,
columns=fullflexible
}
% in-line code styling
\newcommand{\shell}[1]{\lstinline{#1}}
\theoremstyle{definition}
\newtheorem{question}{Gap}
% tagged boxes for fill the gap exercise
\newcommand{\fillgap}[2]{
\begin{center}
\fbox{
\begin{minipage}{4in}
\begin{question}
{\it #1} \hfill ({\bf #2})
\end{question}
\end{minipage}
}
\end{center}
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
\title{The WACC Language Specification}
\date{}
\author{
Second Year Computing Laboratory \\
Department of Computing \\
Imperial College London
}
\maketitle
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{What is WACC?}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
WACC (pronounced ``whack'') is a simple variant on the While family of languages encountered in many program reasoning/verification courses
(in particular in the Models of Computation course taught to our 2nd year undergraduates).
It features all of the common language constructs you would expect of a While-like language,
such as program variables, simple expressions, conditional branching, looping and no-ops.
It also features a rich set of extra constructs, such as simple types, functions, arrays and basic tuple creation on the heap.
The WACC language is intended to help unify the material taught in our more theoretical courses (such as Models of Computation)
with the material taught in our more practical courses (such as Compilers).
The core of the language should be simple enough to reason about
and the extensions should pose some interesting challenges and design choices for anyone implementing it.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{WACC Language Syntax}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
We give the syntax of the WACC language in Backus-Naur Form (BNF)
extended with some basic regular expression notation that simplifies the presentation:
\begin{itemize}
\item $(x$-$y)$ stands for `range', meaning any value from $x$ to $y$ inclusive;
\item $(x)$? stands for `optional', meaning that $x$ can occur zero or one times;
\item $(x)$+ stands for `repeatable', meaning that $x$ can occur one or more times;
\item $(x)$* stands for `optional and repeatable', meaning that $x$ can occur zero or more times.
\end{itemize}
\subsection{BNF}
%%%%%%%%%%%%%%%%%%
\begin{grammar}
<program> ::= `begin' <func>* <stat> `end'
<func> ::= <type> <ident> `(' <param-list>? `)' `is' <stat> `end'
<param-list> ::= <param> ( `,' <param> )*
<param> ::= <type> <ident>
<stat> ::= `skip'
\alt <type> <ident> `=' <assign-rhs>
\alt <assign-lhs> `=' <assign-rhs>
\alt `read' <assign-lhs>
\alt `free' <expr>
\alt `return' <expr>
\alt `exit' <expr>
\alt `print' <expr>
\alt `println' <expr>
\alt `if' <expr> `then' <stat> `else' <stat> `fi'
\alt `while' <expr> `do' <stat> `done'
\alt `begin' <stat> `end'
\alt <stat> `;' <stat>
<assign-lhs> ::= <ident>
\alt <array-elem>
\alt <pair-elem>
<assign-rhs> ::= <expr>
\alt <array-liter>
\alt `newpair' `(' <expr> `,' <expr> `)'
\alt <pair-elem>
\alt `call' <ident> `(' <arg-list>? `)'
<arg-list> ::= <expr> (`,' <expr> )*
<pair-elem> ::= `fst' <expr>
\alt `snd' <expr>
<type> ::= <base-type>
\alt <array-type>
\alt <pair-type>
<base-type> ::= `int'
\alt `bool'
\alt `char'
\alt `string'
<array-type> ::= <type> `[' `]'
<pair-type> ::= `pair' `(' <pair-elem-type> `,' <pair-elem-type> `)'
<pair-elem-type> ::= <base-type>
\alt <array-type>
\alt `pair'
<expr> ::= <int-liter>
\alt <bool-liter>
\alt <char-liter>
\alt <str-liter>
\alt <pair-liter>
\alt <ident>
\alt <array-elem>
\alt <unary-oper> <expr>
\alt <expr> <binary-oper> <expr>
\alt `(' <expr> `)'
<unary-oper> ::= `!' | `-' | `len' | `ord' | `chr'
<binary-oper> ::= `*' | `/' | `\%' | `+' | `-' | `>' | `>=' | `<' | `<=' | `==' | `!=' | `&&' | `||'
<ident> ::= ( `\_' | `a'-`z' | `A'-`Z' ) ( `\_' | `a'-`z' | `A'-`Z' | `0'-`9' )*
<array-elem> ::= <ident> `[' <expr> `]'
<int-liter> ::= <int-sign>? <digit>+
<digit> ::= (`0'-`9')
<int-sign> ::= `+' | `-'
<bool-liter> ::= `true' | `false'
<char-liter> ::= `\'' <character> `\''
<str-liter> ::= `\"' <character>* `\"'
<character> ::= "any-ASCII-character-except-`\\'-`\''-`\"'"
\alt `\\' <escaped-char>
<escaped-char> ::= `0' | `b' | `t' | `n' | `f' | `r' | `\"' | `\'' | `\\'
<array-liter> ::= `[' ( <expr> (`,' <expr>)* )? `]'
<pair-liter> ::= `null'
<comment> ::= `#' ("any-character-except-EOL")* <EOL>
\end{grammar}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{WACC Language Semantics}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
We now go through each of the language components and explain their behaviour and purpose in more detail.
\subsection{Types}
%%%%%%%%%%%%%%%%%%
The WACC language is both statically and strongly typed.
\begin{itemize}
\item Static, in that, once declared, the type of a variable is fixed for the duration of the program.
\item Strong, in that the compiler should not coerce between types.
\end{itemize}
There is also no explicit typecasting.
\paragraph{Basic Types:}
The basic types in the WACC language are:
\begin{itemize}
\item \lit*{int}: The Integer type. Integers in the WACC language can take any value from $-2^{31}$ to $2^{31} - 1$ inclusive.
\item \lit*{bool}: The Boolean type (\shell{true} or \shell{false}).
\item \lit*{char}: The Character type. The WACC language supports only the ASCII characters.
\item \lit*{string}: The String type.
\end{itemize}
We write \lit{T} to denote an arbitrary type.
\paragraph{Arrays:}
As well as the basic types given above, the WACC language also supports the array type.
We write \lit{T[]} to denote an array whose elements are of type \lit{T}.
Note that \lit{T} can be of any type, including another array type, which allows for nested arrays.
In the WACC language, arrays of characters are treated like strings.
Arrays are allocated on the heap.
As well as their elements, each array also tracks its length, which is set when it is created.
\paragraph{Pairs:}
Pairs are allocated on the heap and contain two elements that can be of any type.
We write \lit{pair(T$_1$, T$_2$)} to donate a pair whose first element is of type \lit{T$_1$} and second element is of type \lit{T$_2$}
(these need not be the same).
Note that if either \lit{T$_1$} or \lit{T$_2$} is a pair type, we do not write the type of the sub-elements.
For example, a pair whose first element is an integer
and whose second element is a pair of characters is written as \lit{pair(int, pair)} and not as \lit{pair(int, pair(char, char))}.
It is obvious that we lose some typing information in this way.
\subsection{Program Scopes}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The WACC language includes explicit scoping.
Various statements introduce new program scopes, which have an effect on the visibility of program variables.
Whenever a new variable is declared it is added to the current program scope.
When a program scope is exited, every variable created within that scope is destroyed.
This means that variables are not accessible by statements outside the scope of their creation,
although they are accessible in child scopes.
The main, or global scope is created at the start of a WACC program and is exited at the end of the program.
Functions can only be created at the beginning of this global scope, but they may be called from within any child scope.
We will see that several other program constructs, including functions, while loops and conditional branches,
introduce new program scopes during their execution.
\subsection{Programs}
%%%%%%%%%%%%%%%%%%%%%%
A WACC program \synt{program} consists of zero or more function definitions followed by the body of the main function.
The whole program is written between the \lit{begin} and \lit{end} tokens, denoting the main or global program scope.
\subsection{Function Definitions}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
A function definition \synt{func} consists of a return type, a function name
and zero or more typed parameters followed by the function's body.
A function's body is executed in its own scope, which is denoted by the \lit{is} and \lit{end} tokens.
Functions can only be defined at the beginning of the global scope, before the body of the main function.
Functions may, however, be both recursive and mutually recursive.
Any execution path through the body of the function should end with a \lit{return} statement
whose expression type should match the function's return type. Otherwise, it must end with an \lit{exit} statement.
\subsection{Statements}
%%%%%%%%%%%%%%%%%%%%%%%%
A statement \synt{stat} consists of:
a no-op,
a variable definition,
an assignment,
an input read,
a memory free,
a function return,
an exit call,
a print command,
a conditional branch,
a while loop,
a scope introduction
or the sequential composition of two statements.
We discuss each of these in more detail below.
\paragraph{No-op Statements:}
A no-op statement \lit{skip} does not do anything.
It is used where a statement is expected but we do not want to do anything.
For example, in an \lit*{if} statement where we want to have an empty \lit*{else} clause.
\paragraph{Variable Declaration Statements:}
A variable declaration statement creates a new program variable in the current scope setting its static type and initial value.
The statement must be given a valid WACC type \synt{type}, a variable name \synt{ident} and an initial assignment value \synt{assign-rhs}.
Variable names must not clash with any language keywords, function names or variables in the current scope.
They can consist of any number of letters, upper or lower-case, numbers, or underscores. However they cannot start with a number.
The initial assignment to a variable follows all of the assignment restrictions discussed in detail in the assignment statement section below.
A variable must be declared before any reference is made to it.
Any attempt to access an undeclared variable results in a semantic error, where the variable is not found in the scope.
Additionally, every use of a variable must match the type assigned to that variable when it was declared.
A variable can only be accessed within the scope of its declaration (or any child scope) and it is destroyed when exiting this scope.
Variables must be unique within their scope, so a variable cannot be redefined within the same scope.
Variables can, however, be redefined within a child scope.
In this case, if the value is modified, it will retain its new value. If its type is redefined, it will keep its new value and type for the child scope only.
If a variable is redeclared within a child scope, it is reverted to its original type once the child scope is exited, and the value of the variable will be reverted to the last value assigned before being redefined. If before being redeclared the variable was assigned a new value of the same type as in the parent scope, this value will be reassigned once the child scope terminates.
\paragraph{Assignment Statements:}
An assignment statement updates its target (the left-hand side of the \lit{=}) with a new value (the right-hand side of the \lit{=}).
The target of an assignment can be either a program variable, an array element or a pair element.
The assignment value can be one of five possible types: an expression, an array literal, a function call, a pair constructor or a pair element.
\begin{itemize}
\item If the assignment value is an expression \synt{expr} then
the target and the expression must have the same type.
The expression is then evaluated and the resulting value is copied into the target.
\item If the assignment value is an array literal \synt{array-liter} then the target and value arrays must have the same element type,
but the length can be different.
The value array is then allocated on the heap with each element initialised to the given value. After that, the reference of the value array is copied to the reference of the target array.
For more details on array literals, see the expressions section.
\item If the assignment value is a function call \lit{call} then the target and function return value must have the same type.
The number and types of the function's arguments must also match the function's definition.
A function is called by its name and its arguments are passed by value (for basic types) or by reference (for arrays and pairs).
When called, the function's body is executed in a new scope, not related to the current scope.
The only declared variables are the function's parameters, whose types are set by the function definition and whose values are set by the function call's arguments.
When the execution of the function body terminates, the function's return value is then copied into the assignment target.
\item If the assignment value is pair constructor \lit{newpair} then the target must be of type \lit{pair(T$_1$, T$_2$)}.
The pair constructor is passed two expressions that must match {\tt T}$_1$ and {\tt T}$_2$ respectively.
A \lit{newpair} assignment allocates enough memory on the heap to store the pair structure and its elements.
It then initialises each element of the pair using the evaluation of the first expression for the first element
and the evaluation of the second expression for the second element.
Pairs, in the WACC language, are always used by reference, so a reference to the pair is copied into the target, rather than the actual content of the pair.
\item If the assignment value is a pair element \synt{pair-elem} then the expression passed to the pair element must be of type \lit{pair}
and the target must have the same type as the first (or second) element of the pair when using \lit{fst} (or \lit{snd}) keyword.
The pair expression is evaluated to obtain a reference to a pair and this is dereferenced to find the corresponding pair element,
which is then copied into the target.
\end{itemize}
\paragraph{Read Statements:}
A read statement \lit{read} is a special assignment statement that takes its value from the standard input.
Unlike an assignment statement, it can only be either a program variable, an array element or a pair element.
The read statement determines how it will interpret the value from the standard input based on the type of the target.
For example, if the target is a variable of type \lit{int} then it will convert the input string into an Integer.
\paragraph{Memory Free Statements:}
A memory free statement \lit{free} is used to free the heap memory allocated for a pair or array and its immediate content.
The statement is given an expression that must be of type \lit{pair(T$_1$, T$_2$)} or \lit{T[]} (for some {\tt T}, {\tt T}$_1$, {\tt T}$_2$).
The expression must evaluate to a valid reference to a pair or array, otherwise a segmentation fault will occur at runtime.
If the reference is valid, then the memory for each element of the pair/array is freed, so long as the element is not a reference to another pair or another array
(i.e. free is not recursive).
Then the memory that stores the pair/array itself is also freed.
\paragraph{Function Return Statements:}
A return statement can only be present in the body of a non-main function and is used to return a value from that function.
The type of the expression given to the return statement must match the return type of the function.
Once the return statement is executed, the function is immediately exited.
\paragraph{Exit Statements:}
Exit statements are used to terminate the program instantly and return an exit code. The exit code given to it is an integer.
A semantic error will occur if anything other than an integer is passed to the exit call. If an error code value of negative one is returned,
the program has exited with an error. Any value \lit{n} passed to the exit call greater than 255 or less than 0 will be mapped to the absolute value of
\lit{n} \% 256.
\paragraph{Print Statements:}
There are two types of print command in the WACC language.
The \lit{print} command takes an expression and prints the result of its evaluation to the standard output.
The \lit{println} command is similar, but additionally prints out a new line afterwards.
The output representation of each expression evaluation depends on the type of the expression.
The behaviour of the print statements for each type of expression is shown in Table~\ref{tab:print}, along with some example cases.
%
\begin{table}
\centering
\begin{tabulary}{\textwidth}{C|L|C|C}
\hline
Expression Type & Behaviour & Example Expression & Example Output \\
\hline
\lit*{int} & Output the integer converted to a decimal string. & \lit*{10} & ``10'' \\
\hline
\lit*{bool} & Output ``true'' if the boolean is \lit*{true} and ``false'' otherwise. & \lit*{false} & ``false'' \\
\hline
\lit*{char} & Output a single-character string. & \lit*{'c'} & ``c'' \\
\hline
\lit*{string} or \lit*{char[]} & Outputs a series of characters & ``Hello'' & ``Hello'' \\
\hline
Other Array Types & Outputs the address of array on the heap & [1,2,1] & 0x12008 \\
\hline
\lit*{pair} & Outputs the address of the pair on the heap & \lit*{newpair(a, b)}\footnotemark[1] & 0x12016 \\
\hline
\end{tabulary}
\caption{The behaviour of the print statements for each type of expression.}
\label{tab:print}
\end{table}
\footnotetext[1]{This is not exactly an expression because it can only appears on the right hand side of an assignment. However, it gives the best example here.}
\paragraph{Conditional Branch Statements:}
A conditional branch statement \lit{if} evaluates an expression and determines which program path to follow.
The statement is given a condition expression, that must be of type \lit{bool}, and two body statements, one for the \lit{then} branch and one for the \lit{else} branch.
If the condition evaluates to \lit{true}, then the \lit{then} body statement is executed.
Otherwise, the \lit{else} body statement is executed.
Each of the program branches is executed in its own scope, which are denoted by the \lit{then} and \lit{else} tokens and the \lit{else} and \lit{fi} tokens, respectively.
\paragraph{While Loop Statements:}
A while loop statement \lit{while $(expr)$ do $(stat)$ done} will evaluate the given boolean expression $(expr)$, and if this evaluates to \lit{true}, the body statement $(stat)$ will be executed.
Once $(stat)$ terminates, $(expr)$ will be re-evaluated, and the process is repeated. Whenever the expressions evaluates to \lit{false}, the loop body will not be executed, the $(done)$ state is reached, and the program will execute the next statement.
\paragraph{Scoping Statements:}
A scoping statement introduces a new program scope, which is denoted by the \lit{begin} and \lit{end} tokens.
\paragraph{Sequential Composition:}
Sequential composition consists of two statements separated by a semicolon. The program will execute the first statement, then
proceed to immediately execute the next.
\subsection{Expressions}
%%%%%%%%%%%%%%%%%%%%%%%%%
A expression \synt{expr} consists of
a literal (integer, boolean, character, string or pair),
a variable,
an array element,
a unary expression,
a binary expression
or an expression enclosed by parenthesis.
We discuss the meaning of each of these expressions in more detail below.
The expressions of the WACC language have been chosen to be side-effect free.
When evaluating an expression, it cannot alter the state of the program.
\paragraph{Integer Literals:}
An integer literal \synt{int-liter} consists of a sequence of decimal digits.
Optionally, the sequence can be preceded by a \lit{+} or a \lit{-} symbol.
\paragraph{Boolean Literals:}
A boolean literal \synt{bool-liter} is either \lit{true} or \lit{false}.
\paragraph{Character Literals:}
A character literal \synt{char-liter} is a single ASCII character between two \lit{\char`'} symbols.
A \lit{\char`\\} can be used to escape the character that immediately follows the \lit{\char`\\}.
The meaning of each escaped character is shown in Table~\ref{tab:escapedcharacters}.
%
\begin{table}
\centering
\begin{tabular}{cclc}
\hline
Representation & ASCII Value & Description & Symbol \\
\hline
\lit*{\char`\\ 0} & \lit*{0x00} & null terminator & NUL \\
\lit*{\char`\\ b} & \lit*{0x08} & backspace & BS \\
\lit*{\char`\\ t} & \lit*{0x09} & horizontal tab & HT \\
\lit*{\char`\\ n} & \lit*{0x0a} & line feed & LF \\
\lit*{\char`\\ f} & \lit*{0x0c} & form feed & FF \\
\lit*{\char`\\ r} & \lit*{0x0d} & carriage return & CR \\
\lit*{\char`\\ "} & \lit*{0x22} & double quote & " \\
\lit*{\char`\\ '} & \lit*{0x27} & single quote & ' \\
\lit*{\char`\\ \char`\\} & \lit*{0x5c} & backslash & \textbackslash \\
\hline
\end{tabular}
\caption{The escaped-characters available in the WACC language.}
\label{tab:escapedcharacters}
\end{table}
%
\paragraph{String Literals:}
A string literal \synt{str-liter} is az sequence of characters between two \lit{"} symbols.
Each character in the string literal can be escaped in the same way as in character literal.
\paragraph{Pair Literals:}
The only pair literal \synt{pair-liter} is \lit{null} which represents a reference that does not point to any pair.
To see how pairs are created, read the \lit{newpair} case of the assignment statement.
\paragraph{Array Literals:} Array literals cannot occur directly in expressions, but they do occur in the WACC language as assignment values.
An array literal starts with a \lit{[} token and ends with a \lit{]} token.
The elements of the array (zero or more) are given between these brackets and are separated by \lit{,} tokens.
All elements of an array must be of the same type, so the type of any non-empty array literal can be statically determined.
If, however, an array literal is empty, we allow it to be of any array type.
For example, the array \lit{[]} can be of type \lit{int[]},\lit{bool[]}, \lit{char[]}, etc... depending on the context, but the array \lit{[1]} must be of type \lit{int[]}.
\paragraph{Variables:}
When a variable expression \synt{ident} is evaluated it returns the value of that variable.
If the variable is of type \lit{T} then the return type of the expression is also \lit{T}.
\paragraph{Array Elements:}
An array element expression evaluates to return an element from an array.
The expression consists of two sub-expressions, the first of which must be of type \lit{T[]} and the second of which must be of type \lit{int}.
The return type of the overall expression is \lit{T}.
The first expression is evaluated to find an array \lit{a} and the second is evaluated to find an index \lit{i}.
The overall expression returns the element at the index \lit{i} of array \lit{a}, that is, \lit{a[i]}.
If the array has length $l$ then the index \lit{i} must be between $0$ and $(l - 1)$,
otherwise the expression will generate a runtime error.
\paragraph{Unary Operators:}
A unary operator \synt{unary-oper} has a single sub-expression.
The unary operators available in the WACC language are shown in Table~\ref{tab:unary}.
All unary operators have the same precedence, they are evaluated from right to left.
%
\begin{table}
\centering
\begin{tabulary}{\textwidth}{CCCL}
\hline
Operator & Argument Type & Return Type & Meaning \\
\hline
\lit{!} & \lit*{bool} & \lit*{bool} & Logical Not \\
\lit{-} & \lit*{int} & \lit*{int} & Negation \\
\lit{len} & T\lit*{[]} & \lit*{int} & Array Length \\
\lit{ord} & \lit*{char} & \lit*{int} & Gives the ASCII numeric value of the character \\
\lit{chr} & \lit*{int} & \lit*{char} & Gives the character whose ASCII value matches the number given \\
\hline
\end{tabulary}
\caption{The unary operators of the WACC language with their types and meanings.}
\label{tab:unary}
\end{table}
\begin{itemize}
\item The \lit{!} operator performs a logical Not operation on the result of evaluating its sub-expression,
returning \lit{true} if the sub-expression evaluates to \lit{false} and vice-versa.
\item The \lit{-} operator inverts the sign of the evaluation of its sub-expression.
\item The \lit{len} operator returns the length of the array referenced by the evaluation of its sub-expression.
\item The \lit{ord} operator returns the ASCII value for the given character.
\item The \lit{chr} operator returns the ASCII character corresponding to the given integer. The integer given must be in the range [0, 256), since there are only 256 characters.
\end{itemize}
\paragraph{Binary Operators:}
A binary operator is used in in-fix style between two sub-expressions.
The binary operators available in the WACC language are shown in Table~\ref{tab:binary}.
The operators have different precedences, as illustrated in the table,
with 1 being the highest and 6 being the lowest.
%
\begin{table}
\centering
\begin{tabulary}{\textwidth}{CCCCCL}
\hline
Operator & Precedence & Argument 1 Type & Argument 2 Type & Return Type & Meaning \\
\hline
\lit{*} & 1 & \lit*{int} & \lit*{int} & \lit*{int} & Multiply \\
\lit{/} & 1 & \lit*{int} & \lit*{int} & \lit*{int} & Divide \\
\lit{\%} & 1 & \lit*{int} & \lit*{int} & \lit*{int} & Modulus \\
\lit{+} & 2 & \lit*{int} & \lit*{int} & \lit*{int} & Plus \\
\lit{-} & 2 & \lit*{int} & \lit*{int} & \lit*{int} & Minus \\
\lit{>} & 3 & \lit*{int},\lit*{char} & \lit*{int},\lit*{char} & \lit*{bool} & Greater Than \\
\lit{>=} & 3 & \lit*{int},\lit*{char} & \lit*{int},\lit*{char} & \lit*{bool} & Greater Than or Equal \\
\lit{<} & 3 & \lit*{int},\lit*{char} & \lit*{int},\lit*{char} & \lit*{bool} & Less Than \\
\lit{<=} & 3 & \lit*{int},\lit*{char} & \lit*{int},\lit*{char} & \lit*{bool} & Less Than or Equal \\
\lit{==} & 4 & \lit*{exp} & \lit*{exp} & \lit*{bool} & Equality \\
\lit{!=} & 4 & \lit*{exp} & \lit*{exp} & \lit*{bool} & Inequality \\
\lit{\&\&} & 5 & \lit*{bool} & \lit*{bool} & \lit*{bool} & Logical And \\
\lit{||} & 6 & \lit*{bool} & \lit*{bool} & \lit*{bool} & Logical Or \\
\hline
\end{tabulary}
\caption{The binary operators of the WACC language, with their types and meanings.}
\label{tab:binary}
\end{table}
%
\begin{itemize}
\item The \lit{*}, \lit{/}, \lit{\%}, \lit{+}, \lit{-}, \lit{>}, \lit{>=}, \lit{<} and \lit{<=} operators
all have their standard mathematical behaviour, where integer underflow/overflow results in a runtime error.
If the divisor of a division (\lit{/}) or modulus (\lit{\%}) operator is evaluated to \lit{0}, then this also results in a runtime error.
The result of a division operation is positive if both its dividend and divisor have the same sign, and negative otherwise.
The result of a modulus operation has the same sign as its dividend.
\item The \lit{==} operator performs an equality test on the evaluations of its sub-expressions.
It accepts any two expressions of the same type.
When applied to expressions of type \lit{int}, \lit{bool} or \lit{char}, the result is true iff the content of the two arguments are the same.
When applied to expressions of type \lit{T[]} or \lit{pair}, the result is true iff the two references point to the same object of the same type.
Otherwise, the result is false.
\item The \lit{!=} operator returns the opposite result to the \lit{==} operator.
\item The \lit{\&\&} operator performs a logical And operation on the result of evaluating its sub-expressions,
returning \lit{true} if both sub-expressions evaluate to \lit{true} and \lit{false} otherwise.
\item The \lit{||} operator performs a logical Or operation on the result of evaluating its sub-expressions,
returning \lit{true} if either sub-expression evaluates to \lit{true} and \lit{false} otherwise.
\end{itemize}
\paragraph{Parenthesis:}
We can introduce a pair of parenthesis around an expression to control its evaluation.
The expression in a parenthesis is always evaluated first, regardless of the operator precedence.
\subsection{Whitespace and Comments}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Whitespace is used in the WACC language to delimit keywords and variables.
For example, \lit{if a == 13} denotes the start of an \lit{if} statement with boolean condition \lit{a == 13},
whereas \lit{ifa == 13} denotes a boolean expression comparing the variable \lit{ifa} with the value \lit{13}.
Any other type of occurrence of whitespace is ignored by the compiler.
Note, in particular, that the code indentation in the example programs has no meaning, it simply aids readability.
Also note that whitespace inside a string or character literal is preserved by the compiler.
Comments are defined by a hash symbol followed by any number of characters except the End of Line Character which marks the end of a comment. Comments are ignored by the program, and are used as guidance when reading the code.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection*{Acknowledgements}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The WACC language was developed by Tienchai Wirojsaksaree as part of a UROP placement over the summer in 2013.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{document}