This is a simple, yet powerful, code translator allowing to use Python-like syntax to write D programs.
It operates by interpreting indents, and some keywords to translate it to a standard D programming language program.
Functions need to be prefixed by def
, to make it clear to the parser that they
are functions. Other whitelisted constructs do not require that, as they
already start with a specific keyword and finish with a colon.
This allows to write D code like this:
import std.stdio
class A:
public:
int f
def this():
f = 4
writefln("Constructor")
def ~this():
writefln("Deconstructor")
def void times(int k):
writefln("%d", k * g())
private:
def int g():
return f + 5
def int main(string[] args):
auto a = new A()
a.times(10)
return 0
See tests/test*.dt
files for more examples.
Enjoy.
By default dmt
tool will convert a source code in files ending with the .dt
extension, and pass it to a D compiler. By default the dmd
compiler will be
used. The compiler can be changed by environment variable DMD
, for example:
DMD=ldc2 dmt test4.dt
. If multiple files are passed, they are all converted and
passed together to dmd
.
Options:
--keep
- keep transformed temporary files (.d
files)--convert
- just convert and keep temporary files, do not calldmd
or remove files.--overwrite
- overwrite temporary files if they already exist--run
- compile first.dt
source and run it, passing remaining arguments--pipe
- transform a single.dt
file to.d
and output it on a standard output.
*.d
and *.o
arguments, and all other options starting with a dash, like for
example -O
, -inline
, -c
, are passed to dmd
untouched in the same order
and relations to file as on the dmt
command line.
Just add on a first line of the "script" this:
#!/usr/bin/env -S dmt --run
and make the script executable. And make sure that dmt
is in your PATH
(or
use an absolute path to dmt). Extra arguments from the execution will be passed
to your script main
function as normal.
You can change dmd
to ldc2
using DMD=ldc2
environment variable, or do it
explicitly in the script #!
:
#!/usr/bin/env -S DMD=ldc2 dmt --run
You can also manually run script, by using -run
when invoking dmt
:
dmt -run foo.dt
You can pass multiple source files if you wish.
Using import std
to import entire Phobos is handy, but if you script is on a
network connected file system (like sshfs
), it will considerably slow down the
compilation, as the compiler is trying to find files to import relative to the
currently compiled file first, and parse more files too. It is ok for some
prototyping, but importing more specific modules is a better long term solution
(less likely to break too).
dmt
has just one source file: dmt.d
. Compile it as you like into executable,
using your favorite D compiler.
You can also just use make
(if using dmd), or DMD=ldc2 DMDFLAGS= make
(if
using ldc), or DMD=gdc DMDFLAGS= make
(if using gdc).
Indentations only follow after a line starts with a specific keyword, and is finished with colon.
For example:
if (x):
f()
...
Function and methods should be prefixed with def
:
def int f(int b, int c):
...
class C:
def int f():
...
If you want to declare a function (i.e. C function) or define an interface
method, you can do that on a single line without using def
, just be sure not to
finish it with the colon. Semicolon (;
) is optional and should be avoided, as
dmt
will add one for you.
extern(C) int g(int a, char* c)
interface IFoo:
int f()
def
if
,else
for
,foreach
,foreach_reverse
while
,do
struct
,union
class
,interface
,abstract class
,final class
enum
(but read limitations section how to use them)template
mixin
template definitionin
,out
,body
sections of a function, method or interface contractinvariant
try
,catch
,finally
switch
,final switch
case
anddefault
can be indented or not, your choice
with
scope(exit)
,scope(failure)
,scope(success)
- only these three, with no spaces between tokens.
synchronized
static if
,static foreach
,static foreach_reverse
version
unittest
asm
Note: scope class
(storage type constraint on class instances) is not
supported, becasue it is officially deprecated feature of the language, and might
be removed in the future. If you really want to use it, use
def scope class ...:
Note: public:
, private:
, protected:
, package:
, export:
, extern(...):
,
pragma(.....):
, @nogc:
, nothrow:
, @trusted:
, @safe
, pure:
,
@system:
, @live
, @property
, static:
, extern:
, abstract:
, final:
,
override:
, auto:
(yikes!), shared:
, __gshared:
and similar existing
attributes used in attribute:
form, or user defined UDAs, neighter allow new
indent level by themselves, nor they introduce any {}
-style block for
subsequent statement. However, if set in some specific scope, they only applies
to the definitions in that scope, at that nesting level, up until the end of that
scope. This is the same semantic as in a normal D programming language.
This allows easier changing of attributes for subsequent symbols and declaration in the module, classes and other places, which IMHO is more common form than, block style changes of attributes.
So for example in following class:
class A:
private:
int a
float b
public string c
def auto sum():
return a + b
members, a
, b
, sum
will be private. c
will be public.
If you want, you can use line continuation form together with def:
still to
achive block-level change:
class A:
private \
def:
int a
float b
public string c
def auto sum():
return a + b
In this case, a
and b
will be private, but sum
and c
will be public
(sum
because this is a default visibility level, and c
because it was
explicitly set so more specifically).
Arrays and associative arrays, unfortunately can't be formatted with alignment or indents at the moment:
int[] a = [
1,
2,
3,
]
will not work. Sorry.
Also enum
support is limited, but can be done with some workarounds:
enum E { A, B, }
enum E:
A, \
B \
Examples:
def auto f():
return x + \
y + z
and
def auto f():
return x + \
y + z
Examples:
writeln(a, b, \
c + 5)
int z = (a + b + \
c + d)
You can use space or tabs, or mix of them. But consistency is required for matching indent levels.
If you open a new indent level, the next line must have the same indent as the previous line + new indent.
So for example, this is allowed:
if (a):
\tif (b):
\t if (c):
\t f(a, b, c)
but this is not:
if (a):
\tif (b):
if (c)
\t f(a, b, c)
Also, the if
/else
, in
/out
/body
, case
/default
,
try
/catch
/finally
must match properly:
This is ok:
if (a):
if (b):
f(a, b)
else:
f(0, 0)
but this is not:
if (a):
if (b):
f(a, b)
else:
f(0, 0)
Indentations INSIDE the if
/else
(and other matching ones) blocks can differ if
you want.
This is ok, but is not recommended:
if (a):
f(a)
else:
f(0)
Multi-statement lines. You can put multiple statements on a line, by simply using semicolon:
if (a):
f(a); g(a)
else:
f(0); g(0)
You should avoid putting a semicolon (;
) after the last statement, as this will
most likely trigger a D compiler warning.
Empty blocks. You can put an empty block, by not indenting, or by putting
manually {}
as an empty statement:
while (a-- && b--):
writefln(a, b)
or more readable:
while (a-- && b--):
{}
writefln(a, b)
If you are unhappy with this form, create a nop function called pass
, and do:
while (a-- && b--):
pass
writefln(a, b)
Non-indented forms. It is possible to do non-indented forms, by omitting the colon at end, like:
if (a) f(a)
while (a--) writefln(a)
Note, that you can't do more than one statement, unless you actually put it in brackets
if (a) f(a); g(a) // WRONG / MISLEADING
Because, this will call g
even if f
is not called. This is because this code
is literally translated to D just like on the input (plus semicolon at the end).
Use instead:
if (a) { f(a); g(a); } // BETTER
For the same reason, you should be careful about opening curly brackets
if (a) { f(a);
g(a); }
It works, but defeats the entire purpose of the dmt
.
Also, at the moment dmt
is kind of all or nothing. You can't just throw an
existing D code into it, because it most likely has indent in it, that will not
work. At least dmt
will detect it:
if (a) {
f(a); // UNEXPECTED INDENT
g(a);
}
In the future it might be possible to add pragma
or comment based directives,
to enable / disable dmt
processing.
Another issue is commenting blocks of code:
def void f(int a):
/+
if (a):
return 5;
else:
if (a > 10):
return a * 10
return 1
+/
Will not-work. Because of unexpected indent in the processed lines.
The line continuation marker (\
at the end of the line), allows you do indent
next line arbitarly, and does not introduce the {
in translated code. The line
continuation can be continued on subsequent lines, but should respect
indentations and de-indentations.
writeln(a + (c \
+ d * (x \
- y) \
+ e))
should work. Additionally, it is allowed to put line comments (block comments will lead to unintended results) between such lines:
auto x = a \
+ b \
// foo bar
+ c \
+ d
(this is somehow implicit - the semicolon will be inserted, but because it is in a comment, it will not be a problem).
If you want to split a template or function signature declaration across multiple lines, use this trick:
int f(int a, int b, \
int c, int d) \
def:
return a + b + c + d
(do
can also be used instead of def
).
- Refactor
convert
API to allow unittests of it. - Once refactored, make
convert
functionality available as a CTFE-able function. Together withq{}
strings andimport()
expressions, this could be really cool. - Add directives and flags and environment variables to enforce indent style (i.e. tabs, spaces, amount, etc)
- Make
dmt
self hosting (dmt.d
converted todmt.dt
, and provide a bootstrap binary, or a simplified / older implementation ofdmt.d
, to do a bootstrap process) - Parse comments better and handle them properly.
- Improve support for function / method contracts (
in
,out
,do
,body
), with line-continuations it now works, but would be nice to make it even better - Syntax highlighting and auto-indent hints for mcedit, vim, emacs and vs code
- Convert to a
dub
package? - Enforce same alignment of
case
anddefault
inswitch
. This is should be possible to disable tho, because of ability to add switchcase
cases usingstatic foreach
for example - Enforce
catch
andfinally
to have same indent astry
, similar howelse
needs to have same indent asif
. - Once
gdc
compiler catches up and supportsdo
(instead ofbody
). DMD 2.097.0 will start producing deprecation notices about usage ofbody
. Once this is in gdc, we can also start usingbody
as identifier, instead ofbdy
(i.e. indecompose
andconvert
). - Similarly once
gdc
compiler supports "new" short style versions of function contracts (in (AssertExpression)
,out ([ref] ident; AssertExpression)
), convert to using them.
There was no profiling or deeper optimizations done yet with dmt
, but on my
machine in release mode, it processes 1.14 million lines per second, and
processes 37MB/s (this is quite dependent on the average line length in the
source file) from the input. Pretty good. This certainly can be improved, but is
plenty fast, and for big projects with many files, the conversion process can be
fully parallelized in the build system. A moderatly complex module with 1000
lines converts in just 5ms.
Note that some features familiar from Python, are not implemented and not supported. They might be supported in the future, but that will require a more complex parser. Some examples are listed below.
def auto f(a, b,
c, d):
// ...
will not work.
This is unlikely to be implemented. It is quite limiting and can be annoying, but at the same time, some might argue it is a good thing. Just keep your lines reasonably short, or assign sub-expression to own variables.
This could be easily resolved for the majority of cases, but probably will not be implemented.
For statements and expressions, as a work around, simply put everything on a single line, or use line continuations:
auto x = f(a, b, \
c, d)
For function, method, class, templates, and other definitions / declarations, maybe try this:
auto f(a, b, \
c, d) \
def:
// ...
or for functions and methods specifically:
auto f(a, b, \
c, d) \
do:
// ...
(do
is equivalent to body
).
Other example:
class A : B, \
C!int, \
D!int \
def:
def int f():
return 1
A trick might be to use a dummy private:
or a comment.
def class A:
private:
def interface I:
// nothing
using {}
, will not work, because directly inside aggregate declarations are
expected, not BlockStatement.
Note, that this will definitely not work or be implemented:
if (a): f(a)
if (a): f(a); f(b)
while (a--): writefln(a); f(a)
because, it is really tricky to parse without a full D language parser.
Multi-line comments using block comments are limited. Each line must not start with space, and must be all aligned to same as first line:
/** Foo
* bar
*/
is not allowed, because a second line is indented more than the first one.
/** Foo
** bar
**/
could work. Other option is to do this:
/*
Foo
bar
*/
Note, however, that dmt
adds a semicolon (;
) at the end of each line, so this
is equivalent to:
/*;
Foo;
bar;
*/;
Note the semicolon at the very end of the last line. This could be important if
you put a comment after for example if
statement, without curly braces:
if (a)
/*
Foo
bar
*/
g()
In this case the body of if
will be EMPTY. And g
will be called
unconditionally.
Do not do silly things. And just use //
-style comments if possible.
Also mentioned before, you might opt out of some indenting features, but that makes it awkward and not nice at all:
if (a) {
f(a)
} else {
g(a)
}
Do not do that. Hopefully in the future, dmt
will actually reject such
constructs.
Multi-line string literals / raw literals, and quoted tokens, will often not work properly:
int a = "foo
bar
baz"
will make the dmt
confused, and generate an error.
Only option is to use string concatenation operator (~
) together with explicit
line continuation (to prevent emitting a semicolon by dmt
):
int a = "foo\n" \
~ "bar\n" \
~ "baz"
That however, requires implementing multi-line continuation first in dmt
.
Multi-line formatted arrays / lists, like this:
auto a = [
"foo": 1,
"bar": 3,
]
will not work. Sorry.
One of the easier option is to introduce explicit keyword, and maybe do this:
def_array auto a = [
]
Comments are not supported after a colon (:
)
foreach (a; l): // iterate list
f(a)
will not work.
int a = 5
auto l = delegate int(int b):
return a + b
will not work.
int a = 5
auto l = delegate int(int b) {\
return a + b
}
is an option probably, so is:
int a = 5
auto l = delegate int(int b) \
def:
return a + b
;
(Note explicit semicolon ;
after finishing the def:
block, to finish the
assignment statement.)
Other option is to abandon anonymous delegates, and define named inner function:
int a = 5;
def int f(int b):
return a + b
auto l = &f
extern (C++) interface IFoo:
...
private class X:
...
To work around this, simply add a def
at the start:
def extern (C++) interface IFoo:
...
def private class X:
...
asm
can be used, but all instructions and labels must be at the same level of indentation:
asm:
call L1
L1:
pop EBX
mov pc[EBP],EBX ; // pc now points to code at L1
There is no easy way to introduce a new local scope. Just use if (true)
for the
moment:
if (true):
auto x = 6
writefln(x)
// writefln(x) // x not in local scope.
or better def
with nothing after:
def:
auto x = 6
writefln(x)
// writefln(x) // x not in local scope.
It is a bit awkward, but works.
Named and anonymous enum
definitions are not currently supported:
enum X:
A,
B,
C
The reason is because they require commas at the end of each line, but dmt
inserts semicolons at the end of each such line. Possible solutions would be make
special case for enum
indents, but that requires more than a simple one-level
hack, because of things like this:
enum X:
A = 1
version(a):
B = 5
else:
B = 9
One of the options would be to explicitly prefix enumerations with some keyword:
enum X:
enumvalue A = 1
version(a):
enumvalue B = 5
else:
enumvalue B = 9
This should also be possible then (because trailing commas are ok):
enum X:
enumvalue A, B, C
enumvalue D, E, F
As a workaround use line continuations:
enum X:
A, \
B, \
C \
The line continuation marker is required after the last element of the enum too, even if it is end of the file. It is safe to de-indent on a next line:
enum X:
A, \
B, \
C \
enum Y:
F, \
To add attributes to unittests, use def
:
def @safe nothrow unittest:
{}
/// Bzium
def private unittest:
{}
At the moment, it is required to put parentheses around the conditions, just like in D:
if (a > 5 && b > 3):
f()
but it should not be hard to allow also these forms:
if a > 5 && b > 3:
f()
At the moment it is not supported. This requires a bit of evaluation, to not hide
possible coding errors, like if (a = 5):
, which are currently detected by D
compilers.
Simply do not use def
for declaration of @disable
d functions
class C:
@disable int foo();
because def
requires colon and opens a new scope:
class C:
def @disable int foo():
return int.init;
and that will most likely upset the compiler. Making the return type a void
could help.
Unfortunately, dmt
at the moments does not support contracts.
def int f(int b):
in:
assert(b < 10)
out (ret):
assert(ret < 1000)
do:
return b * b * b
def int f(int b):
in (b < 10)
out (ret; ret < 1000)
body:
return b * b * b
unfortuantely will not compile. A workaround is to use line-continuations in a bit hacky, but a reasonable way:
int f(int b) \
in:
assert(b < 10)
out (ret):
assert(ret < 1000)
do:
return b * b * b
def int f(int b) \
in (b < 10) \
out (ret; ret < 1000) \
body:
return b * b * b
A D code like this:
import std.stdio, std.array, std.algorithm;
void main() {
stdin
.byLineCopy
.array
.sort!((a, b) => a > b) // descending order
.each!writeln;
}
is somehow tricky to convert to dmt
format, without introducing ugly code:
import std.stdio, std.array, std.algorithm
def void main():
stdin \
.byLineCopy \
.array \
.sort!((a, b) => a > b) \
.each!writeln