BooMake

Boo as a Build Language

I recently read Martin Fowler's article on Rake and became inspired to try using Boo as a build language. As you will see, it has many of the same advantages as Ruby in being a host language for DSLs (Domain Specific Languages).

Build languages like make or nant are classic DSLs, although the term little language has been around a lot longer and in fact is an important part of the Unix development philosophy of specialized tools that do one thing well. Whether a language is considered 'little' or 'big' is often a matter of opinion. LISP, Smalltalk, even C, are all small languages (measured in terms of syntax and minimum deployment size) but they have been used for Big Things. (A lame old-fashioned language like Fortran is bigger than C, since it implements I/O as part of the language, but that becomes a limitation.) People revel in fine distinctions and no doubt it is a harmless sport.

Rake is an embedded DSL because it is hosted within a big language. That offers big benefits to both developer and user; the developer doesn't have to muck around reinventing a high-level programming language, and the user has the comfort and power of a tested programming language. It's easy to do unusual things without having to extend the language. Consider make; its power comes from having all those zillions of Unix commands floating around - it is much less powerful in non-POSIX environments. And one usually gets a bizarre mixture of make and shell script which is (shall we say) less than straightforward to maintain. nant is certainly cleaner, and one can write custom tasks in CLI languages, but one still has to 'drop out of the DSL' (as Fowler puts it) to do non-trivial things. XML is excellent for specifying hierarchical data, but not so good as a syntax for programming.

It is self-evident that some languages are better at implementing DSLs than others. This is often characterized as a static/dynamic, build/script thing, but it is the more forgiving and less dogmatic syntax of dynamic languages which makes them shine at this task. Here is a build script written in Boo, to do a very make-ish task. (Note that although there is a lot of type-inference going on, it is a statically-typed program that can be built as an executable.)

import BooMake
OBJS = ['one.o','two.o']
if "-debug" in argv:
	CFLAGS = '-g'
else:
	CFLAGS = '-O2'
LFLAGS = ''
NAME = 'test-one.exe'

# rule to make .o files from .c files
rule('.c','.o') def(f):
	exec("gcc -c ${CFLAGS} ${f}")

target(NAME,OBJS) do:
	exec("gcc -o ${NAME} ${LFLAGS}",OBJS) 

# these use the generic rule defined above...
target('one.o',['one.c','one.h','two.h'])
target('two.o',['two.c','one.h'])

# unconditional targets can of course contain any Boo code.
# But it's nice to have a few 'builtins' like delete() available
# for common tasks.
target('clean') do:
	delete(OBJS)

go(argv)
Apart from the first line (import the necessary library) and the last line (start the program), it looks very much like a make script written using Python syntax. The conventions are the same; the first target is the executable, and this will be executed by default. 'test-one.exe' explicitly depends on the object files 'one.o' and 'two.o', and the action is to shell out GCC. The object files depend on their corresponding source files, and some header files; the action is implicitly supplied by a custom rule.

There is some extra syntax involved; we need to quote strings and put them in lists, and to remember not to say $(var), etc. But generally this is pretty readable; the syntax of Boo does not get in the way. Curiously enough, it is the Ruby-like parts of Boo which make it such an excellent host for this mini-language; the string interpolations (with $ instead of #) and the closures. It would not work half as well with Python, because Python closures ('lambdas') are not so powerful. Why? Ultimately, because Guido can't see the point of multi-line lambdas. (It appears that he would like to get rid of lambdas altogether, but the protests would be too great.).

Making Builds Compiler-Agnostic

Non-trivial make tasks become easier because a BooMake script can contain arbitrary Boo code. Consider the task of building this little application with two very different compilers, GCC and Microsoft CL. The compiler-independant part looks like this:

NAME = 'test-one.exe'
OBJS = mapobj(['one','two'])

target(NAME,OBJS) do:
	build(NAME,OBJS)

# these use the generic rule defined above...
otarget('one',['one.c','one.h','two.h'])
otarget('two',['two.c','one.h'])

target('clean') do:
	delete(OBJS)
Since CL and GCC use different extensions for object files, they cannot be specified explicitly. We need a utility function and a custom target rule which put in the extension:
otarget = def(name as string,deps as List):
	target(name+EXT,deps)
	
mapobj = {ls | [f + EXT for f in ls]}
The implicit rule for building object files, and the explicit rule for building the executable can be specified conditionally:
USE_CL = '-ms' in argv
DEBUG = "-debug" in argv

if USE_CL:
	EXT = '.obj'
	if DEBUG:
		CFLAGS = '/Zi'
	else:
		CFLAGS = '/O2'		
	rule('.c','.obj') def(f):
		exec("cl /nologo /c ${CFLAGS} ${f}")
	build = def(name,obj as List):
		exec("cl /nologo /Fe${name} ",obj)
else:
	EXT = '.o'
	if DEBUG:
		CFLAGS = '-g'
	else:
		CFLAGS = '-O2'
	# rule to make .o files from .c files
	rule('.c','.o') def(f):
		exec("gcc -c ${CFLAGS} ${f}")
	build = def(name,obj as List):
		exec("gcc -o ${name} ",obj)
So everything except the compiler-independant part is reusable and hides the nasty details of building C programs on different platforms. A fully general solution would be tedious (there are a lot of flags for C compilers!) but it would be possible.

I don't doubt make could be persuaded to do this, with plenty of bash script. But the result would not be as pretty or as maintainable.

A Non-trivial NET Program: Sciboo

Generally, BooMake is intended for building CLI programs, but the principles remain the same. Here is how I set about building Sciboo, which is a Boo application which uses ScintillaNET (a C# wrapper of the Scintilla edit control.)

First, some constants are defined. BooMake supplies a function list which creates a list from a space-separated list of files. (Note the convenience of triple-double-quoted verbatim strings.)

SCIBOO = 'sciboo.exe'
SCIBOO_SRC = list("""
sciboo.boo findandrun.boo futils.boo modefile.boo commands.boo
persistance.boo scintillaex.boo tabform.boo
""")
SCIBOO_RESOURCES='sciboo.resources'
RESOURCEGEN = 'write-icon-resources.boo'
PLUGINS = list("ctags.scx complete.scx word-complete.scx scriptlet.scx macro.scx help.scx ctrlq.scx")
SCNETDLL = 'ScintillaNET.dll'
SCNETDIR = 'ScintillaNET'
Sciboo is extendable with plugins, which are .NET assemblies with a '.scx' extension. They are loaded dynamically and need to link to the Sciboo executable itself. Generally they need to be rebuilt after the main application. Here is an implicit rule for making extensions with Boo; you will always have the built-in variable TARGET available when implementing rules:
rule('.boo','.scx') def(f):
	exec("booc -out:${TARGET} -r:${SCIBOO} -r:${SCNETDLL} ${f}")
The make script needs to specify a target for rebuilding everything, and a specific rule for building Sciboo:
target("all",[SCIBOO] + PLUGINS)

target(SCIBOO,SCIBOO_SRC + [SCIBOO_RESOURCES,SCNETDLL]) do:
	exec("booc -t:winexe -out:${SCIBOO} -r:${SCNETDLL} -resources:${SCIBOO_RESOURCES}",SCIBOO_SRC)
exec is overloaded; this version takes an extra argument which is a list and expands it; it's equivalent to appending {' ' + join(ls)} to the string.

ScintillaNET is built from the C# source:

target(SCNETDLL,glob(SCNETDIR,'*.cs',true)) do:
	exec(CSC  + targetname(SCNETDLL) + recurse(SCNETDIR) + unsafe())
The convenience function glob makes a list of all the specified files, recursing into directories if the last argument is true; the source files are the only dependencies of ScintillaNET.

When writing BooMake, I often missed named parameters. The closest Boo comes to this is the ability to put public field assigments in the constructor call for a class, which could be used here, although having to create a class to call a function seems inelegant. The {exec} call here shows a compromise; the detailed flag syntax is handled by convenience functions like {targetname} and {recurse}.

The resources are built in a somewhat eccentric fashion, by calling a Boo script.

target(SCIBOO_RESOURCES,[RESOURCEGEN]) do:
	exec("booi ${RESOURCEGEN}")

write-icon-resources.boo is not a complicated piece of code, and we could have put it inline at this point. The problem then would be how to tell when the resources were out-of-date.

Finally, we need to build the plugins. This is something that would strain make's build model; target is just a function, so it can be called for all plugins in the list.

# generate the plugin targets..
for f in PLUGINS:
	target(f,[change_extension(f,'.boo'),SCIBOO,SCNETDLL])

So a fairly complex custom build procedure can be written in 34 lines, which is cool. I don't believe in compression for its own sake, of course; line count has become a kind of inverse measure of programmer efficiency, and can thus become a perverse incentive. But this script reads as a straighforward description of the targets of the project, how they depend on each other, and how they must be built.

Building Boo without nant.

Here's another non-trivial build task: rebuilding Boo from source. This is usually done by Nant, and it has many complex subtasks, like running antlr, etc. This version just builds the assemblies from the source, which is contained in sub-directories of 'src'. These subdirectories are named after the assemblies.

import BooMake

die("not in the Boo bin directory!") unless exists("booc.exe")

SOURCE=combine('..','src')
CSC='csc /nologo'  # can override this...
TARGETS = [] # will be filled from the following targets...
# these all have corresponding directories under src:
BooLang = 'Boo.Lang.dll'
BooLangCompiler = 'Boo.Lang.Compiler.dll'
BooLangInterpreter = 'Boo.Lang.Interpreter.dll'
BooLangParser = 'Boo.Lang.Parser.dll'
BooLangUseful = 'Boo.Lang.Useful.dll'
Booc = "booc.exe"
Booi = "booi.exe"
Booish = "booish.exe"
Resources = "strings.resources"
ResourceSrc = "../src/Boo.Lang/Resources/strings.txt"
The {die} call is a nice example of Boo in its Perl mood; it ensures that this program will not be run unless we are in the Boo bin directory. Please note that TARGETS is empty; we'll have to fill it in!

The Boo source is mostly C#, but some assemblies are in Boo. Here is a custom target to handle the C# cases:

filter = def(list,ext):
	return [f for f in list if extension(f) == ext]

csctarget = def(name,deps):
	path = combine(SOURCE,filepart(name))
	TARGETS.Add(name)	
	res = filter(deps,".resources")
	refs = filter(deps,".dll")	
	target(name,glob(path,'*.cs',true) + (deps as List)) do:
		exec(CSC + targetname(name) + targetrefs(refs) + resources(res) + recurse(path))
For example, Boo.Lang.dll has a source directory '../src/Boo.Lang', and has ['strings.resources'] as its dependency list. So res has one member, and refs is empty; the BooMake functions targetrefs and resources return empty strings if they're passed empty lists.

There is one ugliness; Boo can't deduce that deps is a list on its own. We could have declared it explicitly in the parameter list, but a typecast does just as well. (All these examples are built with Boo's usual static typing - duck-typing has not been switched on.)

The custom Boo target is similar; the compiler doesn't understand '-recurse:' so we have to feed it the files explicitly:

bootarget = def(name,refs):
	path = combine(SOURCE,filepart(name))
	TARGETS.Add(name)
	files = glob(path,'*.boo',true)
	target(name,files + (refs as List)) do:
		exec('booc' + targetname(name) + targetrefs(refs), files)

(There are a few opportunities here for the obsessive refactorizer.)

Here's the rest:

target("all",TARGETS)

target(Resources,[ResourceSrc]) do:
	exec("resgen ${ResourceSrc} ${Resources}")
		
csctarget(BooLang,[Resources])
csctarget(BooLangCompiler,[BooLang])
bootarget(BooLangInterpreter,[BooLang])
bootarget(BooLangUseful,[BooLang,BooLangParser])
csctarget(Booc,[BooLang,BooLangCompiler,BooLangParser])
bootarget(Booi,[BooLang,BooLangCompiler])
bootarget(Booish,[BooLang,BooLangCompiler,BooLangParser])

What would be very interesting is to rewrite the full nant build (1091 lines) in BooMake, and see how this compact notation compares in readability.

The ability to do non-trivial programming in a build language leads to interesting applications. For instance, the makefile for SciTE (my favourite editor on Linux) includes a machine-generated file of dependencies in make format:

DirectorExtension.o: DirectorExtension.cxx \
  ../../scintilla/include/Platform.h ../../scintilla/include/PropSet.h \
  ../../scintilla/include/SString.h ../../scintilla/include/Scintilla.h \
  ../../scintilla/include/Accessor.h ../src/Extender.h \
  DirectorExtension.h ../src/SciTE.h ../src/SciTEBase.h
...
Now why not just read this file and make the target calls? The backslashes are no problem; here's a generator method Lines which will feed us the reconstructed full lines. Just for fun I've written it in 'whitespace agnostic Boo' syntax (which requires the -wsa compile flag):
def Lines(file as string):
	inf = StreamReader(file)
	sb = StringBuilder()
	for line in inf:
		sb.Append(line[0:-1])
		if not line.EndsWith("\\"):
			yield sb.ToString()
			sb = StringBuilder()
		end
	end
end

for line in Lines('deps.mak'):
	targetFile, depStr = /:/.Split(line,2)
	target(targetFile,list(depStr))
	OBJS.Add(targetFile)
end
Seen in this unfamiliar guise, Boo starts looking like Ruby on .NET.

The full makefile for SciTE is an example of something which is very platform-specific. It has to do a little shell magic to find out the GTK version, and whether Gnome is around, etc. An interesting exercise would be to write one make script which can build SciTE on all platforms.

In Conclusion

BooMake is currently just under 300 lines, and performs well at its limited role in life. The actual meat of the code is in the first 140 lines, which define the classes {Rule} and {Target}. The basic operation is straightforwardly recursive: we look at all the dependencies of a target, and ask them to update themselves. Then we ask if the target file is older than any of the dependencies; if so, the action is fired. Afterwards, all targets are examined and those containing the target are updated.

There are some command-line flags which are handled specially; if '-test' is found, the actual commands will not be excuted, but the target files touched instead. '-verbose' makes BooMake show every timestamp comparison.

'-debug' will result in '-debug+' being added when targetname expands.

There are some gotchas which come from how Boo closures work. In my first attempt at the Sciboo build script, I put the explicit rule in like so:

for f in PLUGINS:
	sourceFile = change_extension(f,'.boo')
	target(f,[sourceFile,SCIBOO,SCNETDLL]) do:
		exec("booc /out:${f} /r:${SCIBOO} /r:${SCNETDLL} ${sourceFile}")
That's cute, but doesn't work for an interesting reason. The closure has a reference to the variable sourceFile, but it isn't bound at that point. Thereafter executing the closure depends on the last value of the variable, which is not what we need here!

To be a useful production tool, there are some gaps that need to be filled. It should know what .NET/Mono versions are installed and find the appropriate compiler (csc or msc). Currently it's awkward switching between a plain exe and a winexe target. There are probably a few useful helper functions which need to go in to avoid the inconsistency of having to import {System.IO} explicitly. You can download the current source and examples here.

However, BooMake was not intended to be immediately useful, but as a demonstration of the point that Boo is well suited to the task of creating embedded DSLs.