This is my multi-machine disassembler. This is a compilation document; for a usage document, see README. Ideally, it should work to just type "make". But the ideal is not achieved as often as it might be. In roughly decreasing likelihood (in my estimation) of their causing trouble, here are some of the issues you may run into: - I've used a semi-private compiler extension (labeled control structure), which will produce syntax errors in standard C. See below for more on the philosophy of this and how to deal with it. - The Makefile uses the "variable != command" assignment syntax. If this doesn't work for you, either get a real make (like a recent Berkeley make) (heh, are my biases showing?) or just replace such lines with suitable variable settings, such as CC=gcc. - The Makefile runs a command that pipes nm output through sed and an awk script to rebuild machines.c. If you don't have nm, sed, and awk, or if your nm produces output in a form that can't easily be massaged this way, you may have to do away with this rule and put in a hand-written machines.c. - The Makefile uses "ld -X -r" to combine multiple .o files into one. If you're not on a conventional Unix machine, this may cause trouble. You can probably work around it by removing that target and listing the multiple files instead of the single file in the MO variable. - The code assumes a Unix-style I/O subsystem. In particular (and, possibly, among other things), it does not make any distinction between binary and text files; this may cause trouble if you are saddled with an OS that does make such a distinction. - The Makefile compiles a program and runs it to produce another source file. This normally will not be a problem, but it may cause trouble if your CC is a cross-compiler. - The code uses gcc-style nested functions. If your compiler can't handle these, the simplest thing to do from my point of view is to get and use gcc instead. It is probably possible to un-nest them, but it may not be simple to do so; I haven't looked into it in any detail. The philosophical remarks below about compiler extensions apply here too. - If you run into any other problems that you think belong on this list, please send me mail! As for my use of compiler extensions... The code uses two notable extensions to C, one being gcc-style nested functions, the other being labeled control structure. Whether it's appropriate to use nonstandard extensions is arguable; I see arguments in both directions. On the one hand, using extensions makes the resulting code nonportable to compilers that don't also implement them. On the other hand, there's no point having extensions if they never get used. Using gccisms like nested functions doesn't bother me significantly. gcc is widely enough available that there's comparatively little risk it will be unavailable for a target platform (though I realize this may be small consolation for someone wanting to build the code for a platform where it's not), and the gccisms improve the expressive power of the language greatly, making possible a number of fairly clean ways of doing things that are relatively difficult in standard C. Using labeled control structure bothers me a bit more. Like nested functions, it does not, technically, make possible anything that's impossible in standard C, but it makes a number of things clearer. Since my patches to gcc to implement labeled control structure are publicly available, anyone who has gcc to start with (see the previous point) can get labeled control structure support - though it may require porting, or installing another gcc version, if the gcc version at hand is too different from the ones I've added the support to. And I'd like to see such support spread; I think the extension is useful enough that it deserves wider support. Thus, even though it bothers me more, I still come down in favor of using such tags. For those who want the relevant gcc patches, they can be had in various ways. They can be found in my gitification of NetBSD's source tree, which can be got via git clone on git://git.rodents-montreal.org/Mouse/netbsd-fork/5.2/src, .../4.0.1/src, or .../1.4T/src, depending on which version you want; the relevant gcc versions are, respectively, 4.1.3, 4.1.2, and egcs-1.1.2. You can find the relevant commit by looking for a commit whose message is "Add labeled control structure to gcc.". For those who'd rather not grub about in their gcc, I've also written a program designed to be run over preprocessor output, to convert most uses of labeled control structure to something stock gcc can understand. This program is also available as a git repo, git://git.rodents-montreal.org/lcs-cvt in this case. It's designed to be used as in gcc -E foo.c -o tmp.i lcs-cvt tmp.i gcc -c -o foo.o tmp.i For those who would prefer to remove the control structure labels by editing the code, or who want to put them into a different compiler, here's a brief sketch of how they work. Some control structure constructs - while, do, for, and switch - can be labeled with strings. These strings appear in < > immediately following the introductory keyword, as in for <"foo"> (i=0; i<100; i++) { ... } switch <"my top"> (cmdchar()) { ... } do <"err"> { ... } whlie (0); Certain other constructs - break and continue statements, case labels, and default: labels - can have a similar tag: break <"err">; continue <"foo">; case <"my top"> 99: default <"main">: In each case, the tag serves to indicate which control structure statement the tagged construct is to apply to. For example, in for <"foo"> (i=0;i<100;i++) { ... switch (array[i]) { ... break <"foo">; ... } ... } the tag causes the break to be associated with the for loop rather than the switch, thus obviating the need to either use a goto or add an otherwise unnecessary boolean variable. When searching for a matching tag, inappropriate control structure types are ignored. For example, a tagged continue ignores switch() tags, even if they match textually; a tagged case label ignores everything but switches. (I may change this someday, to error if the nearest enclosing matching tagged structure is of the wrong type.) The scope of a tag is the body of the tagged construct. Two constructs with the same name will never produce an error; if they are not nested, there is no ambiguity, and if they are nested, a reference to the tag from within both will refer to the innermost. It is always possible to eliminate tags by sufficient use of gotos. Tagged breaks and continues simply turn into a goto to a label placed appropriately with respect to the matching construct; tagged case and default labels can be moved to the outermost level of their containing switch and then followed by a goto to the place they were moved from. I don't like writing gotos in the original source instead, though: - The scope of a goto label is the entire containing function (admittedly, it's possible to restrict this in gcc). This means that when reading the code, you can't know where control might goto that label from without reading the whole function. It also means that you can't use the same label twice in any given function (problematic for use in macros). - gotos kill optimization more effectively than the corresponding "structured" constructs. - Even if a goto label is carefully and appropriately named, as in "goto break_array_search_loop", the risk exists that someone (perhaps not the person who initially wrote it) used it for some other purpose. A tagged control structure tag _can't_ be used outside of its scope. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B