ZAPF

Version 0.3, April 2010

Introduction

ZAPF, the Z-machine Assembler Program of the Future, is an assembler for the Z-machine interactive fiction platform. It provides nearly complete control over the Z-machine's memory layout, and supports two assembly syntaxes: the default syntax is similar to the original ZAP used by Infocom, and a syntax similar to Inform's assembler can also be selected.

ZAPF is a managed application and has been tested under Microsoft .NET (on Windows) as well as Mono (on Linux and Mac OS X).

To use ZAPF, you should be familiar with the Z-machine architecture and instruction set. Refer to the Z-Machine Standards Document if not, but note that the Standards Document uses the Inform opcode names (see the "-i" switch below).

Usage

The simplest way to assemble a file called "foo.zap" is with the command:

zapf foo.zap

Or, if using Mono:

mono zapf.exe foo.zap

This will use the default (Infocom) syntax and generate an output file named according to the Z-machine version, for example "foo.z3".

More options are available: start ZAPF with no parameters for details. In particular, you can change the output filename by specifying a new name after the input filename, and you can select the Inform syntax by specifying the "-i" switch before the input filename. You can change the Z-machine version with the "-v" switch, but the .NEW directive is preferred (see below).

Syntax

A ZAPF input file consists of comments, labels, directives, and instructions. One instruction or directive is allowed per line. Comments and labels may appear on any line, even lines with no instruction or directive. Blank lines are ignored.

Note: directives, instructions, labels, and all other names in ZAPF are case-sensitive.

Comments

Comments are ignored by the assembler. A comment begins with a semicolon and continues until the end of the line:

; This is a comment all by itself
ADD X,Y >Z        ; This is a comment after an instruction

Labels

Labels associate a name with a location in the output file. A label consists of a word followed by one or two colons. A label may appear before an instruction or directive, or by itself, but only one label may appear on a line.

A label with one colon is "local" and can only be referenced within the same routine (see the .FUNCT directive below). The name can be reused in other functions. Local labels may not be defined before the first .FUNCT directive.

A label with two colons is "global" and can be referenced from anywhere else, thus the name must be unique within the whole program. On Z-machine versions 3 and 4, certain global labels have special meaning and must be defined somewhere in the program: see "Version Considerations" below.

Directives

Directives are special commands to the assembler. Some directives cause data to be written to the output file; others merely affect how other parts of the file are interpreted. Directive names must always be given in uppercase.

Some directives take one or more expressions as parameters. Such an expression can be either a number (a positive or negative decimal integers), a global symbol (the name of a global label, object, constant, etc.), or the sum of two or more numbers or constant names connected by "+" signs.

Some directives take a string as a parameter. Strings are delimited by quotation marks and may contain line breaks. If a string contains a quotation mark, the quotation mark must be doubled.

Some directives take one or more names as parameters. Names must be words containing only A-Z (uppercase or lowercase), digits 0-9, and specific punctuation: hyphen (-), dollar sign ($), hash mask (#), ampersand (&), or period (.). In the default syntax mode, question mark is also allowed, and apostrophe is allowed except at the beginning of the name; in Inform syntax mode, question mark and apostrophe are forbidden, but underscore is allowed instead.

= (equal sign)

<name>=<expression>

Defines the specified name as a global constant whose value is given by the expression. The name may then be used later in the file in place of the expression.

.BYTE

.BYTE <expression> [,<expression>,&ellipsis;]

Writes one or more data bytes to the output file.

If a variable name is given as one of the expressions, the variable's number will be written, not its value.

.END

.END

Marks the end of the program.

.ENDI

.ENDI

Marks the end of an inserted file.

.ENDT

.ENDT

Marks the end of a table. If an expected size was supplied in the matching .TABLE directive, and the actual size of the table doen't match, ZAPF will print a warning message.

.FSTR

.FSTR <name>,"string"

Writes an encoded string to the output file, and defines the specified name as a global symbol pointing to it (a word address, suitable for use in the WORDS table). If necessary, a zero byte will be written first to ensure that the string starts at an even address.

The string is also entered into the internal abbreviation table and automatically used to abbreviate game text. All abbreviations must be defined before any code or data that contains strings.

Note: this directive should not be used inside the <WORDS> table.

.FUNCT

.FUNCT <routine name> [,<local name> [=<expression>],&ellipsis;]

Writes a routine header to the output file, and defines the specified name as a global symbol pointing to it (a packed address, suitable for use with a CALL instruction). If necessary, one or more zero bytes will be written first to ensure that the routine starts at an address divisible by 2, 4, or 8 (depending on the Z-machine version).

This directive also clears any local symbols that were defined previously. If any additional names are specified after the routine name, they will be defined as local variables. On Z-machine versions 3 and 4, expressions may also be given to define the initial values for the local variables; on later versions, local variables are always initialized to zero, and ZAPF will print a warning if any default values are given.

.GSTR

.GSTR <name>,"string"

Writes an encoded string to the output file, and defines the specified name as a global symbol pointing to it (a packed address, suitable for use with a PRINT instruction). If necessary, one or more zero bytes will be written first to ensure that the string starts at an address divisible by 2, 4, or 8 (depending on the Z-machine version).

.GVAR

.GVAR <name> [=<expression>]

Defines the specified name as a global symbol pointing to the next unused global variable slot, and writes the variable's initial value to the output file. If an expression is given, it will be used as the initial value; otherwise the initial value will be zero. This directive should be used in the GLOBAL table.

.INSERT

.INSERT "filename"

Assembles the specified file in place of this directive, then resumes at the next line of the current file. The inserted file should end with a .ENDI directive.

If a file with this exact name is not found, ZAPF will try adding a ".zap" or ".xzap" extension before finally giving up.

.LEN

.LEN "string"

Encodes a string (without writing it to the output file), then writes a byte to the output file indicating the number of words taken up by the encoded form of the string.

.NEW

.NEW <expression>

Sets the Z-machine version number. Acceptable values range from 3 to 8.

.OBJECT

Z-machine version 3:

.OBJECT <name>,<flags1>,<flags2>,<parent>,<sibling>,<child>,<properties>

Z-machine versions 4 and up:

.OBJECT <name>,<flags1>,<flags2>,<flags3>,<parent>,<sibling>,<child>,<properties>

Writes an object record to the output file, and defines the specified name as a global symbol pointing to the next unused object number. This directive should be used in the OBJECT table.

All parameters after the name are expressions whose values are written into the object record. Typically, flags1, flags2, and flags3 are constants or sums of constants, parent, sibling, and child are object names, and properties is a global label pointing to a property table defined elsewhere.

.PROP

.PROP <length>,<number>

Writes a property header to the output file. The parameters are expressions giving the length (in bytes) of the property data which follows and the property number, respectively. This directive should be used in property tables referenced by the .OBJECT directive.

Note: this directive does not begin or end the property table. The property table must begin with a length-prefixed string (see .STRL) and end with .BYTE 0.

.STR

.STR "string"

Writes an encoded string to the output file.

.STRL

.STRL "string"

Writes an encoded string to the output file, prefixed by a byte indicating the number of words taken up by the encoded string. This is equivalent to .LEN followed by .STR for the same string.

.TABLE

.TABLE [<expression>]

Begins a table definition, which must be ended later with .ENDT. The expression, if specified, indicates the length of the table in bytes; .ENDT will print a warning if the table size is incorrect.

Table definitions may not be nested.

.VOCBEG

.VOCBEG <record length>,<key length>

Begins a block of sorted records, which must be ended later with .VOCEND. Record length and key length are expressions giving the length (in bytes) of each record, and of the sort key, which must appear at the beginning of each record.

Records within the block will be rearranged in increasing order of their sort keys, treating the key as a big-endian number. Within the block, labels may only appear at the beginning of a record: that is, at a multiple of record length bytes after .VOCBEG. The labels will be updated as the records are moved.

Typically this directive is used in the VOCAB table to sort dictionary words. In this case, record length should be the length of an entire dictionary entry, and key length should be the length (in bytes!) of a dictionary word for the selected Z-machine version (4 in version 3, or 6 in all later versions).

Sorted blocks may not be nested.

.VOCEND

.VOCEND

Ends a block of sorted records started with .VOCBEG.

.WORD

[.WORD] <expression> [,<expression>,&ellipsis;]

Writes one or more data words to the output file.

Note: the .WORD directive itself is optional. If one or more expressions separated by commas are written on a line, without a directive or instruction name in front, ZAPF will write them to the output file as data words.

ZWORD

.ZWORD "string"

Writes an encoded string to the output file as a dictionary word. The string will be padded or truncated to contain the correct number of Z-characters for the Z-machine version (6 in version 3, or 9 in all later versions).

Debugging Directives

These directives cause ZAPF to generate records for a debug information file which can be loaded into an interpreter for source-level debugging. Refer to the Inform Technical Manual for the format of this file and the purposes of these records. A debug information file will be generated if, and only if, at least one debugging directive is present.

.DEBUG-ACTION <expression>,"name"
.DEBUG-ARRAY <expression>,"name"
.DEBUG-ATTR <expression>,"name"
.DEBUG-CLASS "name",<file1>,<line1>,<col1>,<file2>,<line2>,<col2>
.DEBUG-FAKE-ACTION <expression>,"name"
.DEBUG-FILE <num>,"include name","file path"
.DEBUG-GLOBAL <expression>,"name"
.DEBUG-LINE <file>,<line>,<col>
.DEBUG-MAP "key name" = <expression>
.DEBUG-OBJECT <expression>,"name",<file1>,<line1>,<col1>,<file2>,<line2>,<col2>
.DEBUG-PROP <expression>,"name"
.DEBUG-ROUTINE <file>,<line>,<col>,"routine name" [,"param name",&ellipsis;]
.DEBUG-ROUTINE-END <file>,<line>,<col>

Note that "file" expressions must be given as file numbers, referencing a .DEBUG-FILE directive given previously.

.DEBUG-ROUTINE and .DEBUG-ROUTINE-END should appear before and after a routine, respectively. Any .DEBUG-LINE directives in between will be associated with that routine.

Instructions

Two Syntaxes

The "-i" switch affects instructions in two ways. First, it changes the general syntax of operands, stores, and branches, as shown in the following table.

Default syntax Inform syntax
Plain instruction MOVE x,y insert_obj x y
Store ADD x,y >r add x y -> r
Branch EQUAL? x,y /label je x y ?label
Negated branch EQUAL? x,y \label je x y ?~label
Branch to return ZERO? x /TRUE jz x ?rtrue

Second, it changes the opcode names from Infocom's original names to the names used in the Z-Machine Standards Document, as shown in the following table. Note that opcode names are case-sensitive in both modes. (Also note that CHECKU and PRINTU were not in Infocom's original design.)

Default syntax Inform syntax Default syntax Inform syntax Default syntax Inform syntax
ADD add HLIGHT set_text_style PRINTT print_table
ASHIFT art_shift ICALL1 call_1n PRINTU print_unicode
ASSIGNED? check_arg_count ICALL2 call_2n PTSIZE get_prop_len
BAND and ICALL call_vn PUSH push
BCOM not IGRTR? inc_chk PUTB storeb
BOR or INC inc PUTP put_prop
BTST test IN? jin PUT storew
BUFOUT buffer_mode INPUT read_char QUIT quit
CALL1 call_1s INTBL? scan_table RANDOM random
CALL2 call_2s IRESTORE restore_undo READ aread / sread
CALL call_vs ISAVE save_undo REMOVE remove_obj
CATCH catch IXCALL call_vn2 RESTART restart
CHECKU check_unicode JUMP jump RESTORE restore
CLEAR erase_window LESS? jl RETURN ret
COLOR set_colour LEX tokenise RFALSE rfalse
COPYT copy_table LOC get_parent RSTACK ret_popped
CRLF new_line MARGIN set_margins RTRUE rtrue
CURGET get_cursor MENU make_menu SAVE save
DCLEAR erase_picture MOD mod SCREEN set_window
DEC dec MOUSE-INFO read_mouse SCROLL scroll_window
DIRIN input_stream MOUSE-LIMIT mouse_window SET store
DIROUT output_stream MOVE insert_obj SHIFT log_shift
DISPLAY draw_picture MUL mul SOUND sound_effect
DIV div NEXT? get_sibling SPLIT split_window
DLESS? dec_chk NEXTP get_next_prop SUB sub
EQUAL? je NOOP nop THROW throw
ERASE erase_line ORIGINAL? piracy USL show_status
FCLEAR clear_attr PICINF picture_data VALUE load
FIRST? get_child PICSET picture_table VERIFY verify
FONT set_font POP pull WINATTR window_style
FSET set_attr PRINTB print_addr WINGET get_wind_prop
FSET? test_attr PRINTC print_char WINPOS move_window
FSTACK pop / pop_stack PRINTD print_obj WINPUT put_wind_prop
GETB loadb PRINTF print_form WINSIZE window_size
GET loadw PRINTI print XCALL call_vs2
GETP get_prop PRINTN print_num XPUSH push_stack
GETPT get_prop_addr PRINT print_paddr ZERO? jz
GRTR? jg PRINTR print_ret ZWSTR encode_text

Indirect Variable Operands

Some opcodes (SET, VALUE, INC, DEC, IGRTR?, DLESS?) take the number of a variable as their first parameter. However, unlike Inform's assembler, ZAPF does not treat these parameters specially. This instruction stores 10 into the variable whose number is in "X":

SET X,10

To store 10 into the variable "X" Itself, prefix the variable name with an apostrophe:

SET 'X,10

Even in Inform mode, the apostrophe is still necessary:

store 'x 10

Default Store Target

If the target of a store instruction is omitted, the result will be stored to the stack by default.

Version Considerations

Header

Version 3 and 4

In these versions, ZAPF automatically assembles the game header. Therefore, certain global labels must be defined:

ENDLOD Marks the end of low memory and the beginning of high memory. Some interpreters might conserve RAM by leaving high memory on the disk, so frequently used constant data should be (and all mutable data must be) located before this label.
IMPURE Marks the end of "impure" (dynamic) memory and the beginning of "pure" (static) memory. This must be defined before ENDLOD.
START Marks the instruction where the program begins.
VOCAB Marks the beginning of the dictionary (vocabulary) table. See the Z-Machine Standards Document for the format of this table.
OBJECT Marks the beginning of the object table. See the Z-Machine Standards Document for the format of this table, and see the .OBJECT directive above. This must be defined before ENDLOD.
GLOBAL Marks the beginning of the global variable table, which consists of up to 240 words corresponding to the Z-machine's global variables. See the .GVAR directive above. This must be defined before ENDLOD.
WORDS Marks the beginning of the abbreviation table, which consists of 96 word addresses (byte addresses divided by 2) pointing to abbreviation strings. See the .FSTR directive above.

Optionally, the constant RELEASEID may be defined to set the release number of the output file. If it is omitted, the release number will be 0.

Version 5 and up

In these versions, ZAPF does not automatically create a game header. The input file must start with data directives to assemble one: refer to the Z-Machine Standards Document for the format of the header. ZAPF will, however, fill in the Z-code version, serial number, length, checksum, and creator ID (a.k.a. "Inform version") fields.

License

ZAPF is distributed under the terms of the GNU General Public License version 3 (GPLv3). See COPYING.txt for details.

ZAPF History

0.3 — April 18, 2010

0.2 — July 21, 2009

0.1 — July 2, 2009