Mercurial > hg > index.cgi

diff docs/manual.docbook.sgml @ 0:2c24602be78f
Initial import from lwtools 3.0.1 version, with new hand built build system and file reorganization
author: lost@l-w.ca
date: Wed, 19 Jan 2011 22:27:17 -0700
children: fd1ecc5d6e69
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/docs/manual.docbook.sgml	Wed Jan 19 22:27:17 2011 -0700
@@ -0,0 +1,2180 @@
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.5//EN">
+<book>
+<bookinfo>
+<title>LW Tool Chain</title>
+<author><firstname>William</firstname><surname>Astle</surname></author>
+<copyright><year>2009, 2010</year><holder>William Astle</holder></copyright>
+</bookinfo>
+<chapter>
+
+<title>Introduction</title>
+
+<para>
+The LW tool chain provides utilities for building binaries for MC6809 and
+HD6309 CPUs. The tool chain includes a cross-assembler and a cross-linker
+which support several styles of output.
+</para>
+
+<section>
+<title>History</title>
+<para>
+For a long time, I have had an interest in creating an operating system for
+the Coco3. I finally started working on that project around the beginning of
+2006. I had a number of assemblers I could choose from. Eventually, I settled
+on one and started tinkering. After a while, I realized that assembler was not
+going to be sufficient due to lack of macros and issues with forward references.
+Then I tried another which handled forward references correctly but still did
+not support macros. I looked around at other assemblers and they all lacked
+one feature or another that I really wanted for creating my operating system.
+</para>
+
+<para>
+The solution seemed clear at that point. I am a fair programmer so I figured
+I could write an assembler that would do everything I wanted an assembler to
+do. Thus the LWASM probject was born. After more than two years of on and off
+work, version 1.0 of LWASM was released in October of 2008.
+</para>
+
+<para>
+As the aforementioned operating system project progressed further, it became
+clear that while assembling the whole project through a single file was doable,
+it was not practical. When I found myself playing some fancy games with macros
+in a bid to simulate sections, I realized I needed a means of assembling
+source files separately and linking them later. This spawned a major development
+effort to add an object file support to LWASM. It also spawned the LWLINK
+project to provide a means to actually link the files.
+</para>
+
+</section>
+
+</chapter>
+
+<chapter>
+<title>Output Formats</title>
+
+<para>
+The LW tool chain supports multiple output formats. Each format has its
+advantages and disadvantages. Each format is described below.
+</para>
+
+<section>
+<title>Raw Binaries</title>
+<para>
+A raw binary is simply a string of bytes. There are no headers or other
+niceties. Both LWLINK and LWASM support generating raw binaries. ORG directives
+in the source code only serve to set the addresses that will be used for
+symbols but otherwise have no direct impact on the resulting binary.
+</para>
+
+</section>
+<section>
+<title>DECB Binaries</title>
+
+<para>A DECB binary is compatible with the LOADM command in Disk Extended
+Color Basic on the CoCo. They are also compatible with CLOADM from Extended
+Color Basic. These binaries include the load address of the binary as well
+as encoding an execution address. These binaries may contain multiple loadable
+sections, each of which has its own load address.</para>
+
+<para>
+Each binary starts with a preamble. Each preamble is five bytes long. The
+first byte is zero. The next two bytes specify the number of bytes to load
+and the last two bytes specify the address to load the bytes at. Then, a
+string of bytes follows. After this string of bytes, there may be another
+preamble or a postamble. A postamble is also five bytes in length. The first
+byte of the postamble is $FF, the next two are zero, and the last two are
+the execution address for the binary.
+</para>
+
+<para>
+Both LWASM and LWLINK can output this format.
+</para>
+</section>
+
+<section>
+<title>OS9 Modules</title>
+<para>
+
+Since version 2.5, LWASM is able to generate OS9 modules. The syntax is
+basically the same as for other assemblers.  A module starts with the MOD
+directive and ends with the EMOD directive.  The OS9 directive is provided
+as a shortcut for writing system calls.
+
+</para>
+
+<para>
+
+LWASM does NOT provide an OS9Defs file. You must provide your own. Also note
+that the common practice of using "ifp1" around the inclusion of the OS9Defs
+file is discouraged as it is pointless and can lead to unintentional
+problems and phasing errors.  Because LWASM reads each file exactly once,
+there is no benefit to restricting the inclusion to the first assembly pass.
+
+</para>
+
+<para>
+
+It is also critical to understand that unlike many OS9 assemblers, LWASM
+does NOT maintain a separate data address counter.  Thus, you must define
+all your data offsets and so on outside of the mod/emod segment.  It is,
+therefore, likely that source code targeted at other assemblers will require
+edits to build correctly.
+
+</para>
+
+<para>
+
+LWLINK does not, yet, have the ability to create OS9 modules from object
+files.
+
+</para>
+</section>
+
+<section>
+<title>Object Files</title>
+<para>LWASM supports generating a proprietary object file format which is
+described in <xref linkend="objchap">. LWLINK is then used to link these
+object files into a final binary in any of LWLINK's supported binary
+formats.</para>
+
+<para>Object files also support the concept of sections which are not valid
+for other output types. This allows related code from each object file
+linked to be collapsed together in the final binary.</para> 
+
+<para>
+Object files are very flexible in that they allow references that are not
+known at assembly time to be resolved at link time.  However, because the
+addresses of such references are not known at assembly time, there is no way
+for the assembler to deduce that an eight bit addressing mode is possible. 
+That means the assember will default to using sixteen bit addressing
+whenever an external or cross-section reference is used.
+</para>
+
+<para>
+As of LWASM 2.4, it is possible to force direct page addressing for an
+external reference.  Care must be taken to ensure the resulting addresses
+are really in the direct page since the linker does not know what the direct
+page is supposed to be and does not emit errors for byte overflows.
+</para>
+
+<para>
+It is also possible to use external references in an eight bit immediate
+mode instruction.  In this case, only the low order eight bits will be used. 
+Again, no byte overflows will be flagged.
+</para>
+
+
+</section>
+
+</chapter>
+
+<chapter>
+<title>LWASM</title>
+<para>
+The LWTOOLS assembler is called LWASM. This chapter documents the various
+features of the assembler. It is not, however, a tutorial on 6x09 assembly
+language programming.
+</para>
+
+<section>
+<title>Command Line Options</title>
+<para>
+The binary for LWASM is called "lwasm". Note that the binary is in lower
+case. lwasm takes the following command line arguments.
+</para>
+
+<variablelist>
+
+<varlistentry>
+<term><option>--6309</option></term>
+<term><option>-3</option></term>
+<listitem>
+<para>
+This will cause the assembler to accept the additional instructions available
+on the 6309 processor. This is the default mode; this option is provided for
+completeness and to override preset command arguments.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--6809</option></term>
+<term><option>-9</option></term>
+<listitem>
+<para>
+This will cause the assembler to reject instructions that are only available
+on the 6309 processor.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--decb</option></term>
+<term><option>-b</option></term>
+<listitem>
+<para>
+Select the DECB output format target. Equivalent to <option>--format=decb</option>.
+</para>
+<para>While this is the default output format currently, it is not safe to rely
+on that fact. Future versions may have different defaults. It is also trivial
+to modify the source code to change the default. Thus, it is recommended to specify
+this option if you need DECB output.
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--format=type</option></term>
+<term><option>-f type</option></term>
+<listitem>
+<para>
+Select the output format. Valid values are <option>obj</option> for the
+object file target, <option>decb</option> for the DECB LOADM format,
+<option>os9</option> for creating OS9 modules, and <option>raw</option> for
+a raw binary.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--list[=file]</option></term>
+<term><option>-l[file]</option></term>
+<listitem>
+<para>
+Cause LWASM to generate a listing. If <option>file</option> is specified,
+the listing will go to that file. Otherwise it will go to the standard output
+stream. By default, no listing is generated. Unless <option>--symbols</option>
+is specified, the list will not include the symbol table.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--symbols</option></term>
+<term><option>-s</option></term>
+<listitem>
+<para>
+Causes LWASM to generate a list of symbols when generating a listing.
+It has no effect unless a listing is being generated.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--obj</option></term>
+<listitem>
+<para>
+Select the proprietary object file format as the output target.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--output=FILE</option></term>
+<term><option>-o FILE</option></term>
+<listitem>
+<para>
+This option specifies the name of the output file. If not specified, the
+default is <option>a.out</option>.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--pragma=pragma</option></term>
+<term><option>-p pragma</option></term>
+<listitem>
+<para>
+Specify assembler pragmas. Multiple pragmas are separated by commas. The
+pragmas accepted are the same as for the PRAGMA assembler directive described
+below.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--raw</option></term>
+<term><option>-r</option></term>
+<listitem>
+<para>
+Select raw binary as the output target.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--includedir=path</option></term>
+<term><option>-I path</option></term>
+<listitem>
+<para>
+Add <option>path</option> to the end of the include path.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--help</option></term>
+<term><option>-?</option></term>
+<listitem>
+<para>
+Present a help screen describing the command line options.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--usage</option></term>
+<listitem>
+<para>
+Provide a summary of the command line options.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--version</option></term>
+<term><option>-V</option></term>
+<listitem>
+<para>
+Display the software version.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--debug</option></term>
+<term><option>-d</option></term>
+<listitem>
+<para>
+Increase the debugging level. Only really useful to people hacking on the
+LWASM source code itself.
+</para>
+</listitem>
+</varlistentry>
+
+</variablelist>
+
+</section>
+
+<section>
+<title>Dialects</title>
+<para>
+LWASM supports all documented MC6809 instructions as defined by Motorola. 
+It also supports all known HD6309 instructions.  While there is general
+agreement on the pneumonics for most of the 6309 instructions, there is some
+variance with the block transfer instructions. TFM for all four variations
+seems to have gained the most traction and, thus, this is the form that is
+recommended for LWASM. However, it also supports COPY, COPY-, IMP, EXP,
+TFRP, TFRM, TFRS, and TFRR. It further adds COPY+ as a synomym for COPY,
+IMPLODE for IMP, and EXPAND for EXP.
+</para>
+
+<para>By default, LWASM accepts 6309 instructions. However, using the
+<parameter>--6809</parameter> parameter, you can cause it to throw errors on
+6309 instructions instead.</para>
+
+<para>
+The standard addressing mode specifiers are supported. These are the
+hash sign ("#") for immediate mode, the less than sign ("&lt;") for forced
+eight bit modes, and the greater than sign ("&gt;") for forced sixteen bit modes.
+</para>
+
+<para>
+Additionally, LWASM supports using the asterisk ("*") to indicate
+base page addressing. This should not be used in hand-written source code,
+however, because it is non-standard and may or may not be present in future
+versions of LWASM.
+</para>
+
+</section>
+
+<section>
+<title>Source Format</title>
+
+<para>
+LWASM accepts plain text files in a relatively free form. It can handle
+lines terminated with CR, LF, CRLF, or LFCR which means it should be able
+to assemble files on any platform on which it compiles.
+</para>
+<para>
+Each line may start with a symbol. If a symbol is present, there must not
+be any whitespace preceding it. It is legal for a line to contain nothing
+but a symbol.</para>
+<para>
+The op code is separated from the symbol by whitespace. If there is
+no symbol, there must be at least one white space character preceding it.
+If applicable, the operand follows separated by whitespace. Following the
+opcode and operand is an optional comment.
+</para>
+
+<para> It is important to note that operands cannot contain any whitespace
+except in the case of delimited strings.  This is because the first
+whitespace character will be interpreted as the separator between the
+operand column and the comment.  This behaviour is required for approximate
+source compatibility with other 6x09 assemblers.  </para>
+
+<para>
+A comment can also be introduced with a * or a ;. The comment character is
+optional for end of statement comments. However, if a symbol is the only
+thing present on the line other than the comment, the comment character is
+mandatory to prevent the assembler from interpreting the comment as an opcode.
+</para>
+
+<para>
+For compatibility with the output generated by some C preprocessors, LWASM
+will also ignore lines that begin with a #. This should not be used as a general
+comment character, however.
+</para>
+
+<para>
+The opcode is not treated case sensitively. Neither are register names in
+the operand fields. Symbols, however, are case sensitive.
+</para>
+
+<para> As of version 2.6, LWASM supports files with line numbers.  If line
+numbers are present, the line must start with a digit.  The line number
+itself must consist only of digits.  The line number must then be followed
+by either the end of the line or exactly one white space character.  After
+that white space character, the lines are interpreted exactly as above. 
+</para>
+
+</section>
+
+<section>
+<title>Symbols</title>
+
+<para>
+Symbols have no length restriction. They may contain letters, numbers, dots,
+dollar signs, and underscores. They must start with a letter, dot, or
+underscore.
+</para>
+
+<para>
+LWASM also supports the concept of a local symbol. A local symbol is one
+which contains either a "?" or a "@", which can appear anywhere in the symbol.
+The scope of a local symbol is determined by a number of factors. First,
+each included file gets its own local symbol scope. A blank line will also
+be considered a local scope barrier. Macros each have their own local symbol
+scope as well (which has a side effect that you cannot use a local symbol
+as an argument to a macro). There are other factors as well. In general,
+a local symbol is restricted to the block of code it is defined within.
+</para>
+
+<para>
+By default, unless assembling to the os9 target, a "$" in the symbol will
+also make it local.  This can be controlled by the "dollarlocal" and
+"nodollarlocal" pragmas.  In the absence of a pragma to the contrary, for
+the os9 target, a "$" in the symbol will not make it considered local while
+for all other targets it will.
+</para>
+
+</section>
+
+<section>
+<title>Numbers and Expressions</title>
+<para>
+
+Numbers can be expressed in binary, octal, decimal, or hexadecimal. Binary
+numbers may be prefixed with a "%" symbol or suffixed with a "b" or "B".
+Octal numbers may be prefixed with "@" or suffixed with "Q", "q", "O", or
+"o". Hexadecimal numbers may be prefixed with "$", "0x" or "0X", or suffixed
+with "H". No prefix or suffix is required for decimal numbers but they can
+be prefixed with "&amp;" if desired. Any constant which begins with a letter
+must be expressed with the correct prefix base identifier or be prefixed
+with a 0. Thus hexadecimal FF would have to be written either 0FFH or $FF.
+Numbers are not case sensitive.
+
+</para>
+
+<para> A symbol may appear at any point where a number is acceptable. The
+special symbol "*" can be used to represent the starting address of the
+current source line within expressions. </para>
+
+<para>The ASCII value of a character can be included by prefixing it with a
+single quote ('). The ASCII values of two characters can be included by
+prefixing the characters with a quote (").</para>
+
+<para>
+
+LWASM supports the following basic binary operators: +, -, *, /, and %. 
+These represent addition, subtraction, multiplication, division, and
+modulus.  It also supports unary negation and unary 1's complement (- and ^
+respectively).  It is also possible to use ~ for the unary 1's complement
+operator.  For completeness, a unary positive (+) is supported though it is
+a no-op.  LWASM also supports using |, &, and ^ for bitwise or, bitwise and,
+and bitwise exclusive or respectively.
+
+</para>
+
+<para>
+
+Operator precedence follows the usual rules. Multiplication, division, and
+modulus take precedence over addition and subtraction.  Unary operators take
+precedence over binary operators.  Bitwise operators are lower precdence
+than addition and subtraction.  To force a specific order of evaluation,
+parentheses can be used in the usual manner.
+
+</para>
+
+<para>
+
+As of LWASM 2.5, the operators && and || are recognized for boolean and and
+boolean or respectively.  They will return either 0 or 1 (false or true). 
+They have the lowest precedence of all the binary operators.
+
+</para>
+
+</section>
+
+<section>
+<title>Assembler Directives</title>
+<para>
+Various directives can be used to control the behaviour of the
+assembler or to include non-code/data in the resulting output. Those directives
+that are not described in detail in other sections of this document are
+described below.
+</para>
+
+<section>
+<title>Data Directives</title>
+<variablelist>
+<varlistentry><term>FCB <parameter>expr[,...]</parameter></term>
+<term>.DB <parameter>expr[,...]</parameter></term>
+<term>.BYTE <parameter>expr[,...]</parameter></term>
+<listitem>
+<para>Include one or more constant bytes (separated by commas) in the output.</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>FDB <parameter>expr[,...]</parameter></term>
+<term>.DW <parameter>expr[,...]</parameter></term>
+<term>.WORD <parameter>expr[,...]</parameter></term>
+<listitem>
+<para>Include one or more words (separated by commas) in the output.</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>FQB <parameter>expr[,...]</parameter></term>
+<term>.QUAD <parameter>expr[,...]</parameter></term>
+<term>.4BYTE <parameter>expr[,...]</parameter></term>
+<listitem>
+<para>Include one or more double words (separated by commas) in the output.</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>FCC <parameter>string</parameter></term>
+<term>.ASCII <parameter>string</parameter></term>
+<term>.STR <parameter>string</parameter></term>
+<listitem>
+<para>
+Include a string of text in the output. The first character of the operand
+is the delimiter which must appear as the last character and cannot appear
+within the string. The string is included with no modifications>
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>FCN <parameter>string</parameter></term>
+<term>.ASCIZ <parameter>string</parameter></term>
+<term>.STRZ <parameter>string</parameter></term>
+<listitem>
+<para>
+Include a NUL terminated string of text in the output. The first character of
+the operand is the delimiter which must appear as the last character and
+cannot appear within the string. A NUL byte is automatically appended to
+the string.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>FCS <parameter>string</parameter></term>
+<term>.ASCIS <parameter>string</parameter></term>
+<term>.STRS <parameter>string</parameter></term>
+<listitem>
+<para>
+Include a string of text in the output with bit 7 of the final byte set. The
+first character of the operand is the delimiter which must appear as the last
+character and cannot appear within the string.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry><term>ZMB <parameter>expr</parameter></term>
+<listitem>
+<para>
+Include a number of NUL bytes in the output. The number must be fully resolvable
+during pass 1 of assembly so no forward or external references are permitted.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry><term>ZMD <parameter>expr</parameter></term>
+<listitem>
+<para>
+Include a number of zero words in the output. The number must be fully
+resolvable during pass 1 of assembly so no forward or external references are
+permitted.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry><term>ZMQ <parameter>expr<parameter></term>
+<listitem>
+<para>
+Include a number of zero double-words in the output. The number must be fully
+resolvable during pass 1 of assembly so no forward or external references are
+permitted.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>RMB <parameter>expr</parameter></term>
+<term>.BLKB <parameter>expr</parameter></term>
+<term>.DS <parameter>expr</parameter></term>
+<term>.RS <parameter>expr</parameter></term>
+<listitem>
+<para>
+Reserve a number of bytes in the output. The number must be fully resolvable
+during pass 1 of assembly so no forward or external references are permitted.
+The value of the bytes is undefined.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry><term>RMD <parameter>expr</parameter></term>
+<listitem>
+<para>
+Reserve a number of words in the output. The number must be fully
+resolvable during pass 1 of assembly so no forward or external references are
+permitted. The value of the words is undefined.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry><term>RMQ <parameter>expr</parameter></term>
+<listitem>
+<para>
+Reserve a number of double-words in the output. The number must be fully
+resolvable during pass 1 of assembly so no forward or external references are
+permitted. The value of the double-words is undefined.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>INCLUDEBIN <parameter>filename</parameter></term>
+<listitem>
+<para>
+Treat the contents of <parameter>filename</parameter> as a string of bytes to
+be included literally at the current assembly point. This has the same effect
+as converting the file contents to a series of FCB statements and including
+those at the current assembly point.
+</para>
+
+<para> If <parameter>filename</parameter> beings with a /, the file name
+will be taken as absolute.  Otherwise, the current directory will be
+searched followed by the search path in the order specified.</para>
+
+<para> Please note that absolute path detection including drive letters will
+not function correctly on Windows platforms.  Non-absolute inclusion will
+work, however.</para>
+
+</listitem>
+</varlistentry>
+
+</variablelist>
+
+</section>
+
+<section>
+<title>Address Definition</title>
+<para>The directives in this section all control the addresses of symbols
+or the assembly process itself.</para>
+
+<variablelist>
+<varlistentry><term>ORG <parameter>expr</parameter></term>
+<listitem>
+<para>Set the assembly address. The address must be fully resolvable on the
+first pass so no external or forward references are permitted. ORG is not
+permitted within sections when outputting to object files. For the DECB
+target, each ORG directive after which output is generated will cause
+a new preamble to be output. ORG is only used to determine the addresses
+of symbols when the raw target is used.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><parameter>sym</parameter> EQU <parameter>expr</parameter></term>
+<term><parameter>sym</parameter> = <parameter>expr</parameter></term>
+<listitem>
+<para>Define the value of <parameter>sym</parameter> to be <parameter>expr</parameter>.
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><parameter>sym</parameter> SET <parameter>expr</parameter></term>
+<listitem>
+<para>Define the value of <parameter>sym</parameter> to be <parameter>expr</parameter>.
+Unlike EQU, SET permits symbols to be defined multiple times as long as SET
+is used for all instances. Use of the symbol before the first SET statement
+that sets its value is undefined.</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>SETDP <parameter>expr</parameter></term>
+<listitem>
+<para>Inform the assembler that it can assume the DP register contains
+<parameter>expr</parameter>. This directive is only advice to the assembler
+to determine whether an address is in the direct page and has no effect
+on the contents of the DP register. The value must be fully resolved during
+the first assembly pass because it affects the sizes of subsequent instructions.
+</para>
+<para>This directive has no effect in the object file target.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>ALIGN <parameter>expr</parameter>[,<parameter>value</parameter>]</term>
+<listitem>
+
+<para>Force the current assembly address to be a multiple of
+<parameter>expr</parameter>.  If <parameter>value</parameter> is not
+specified, a series of NUL bytes is output to force the alignment, if
+required.  Otherwise, the low order 8 bits of <parameter>value</parameter>
+will be used as the fill.  The alignment value must be fully resolved on the
+first pass because it affects the addresses of subsquent instructions. 
+However, <parameter>value</parameter> may include forward references; as
+long as it resolves to a constant for the second pass, the value will be
+accepted.</para>
+
+<para>Unless <parameter>value</parameter> is specified as something like $12,
+this directive is not suitable for inclusion in the middle of actual code. 
+The default padding value is $00 which is intended to be used within data
+blocks.  </para>
+
+</listitem>
+</varlistentry>
+
+</variablelist>
+
+</section>
+
+<section>
+<title>Conditional Assembly</title>
+<para>
+Portions of the source code can be excluded or included based on conditions
+known at assembly time. Conditionals can be nested arbitrarily deeply. The
+directives associated with conditional assembly are described in this section.
+</para>
+<para>All conditionals must be fully bracketed. That is, every conditional
+statement must eventually be followed by an ENDC at the same level of nesting.
+</para>
+<para>Conditional expressions are only evaluated on the first assembly pass.
+It is not possible to game the assembly process by having a conditional
+change its value between assembly passes. Due to the underlying architecture
+of LWASM, there is no possible utility to IFP1 and IFP2, nor can they, as of LWASM 3.0, actually
+be implemented meaningfully. Thus there is not and never will
+be any equivalent of IFP1 or IFP2 as provided by other assemblers. Use of those opcodes
+will throw a warning and be ignored.</para>
+
+<para>It is important to note that if a conditional does not resolve to a constant
+during the first parsing pass, an error will be thrown. This is unavoidable because the assembler
+must make a decision about which source to include and which source to exclude at this stage.
+Thus, expressions that work normally elsewhere will not work for conditions.</para>
+
+<variablelist>
+<varlistentry>
+<term>IFEQ <parameter>expr</parameter></term>
+<listitem>
+<para>If <parameter>expr</parameter> evaluates to zero, the conditional
+will be considered true.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>IFNE <parameter>expr</parameter></term>
+<term>IF <parameter>expr</parameter></term>
+<listitem>
+<para>If <parameter>expr</parameter> evaluates to a non-zero value, the conditional
+will be considered true.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>IFGT <parameter>expr</parameter></term>
+<listitem>
+<para>If <parameter>expr</parameter> evaluates to a value greater than zero, the conditional
+will be considered true.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>IFGE <parameter>expr</parameter></term>
+<listitem>
+<para>If <parameter>expr</parameter> evaluates to a value greater than or equal to zero, the conditional
+will be considered true.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>IFLT <parameter>expr</parameter></term>
+<listitem>
+<para>If <parameter>expr</parameter> evaluates to a value less than zero, the conditional
+will be considered true.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>IFLE <parameter>expr</parameter></term>
+<listitem>
+<para>If <parameter>expr</parameter> evaluates to a value less than or equal to zero , the conditional
+will be considered true.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>IFDEF <parameter>sym</parameter></term>
+<listitem>
+<para>If <parameter>sym</parameter> is defined at this point in the assembly
+process, the conditional
+will be considered true.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>IFNDEF <parameter>sym</parameter></term>
+<listitem>
+<para>If <parameter>sym</parameter> is not defined at this point in the assembly
+process, the conditional
+will be considered true.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>ELSE</term>
+<listitem>
+<para>
+If the preceding conditional at the same level of nesting was false, the
+statements following will be assembled. If the preceding conditional at
+the same level was true, the statements following will not be assembled.
+Note that the preceding conditional might have been another ELSE statement
+although this behaviour is not guaranteed to be supported in future versions
+of LWASM.
+</para>
+</listitem>
+
+<varlistentry>
+<term>ENDC</term>
+<listitem>
+<para>
+This directive marks the end of a conditional construct. Every conditional
+construct must end with an ENDC directive.
+</para>
+</listitem>
+</varlistentry>
+
+</variablelist>
+</section>
+
+<section>
+<title>OS9 Target Directives</title>
+
+<para>This section includes directives that apply solely to the OS9
+target.</para>
+
+<variablelist>
+
+<varlistentry>
+<term>OS9 <parameter>syscall</parameter></term>
+<listitem>
+<para>
+
+This directive generates a call to the specified system call. <parameter>syscall</parameter> may be an arbitrary expression.
+
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>MOD <parameter>size</parameter>,<parameter>name</parameter>,<parameter>type</parameter>,<parameter>flags</parameter>,<parameter>execoff</parameter>,<parameter>datasize</parameter></term>
+<listitem>
+<para>
+
+This tells LWASM that the beginning of the actual module is here. It will
+generate a module header based on the parameters specified.  It will also
+begin calcuating the module CRC.
+
+</para>
+
+<para>
+
+The precise meaning of the various parameters is beyond the scope of this
+document since it is not a tutorial on OS9 module programming.
+
+</para>
+
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>EMOD</term>
+<listitem>
+<para>
+
+This marks the end of a module and causes LWASM to emit the calculated CRC
+for the module.
+
+</para>
+</varlistentry>
+
+</variablelist>
+</section>
+
+<section>
+<title>Miscelaneous Directives</title>
+
+<para>This section includes directives that do not fit into the other
+categories.</para>
+
+<variablelist>
+
+<varlistentry>
+<term>INCLUDE <parameter>filename</parameter></term>
+<term>USE <parameter>filename</parameter></term>
+
+<listitem> <para> Include the contents of <parameter>filename</parameter> at
+this point in the assembly as though it were a part of the file currently
+being processed.  Note that if whitespace appears in the name of the file,
+you must enclose <parameter>filename</parameter> in quotes.
+</para>
+
+<para>
+Note that the USE variation is provided only for compatibility with other
+assemblers. It is recommended to use the INCLUDE variation.</para>
+
+<para>If <parameter>filename</parameter> begins with a &quot;/&quot;, it is
+interpreted as an absolute path. If it does not, the search path will be used
+to find the file. First, the directory containing the file that contains this
+directive. (Includes within an included file are relative to the included file,
+not the file that included it.) If the file is not found there, the include path
+is searched. If it is still not found, an error will be thrown. Note that the
+current directory as understood by your shell or operating system is not searched.
+</para>
+
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>END <parameter>[expr]</parameter></term>
+<listitem>
+<para>
+This directive causes the assembler to stop assembling immediately as though
+it ran out of input. For the DECB target only, <parameter>expr</parameter>
+can be used to set the execution address of the resulting binary. For all
+other targets, specifying <parameter>expr</parameter> will cause an error.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>ERROR <parameter>string</parameter></term>
+<listitem>
+<para>
+Causes a custom error message to be printed at this line. This will cause
+assembly to fail. This directive is most useful inside conditional constructs
+to cause assembly to fail if some condition that is known bad happens. Everything
+from the directive to the end of the line is considered the error message.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>WARNING <parameter>string</parameter></term>
+<listitem>
+<para>
+Causes a custom warning message to be printed at this line. This will not cause
+assembly to fail. This directive is most useful inside conditional constructs
+or include files to alert the programmer to a deprecated feature being used
+or some other condition that may cause trouble later, but which may, in fact,
+not cause any trouble.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>.MODULE <parameter>string</parameter></term>
+<listitem>
+<para>
+This directive is ignored for most output targets. If the output target
+supports encoding a module name into it, <parameter>string</parameter>
+will be used as the module name.
+</para>
+<para>
+As of version 3.0, no supported output targets support this directive.
+</para>
+</listitem>
+</varlistentry>
+
+</variablelist>
+</section>
+
+</section>
+
+<section>
+<title>Macros</title>
+<para>
+LWASM is a macro assembler. A macro is simply a name that stands in for a
+series of instructions. Once a macro is defined, it is used like any other
+assembler directive. Defining a macro can be considered equivalent to adding
+additional assembler directives.
+</para>
+<para>Macros may accept parameters. These parameters are referenced within
+a macro by the a backslash ("\") followed by a digit 1 through 9 for the first
+through ninth parameters. They may also be referenced by enclosing the
+decimal parameter number in braces ("{num}"). These parameter references
+are replaced with the verbatim text of the parameter passed to the macro. A
+reference to a non-existent parameter will be replaced by an empty string.
+Macro parameters are expanded everywhere on each source line. That means
+the parameter to a macro could be used as a symbol or it could even appear
+in a comment or could cause an entire source line to be commented out
+when the macro is expanded.
+</para>
+<para>
+Parameters passed to a macro are separated by commas and the parameter list
+is terminated by any whitespace. This means that neither a comma nor whitespace
+may be included in a macro parameter.
+</para>
+<para>
+Macro expansion is done recursively. That is, within a macro, macros are
+expanded. This can lead to infinite loops in macro expansion. If the assembler
+hangs for a long time while assembling a file that uses macros, this may be
+the reason.</para>
+
+<para>Each macro expansion receives its own local symbol context which is not
+inherited by any macros called by it nor is it inherited from the context
+the macro was instantiated in. That means it is possible to use local symbols
+within macros without having them collide with symbols in other macros or
+outside the macro itself. However, this also means that using a local symbol
+as a parameter to a macro, while legal, will not do what it would seem to do
+as it will result in looking up the local symbol in the macro's symbol context
+rather than the enclosing context where it came from, likely yielding either
+an undefined symbol error or bizarre assembly results.
+</para>
+<para>
+Note that there is no way to define a macro as local to a symbol context. All
+macros are part of the global macro namespace. However, macros have a separate
+namespace from symbols so it is possible to have a symbol with the same name
+as a macro.
+</para>
+
+<para>
+Macros are defined only during the first pass. Macro expansion also
+only occurs during the first pass. On the second pass, the macro
+definition is simply ignored. Macros must be defined before they are used.
+</para>
+
+<para>The following directives are used when defining macros.</para>
+
+<variablelist>
+<varlistentry>
+<term><parameter>macroname</parameter> MACRO</term>
+<listitem>
+<para>This directive is used to being the definition of a macro called
+<parameter>macroname</parameter>. If <parameter>macroname</parameter> already
+exists, it is considered an error. Attempting to define a macro within a
+macro is undefined. It may work and it may not so the behaviour should not
+be relied upon.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>ENDM</term>
+<listitem>
+<para>
+This directive indicates the end of the macro currently being defined. It
+causes the assembler to resume interpreting source lines as normal.
+</para>
+</listitem>
+</variablelist>
+
+</section>
+
+<section>
+<title>Structures</title>
+<para>
+
+Structures are used to group related data in a fixed structure. A structure
+consists a number of fields, defined in sequential order and which take up
+specified size.  The assembler does not enforce any means of access within a
+structure; it assumes that whatever you are doing, you intended to do. 
+There are two pseudo ops that are used for defining structures.
+
+</para>
+
+<variablelist>
+<varlistentry>
+<term><parameter>structname</parameter> STRUCT</term>
+<listitem>
+<para>
+
+This directive is used to begin the definition of a structure with name
+<parameter>structname</parameter>.  Subsequent statements all form part of
+the structure definition until the end of the structure is declared.
+
+</para>
+</listitem>
+</varlistentry>
+<varlistentry>
+<term>ENDSTRUCT</term>
+<term>ENDS</term>
+<listitem>
+<para>
+This directive ends the definition of the structure. ENDSTRUCT is the
+preferred form. Prior to version 3.0 of LWASM, ENDS was used to end a
+section instead of a structure.
+</para>
+</listitem>
+</varlistentry>
+</variablelist>
+
+<para>
+
+Within a structure definition, only reservation pseudo ops are permitted.
+Anything else will cause an assembly error.
+</para>
+
+<para> Once a structure is defined, you can reserve an area of memory in the
+same structure by using the structure name as the opcode.  Structures can
+also contain fields that are themselves structures.  See the example
+below.</para>
+
+<programlisting>
+tstruct2  STRUCT
+f1        rmb 1
+f2        rmb 1
+          ENDSTRUCT
+
+tstruct   STRUCT
+field1    rmb 2
+field2    rmb 3
+field3    tstruct2
+          ENDSTRUCT
+
+          ORG $2000
+var1      tstruct
+var2      tstruct2
+</programlisting>
+
+<para>Fields are referenced using a dot (.) as a separator. To refer to the
+generic offset within a structure, use the structure name to the left of the
+dot.  If referring to a field within an actual variable, use the variable's
+symbol name to the left of the dot.</para>
+
+<para>You can also refer to the actual size of a structure (or a variable
+declared as a structure) using the special symbol sizeof{structname} where
+structname will be the name of the structure or the name of the
+variable.</para>
+
+<para>Essentially, structures are a shortcut for defining a vast number of
+symbols.  When a structure is defined, the assembler creates symbols for the
+various fields in the form structname.fieldname as well as the appropriate
+sizeof{structname} symbol.  When a variable is declared as a structure, the
+assembler does the same thing using the name of the variable.  You will see
+these symbols in the symbol table when the assembler is instructed to
+provide a listing.  For instance, the above listing will create the
+following symbols (symbol values in parentheses): tstruct2.f1 (0),
+tstruct2.f2 (1), sizeof{tstruct2} (2), tstruct.field1 (0), tstruct.field2
+(2), tstruct.field3 (5), tstruct.field3.f1 (5), tstruct.field3.f2 (6),
+sizeof{tstruct.field3} (2), sizeof{tstruct} (7), var1 {$2000}, var1.field1
+{$2000}, var1.field2 {$2002}, var1.field3 {$2005}, var1.field3.f1 {$2005},
+var1.field3.f2 {$2006}, sizeof(var1.field3} (2), sizeof{var1} (7), var2
+($2007), var2.f1 ($2007), var2.f2 ($2008), sizeof{var2} (2).  </para>
+
+</section>
+
+<section>
+<title>Object Files and Sections</title>
+<para>
+The object file target is very useful for large project because it allows
+multiple files to be assembled independently and then linked into the final
+binary at a later time. It allows only the small portion of the project
+that was modified to be re-assembled rather than requiring the entire set
+of source code to be available to the assembler in a single assembly process.
+This can be particularly important if there are a large number of macros,
+symbol definitions, or other metadata that uses resources at assembly time.
+By far the largest benefit, however, is keeping the source files small enough
+for a mere mortal to find things in them.
+</para>
+
+<para>
+With multi-file projects, there needs to be a means of resolving references to
+symbols in other source files. These are known as external references. The
+addresses of these symbols cannot be known until the linker joins all the
+object files into a single binary. This means that the assembler must be
+able to output the object code without knowing the value of the symbol. This
+places some restrictions on the code generated by the assembler. For
+example, the assembler cannot generate direct page addressing for instructions
+that reference external symbols because the address of the symbol may not
+be in the direct page. Similarly, relative branches and PC relative addressing
+cannot be used in their eight bit forms. Everything that must be resolved
+by the linker must be assembled to use the largest address size possible to
+allow the linker to fill in the correct value at link time. Note that the
+same problem applies to absolute address references as well, even those in
+the same source file, because the address is not known until link time.
+</para>
+
+<para>
+It is often desired in multi-file projects to have code of various types grouped
+together in the final binary generated by the linker as well. The same applies
+to data. In order for the linker to do that, the bits that are to be grouped
+must be tagged in some manner. This is where the concept of sections comes in.
+Each chunk of code or data is part of a section in the object file. Then,
+when the linker reads all the object files, it coalesces all sections of the
+same name into a single section and then considers it as a unit.
+</para>
+
+<para>
+The existence of sections, however, raises a problem for symbols even
+within the same source file. Thus, the assembler must treat symbols from
+different sections within the same source file in the same manner as external
+symbols. That is, it must leave them for the linker to resolve at link time,
+with all the limitations that entails.
+</para>
+
+<para>
+In the object file target mode, LWASM requires all source lines that
+cause bytes to be output to be inside a section. Any directives that do
+not cause any bytes to be output can appear outside of a section. This includes
+such things as EQU or RMB. Even ORG can appear outside a section. ORG, however,
+makes no sense within a section because it is the linker that determines
+the starting address of the section's code, not the assembler.
+</para>
+
+<para>
+All symbols defined globally in the assembly process are local to the 
+source file and cannot be exported. All symbols defined within a section are
+considered local to the source file unless otherwise explicitly exported.
+Symbols referenced from external source files must be declared external,
+either explicitly or by asking the assembler to assume that all undefined
+symbols are external.
+</para>
+
+<para>
+It is often handy to define a number of memory addresses that will be
+used for data at run-time but which need not be included in the binary file.
+These memory addresses are not initialized until run-time, either by the
+program itself or by the program loader, depending on the operating environment.
+Such sections are often known as BSS sections. LWASM supports generating
+sections with a BSS attribute set which causes the section definition including
+symbols exported from that section and those symbols required to resolve
+references from the local file, but with no actual code in the object file.
+It is illegal for any source lines within a BSS flagged section to cause any
+bytes to be output.
+</para>
+
+<para>The following directives apply to section handling.</para>
+
+<variablelist>
+<varlistentry>
+<term>SECTION <parameter>name[,flags]</parameter></term>
+<term>SECT <parameter>name[,flags]</parameter></term>
+<term>.AREA <parameter>name[,flags]</parameter></term>
+<listitem>
+<para>
+Instructs the assembler that the code following this directive is to be
+considered part of the section <parameter>name</parameter>. A section name
+may appear multiple times in which case it is as though all the code from
+all the instances of that section appeared adjacent within the source file.
+However, <parameter>flags</parameter> may only be specified on the first
+instance of the section.
+</para>
+<para>There is a single flag supported in <parameter>flags</parameter>. The
+flag <parameter>bss</parameter> will cause the section to be treated as a BSS
+section and, thus, no code will be included in the object file nor will any
+bytes be permitted to be output.</para>
+<para>
+If the section name is "bss" or ".bss" in any combination of upper and
+lower case, the section is assumed to be a BSS section. In that case,
+the flag <parameter>!bss</parameter> can be used to override this assumption.
+</para>
+<para>
+If assembly is already happening within a section, the section is implicitly
+ended and the new section started. This is not considered an error although
+it is recommended that all sections be explicitly closed.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>ENDSECTION</term>
+<term>ENDSECT</term>
+<listitem>
+<para>
+This directive ends the current section. This puts assembly outside of any
+sections until the next SECTION directive. ENDSECTION is the preferred form.
+Prior to version 3.0 of LWASM, ENDS could also be used to end a section but
+as of version 3.0, it is now an alias for ENDSTRUCT instead.
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><parameter>sym</parameter> EXTERN</term>
+<term><parameter>sym</parameter> EXTERNAL</term>
+<term><parameter>sym</parameter> IMPORT</term>
+<listitem>
+<para>
+This directive defines <parameter>sym</parameter> as an external symbol.
+This directive may occur at any point in the source code. EXTERN definitions
+are resolved on the first pass so an EXTERN definition anywhere in the
+source file is valid for the entire file. The use of this directive is
+optional when the assembler is instructed to assume that all undefined
+symbols are external. In fact, in that mode, if the symbol is referenced
+before the EXTERN directive, an error will occur.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><parameter>sym</parameter> EXPORT</term>
+<term><parameter>sym</parameter> .GLOBL</term>
+
+<term>EXPORT <parameter>sym</parameter></term>
+<term>.GLOBL <parameter>sym</parameter></term>
+
+<listitem>
+<para>
+This directive defines <parameter>sym</parameter> as an exported symbol.
+This directive may occur at any point in the source code, even before the
+definition of the exported symbol.
+</para>
+<para>
+Note that <parameter>sym</parameter> may appear as the operand or as the
+statement's symbol. If there is a symbol on the statement, that will
+take precedence over any operand that is present.
+</para>
+</listitem>
+
+</varlistentry>
+
+<varlistentry>
+<term><parameter>sym</parameter> EXTDEP</term>
+<listitem>
+
+<para>This directive forces an external dependency on
+<parameter>sym</parameter>, even if it is never referenced anywhere else in
+this file.</para>
+
+</listitem>
+</varlistentry>
+</variablelist>
+
+</section>
+
+<section>
+<title>Assembler Modes and Pragmas</title>
+<para>
+There are a number of options that affect the way assembly is performed.
+Some of these options can only be specified on the command line because
+they determine something absolute about the assembly process. These include
+such things as the output target. Other things may be switchable during
+the assembly process. These are known as pragmas and are, by definition,
+not portable between assemblers.
+</para>
+
+<para>LWASM supports a number of pragmas that affect code generation or
+otherwise affect the behaviour of the assembler. These may be specified by
+way of a command line option or by assembler directives. The directives
+are as follows.
+</para>
+
+<variablelist>
+<varlistentry>
+<term>PRAGMA <parameter>pragma[,...]</parameter></term>
+<listitem>
+<para>
+Specifies that the assembler should bring into force all <parameter>pragma</parameter>s
+specified. Any unrecognized pragma will cause an assembly error. The new
+pragmas will take effect immediately. This directive should be used when
+the program will assemble incorrectly if the pragma is ignored or not supported.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>*PRAGMA <parameter>pragma[,...]</parameter></term>
+<listitem>
+<para>
+This is identical to the PRAGMA directive except no error will occur with
+unrecognized or unsupported pragmas. This directive, by virtue of starting
+with a comment character, will also be ignored by assemblers that do not
+support this directive. Use this variation if the pragma is not required
+for correct functioning of the code.
+</para>
+</listitem>
+</varlistentry>
+</variablelist>
+
+<para>Each pragma supported has a positive version and a negative version.
+The positive version enables the pragma while the negative version disables
+it. The negatitve version is simply the positive version with "no" prefixed
+to it. For instance, "pragma" vs. "nopragma". Only the positive version is
+listed below.</para>
+
+<para>Pragmas are not case sensitive.</para>
+
+<variablelist>
+<varlistentry>
+<term>index0tonone</term>
+<listitem>
+<para>
+When in force, this pragma enables an optimization affecting indexed addressing
+modes. When the offset expression in an indexed mode evaluates to zero but is
+not explicity written as 0, this will replace the operand with the equivalent
+no offset mode, thus creating slightly faster code. Because of the advantages
+of this optimization, it is enabled by default.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>cescapes</term>
+<listitem>
+<para>
+This pragma will cause strings in the FCC, FCS, and FCN pseudo operations to
+have C-style escape sequences interpreted. The one departure from the official
+spec is that unrecognized escape sequences will return either the character
+immediately following the backslash or some undefined value. Do not rely
+on the behaviour of undefined escape sequences.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>importundefexport</term>
+<listitem>
+<para>
+This pragma is only valid for targets that support external references. When
+in force, it will cause the EXPORT directive to act as IMPORT if the symbol
+to be exported is not defined.  This is provided for compatibility with the
+output of gcc6809 and should not be used in hand written code.  Because of
+the confusion this pragma can cause, it is disabled by default.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>undefextern</term>
+<listitem>
+<para>
+This pragma is only valid for targets that support external references. When in
+force, if the assembler sees an undefined symbol on the second pass, it will
+automatically define it as an external symbol. This automatic definition will
+apply for the remainder of the assembly process, even if the pragma is
+subsequently turned off. Because this behaviour would be potentially surprising,
+this pragma defaults to off.
+</para>
+<para>
+The primary use for this pragma is for projects that share a large number of
+symbols between source files. In such cases, it is impractical to enumerate
+all the external references in every source file. This allows the assembler
+and linker to do the heavy lifting while not preventing a particular source
+module from defining a local symbol of the same name as an external symbol
+if it does not need the external symbol. (This pragma will not cause an
+automatic external definition if there is already a locally defined symbol.)
+</para>
+<para>
+This pragma will often be specified on the command line for large projects.
+However, depending on the specific dynamics of the project, it may be sufficient
+for one or two files to use this pragma internally.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>dollarlocal</term>
+<listitem>
+
+<para>When set, a "$" in a symbol makes it local. When not set, "$" does not
+cause a symbol to be local.  It is set by default except when using the OS9
+target.</para>
+
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>dollarnotlocal</term>
+<listitem>
+
+<para> This is the same as the "dollarlocal" pragma except its sense is
+reversed.  That is, "dollarlocal" and "nodollarnotlocal" are equivalent and
+"nodollarlocal" and "dollarnotlocal" are equivalent.  </para>
+
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>pcaspcr</term>
+<listitem>
+
+<para> Normally, LWASM makes a distinction between PC and PCR in program
+counter relative addressing. In particular, the use of PC means an absolute
+offset from PC while PCR causes the assembler to calculate the offset to the
+specified operand and use that as the offset from PC. By setting this
+pragma, you can have PC treated the same as PCR. </para>
+
+
+</listitem>
+</varlistentry>
+
+</variablelist>
+
+</section>
+
+</chapter>
+
+<chapter>
+<title>LWLINK</title>
+<para>
+The LWTOOLS linker is called LWLINK. This chapter documents the various features
+of the linker.
+</para>
+
+<section>
+<title>Command Line Options</title>
+<para>
+The binary for LWLINK is called "lwlink". Note that the binary is in lower
+case. lwlink takes the following command line arguments.
+</para>
+<variablelist>
+<varlistentry>
+<term><option>--decb</option></term>
+<term><option>-b</option></term>
+<listitem>
+<para>
+Selects the DECB output format target. This is equivalent to <option>--format=decb</option>
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--output=FILE</option></term>
+<term><option>-o FILE</option></term>
+<listitem>
+<para>
+This option specifies the name of the output file. If not specified, the
+default is <option>a.out</option>.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--format=TYPE</option></term>
+<term><option>-f TYPE</option></term>
+<listitem>
+<para>
+This option specifies the output format. Valid values are <option>decb</option>
+and <option>raw</option>
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--raw</option></term>
+<term><option>-r</option></term>
+<listitem>
+<para>
+This option specifies the raw output format.
+It is equivalent to <option>--format=raw</option>
+and <option>-f raw</option>
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--script=FILE</option></term>
+<term><option>-s</option></term>
+<listitem>
+<para>
+This option allows specifying a linking script to override the linker's
+built in defaults.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--section-base=SECT=BASE</option></term>
+<listitem>
+<para>
+Cause section SECT to load at base address BASE. This will be prepended
+to the built-in link script. It is ignored if a link script is provided.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--map=FILE</option></term>
+<term><option>-m FILE</option></term>
+<listitem>
+<para>
+This will output a description of the link result to FILE.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--library=LIBSPEC</option></term>
+<term><option>-l LIBSPEC</option></term>
+<listitem>
+<para>
+Load a library using the library search path. LIBSPEC will have "lib" prepended
+and ".a" appended.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--library-path=DIR</option></term>
+<term><option>-L DIR</option></term>
+<listitem>
+<para>
+Add DIR to the library search path.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--debug</option></term>
+<term><option>-d</option></term>
+<listitem>
+<para>
+This option increases the debugging level. It is only useful for LWTOOLS
+developers.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--help</option></term>
+<term><option>-?</option></term>
+<listitem>
+<para>
+This provides a listing of command line options and a brief description
+of each.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--usage</option></term>
+<listitem>
+<para>
+This will display a usage summary
+of each command line option.
+</para>
+</listitem>
+</varlistentry>
+
+
+<varlistentry>
+<term><option>--version</option></term>
+<term><option>-V</option></term>
+<listitem>
+<para>
+This will display the version of LWLINK.
+</para>
+</listitem>
+</varlistentry>
+
+</section>
+
+<section>
+<title>Linker Operation</title>
+
+<para>
+
+LWLINK takes one or more files in supported input formats and links them
+into a single binary. Currently supported formats are the LWTOOLS object
+file format and the archive format used by LWAR. While the precise method is
+slightly different, linking can be conceptualized as the following steps.
+
+</para>
+
+<orderedlist>
+<listitem>
+<para>
+First, the linker loads a linking script. If no script is specified, it
+loads a built-in default script based on the output format selected. This
+script tells the linker how to lay out the various sections in the final
+binary.
+</para>
+</listitem>
+
+<listitem>
+<para>
+Next, the linker reads all the input files into memory. At this time, it
+flags any format errors in those files. It constructs a table of symbols
+for each object at this time.
+</para>
+</listitem>
+
+<listitem>
+<para>
+The linker then proceeds with organizing the sections loaded from each file
+according to the linking script. As it does so, it is able to assign addresses
+to each symbol defined in each object file. At this time, the linker may
+also collapse different instances of the same section name into a single
+section by appending the data from each subsequent instance of the section
+to the first instance of the section.
+</para>
+</listitem>
+
+<listitem>
+<para>
+Next, the linker looks through every object file for every incomplete reference.
+It then attempts to fully resolve that reference. If it cannot do so, it
+throws an error. Once a reference is resolved, the value is placed into
+the binary code at the specified section. It should be noted that an
+incomplete reference can reference either a symbol internal to the object
+file or an external symbol which is in the export list of another object
+file.
+</para>
+</listitem>
+
+<listitem>
+<para>
+If all of the above steps are successful, the linker opens the output file
+and actually constructs the binary.
+</para>
+</listitem>
+</orderedlist>
+
+</section>
+
+<section
+<title>Linking Scripts</title>
+<para>
+A linker script is used to instruct the linker about how to assemble the
+various sections into a completed binary. It consists of a series of
+directives which are considered in the order they are encountered.
+</para>
+<para>
+The sections will appear in the resulting binary in the order they are
+specified in the script file. If a referenced section is not found, the linker will behave as though the
+section did exist but had a zero size, no relocations, and no exports.
+A section should only be referenced once. Any subsequent references will have
+an undefined effect.
+</para>
+
+<para>
+All numbers are in linking scripts are specified in hexadecimal. All directives
+are case sensitive although the hexadecimal numbers are not.
+</para>
+
+<para>A section name can be specified as a "*", then any section not
+already matched by the script will be matched. The "*" can be followed
+by a comma and a flag to narrow the section down slightly, also.
+If the flag is "!bss", then any section that is not flagged as a bss section
+will be matched. If the flag is "bss", then any section that is flagged as
+bss will be matched.
+</para>
+
+<para>The following directives are understood in a linker script.</para>
+<variablelist>
+<varlistentry>
+<term>section <parameter>name</parameter> load <parameter>addr</parameter></term>
+<listitem><para>
+
+This causes the section <parameter>name</parameter> to load at
+<parameter>addr</parameter>. For the raw target, only one "load at" entry is
+allowed for non-bss sections and it must be the first one. For raw targets,
+it affects the addresses the linker assigns to symbols but has no other
+affect on the output. bss sections may all have separate load addresses but
+since they will not appear in the binary anyway, this is okay.
+</para><para>
+For the decb target, each "load" entry will cause a new "block" to be
+output to the binary which will contain the load address. It is legal for
+sections to overlap in this manner - the linker assumes the loader will sort
+everything out.
+</para></listitem>
+</varlistentry>
+
+<varlistentry>
+<term>section <parameter>name</parameter></term>
+<listitem><para>
+
+This will cause the section <parameter>name</parameter> to load after the previously listed
+section.
+</para></listitem></varlistentry>
+<varlistentry>
+<term>exec <parameter>addr or sym</parameter></term>
+<listitem>
+<para>
+This will cause the execution address (entry point) to be the address
+specified (in hex) or the specified symbol name. The symbol name must
+match a symbol that is exported by one of the object files being linked.
+This has no effect for targets that do not encode the entry point into the
+resulting file. If not specified, the entry point is assumed to be address 0
+which is probably not what you want. The default link scripts for targets
+that support this directive automatically starts at the beginning of the
+first section (usually "init" or "code") that is emitted in the binary.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term>pad <parameter>size</parameter></term>
+<listitem><para>
+This will cause the output file to be padded with NUL bytes to be exactly
+<parameter>size</parameter> bytes in length. This only makes sense for a raw target.
+</para>
+</listitem>
+</varlistentry>
+</variablelist>
+
+
+
+</section>
+
+</chapter>
+
+<chapter>
+<title>Libraries and LWAR</title>
+
+<para>
+LWTOOLS also includes a tool for managing libraries. These are analogous to
+the static libraries created with the "ar" tool on POSIX systems. Each library
+file contains one or more object files. The linker will treat the object
+files within a library as though they had been specified individually on
+the command line except when resolving external references. External references
+are looked up first within the object files within the library and then, if
+not found, the usual lookup based on the order the files are specified on
+the command line occurs.
+</para>
+
+<para>
+The tool for creating these libary files is called LWAR.
+</para>
+
+<section>
+<title>Command Line Options</title>
+<para>
+The binary for LWAR is called "lwar". Note that the binary is in lower
+case. The options lwar understands are listed below. For archive manipulation
+options, the first non-option argument is the name of the archive. All other
+non-option arguments are the names of files to operate on.
+</para>
+
+<variablelist>
+<varlistentry>
+<term><option>--add</option></term>
+<term><option>-a</option></term>
+<listitem>
+<para>
+This option specifies that an archive is going to have files added to it.
+If the archive does not already exist, it is created. New files are added
+to the end of the archive.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--create</option></term>
+<term><option>-c</option></term>
+<listitem>
+<para>
+This option specifies that an archive is going to be created and have files
+added to it. If the archive already exists, it is truncated.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--merge</option></term>
+<term><option>-m</option></term>
+<listitem>
+<para>
+If specified, any files specified to be added to an archive will be checked
+to see if they are archives themselves. If so, their constituent members are
+added to the archive. This is useful for avoiding archives containing archives.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--list</option></term>
+<term><option>-l</option></term>
+<listitem>
+<para>
+This will display a list of the files contained in the archive.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--debug</option></term>
+<term><option>-d</option></term>
+<listitem>
+<para>
+This option increases the debugging level. It is only useful for LWTOOLS
+developers.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--help</option></term>
+<term><option>-?</option></term>
+<listitem>
+<para>
+This provides a listing of command line options and a brief description
+of each.
+</para>
+</listitem>
+</varlistentry>
+
+<varlistentry>
+<term><option>--usage</option></term>
+<listitem>
+<para>
+This will display a usage summary
+of each command line option.
+</para>
+</listitem>
+</varlistentry>
+
+
+<varlistentry>
+<term><option>--version</option></term>
+<term><option>-V</option></term>
+<listitem>
+<para>
+This will display the version of LWLINK.
+of each.
+</para>
+</listitem>
+</varlistentry>
+
+</section>
+
+</chapter>
+
+<chapter id="objchap">
+<title>Object Files</title>
+<para>
+LWTOOLS uses a proprietary object file format. It is proprietary in the sense
+that it is specific to LWTOOLS, not that it is a hidden format. It would be
+hard to keep it hidden in an open source tool chain anyway. This chapter
+documents the object file format.
+</para>
+
+<para>
+An object file consists of a series of sections each of which contains a
+list of exported symbols, a list of incomplete references, and a list of
+"local" symbols which may be used in calculating incomplete references. Each
+section will obviously also contain the object code.
+</para>
+
+<para>
+Exported symbols must be completely resolved to an address within the
+section it is exported from. That is, an exported symbol must be a constant
+rather than defined in terms of other symbols.</para>
+
+<para>
+Each object file starts with a magic number and version number. The magic
+number is the string "LWOBJ16" for this 16 bit object file format. The only
+defined version number is currently 0. Thus, the first 8 bytes of the object
+file are <code>4C574F424A313600</code>
+</para>
+
+<para>
+Each section has the following items in order:
+</para>
+
+<itemizedlist>
+<listitem><para>section name</para></listitem>
+<listitem><para>flags</para></listitem>
+<listitem><para>list of local symbols (and addresses within the section)</para></listitem>
+<listitem><para>list of exported symbols (and addresses within the section)</para></listitem>
+<listitem><para>list of incomplete references along with the expressions to calculate them</para></listitem>
+<listitem><para>the actual object code (for non-BSS sections)</para></listitem>
+</itemizedlist>
+
+<para>
+The section starts with the name of the section with a NUL termination
+followed by a series of flag bytes terminated by NUL. There are only two
+flag bytes defined. A NUL (0) indicates no more flags and a value of 1
+indicates the section is a BSS section. For a BSS section, no actual
+code is included in the object file.
+</para>
+
+<para>
+Either a NULL section name or end of file indicate the presence of no more
+sections.
+</para>
+
+<para>
+Each entry in the exported and local symbols table consists of the symbol
+(NUL terminated) followed by two bytes which contain the value in big endian
+order. The end of a symbol table is indicated by a NULL symbol name.
+</para>
+
+<para>
+Each entry in the incomplete references table consists of an expression
+followed by a 16 bit offset where the reference goes. Expressions are
+defined as a series of terms up to an "end of expression" term. Each term
+consists of a single byte which identifies the type of term (see below)
+followed by any data required by the term. Then end of the list is flagged
+by a NULL expression (only an end of expression term).
+</para>
+
+<table frame="all"><title>Object File Term Types</title>
+<tgroup cols="2">
+<thead>
+<row>
+<entry>TERMTYPE</entry>
+<entry>Meaning</entry>
+</row>
+</thead>
+<tbody>
+<row>
+<entry>00</entry>
+<entry>end of expression</entry>
+</row>
+
+<row>
+<entry>01</entry>
+<entry>integer (16 bit in big endian order follows)</entry>
+</row>
+<row>
+<entry>02</entry>
+<entry>	external symbol reference (NUL terminated symbol name follows)</entry>
+</row>
+
+<row>
+<entry>03</entry>
+<entry>local symbol reference (NUL terminated symbol name follows)</entry>
+</row>
+
+<row>
+<entry>04</entry>
+<entry>operator (1 byte operator number)</entry>
+</row>
+<row>
+<entry>05</entry>
+<entry>section base address reference</entry>
+</row>
+
+<row>
+<entry>FF</entry>
+<entry>This term will set flags for the expression. Each one of these terms will set a single flag. All of them should be specified first in an expression. If they are not, the behaviour is undefined. The byte following is the flag. Flag 01 indicates an 8 bit relocation. Flag 02 indicates a zero-width relocation (see the EXTDEP pseudo op in LWASM).</entry>
+</row>
+</tbody>
+</tgroup>
+</table>
+
+
+<para>
+External references are resolved using other object files while local
+references are resolved using the local symbol table(s) from this file. This
+allows local symbols that are not exported to have the same names as
+exported symbols or external references.
+</para>
+
+<table frame="all"><title>Object File Operator Numbers</title>
+<tgroup cols="2">
+<thead>
+<row>
+<entry>Number</entry>
+<entry>Operator</entry>
+</row>
+</thead>
+<tbody>
+<row>
+<entry>01</entry>
+<entry>addition (+)</entry>
+</row>
+<row>
+<entry>02</entry>
+<entry>subtraction (-)</entry>
+</row>
+<row>
+<entry>03</entry>
+<entry>multiplication (*)</entry>
+</row>
+<row>
+<entry>04</entry>
+<entry>division (/)</entry>
+</row>
+<row>
+<entry>05</entry>
+<entry>modulus (%)</entry>
+</row>
+<row>
+<entry>06</entry>
+<entry>integer division (\) (same as division)</entry>
+</row>
+
+<row>
+<entry>07</entry>
+<entry>bitwise and</entry>
+</row>
+
+<row>
+<entry>08</entry>
+<entry>bitwise or</entry>
+</row>
+
+<row>
+<entry>09</entry>
+<entry>bitwise xor</entry>
+</row>
+
+<row>
+<entry>0A</entry>
+<entry>boolean and</entry>
+</row>
+
+<row>
+<entry>0B</entry>
+<entry>boolean or</entry>
+</row>
+
+<row>
+<entry>0C</entry>
+<entry>unary negation, 2's complement (-)</entry>
+</row>
+
+<row>
+<entry>0D</entry>
+<entry>unary 1's complement (^)</entry>
+</row>
+</tbody>
+</tgroup>
+</table>
+
+<para>
+An expression is represented in a postfix manner with both operands for
+binary operators preceding the operator and the single operand for unary
+operators preceding the operator.
+</para>
+
+</chapter>
+</book>
+
author	lost@l-w.ca
date	Wed, 19 Jan 2011 22:27:17 -0700
parents
children	fd1ecc5d6e69