Mercurial > hg-old > index.cgi
view doc/internals.txt @ 126:269ef87192ad
Fixed stupid logic problem reading input files
author | lost |
---|---|
date | Fri, 23 Jan 2009 05:10:33 +0000 |
parents | 7fbccdd1defb |
children | ebff3a3e8fa6 |
line wrap: on
line source
LWASM Internals =============== LWASM is a table-driven assembler that notionally uses two passes. However, it implements its assembly in several passes as follows. Pass 1 - Preprocessing & Parsing -------------------------------- This pass reads the source file and all included source files. It handles macro definition and expansion. As it reads the various lines, it also identifies any symbol associated with the line, the operation code, and, based on the operation code, the operand, if any. Upon examination of the operand, any expressions are stored in an internal postfix notation for later evaluation. During this pass, preliminary values are assigned to all symbols using the largest possible instruction size. A table of lines that reference every symbol is generated to be used in the following pass. Note that any symbols for which the value is known with no uncertainty factor will be generated with the smallest possible instruction. At this stage, simple optimizations are performed on expressions. This includes coalescing constants (1+2+x => 3+x). It also includes some basic algebra (x+x => 2*x, 2*x+4*x => 6*x, x-x => 0). Pass 2 - Optimization --------------------- This pass sweeps the code looking for operations which could use a shorter instruction. If it finds one, it must then re-define all symbols defined subsequently and all symbols defined in terms of one of those symbols in a cascade. This process is repeated until no more possible reductions are discovered. If, in the process of implementing an instruction reduction, a phasing error or other conflict is encountered, the reduction is backed out and marked as forced. The following may be candidates for reduction, depending on assembler options: - extended addressing -> direct addressing (not in obj target) - 16 bit offset -> 8 bit offset (indirect indexed) - 16 bit offset -> 8 bit or 5 bit offset (direct indexed) - 16 bit offset -> no offset (indexed) - 16 bit relative -> 8 bit relative (depending on configuration)