96
|
1 LWASM Internals
|
|
2 ===============
|
|
3
|
|
4 LWASM is a table-driven assembler that notionally uses two passes. However,
|
|
5 it implements its assembly in several passes as follows.
|
|
6
|
|
7 Pass 1 - Preprocessing & Parsing
|
|
8 --------------------------------
|
|
9
|
|
10 This pass reads the source file and all included source files. It handles
|
|
11 macro definition and expansion.
|
|
12
|
|
13 As it reads the various lines, it also identifies any symbol associated with
|
|
14 the line, the operation code, and, based on the operation code, the operand,
|
|
15 if any. Upon examination of the operand, any expressions are stored in an
|
|
16 internal postfix notation for later evaluation. During this pass,
|
|
17 preliminary values are assigned to all symbols using the largest possible
|
|
18 instruction size. A table of lines that reference every symbol is generated
|
|
19 to be used in the following pass. Note that any symbols for which the value
|
|
20 is known with no uncertainty factor will be generated with the smallest
|
|
21 possible instruction.
|
|
22
|
|
23 At this stage, simple optimizations are performed on expressions. This
|
|
24 includes coalescing constants (1+2+x => 3+x). It also includes some basic
|
|
25 algebra (x+x => 2*x, 2*x+4*x => 6*x, x-x => 0).
|
|
26
|
|
27 Pass 2 - Optimization
|
|
28 ---------------------
|
|
29
|
|
30 This pass sweeps the code looking for operations which could use a shorter
|
|
31 instruction. If it finds one, it must then re-define all symbols defined
|
|
32 subsequently and all symbols defined in terms of one of those symbols in a
|
|
33 cascade. This process is repeated until no more possible reductions are
|
|
34 discovered.
|
|
35
|
|
36 If, in the process of implementing an instruction reduction, a phasing error
|
|
37 or other conflict is encountered, the reduction is backed out and marked as
|
|
38 forced.
|
|
39
|
|
40 The following may be candidates for reduction, depending on assembler
|
|
41 options:
|
|
42
|
|
43 - extended addressing -> direct addressing (not in obj target)
|
|
44 - 16 bit offset -> 8 bit offset (indirect indexed)
|
|
45 - 16 bit offset -> 8 bit or 5 bit offset (direct indexed)
|
|
46 - 16 bit offset -> no offset (indexed)
|
|
47 - 16 bit relative -> 8 bit relative (depending on configuration)
|
|
48
|
|
49
|