comparison doc/internals.txt @ 333:ebff3a3e8fa6

Updated internals to describe the multi-pass architecture
author lost
date Tue, 02 Mar 2010 00:44:18 +0000
parents ed3553296580
children e7885b3ee266
comparison
equal deleted inserted replaced
332:67224d8d1024 333:ebff3a3e8fa6
2 =============== 2 ===============
3 3
4 LWASM is a table-driven assembler that notionally uses two passes. However, 4 LWASM is a table-driven assembler that notionally uses two passes. However,
5 it implements its assembly in several passes as follows. 5 it implements its assembly in several passes as follows.
6 6
7 Pass 1 - Preprocessing & Parsing 7 Pass 1
8 -------------------------------- 8 ------
9 9
10 This pass reads the source file and all included source files. It handles 10 This pass reads the entire source code and parses each line into an internal
11 macro definition and expansion. 11 representation. Macros, file inclusions, and conditional assembly
12 instructions are resolved at this point as well.
12 13
13 As it reads the various lines, it also identifies any symbol associated with 14 Pass 2
14 the line, the operation code, and, based on the operation code, the operand, 15 ------
15 if any. Upon examination of the operand, any expressions are stored in an
16 internal postfix notation for later evaluation. During this pass,
17 preliminary values are assigned to all symbols using the largest possible
18 instruction size. A table of lines that reference every symbol is generated
19 to be used in the following pass. Note that any symbols for which the value
20 is known with no uncertainty factor will be generated with the smallest
21 possible instruction.
22 16
23 At this stage, simple optimizations are performed on expressions. This 17 This pass assigns instruction sizes to all invariate instructions. Invariate
24 includes coalescing constants (1+2+x => 3+x). It also includes some basic 18 instructions are any instructions with a fixed size, including those with
25 algebra (x+x => 2*x, 2*x+4*x => 6*x, x-x => 0). 19 forced addressing modes.
26 20
27 Pass 2 - Optimization 21 Pass 3
28 --------------------- 22 ------
29 23
30 This pass sweeps the code looking for operations which could use a shorter 24 This pass resolves all instruction sizes that can be resolved without
31 instruction. If it finds one, it must then re-define all symbols defined 25 setting addresses for instructions. This process is repeated until no
32 subsequently and all symbols defined in terms of one of those symbols in a 26 further instructions sizes are resolved.
33 cascade. This process is repeated until no more possible reductions are
34 discovered.
35 27
36 If, in the process of implementing an instruction reduction, a phasing error 28 Pass 4
37 or other conflict is encountered, the reduction is backed out and marked as 29 ------
38 forced.
39 30
40 The following may be candidates for reduction, depending on assembler 31 This pass assigns addresses to all symbols where values are known. It does
41 options: 32 the same for instructions. Then a repeat of similar algorithms as in the
33 previous pass is used to resolve as many operands as possible.
42 34
43 - extended addressing -> direct addressing (not in obj target) 35 This pass is repeated multiple times until no further instructions or
44 - 16 bit offset -> 8 bit offset (indirect indexed) 36 symbols are resolved.
45 - 16 bit offset -> 8 bit or 5 bit offset (direct indexed) 37
46 - 16 bit offset -> no offset (indexed) 38 Pass 5
47 - 16 bit relative -> 8 bit relative (depending on configuration) 39 ------
40
41 Finalization of all instruction sizes by forcing them to the maximum
42 addressing mode. Then all remaining instruction addresses and symbol values
43 are resolved.
44
45 Pass 6
46 ------
47
48 This pass does actual code generation.
48 49
49 50
51 Expression Evaluation
52 =====================
53
54 Each expression carries a certainty flag. Any expression in which any term
55 is flagged as uncertain is, itself, uncertain. There are a few specific
56 cases where such uncertainty can cancel out. For instance, X-X where X is
57 uncertain is guaranteed to be 0 and so there is no uncertainty.
58