Mercurial > hg-old > index.cgi
comparison doc/internals.txt @ 333:ebff3a3e8fa6
Updated internals to describe the multi-pass architecture
author | lost |
---|---|
date | Tue, 02 Mar 2010 00:44:18 +0000 |
parents | ed3553296580 |
children | e7885b3ee266 |
comparison
equal
deleted
inserted
replaced
332:67224d8d1024 | 333:ebff3a3e8fa6 |
---|---|
2 =============== | 2 =============== |
3 | 3 |
4 LWASM is a table-driven assembler that notionally uses two passes. However, | 4 LWASM is a table-driven assembler that notionally uses two passes. However, |
5 it implements its assembly in several passes as follows. | 5 it implements its assembly in several passes as follows. |
6 | 6 |
7 Pass 1 - Preprocessing & Parsing | 7 Pass 1 |
8 -------------------------------- | 8 ------ |
9 | 9 |
10 This pass reads the source file and all included source files. It handles | 10 This pass reads the entire source code and parses each line into an internal |
11 macro definition and expansion. | 11 representation. Macros, file inclusions, and conditional assembly |
12 instructions are resolved at this point as well. | |
12 | 13 |
13 As it reads the various lines, it also identifies any symbol associated with | 14 Pass 2 |
14 the line, the operation code, and, based on the operation code, the operand, | 15 ------ |
15 if any. Upon examination of the operand, any expressions are stored in an | |
16 internal postfix notation for later evaluation. During this pass, | |
17 preliminary values are assigned to all symbols using the largest possible | |
18 instruction size. A table of lines that reference every symbol is generated | |
19 to be used in the following pass. Note that any symbols for which the value | |
20 is known with no uncertainty factor will be generated with the smallest | |
21 possible instruction. | |
22 | 16 |
23 At this stage, simple optimizations are performed on expressions. This | 17 This pass assigns instruction sizes to all invariate instructions. Invariate |
24 includes coalescing constants (1+2+x => 3+x). It also includes some basic | 18 instructions are any instructions with a fixed size, including those with |
25 algebra (x+x => 2*x, 2*x+4*x => 6*x, x-x => 0). | 19 forced addressing modes. |
26 | 20 |
27 Pass 2 - Optimization | 21 Pass 3 |
28 --------------------- | 22 ------ |
29 | 23 |
30 This pass sweeps the code looking for operations which could use a shorter | 24 This pass resolves all instruction sizes that can be resolved without |
31 instruction. If it finds one, it must then re-define all symbols defined | 25 setting addresses for instructions. This process is repeated until no |
32 subsequently and all symbols defined in terms of one of those symbols in a | 26 further instructions sizes are resolved. |
33 cascade. This process is repeated until no more possible reductions are | |
34 discovered. | |
35 | 27 |
36 If, in the process of implementing an instruction reduction, a phasing error | 28 Pass 4 |
37 or other conflict is encountered, the reduction is backed out and marked as | 29 ------ |
38 forced. | |
39 | 30 |
40 The following may be candidates for reduction, depending on assembler | 31 This pass assigns addresses to all symbols where values are known. It does |
41 options: | 32 the same for instructions. Then a repeat of similar algorithms as in the |
33 previous pass is used to resolve as many operands as possible. | |
42 | 34 |
43 - extended addressing -> direct addressing (not in obj target) | 35 This pass is repeated multiple times until no further instructions or |
44 - 16 bit offset -> 8 bit offset (indirect indexed) | 36 symbols are resolved. |
45 - 16 bit offset -> 8 bit or 5 bit offset (direct indexed) | 37 |
46 - 16 bit offset -> no offset (indexed) | 38 Pass 5 |
47 - 16 bit relative -> 8 bit relative (depending on configuration) | 39 ------ |
40 | |
41 Finalization of all instruction sizes by forcing them to the maximum | |
42 addressing mode. Then all remaining instruction addresses and symbol values | |
43 are resolved. | |
44 | |
45 Pass 6 | |
46 ------ | |
47 | |
48 This pass does actual code generation. | |
48 | 49 |
49 | 50 |
51 Expression Evaluation | |
52 ===================== | |
53 | |
54 Each expression carries a certainty flag. Any expression in which any term | |
55 is flagged as uncertain is, itself, uncertain. There are a few specific | |
56 cases where such uncertainty can cancel out. For instance, X-X where X is | |
57 uncertain is guaranteed to be 0 and so there is no uncertainty. | |
58 |