Post-Assembly Program Relocation Relocation Dictionary (RLD) and External Symbol Dictionary (ESD)

Post-assembly Program Relocation Relocation Dictionary (RLD) and External Symbol Dictionary (ESD) 1. Terminology: Relocatable vs. not Relocatable "relocatable" refers to something that can be moved without adjustment; i.e., the program semantics are not changed by moving the item. To say something is not relocatable means that if you move it, you will have to adjust it. For simple SIC most instructions use absolute referencing, and so most operands are not relocatable. In contrast, most SIC/XE operands are relocatable, the exception being 4-byte instructions, which usually have operands that are not relocatable. Symbols such as statement labels and self-defining literals have a value that is relative to the START statement. These are called relative references. When a relative reference is used in an absolute context, it is not relocatable; e.g., for +LDA #STUFF the operand STUFF (which is a relative reference) is not relocatable. A reference whose value does not depend on the START statement is called an absolute reference. Actual numbers are always absolute references, so in contrast, +LDA #1234 has a relocatable operand. Note that LDA 1234 is relocatable for either simple SIC or SIC/SE. 2. Load Process The initial load process for a relocating loader is the same as that for an absolute loader (such as the one used for the SIC Simulator); i.e., the loader simply loads the object module into memory starting from a specified memory location. For an absolute loader, this location is specified as part of the load module and since the assembler has determined all addresses in the module using this location, no addressing adjustments are required. For a relocating loader, the memory location is not specified until load time (presumably provided by the operating system or some other control program); i.e., the binding of the load module to a specific load location in memory is not determined until the time of load. Since this location was not known at the time the object module was assembled, any absolute references in the object code must be adjusted to reflect the load location in order for the loaded code to work as planned. This process is called program relocation. 3. Relocation Dictionary (RLD) If programs are to be dynamically relocated in memory, the assembler must generate a table of those program locations that are not relocatable, called the Relocation Dictionary or RLD. For discussion purposes, we assume that the program START is 0. It is a simple adjustment otherwise. The RLD is used by the loader, which (for a START of 0) simply adds the program load point to the value pointed to by each (relative) location specified in the RLD. In contrast to SIC/XE, where most operands are relocatable, the RLD for a simple SIC program would need to specify almost every instruction. The RLD is normally generated "on the fly" during pass 2 of the assembler, with each entry derived from the current value of the location counter. The only information needed in an RLD entry is the (relative) location of the 3-byte word within the object module that needs to be adjusted for the object code to work. In particular, for the most common case in SIC/XE, a 4-byte instruction with a non-relocatable operand, the assembler simply adds locctr + 1 to the RLD as part of the process of generating code for the instruction. Note that it is during code generation that the assembler determines if generated code is relocatable or non- relocatable. Example: Assume that the WORD storage directive allows symbolic references in the operand field and suppose that a program has the following lines: loc assembly language statement object code 2B1 LOOP LDB ADDR,X 6BA010 . 2C4 ADDR WORD STUFF 0052A1 . 410C +JSUB LOOP 4B1002B1 4110 +LDB #12345 6B103039 . 52A1 STUFF RESW 12000 Since LOOP is a label in a program, then the statement +JSUB LOOP has a relative reference (LOOP) used in an absolute context and so it is not "relocatable". This is determined during code generation in pass 2 and the location of the operand 410C+1 = 00410D is added to the RLD. In contrast, for the statement +LDB #12345 the operand is absolute, so no RLD entry is generated. For the statement ADDR WORD STUFF the relative reference STUFF also occurs in an absolute context, which means that the 3-byte word generated by the assembler is not relocatable. Hence, the location for this particular WORD directive needs to be added to the RLD. Just as with "+" instructions, this is normally determined "on the fly" during pass 2 of the assembler, in this case when the WORD statement is resolved. The location counter points directly to the location of the operand; hence, the value 0002C4 is added to the RLD (which for this example also happens to be the value of ADDR). 4. Externally Defined Symbols, Control Sections (CSECT) Suppose that a program consists of a main routine and 2 subroutines and that these are being written independently. In order to assemble these routines, the source files must essentially be amalgamated, because each may use symbolic references defined in one of the other routines. As program size increases, the need to be able to work with subroutines and assemble them independently increases, so it is advantageous to automate this process. The mechanism employed is that of a control section (CSECT). A control section is a block of code that can be assembled independently. The first line of the block is <label> CSECT <initial-locctr> The block ends when the END statement or another CSECT is encountered. The START statement serves as the first CSECT. If a CSECT is labeled, the label is usually referred to as the name of the control section. If no initial location counter is provided, assembly starts from 0 for the control section. Since it can be assembled independently, each control section has its own RLD. There are two types of symbolic references used in a control section: 1. symbols defined in the control section that are referenced only within the section (local symbols) 2. symbols used in the control section that are not defined within the section (externally defined symbols) A local symbol name may appear in more than one control section since its reference within the section is unambiguous. Example: P 87 5. EXTREF and EXTDEF statements An externally defined symbol used within a control section must be identified by using EXTREF statements within the control section; e.g., EXTREF SUB1, SUB2 The EXTREF only needs to be issued before first use of the symbol in an operand, although good form is to place all EXTREF statements at the beginning of the CSECT. It is an error for a control section to have an EXTREF for a symbol that is also defined within the section. A control section identifies symbols that are to be made available to other controls sections by using EXTDEF statements; e.g., EXTDEF TABSIZE, ADDR1 If there is a label on the CSECT statement, it is automatically included as an EXTDEF (i.e., specification as EXTDEF is inferred). 6. External Symbol Dictionary (ESD) The RLD provides the information needed for the loader to relocate a control section. The means for dealing with externally defined symbols is called the external symbol dictionary or ESD. In contrast to the RLD, the ESD for each control section must contain both symbolic and location information. There are 2 basic parts to the ESD: 1. EXTDEF Part: finalized in pass 1 as pairs consisting of (<EXTDEF-symbol>, <value>) [<value> = value of the symbol in the symbol table] 2. EXTREF Part: finalized in pass 2 as triples consisting of (<EXTREF-symbol>, <location>, <operation>) [<location> = (relative) location of operand referencing the symbol] [<operation> = +, -, *, / ] Example: EXTDEF part Given a control section labeled SUB1, EXTDEF TABSIZE, ADDR1 might generate as the EXTDEF part of the ESD the table: SUB1 000000 TABSIZE 0000F3 ADDR1 0000DD where each EXTDEF address is taken straight from the symbol table at the end of pass 1. Each entry in the EXTDEF part of an ESD provides the value of a symbolic reference. It is an error if a symbol appears in the EXTDEF part of more than one ESD. Note that the references in the EXTDEF part are not relocatable; i.e., their values must be adjusted at load time by adding on the load point for their associated module. The EXTDEF part is set up by EXTDEF statements and can be finished at the end of pass 1. In contrast, the EXTREF part can only be constructed incrementally as external operands are encountered during pass 2 code generation. At module load time, each location in the EXTREF part must be adjusted by adding on the load point for the associated module (as is also the case for the EXTDEF part). If operand arithmetic is not supported, the <operation> entry is redundant (defaults to "+"), because the value of the external reference, once known, is just added to the 3-byte value at the operand location. Example: EXTREF part Suppose that TABSIZE is an EXTREF for some module, and the assembler (in pass 2) encounters the statement +LDA TABSIZE with location counter at 12B. The pass 2 object code line generated is then 03100000 and the entry for the EXTREF part of the ESD is TABSIZE 00012C + (the operand is at locctr+1) In essence, in generating the object code, the external reference to TABSIZE in this example is treated as an absolute reference (0), to be resolved at load time once the (relocated) value of TABSIZE becomes known.

Load more