Nr. 2..5, Mar..Jun-1996 (revised 07-Sep-2009: added Chap. 12)
Copyright (1996-2009) by Günter Dotzel and Hartmut Goebel
The Oberon System is an object-oriented programming framework supporting persistent objects and run-time extensibility. This paper describes the development of Alpha Oberon, an implementation of the Oberon System V4 for AlphaAXP under OpenVMS with X11 server. The port is based upon the ETH Zuerich's Oberon System for MIPS/Unix and ModulaWare's OpenVMS AXP stand-alone Oberon-2 compiler. The processor and operating system specific parts of the Oberon System and its boot-loader were rewritten in Oberon-2 from scratch. Details are provided for the module-loader, bootstrap-mechanism, garbage collector, OpenVMS' vs. Unix's system service support for exception- and stack-processing, procedure calling conventions, data alignment issues, and run-time data structures.
Keywords: Oberon, Alpha, AXP, OpenVMS, Compiler, OOP, Operating System, Exception handler, Garbage collector
This paper describes the development of Alpha Oberon, the Oberon System V4 for OpenVMS AXP with OSF/Motif (X11), in respect of the implementation of the module-loader, the bootstrap-mechanism, the garbage collector (GC), the exception-handler (EH), and the modifications to the stand-alone Oberon compiler.
Oberon is a programming language as well as an operating system [Rei91], [Wir92]. The programming language Oberon is simpler than Modula-2, but supports object-oriented programming through extensible records and polymorphism. Oberon-2 [M~os91] adds type-bound procedures (methods) and dynamic arrays. The Oberon System provides an extensible but compact framework for Oberon-2, where all resources are used economically.
The Oberon System is a single-process multitasking operating system with automatic garbage collection and dynamic module-loading, which allows run-time extensibility. See also [Rei91], [Rei92], [Wir88], and [Wir92].
The ETH-Zuerich offers two versions of the Oberon System, version 3 and 4. The conception of V3 seems more powerful, because it is completely based on persistent objects [Gut94]. However, the interfaces of V4 are stable in contrast to V3 which is still evoluting. Also, at the start of the project, the sources of V3 were not available, which ruled out the possibility to start with V3. But there is evidence from other implementors that the migration from V4 to V3 is not complicated.
Alpha Oberon is based on DECOberon, an implementation of Oberon System V4 on MIPS/Unix respectively DEC Ultrix. All processor and platform specific parts of the Oberon System, i.e.: the module-loader, boot-loader and EH were rewritten from scratch.
DECOberon uses X11 as graphical user interface which offers drawing functions, mouse- and keyboard-event processing needed by the Oberon System. Except for the foreign procedure type declarations, module X11 remained almost unchanged in Alpha Oberon.
In DECOberon, the file I/O interface- and other Unix service-routine calls are located in module Unix. The interface of module Unix was essentially kept unchanged in Alpha Oberon, so that higher-level Oberon System modules didn't need any changes. The implementation of module Unix translates all Unix-functions needed by the Oberon System to functionally equivalent OpenVMS-calls. Those Unix services which don't have a direct equivalent in OpenVMS are emulated in module Unix. Such an emulation is not always trivial. For example, Alpha Oberon also supports reading OpenVMS' variable record length files, which involve a temporary file copy in Unix.Open.
Compared to other RISC-processors, the AXP architecture [AXP] has some architectural specialities: There aren't any symbolic references in the program-code, there is no program-status-word and there are 48 different modi for each floating-point-instruction. The processor can execute multiple instructions in one cycle. In high-performance mode, it isn't not possible to localize the instruction, which has caused a trap (e.g. division by zero or access violation). The processor also requires special EH mechanism and natural data alignment to avoid execution time penalties.
For other implementations of the Oberon Systems a dedicated compiler was developed. For the Alpha Oberon port, a stand-alone, native-code Oberon-2 compiler was already at hand [Dot94]. During the development of Alpha Oberon only three small compiler errors were detected. A2O generates directly AXP machine-code in OpenVMS object-file format. The A2O data sheet is located in ModulaWare's homepage (file /h2odat.txt).
Only a relatively small extension of A2O's compiler-backend (linker-interface) was necessary to generate Oberon load-files (OLF) instead of OpenVMS object-files (for stand-alone programs) and to support the GC (see chap. 7).
Unlike other implementations such as HP-Oberon [Sup94], no distinction has to be made between calling internal (available with the Oberon System) or external procedures (shareable image), because A2O uses the OpenVMS AXP calling conventions [VMS2]. A2O generates identical code for the Oberon System and for stand-alone applications. Any run-time system (RTS) routine of A2O can easily be remapped to an external procedure. A feature which is used in the boot-loader (see chap. 5.1 and 6).
This chapter summarises those concepts whose implementation details are presented in the chapters 5 to 8.
Dynamic module-loading allows to load a module only when it is needed. A module has neither to be present during system development, nor declared. Linkage is impossible. Such an unnecessary additional step would limit the extensibility of the system. If the user develops and compiles a module, this module is immediately available by loading it. The module-loader loads each module into the heap of the Oberon System, resolves all references and initialize the module-body. If a module imports other modules they are also loaded, if they are not already located in the heap. This process runs recursively and to make sure, that all indirectly used modules are available. The so-called module-key guarantees the version consistency of the module interfaces.
The GC collects automatically all unused storage units which can be re-used for allocation. The GC recognizes an unused storage unit, if it is no longer directly or indirectly referenced. The GC uses the so-called mark-sweep-method; that is: At first all reachable objects were marked, then the heap is examined sequentially and the storage-scope of all non-marked objects is released. This requires that the heap consists of a continuous storage area. The boot-loader allocates this storage area from the operating system and makes it available to the Oberon System. It isn't necessary that this (possibly large) storage area is initialized with NIL, because this is done with each allocation with Oberon NEW. Like most Oberon Systems which are built on-top of an existing operating system, the GC is also responsible for closing all files. More details are presented in [Wir92], [Pfi91], [Tem91] and chap. 7.
Programming errors (traps) have to be handled within the system, otherwise the system would terminate. The control would return to the underlying operating system (if there is one). Why an execption handler is needed, is illustrated by an example: In a conventional operating system, an editor is activated by a command. This command puts the computer into the "edit" mode. If an error occurs during the execution, which isn't handled by the editor, the editor (program) would terminate. In most cases the modifications to the text are lost. In the Oberon System is no edit-mode: there is only a class 'Text' and a set of editing commands, which operate on this class. Also loading a text file is only a command (Edit.Open), which loads the text and opens a so-called viewer to map part of that text to a display window. After execution of Edit.Open, the control returns to the system's main-loop, Oberon.Loop. If an error occured during command execution, which isn't handled by the command itself, the control is also given back to the main-loop. Thus the text is kept and is editable further on.
There are three module categories: core- and system-modules and tools. The bootstrap loads the Oberon System into the process' main storage. There are two boot-phases: the first is under control of the boot-loader (module BootLoader) and the second under control of the Oberon core (module System). The boot-loader loads the code, reference and constant sections of all core-modules, into the heap. During the load, all references are resolved, the type descriptors for run-time type information are constructed and the global variable sections are reserved. Then the boot-loader transfers the program control to the body of the module, which is the top of the import-hierarchy, for initialisation, At this point, the Oberon System is initialized and the remaining system-modules were loaded.
The architecture of the original Oberon System is inherently designed for 32 Bit. In Oberon, the 32 bit dependency can be illustrated for example by the function SYSTEM.ADR, which is of type LONGINT (32 Bit). The Oberon System assumes at many places, that LONGINT has the same size as an address. Low-level modules like Kernel and the module-loader perform a lot of address computations using variables of type LONGINT. There are many 32 bit dependencies, which can't easily be detected and hence it seems impossible to have 64 bit addressing without also changing the pervasive type LONGINT to 64 bit integer.
The AXP processor has 64 bit addresses, but OpenVMS processes are currently restricted to operate within 32 bit virtual address space. The 64 bit address representation is a canonical extension of the sign bit 31 into bits 31 to 63. This technique allows to have compatible system services and data types on VAX and AXP, which eases migration to AXP. DEC currently extends OpenVMS to allow 64-bit virtual process space.
One of the goals of the Alpha Oberon was, to prepare for 64 bit address extension in the low-level modules (module-loader, GC). This was done by introducing two types ADDRESS = LONGINT, for variables, which contain a 32 Bit address and ADDRESS64 = SYSTEM.SIGNED_64 (64-bit integer type of A2O), for variables which already contain a 64 bit address in canonical form. The heap data structures of Alpha Oberon are already prepared for 64 bit. Where possible ADDRESS64 was used, othwerwise fields of type ADDRESS are padded with fill-bytes.
A2O can generate OpenVMS object-files, which allow bindings to so-called shareable images. This feature allows to connect at run-time to procedures, which are not known at compile time. This is done using an operating system service (LIB$FIND_IMAGESYMBOL) via name symbol, e.g.: "Module.Procedure".
The use of shareable images in order to implement Oberon System's module-loader, would allow to have a common object file format for stand-alone and embedded Oberon programs. But this possibility turned out to be impractical for several reasons. Therefore the A2O back-end (linker interface) was extended, to allow generation of syntactically simple Oberon load-files (OLF).
On instruction level, the generated code is identical for both formats. The reference section is of course different. What the OpenVMS linker is supposed to resolve at link time, to produce a static linkage section, required by the so-called base pointer architecture, needs to be done by the module-loader at run-time, with identical storage layout. Identical mapping is also required for storage sections of constants and variables, and for the program code itself. Constants and program-code are transfered one-to-one from the OLF and copied into the heap. Storage for the variable-section is allocated in the heap.
An OLF contains all information, necessary for module binding by the module-loader: imported modules, exported procedures and commands. The GC needs pointer-offsets and the exception- and trap-handler (post-mortem dump viewer) needs the symbolic references (RefBlk; see [Wir92]). For each loaded module a structure ModuleDesc is allocated to store its descriptor:
Name *= ARRAY 32 OF CHAR;
Module* = POINTER TO ModuleDesc;
ModuleDesc* = RECORD
next-: Module;
refcnt-: LONGINT;
key: ModuleKey;
imports: POINTER TO ARRAY OF (* Module *) ADDRESS;
linkage: LinkageSectionPtr;
procDescs-: POINTER TO ARRAY OF ProcDesc;
data-, const, code-: POINTER TO ARRAY OF LONGINT;
entries: POINTER TO ARRAY OF INTEGER;
cmds-: POINTER TO ARRAY OF Cmd;
ptrTab: POINTER TO ARRAY OF ADDRESS;
tdescs-: POINTER TO ARRAY OF (* Kernel.Tag *) ADDRESS64;
refs-: RefBlock;
rtsFlags: SET;
name-: Name;
init: BOOLEAN;
END ;
The types ModuleKey, Cmd and RefBlock aren't described here, because they
are not needed for the understanding of the presented concepts. The most
important tasks of the module-loader is the generation of linkage- and type
descriptor section.
Most CISCs and 32-bit RISCs call a procedure directly via an address whose immediate value follows the call instruction. The 64-bit AXP RISC processor does not have any linker-relocatable addresses in the program section. Each processor instruction has a fixed width of 32 bit and there is no room to store any 64 bit virtual addresses in the instruction format. Data is accessed relative to base pointers and procedures are accessed via a so-called procedure linkage pair. The called procedure finds its context and base pointers by a pointer to its own procedure descriptor [VMS2].
Data base pointers, linkage pairs and procedure descriptors are stored in the linkage-section. The references, all 64 bit wide are resolved by the binder or loader. The procedure descriptor stores information about start address of the procedure code (entry), parameter types, stack-frame size, and EH, if present. A linkage-pair consists of the procedure code entry-address and the address of its procedure descriptor.
Before a procedure is called, in each case the corresponding procedure descriptor has to be loaded into a reserved register. Only by means of its own procedure descriptor, a procedure can load the required base registers to address its own or foreign global variables, constants, objects, and procedures.
With A2O, the linkage-section of a module contains the pointers to its own sections (constant, data, objects, run-time system), procedure descriptors of the own procedures, pointers to sections of imported modules (constant, data, objects), linkage pairs for the run-time system procedures and linkage-pairs for all used procedures from the own and from external modules.
These data stuctures are defined and documented in module A2OLayout.
LinkageSection = RECORD (* <modulename>_$link$ = modula2_$vector$ *)
linkdata: RECORD
data : ADDRESS64; (* modula2_$data$ *)
const : ADDRESS64; (* modula2_$strings$ *)
rts : ADDRESS64; (* Modula2$RunTimeSystem_$data$ *)
object : ADDRESS64; (* oberon2_$objects$ *)
unused1: ADDRESS64;
unused2: ADDRESS64;
END;
procDesc : ARRAY nofOwnProcs OF ProcDesc;
impMods : ARRAY nofImportedModules OF ImportedModuleData;
rts : ARRAY 16 OF ProcLinkage;
own : ARRAY nofOwnProcs OF ProcLinkage;
imp : ARRAY nofImportedProcs OF ProcLinkage;
END;
Pseudo-definition of the linkage-section
OpenVMS defines different formats of procedure descriptors; more details are in [VMS2]. For the module-loader this definition of the so-called full-frame procedure descriptor is sufficient:
ProcDescPtr = POINTER TO ProcDesc;
ProcDesc = RECORD
data1 : SYSTEM.QUADWORD;
entry : ADDRESS64;
data2 : ARRAY 2 OF SYSTEM.QUADWORD;
handlerData : ARRAY 2 OF ADDRESS64;
END;
ProcDesc = data1:8 entry:Num data2:16 handlerEntry:Num handlerData:Num .
Block(86X); i := 0; t := S.ADR(m.code[0]);
WHILE i < nofOwnProcs DO
Files.ReadBytes(R, m.procDescs[i].data1, 8);
Files.ReadNum (R, adr); m.procDescs[i].entry := t + adr;
Files.ReadBytes(R, m.procDescs[i].data2, 16);
Files.ReadNum (R, adr); m.procDescs[i].handlerData[0] := adr;
Files.ReadNum (R, adr); m.procDescs[i].handlerData[1] := adr;
INC(i);
END;
Definition of procedure descriptor, OLF-representation and read routine
The data sections data1 and data2 are read in one block and copied into the storage without interpretation. For the entry data the offset of the code-section is entered in the OLF. handlerData is used for the EH and is currently always zero. The procedure descriptors are followed by the data pointer for the imported modules: one for variables/data-, one for constants- and one for type descriptor/object-section. They are used, if the data of an imported module is referenced. Becauses all imported modules are already loaded, the pointers can be copied from their linkage-section.
ImportedModuleData = RECORD
data : ADDRESS64
const : ADDRESS64;
object: ADDRESS64;
END;
i := 0;
WHILE i < nofimp DO
m1 := imports[i]; m.imports[i] := S.VAL(ADDRESS,m1);
IF m1 # foreignMod THEN
impMods[i].data := m1.linkage.linkdata.data;
impMods[i].const := m1.linkage.linkdata.const;
impMods[i].object:= m1.linkage.linkdata.object;
END;
INC(i);
END;
The last part of the linkage-section contains the linkage-pairs. A linkage-pair has
following structure:
LinkagePair = RECORD
entryAdr : ADDRESS64;
procDesc : ADDRESS64; (*ProcDescPtr;*)
END;
Actually the item entryAdr is redundant, because it is also registered in the
procedure descriptor. It serves to avoid an additional dereferencing during
procedure call. Besides the redundancy, it also facilitates the construction of the
linkage-pairs: only the address of the procedure descriptor is needed.
PROCEDURE FillLinkagePair(VAR lp: LinkagePair; procDesc: ADDRESS);
VAR pdesc: ProcDescPtr;
BEGIN
pdesc := SYSTEM.VAL(ProcDescPtr, procDesc);
lp.entry := pdesc.entry;
lp.procDesc := procDesc;
END FillLinkagePair;
The linkage-pairs section is divided into three sub-sections: the linkage-pairs of
the RTS-routines, of the own procedures, and of the imported procedures. The
module-loader processes Oberon procedures and foreign language procedures
differently. In A2O, foreign procedures are declared in so-called foreign
interface modules. Like all external symbols, references to foreign procedures
are resolved symbolically by procedure Kernel.dllsym (see chap. 6.2) via an
operating system service.
Maybe the only 'hack' in the Alpha Oberon is the installation of the storage allocation routine. O2NewCode is the only RTS-routine, which is currently represented by an Oberon procedure. For stand-alone programs (OpenVMS linker), it is resolved to Storage.ALLOCATE directly. The module-loader 'patches' this entry to Kernel.ALLOCATE. This binding is done symbolically, by 'misuse' of RefBlk of the module Kernel. A description of ALLOCATE is searched in Kernel to resolve the reference. More informations about RefBlk can be found in [Wir92].
This 'hack' is necessary, in order to guarantee that the procedure Kernel.ALLOCATE is referenced in the boot-loader from the subsequently loaded module Kernel. SYSTEM.VAL(ADDRESS, Kernel.ALLOCATE) would deliver a reference to the statically linked module Kernel. However, the system wouldn't work with that procedure. Also, the work-around for module Storage, as used for the storage management in the boot-loader itself, wouldn't work with it. The OLF does not contain data for the linkage pairs of the procedures of the own module. Their sequence corresponds to the order of the procedure descriptors, from which they are calculated.
i := 0;
WHILE i < nofOwnProcs DO
FillLinkagePairP(ownLinkage, S.ADR(m.procDescs[i]));
INC(i);
END;
For the linkage pairs of the imported procedures the module number and the
number 'num' (see below) of the export-entry are stored in the OLF. The module
number specifies from which of the imported modules the procedure is derived.
In the export list of this module, the first entry contains the number of the
corresponding procedure descriptor. From that the linkage-pair is calculated.
i := 0;
WHILE i < nofImpProcs DO
Files.ReadNum(R,num);
IF num = -1 THEN
Files.ReadString(R,name);
Unix.dllsym(name, t); FillLinkagePairP(ownLinkage, t);
ELSE
m1 := S.VAL(Module, m.imports[num-1]);
Files.ReadNum(R,num);
FillLinkagePairP(ownLinkage, S.ADR(m1.procDescs[m1.entries[num]]));
END;
INC(i);
END;
If the module number is equal to -1, a foreign procedure this indicated, which is
to be referenced symbolically by its name (without module name). The symbol
stored in the OLF is passed to Unix.dllsym, which resolves and returns the
symbols address. Per definition, the value of a procedure symbol is identical to
the address of the procedure descriptor. With that the return value of
Unix.dllsym is used for the generation of linkage-pairs. More information about
the dllsym can be found in chap. 6.2.
As described in chap. 7.3, the type descriptors have to be allocated in the heap individually. The layout corresponds to the structures used in DECOberon, however the addresses were expanded to 64 Bit. Here is the pseudo-definition of a type descriptor:
TypeDesc = RECORD
tag0 : ADDRESS64;
tdSize : INTEGER64;
sentinel : INTEGER64;
self : ADDRESS64; (* points to recSize *)
ext : RECORD extlev, pad: LONGINT; END;
module : ADDRESS64; (* points to the types Modules.ModuleDesc *)
name : Name;
methods : ARRAY nofMethods OF LinkagePair;
keys : ARRAY 32 (*=maxExts*) OF ADDRESS64;
tag : ADDRESS64; (* points to tdSize *)
recSize : INTEGER64;
ptrs : ARRAY nofPtrs+1 OF LONGINT;
END;
The base address of a type descriptor is SYSTEM.ADR(recSize). It represents
the tag of all records of this type. The field elements tag0 to self correspond to
the structure which is generated at the beginning of the allocated storage area.
Solely the element self doesn't point to tdSize, but to recSize. tag and self
connect type descriptor and type information, which are used for persistent
objects. The OLF contains all information, necessary to generate the type
descriptors.
TypeBlock = 89X maxIdentLen:Num {Type} -1:Num .
Type = recsize:Num nofmethods:Num nofptr:Num extLev:Num
String BaseTypeMethods {offset:Num}*nofptr .
BaseType = [ -1:Num | modnum:Num offset:Num nofinhMeth:Num ] .
Methods = {methNum:Num entryNum:Num} -1:Num .
The storage requirements for the type descriptor is determined with nofmethods
and nofptrs. All others have constant size. The generated type descriptor must
contain the keys of all base types, as well as the methods inherited by them.
There is a simple way to construct the type descriptor, because it differs from its
direct base type only by an additional key value and the additional or overwritten
methods. Consequently, the keys and methods of the base type can be copied
and then completed. Because the base type is always derived from the actual or
directly imported module, it can only be referenced by a number. nofinhMeth
counts the numbers of base type's methods. This number determines the size of
the area to be copied. The correspondig section of LoadTypes is given below:
Files.ReadNum(R, i); (* modnum *)
IF i # -1 THEN (* inherit/copy data from basetype *)
IF i = 0 THEN m1 := m; ELSE m1 := imports[i-1]; END;
Files.ReadNum(R, i); S.GET(S.ADR(m1.tdescs[i]), i); (* base tdadr *)
Files.ReadNum(R, j); j := j*AL.LnkPLSize + keySize + qwSize;
(* inherited methods + keys + tag *)
S.MOVE(i-j, t-j, j-qwSize);
END;
S.PUT(t - (qwSize + (tdd.ext.extlev+1)*tagSize), t); (* implant own key *)
p := S.VAL(ProcLinkageArrPtr, t-(qwSize+keySize+(nofmeth+1)*AL.LnkPLSize));
LOOP
Files.ReadNum(R, i); IF (i = -1) THEN EXIT; END;
Files.ReadNum(R, j);
FillLinkagePair(p[nofmeth-i], S.ADR(m.procDescs[j]));
END;
After copying the base type data, the address of the own type descriptor is
written as a key into the key table. The last loop generates the linkage-pairs for
the methods. These can only be defined in the same module, where the type is
defined. That's why the method number and the number of the procedure
descriptor are stored in the OLF. The linkage-pairs of the methods are stored in
the type descriptor, because methods were called by the address of the type
descriptor plus offset. The procedure descriptors are in the linkage-section of
the module. A2O stores also the associated linkage-pairs, but they aren't
needed.
Because the module-loader runs within of the Oberon System, at first itself and all used modules have to be loaded into the storage. Then the Oberon System has to be started for self-booting. For the most Oberon implementations the bootstrap works as follows: With a special tool, the so-called boot-linker, the module-loader and all other used modules were composed into a boot-file. The boot-file contains a direct memory map of these modules with their data-sections and all references in the code being already relocated. The boot-loader demands the Oberon heap from the operating system, loads the boot-file into the heap and resolves the address references. Then it jumps to the entry-point of the boot-file and the System boots. But now the heap is initialized in Kernel.Boot. Because the data of the boot-file consists of a contiguous block of data, the Kernel.Boot recognizes the end of the boot-file data and the beginning of free storage.
One disadvantage of this method is the code duplication caused by the boot-linker: it must contain the load routines, which are also in the module-loader. Also, the boot-linker must generate an exact image of the heap, by duplicating the storage management-routines from module Kernel. There is of course a good reason for having a boot-linker, namely cross development. Normally there is no stand-alone Oberon-2 compiler available on the target machine. The boot-loader is then written in another language for which a compiler is available.
Target machine Boot-loader language Module-loader language
____________________________________________________________
Ceres NS32000 Assembler Oberon
Mac II - MC68000 Assembler
SPARCStation - Modula-2
DECstation C Oberon
RISC System/6000 - C
Chameleon Oberon Oberon
Mithril Modula-2 Oberon
Alpha AXP Oberon Oberon
Implementation language of the loader, extended from [Bra92]
Due to A2O, the Alpha Oberon bootstrap could be simplified. The boot-loader doesn't need a special boot-file; it reads each module as OLF. Therefore the boot-loader uses the same algorithm as the module-loader. Code-duplication could be limited to one single procedure (BootLoader.LoadModule). But one problem remains to be solved: How do the modules get into the heap, without leaving references to the boot-loader? Such references would disarrange the GC.
The solution is to use the Oberon heap only and to cut all remaining references. A2O maps the functions NEW and SYSTEM.NEW to Storage.ALLOCATE, and therefore the boot-loader could use another storage module, which maps Storage.ALLOCATE to Kernel.ALLOCATE. So the boot-loader allocates and initializes the Oberon heap, and all demands for storage are satisfied by Kernel.ALLOCATE from there.
PROCEDURE InitHeap; (* types are taken from Kernel *)
TYPE
FreePtr = POINTER TO RECORD
(* off-8*) tag : ADDRESS;
pad0: LONGINT; (* todo 64 bit: remove pad0 *)
(* off0 *) size: LONGINT; (* field size aligned to 8-byte boundary,
size MOD B = B-8 *)
pad1: LONGINT; (* todo 64 bit: remove pad1 *)
(* off8 *) next: ADDRESS;
pad2: LONGINT; (* todo 64 bit: remove pad2 *)
END ;
VAR size, firstBlock, endBlock: LONGINT; rest: FreePtr;
BEGIN
heapSize := heapSize*1024*1024;
heapAdr := Unix.Malloc(heapSize);
firstBlock := heapAdr + (-heapAdr-8) MOD B;
size := heapAdr + heapSize - firstBlock;
DEC(size, size MOD B);
IF size = heapSize THEN DEC(size,B) END; (* makeroom for rest^ *)
endBlock := firstBlock + size;
heapAdr := firstBlock; heapSize := size;
(* save re-calculation inKernel.Boot *)
rest := S.VAL(FreePtr, firstBlock);
rest.tag := S.VAL(LONGINT, S.VAL(SET, S.ADR(rest.size)) + free);
rest.size := S.VAL(LONGINT, endBlock) - S.VAL(LONGINT, rest) - 8;
rest.next := 0;
Kernel.FindRoots := TheEmptyProc; (* there's nothing to collect *)
Kernel.Boot;
END InitHeap;
As soon as the boot-loader demands storage from the operating system, the
storage is initialized by a single free-block. Then Kernel.Boot is called and from
there the GC. Because there are no objects in the heap, this serves for the
free-list generation. During the load process of module Unix the variable
_Unix.dllsym is initialized by the procedure boot-loader.dllsym. Because
_Unix.dllsym is the first variable in _Unix, this can be done without importing
_Unix. It would not make sense to import _Unix, because not the dynamic
module _Unix, which is to be loaded, would be used, but the statically linked
module.
After loading of the System modules into the heap, the heap also contains file buffers and other storage blocks. But the Oberon System should only take over the loaded modules. The task to remove them isn't done by the boot-loader, but by the system itself during boot-phase two. All references to the boot-loader have to be removed from the heap. "Module" is the only data type, which should be taken over. Thus only the tags for ModuleDesc to the corresponding tags in the Oberon System have to be patched:
PROCEDURE PatchTags;
VAR m: Module; i: LONGINT; t: ADDRESS;
td: POINTER TO RECORD filler: ARRAY 5 OF INTEGER64; name: Name END;
BEGIN
m := modules;
WHILE (m # NIL) & (m.name # "Modules") DO m := m.next; END ;
ASSERT(m # NIL);
i := LEN(m.tdescs^);
REPEAT
DEC(i);
t := SHORT(m.tdescs[i]); (* todo 64bit: remove SHORT *)
S.GET(t - 8, td);
UNTIL td.name = "ModuleDesc";
ASSERT(t # 0);
m := foreignMod;
WHILE (m # NIL) DO
S.PUT(S.VAL(LONGINT,m) - qwSize, t);
m := m.next;
END;
END PatchTags;
From the type descriptors of 'Modules' the type 'ModuleDesc' is searched using
the information for persistent objects. Then the module list is checked and the
tag is overwritten by the newly determined value respectively. The precondition
is that 'Modules' is loaded directly or indirectly in order to determine the tag. This
isn't a limitation, because 'Modules' is an essential part of the Oberon System,
because it implements the module-loader.
Above is said that "all references to the boot-loader" have to be resolved, but exactly speaking, Unix.dllsym must still point to BootLoader.dllsym. Because this reference is located within a variable section of a module and procedure variables aren't traced by the GC, this reference doesn't disarrange. At this point the initializing body of 'Modules' is called and boot-phase two starts.
When the body of 'Modules' gets control, it initializes and calls Kernel.Boot. From there the GC is called and all modules are marked. Because the pointer in the variable section of the modules were assigned to NIL, no further objects are marked. By this all undesired blocks are released during the sweep-phase e.g. file buffer of the boot-loader. At the same time the free-list in the heap is produced. The free-list of the statically linked module Kernel is located in the variable section of the boot-loader and hence are inaccessible.
PROCEDURE Boot*; (* is called from Modules immediatly after booting *)
BEGIN
IF ~ booted THEN booted := TRUE; (* avoid user call *)
(* heap has been set up by boot-loader, so just get the values *)
Unix.dllsym("heapAdr", S.VAL(ADDRESS, heapAdr));
Unix.dllsym("heapSize", S.VAL(ADDRESS, heapSize));
firstBlock := heapAdr;
endBlock := firstBlock + heapSize;
firstTry := TRUE;
GCenabled := TRUE;
GC(FALSE);
END;
Unix.Init;
END Boot;
Kernel.Boot
Now the heap contains only the data of the modules. By initializing of the
remaining modules which were loaded, boot-phase two is completed.
Boot-phase three is started by loading/initializing 'Modules' shown by the
following sequence:
VAR modPtr, cmdPtr: POINTER TO RECORD name: Name END;
...
Unix.dllsym("modPtr", S.VAL(ADDRESS, modPtr));
Unix.dllsym("cmdPtr", S.VAL(ADDRESS, cmdPtr));
loop := ThisCommand(ThisMod(modPtr.name), cmdPtr.name)(*default: Oberon.Loop*)
modPtr and cmdPtr are pointers to character strings, which are specified by the
boot-loader command qualifiers /Module[="Oberon"] and /Command[="Loop],
which loads module Oberon and its body is executed. The body calls module
Configuration and then returns to module 'Modules' and to Oberon.Loop, which
waits for input.
Some problems remains to to be solved: How will the reloaded module Kernel come to know where the heap starts and what's its size? How can the options of the boot-loader be transfered to the System? How can the module-loader references system services?
This is done by BootLoader.dllsym. As mentioned in chap. 4.4, the procedure variable dllsym of the reloaded module Unix is assigned to that procedure. With that the connection from the Oberon System to the outer world is established. dllsym transferes all required data to the Oberon System. So it is not necessary to reserve further variables in the Oberon System for the boot-loader. The System demands data by the boot-loader. This minimal interface makes the System more robust for changes:
PROCEDURE dllsym * (name: ARRAY OF CHAR; VAR res: ADDRESS);
PROCEDURE ResSym(pat, library: ARRAY OF CHAR): BOOLEAN;
VAR i, len: LONGINT;
BEGIN i := 0; len := LEN(name); IF LEN(pat) < len THEN len := LEN(pat) END;
WHILE (name[i] = pat[i]) & (i < len) DO INC(i) END;
IF pat[i] = CHR(0) THEN
IF ODD(lib.LIB$FIND_IMAGE_SYMBOL(library,name,res)) THEN RETURN TRUE END;
END;
RETURN FALSE
END ResSym;
BEGIN
res := 0;
(* resolve our symbols first *)
IF name = "heapAdr" THEN res := S.VAL(ADDRESS,heapAdr);
ELSIF name = "heapSize" THEN res := S.VAL(ADDRESS,heapSize);
...
ELSE (* foreign symbol *)
IF ResSym("SYS$","SYS$SSISHR")
...
OR ResSym("","LIBRTL")
THEN (* okay *)
ELSE
Console.Str("Error: Can't resolve symbol ");
Console.Str(name); Console.Ln; HALT(20);
END;
END;
END dllsym;
The procedure gets the name of the symbol to be resolved. If the symbol is
defined by the boot-loader, the value is given back directly, for example the
symbol heapAdr. Otherwise, the operating system service
LIB$FIND_IMAGE_SYMBOL tries to resolve the symbol. To find out in which
sharable image the symbol is, the beginning of the symbol-name is compared
with a pattern, e.g.: all symbols, which start with "SYS$" are searched in the
shareable image SYS$SSISHR.EXE. If no pattern matches, the symbol is
searched in LIBRTL.EXE, X11 and other shareable image libraries.
During development it turns out, that without GC the system would work properly. As described in [Bra92] the GC is responsible for closing the files, because the Oberon System does not have the possibility to close files physically, i.e., closing files on the level of the underlying file system. Without GC a file couldn't be opened a second time, after it was closed. As base, the GC of DECOberon was used, which itself is based on the SPARC-Oberon [Tem91]. This GC is able to follow objects on the stack. This is important, because the GC is also called implicitly, e.g. when a file is opened but the storage is insufficient.
The original GC of Oberon System V1 didn't need to search for objects on the stack, because the GC isn't called implicitly but only as command, i.e.: when the stack is empty. To follow pointers in dynamically allocated arrays, a special type descriptor has to be used.
Target machine GC language search on stack follows dyn. arrays
___________________________________________________________________
Ceres NS32000 NS32000 Assembler no no
Mac II MC68000 Assembler no no
SPARCStation Modula-2 yes no
DECstation Oberon yes yes
RISC System/6000 Oberon yes yes
Chameleon Oberon yes yes
Mithril Modula-2 yes yes
Alpha AXP Oberon yes yes
Implementation language of the GC and search strategies, expanded from
[Bra92]
The GC needs compiler support, i.e.: alignment of record-elements, storage-allocation, and layout of typedescriptors. The GC must also be adapted to the particular heap-structures, i.e.: address extension to 64 Bit.
The GC has also to follow objects on the stack. Therefore the stack is run through in 4 byte increments. The data retrieved is given to the GC as pointer candidates (see also [Syp92]). Because local variables are 8 byte aligned, all local pointer variables are found. If the pointer is an element of a local record variable, it couldn't be found reliably. The reason is that record elements were byte-level aligned. The alignment for record elements was changed so that pointers within a record are always aligned to a 4 byte boundary and hence can be found by the GC.
The GC requires information about the object size in the memory. This information has to be stored at allocation time. With DECOberon this is done by three procedures in module Kernel: one for records, one for dynamic arrays, and one for SYSTEM.NEW. As mentioned above, A2O maps NEW and SYSTEM.NEW to only one RTS-routine. Changing A2O to call different allocation routines would have been too much work. To avoid that change, the compiler was modified to generate inline code, to store the required data for the GC.
The inline code has one further advantage: The compiler can generate a tight loop (using loop-unrolling) to clear the allocated memory, so that pointer variables are initialised to NIL, which is required by the GC.
In the stand-alone version, A2O has generated the static type descriptors in the type descriptor section sequentially. Then follows the area for the support of persistent objects. The type descriptors were accessed by base (type descriptor section) + offset (type). However for the Oberon System the type descriptors have to be allocated individually, because it is possible that objects which need a descriptor, are still in the heap, even though the corresponding module was already released (see [Wir92]). Therefore the layout of the type descriptor section was changed to have an array of pointers to the actual type descriptors at the beginning. Then the type descriptors follow.
The Alpha AXP processor uses 64-bit addressing. From spring 1996 it is said that OpenVMS AXP is available with a 64-bit extension. At the moment each address -managed by the operating system- has 32 bit, with a sign extension to 64 bit. AOS (and also A2O) is designed to allow 64 bit address extension. In particular the heap structures start from 8 byte addresses. Naturally, the GC is highly optimized to be fast and although is is written in Oberon, it was difficult to understand and modify, because it uses low-level operations.
As already mentioned in chap. 4.3, traps should handled within the Oberon System. There are two classes of traps: (1) those raised by the Oberon run-time system (e.g. array index out-of-range) and (2) those raised by OpenVMS (e.g. access violation, device full). A2O reports class 1 errors by a call to the operating system service LIB$SIGNAL which can be processed like class 2 errors.
Exceptions are normally handled by the command interpreter. The OpenVMS message utility [VMS4] translates the error numbers to textual error messages with the SYS$PUTMSG service, which outputs the message. The EH of Alpha Oberon also uses the OpenVMS message utility by specifying a call-back-routine as optional parameter of SYS$PUTMSG to the message. Thus error messages are displayed in an ordinary Oberon text viewer called System.Trap.
In contrast to other ports of the Oberon System, this has the advantage, that there is no need to deal with message generation. There is no need to know which errors can occur and what the associated message is. This provides for private shareable images (dllsym) to raise their own exceptions or to extend the compiler's run-time errors, without any modifications to module System.
After displaying the error message, the EH generates a list of procedures in the call sequence (procedure trace). The EH also displays the values of local variables according to their corresponding data type. This corresponds to the "facility for symbolic debugging" as described in [Wir92].
What happens, if a trap occured before the EH is installed by module System? The trap would be handled by the handler of the command-interpreter. If the trap occured after the control is transfered to the Oberon System, there is no possibility of locating the trap. Thus the boot-loader installs its own handler, which outputs the list of the call trace on the console.
The actual EH System.Trap is installed two times: (1) in the initialization part of module System and (2) in Oberon.Loop. The first installation is to catch traps during the boot-process. Because the handler is installed as a so-called stack-frame handler, it is reinstalled again when leaving the initialization body of module System. Thus Oberon.Loop installs it again, because the control shall be transfered back, if a trap occurs.
Processing the call-chain is highly system-dependent. Unix and most other operating systems require to fumble with the stack-pointer. Under OpenVMS AXP this can be done by using two services to get the invocation context (INVO_CONTEXT) of the actual procedure and all its predecessors:
PROCEDURE LIB$GET_CURRENT_INVO_CONTEXT (VAR invoContext$N: Invo_Context); PROCEDURE LIB$GET_PREV_INVO_CONTEXT (VAR invoContext$N: Invo_Context): BOOLEAN;LIB$GET_CURRENT_INVO_CONTEXT fills the parameter invoContex with the context of the currently active routine. LIB$GET_PREV_INVO_CONTEXT delivers for a given context K of a procedure P the context K' of the calling procedure P'. Thus the call-chain length decreases by one context. If such a context K' doesn't exist, the system service returns FALSE. (Syntax note: the suffix $N in the procedure definition prevents, that A2O transfers the Oberon type tag to the system service used for the record type Invo_Context.)
TYPE
InvoContext = RECORD
length: LONGINT;
flags: SET; (* upper 8 bits are version byte *)
procDesc: ADDRESS64;
progCnt: ADDRESS64;
procStatus: SYSTEM.QUADWORD; (* 64 bit flag field *)
iReg: ARRAY 31 OF SYSTEM.SIGNED_64;
fReg: ARRAY 31 OF SYSTEM.QUADWORD;
END;
For the call-trace only the procedure descriptor 'procDesc' and the frame-pointer
'iReg[29]' from this structure are used. However, without the above mentioned
system services it would be more complicated to get both data sets. By using
the system services, the EH of the boot-loader consists of a few lines only:
EX.LIB$GET_CURRENT_INVO_CONTEXT(invoContext);
WHILE ~ OutProcName(SHORT(invoContext.procDesc))
& EX.LIB$GET_PREV_INVO_CONTEXT(invoContext) DO END;
WHILE EX.LIB$GET_PREV_INVO_CONTEXT(invoContext)
& OutProcName(SHORT(invoContext.procDesc)) DO END;
RETURN EX.SS$_RESIGNAL;
OutProcName displays the procedure name belonging to INVO_CONTEXT. If
the procedure isn't found, OutProcName returns FALSE. The first while-loop
walks the call-chain until a procedure, known by the Oberon System, is found.
The second while-loop executes as long as the procedures are known by the
Oberon System. If a unknown procedure is found the loop terminates.
Because the boot-loader has to terminate the execution if an error occured, the handler returns the status value SS$_RESIGNAL. Thus OpenVMS passes the exception to the next handler in the call-chain. By doing so the EH of the command-interpreter is reached and the boot-loader terminates. The symbolic information belonging to each procedure is searched using its procedure descriptor. The symbolic name, type and offset data is contained in the RefBlk of the module. At the end of the EH the statement
RETURN EX.SYS$UNWIND(mechArgs.depth,0);transfers the control to the procedure which has installed the handler. The stack is set back, the saved register values are restored and Oberon.Loop continues.
Despite its enormous functionality, the Oberon System is relatively easy to port. In a first step, after the bootstrap was done, it was quickly possible to work basically with the System without GC and exception handler. But GC and EH are required to work properly with the System. Exactly these low-level parts render the port difficult, because they are hard to debug. The OpenVMS EH facilities proved to be a powerful instrument. Compared to Unix, the available system services sufficiently support establishing exception handlers. To prepare for the 64-bit extensions also posed a challenge. The original sources frequently contained constant literals instead of spelling identifiers. This required many modifications all over the source text. A big success is that most extensions like Edit, Draw, the hypertext elements and many other applications did run simply by recompilation.
[AXP] Richard L. Sites (ed.): Alpha Architecture Reference Manual. Burlington, 1992.
[Bra92] Marc Brandis, et al: The Oberon System Family. Department Informatik, ETH Zuerich, Report No. 174 (1992)
[Dot94] Guenter Dotzel: Alpha AXP/OpenVMS Modula-2 and Oberon-2 Compiler Project. In: Peter Schulthess (Hsg.): Proceedings of the Joint Modular Languages Conference, University of Ulm, Germany, 28-30 September 1994. Universit~atsverlag Ulm, 1994. An updated version of this paper is here: http://www.modulaware.com/max_sum.htm
[Gut94] Juerg Gutknecht: Oberon - Perspectives of Evolution. In: Peter Schulthess (Hsg.): Proceedings of the Joint Modular Languages Conference, University of Ulm, Germany, 28-30 September 1994. Universit~atsverlag Ulm, 1994
[Kna94] Markus Knasmueller: Oberon Dialogs, User's Guide and Programming Interface. Institut fuer Informatik, Johannes Kepler Universit~at Linz, Report No. 1 (1994).
[M~os91] Hanspeter M~ossenb~ock, The Programming Language Oberon-2. Department Informatik, ETH Zuerich, Report No. 160 (1991)
[M~os92] Hanspeter M~ossenb~ock, Object Oriented Programming in Oberon-2, Springer Verlag, 1992. [Pfi91] Cuno Pfister (ed.), et al: Oberon Technical Notes. Department Informatik, ETH Zuerich, Report No. 156 (1991)
[Rei91] Martin Reiser: The Oberon System, User Guide and Programmer's Manual. Addison-Wesley, 1991
[Rei92] Martin Reiser, Niklaus Wirth: Programming in Oberon, steps beyond Pascal and Modula. Addison-Wesley, 1992
[Sup94] HP Oberon, The Oberon Implementation for Hewlett-Packard Apollo 9000 Series 700, Jacques Supcik, Department Informatik, ETH Zuerich, Report No. 212 (1994)
[Szy92] Clemens A. Szyperski, Insight ETHOS: On Object-Orientation in Operating Systems, ETH Zuerich Dissertation (1992).
[Tem91] Josef Templ, Design and Implementation of SPARC-Oberon. Structured Programming, 15:12, 197-205. Dez. 1991,5
[VMS1] Digital Equipment Corporation: OpenVMS Programming Concepts Manual. Maynard, Massachusetts, 1994
[VMS2] Digital Equipment Corporation: OpenVMS Calling Standard. Maynard, Massachusetts, 1994
[VMS3] Digital Equipment Corporation: OpenVMS DCL Dictionary. Maynard, Massachusetts, 1994
[VMS4] Digital Equipment Corporation: OpenVMS Command Definition, Librarian, and Message Utilities Manual. Maynard, Massachusetts, 1993
[Wir88] Niklaus Wirth: The Oberon System. Department Informatik, ETH Zuerich, Report No. 88 (1988)
[Wir92] Niklaus Wirth, Juerg Gutknecht: Project Oberon, The Design of an Operating System and Compiler. Addison-Wesley, 1992
Thanks to Josef Templ for his hints concerning the GC and to A. Schuhmacher who helped with the translation from German into English.
A more detailed description which contains several illustration is contained in the german description of this project: "64-Bit-Portierung des Alpha-Oberon-Systems und des Oberon-2-Compilers" by Hartmut Goebel
Michio Kitahara: "The entangled civilization: democracy, equality, and freedom at a loss", 369 pages. University Press of America, Lanham - New York, 1995 and Open Gate Press, London, 1995.
While reading this book,
it became evident, that the author has read many good books
on economy, political philosophy, history, socialism, statism,
science, and psychology, while he lived in Japan, Europe and America.
This book shows how vulnerable the Western civilisation got through
socialism, how the self as an object in a collective
setting is manipulated, that the cause
for peoples' violent protest against nuclear power plants
is based on egoist human thought.
When explaining how collectivism is emphasized at the expense
of individualism, he writes on page 230:
"But the ironic point here is that collectivism is carried
out on the basis of the individualistic perception of human behaviour without
knowing or realizing this. This is another very important point in this
book, and I would like to ask you to read the above sentence once again."
Unfortunately, I can't give you more quotes, because some visitor
has stolen my copy of this book, which had a nice all black hard-cover.
Explosive content without any journalistic hype: Must read!
Michio Kitahara: "The African Revenge: The Age of Regression and
the Decline of the West", Columbus, Ohio: Pine Island Press, 1997.
Book description as provided by the author:
By reflecting our evolutionary background, the structure of
the human brain contains two primitive levels which deal with
the basic existence of ourselves as animals, such as sex and
territorial defense. On top of these, we have another level
dealing with human characteristics, such as morality, ethics,
reason, compassion, and the art of interhuman relations.
Medieval Europeans were very much under the influence of the primitive parts of the brain.
Along with the rise of the modern West, they learned to restrain them.
But the rise of the modern West also entailed colonialism
and slavery. The Africans in America have been forced to suffer
for centuries. There is now abundant scientific evidence that
when humans experience hardship, adults become childish. When
the hardship is extreme, humans tend to exist under the dominance
of the two lower levels of the brain. As a result of their
tragic past, the African-Americans have created a unique culture
of their own, characterized by these tendencies.
This culture emphasizes sensuality, spontaneity, action, and
emotions, which appeal to the more primitive aspects of human
existence. For this reason, it is irresistible.
African-American superstars in rock music, sports, and entertainment
became the role models for everyone. But unfortunately, this
culture is incompatible with the basic characteristics of the
modern West, which emphasize logic, reason, rationality, and the
restraint of emotions and spontaneity. The West is also being
Africanized more and more in counterproductive ways, as seen in
drugs, vandalism, violence, and crimes against persons. Western
civilization's abuse of the Africans has boomeranged back upon
itself.
About the Author: Michio Kitahara was born in Japan but received his Ph.D. from the University of Uppsala, Sweden. He has held teaching or research appointments at the Universities of Maryland, Michigan, and San Francisco, as well as the State University of New York at Buffalo. This is the third and final book of his trilogy on the rise of the modern West and its future, following "The Tragedy of Evolution" (1991) and The Entangled Civilization (1995). He currently lives in Sweden in order to study the fate of Scandinavian social democracy firsthand.
[ Home
| Contact
| Legal
| OpenVMS_compiler
| Alpha_Oberon_System
| The ModulaTor
| Bibliography
| Oberon[-2] links
| Modula-2 links
| General: Interesting Books Selector
]
![]()