by Günter Dotzel, Apr-2001
Preface: This is an article I prepared in July 1999, but I decided to delay publication after the discussion on this topic got offensive in news:comp.lang.oberon at this time.
Definition is restriction. Thus language reports do not define the size of pervasive data types. This is the task of implementation notes.
A programming language standard does neither define the size, nor the internal representation, nor storage allocation, nor alignment of abstract whole number, real number, or any other data types. Any such specification would unnecessarily restrict compiler implementations on different processors and operating systems. In contrast, compiler implementation notes define such low-level details.
High-level programming language provide an abstraction between the application programs and the hardware/operating system. The goal of a programming language specification is to provide this abstraction while maintaining the programs' semantics independent from any specific compiler implementation. Application programs developed without assumption of any implementation-specific details are portable.
For example, programs developed witg portability in mind on 32 bit machines using a 32 bit compiler should compile without modification on a 64 bit machine with a 64 bit compiler.
Should this migration present any problems, this is mostly due to the fact that the developer assumed specific pervasive data type size or used low-level facilities. Relying on specific data type sizes, i.e., upon the implementation notes, renders a program unportable. Low-level facilities should should only be used in well-encapsulated, system dependent modules.
In the following only Oberon-2 is considered. Oberon-2 is a modern, imperative, object-oriented programming language. Compilers for Oberon-2 are available for all popular processors and operating systems. A huge amount of free applications with complete source code is available.
Most compilers are 32 bit implementations. But there are also compilers for 64 bit machines, see "64 bit Oberon", single-chip computers, and digital signal processors. Here is a summary of many available compiler implementations.
To clarify the term 64 bit Oberon-2 compiler, see What is a 64 bit compiler?
For example, a change in whole number data types from SIZE(LONGINT)=4 to SIZE(LONGINT)=8 only breaks lowest-level source code. The rest of the programs developed with 32 bit compilers, not using other implementation specific features, such as module SYSTEM, remain unchanged.
The Oberon System V4 is the largest Oberon-2 source base. V4 comprises an extensible programming system with a graphical user interface, a programming library, and many tools and applications.
When working on a 64 bit machine, one wants to use the flat 64 bit address space and 64 bit whole number arithmetic. This was the reason to construct 64 bit processors.
A 64 bit Oberon-2 compiler on a 64 bit machine thus requires that (1) pointers, addresses, and the pervasive type LONGINT are 64 bit wide, (2) the operating system provides a 64 bit storage allocation procedure, (3) 64 bit range constant integer expressions are allowed, (4) there are no 32 bit restrictions, e.g., in indexing arrays, size of structured data), and (5) the separate library and run-time system can handle 64 bit data types. 32 bit whole numbers are only needed where memory space considerations matter or where foreign language interfaces require 32 bit data types. To remain portable with existing source code, neither the language nor the library interface must be changed.
Library modules which contain procedures such as
PROCEDURE WriteInt (i, n: LONGINT);
PROCEDURE ReadInt (VAR i: LONGINT);
do not require any interface change, although the procedures' implementation will require some modification to provide for the larger 64 bit integer range.
A2O is a 64 bit native code Oberon-2compiler for OpenVMS Alpha operating system.
AOS is a 64 bit implementation of the Oberon System V4 on 64 bit Alpha OpenVMS.
OOC is an Oberon-2 to C translator for Unix.
The ISO Modula-2 library is written in Modula-2. To create stand-alone Oberon-2 applications with A2O, the ISO Modula-2 lib (32 bit and 64 bit implementation) is used.
Under AOS, the ISO M2 lib can only be used via a foreign language interface. Within AOS, M2 is a foreign language, just like C and Pascal. (If it were a goal to have the ISO Modula-2 lib under the Oberon System, it must be transpiled from Modula-2 to Oberon-2.)
AOS uses a special object file format which allows dynamic link load + metaprogramming + fine grained symbol files. The library of AOS is the Oberon System V4 API.
There is only one compiler (one source/one exe-file), called A2O, which is both,
A2O can generate 32 bit and 64 bit code (size of pointer/proc-var/LONGINT) for both, (1) the OpenVMS linker and for (2) the AOS's linker/loader.
To compile a module, module Compiler.Compile calls A2O from AOS (Alpha Oberon System) only as a 64 bit compiler, because AOS is a 64 bit Oberon System.
AOS is a set of several hundred Oberon-2 modules which compiles with A2O in either 32 bit or 64 bit mode without source code change. If you compile all modules in 32 bit mode you'd get a 32 bit Oberon System.
According to the Oberon-2 report, SHORTINT/INTEGER/LONGINT are abstract whole number data types, only defined by the type inclusion SHORTINT <= INTEGER <= LONGINT. Data types with specific sizes are imported from SYSTEM.
LONGINT gets you the maximal integer range with any given compiler. If a program worked with a 32 bit compiler, it'll also work with a 64 bit compiler, given the size of pointers, SYSTEM.PTR matches the size of LONGINT and given the result type of ENTIER() and SYSTEM.ADR() is LONGINT.
If you follow these rules, no source code changes are required to existing programs, except where specific data type sizes are needed. These are imported from module SYSTEM.
If someone had reason to use the pervasive type SHORTINT, she knew about the range of this type and that it most probably matches the type size of CHAR and SYSTEM.BYTE.
If someone had reason to use INTEGER instead of LONGINT, she most probably knew, that at least this data type size is needed to represent for example ORD(char).
All 32/64 bit compatible language extensions are described in "64 bit Oberon" Not a single language change was required, which would invalidate existing source code.
Download the free AOS for OpenVMS Alpha
The port of the Oberon System V4 to the Compaq 64 Alpha under OpenVMS is one example of a successful 64 bit migration with more than 300 modules; it is described in the article entitled "64 bit Oberon".
To clarify the
64 bit Alpha Oberon System (AOS) port:
No source change was required
for 99% of all existing Oberon System V4 applications.
Note, only 1% of all modules did required changes -- not 1% of the source code. This is mostly because
in some low-level modules, the programmers assumed that
SIZE(LONGINT) would always be 32 bits -- at the time when the Oberon System was developed,
this was a practical, but nevertheless unnecessary assumption.
Still 99% of the modules simply compiled and worked even without even looking at the source code.
(We would not have had the time to even look
at the huge amount of source code already publicly available at this time.)
Only low level modules needed some modifications -- all these 32 bit dependencies were easily tracked down after being flagged at compile time; usually only a few source lines had to be changed. Such changes can be made such that it would still would compile with 32 bit LONGINT size.
Our approach was proved by the success of porting a huge collection of applications within a few weeks.
Nevertheless, there are other proposals to provide for 64 bit extensions in Oberon-2:
One Oberon-2 compiler implementation (OOC) added a new pervasive integer called HUGEINT and the author even proposed this extension in an effort to standardize the Oberon-2 language and -- even worse -- trying to standardize the pervasive integer type sizes. But the HUGEINT-approach is flawed because it changes the de-facto Oberon-2 language report by
I also checked the latest source of OOC (summer 1999): it is a 32 bit compiler, even when run on a 64 bit machine, because (1) the scanner can't even parse 64 bit integer literals, (2) the library is only 32 bit int/str/int conversion. (3) all formal procedure parameters of type LONGINT must be converted to HUGEINT, if they want to migrate to 64 bit. In addition, OOC does not have any number conversion from real to hugeint; ENTIER() result type is always LONGINT. The compiler itself internally uses only LONGINT, which doesn't allow to store 64 bit constant integer values. OOC has 32 bit restrictions all over the place (size of structures, maximal index in array declaration, max index in dynamic arrays, size of local and global variables, size in SYSTEM.COPY, MOVE).
The compiler would have to use HUGEINT instead of LONGINT in most places. The resulting source of OOC would no longer compile on 32 bit machines, which proves that there 64 bit concept results in 32 incompatibility. MAX/MIN(HUGEINT) is even set to MAX/MIN(LONGINT) in the ansi c backend; there is no other (native code) backend. OOC has a long way to go, because its HUGEINT extension does not allow seamless migration from 32 to 64 bit.
My observations allow to conclude that they don't have any 64 bit experience, but they want to standardize their implementation notes (what concerns the size of pervasive integer types, SYSTEM.ADR(): SYSTEM:ADDRESS, magic mapping from LONGINT or HUGEINT to ADDRESS, etc., apart from LONGCHAR, LONGCHR(), ...
Note, this is the state of summer 1999; I don't know if OOC was modified in this respect since then.
Q: What is a 32 bit restriction?
A: You've got a 32 bit restriction, if for example:
VAR l: LONGINT; BEGIN l:=SYSTEM.ADR(l);does compile with both oo2c_32 (32 bit implementation) and oo2c_64 (64 bit implementation)?
I guess it does not, because the result type of SYSTEM.ADR is SYSTEM.ADDRESS; on 32 bit systems this is an alias to LONGINT, on 64 bit systems an alias to HUGEINT. Later it was stated that SYSTEM.ADDRESS were not a synonym for HUGEINT. so is alias and synonym not identical? Still later it was clarified: "[in OOC], on systems with 32 bit pointers, ADDRESS is an alias for LONGINT, whereas on systems with 64 bit pointers, it is an alias for HUGEINT."
Anyway, this is a language change and it breaks existing code, because a new data type HUGEINT is needed to take advantage of 64 bit arithmetic/adressing.
An then why are there two different compiler names for OOC?
A2O does not need such a disticntion, because it is both, (1) a 32 bit and (2) a 64 bit compiler without 32 bit restrictions; It's just a compilation switch. This is possible, because A2O always runs on a 64 bit machine. (The 32 bit mode is no longer needed for AOS; it was only kept for stand-alone programs.) And if you want
This is what we did. It is simple. It has been proven. It works. It allows migration from 32 bit to 64 bit pointer/address/longint with maximal upward and backward compatibility. It does not require any change to the Oberon-2 language report, and e.g.: the result type of SYSTEM.ADR() remains LONGINT. We are not pushing our language, because we did not make any language change to Oberon-2. (What concerns the language extensions: even without the new LONGSET type and 64 bit hex literals which we introduced in A2O, all I said above is still valid.)
One ironic tragedy remains: If the majority adopted OOC as a de factor standard, it'd create a lot of work necessary to modify existing source code, which in turn could decrease the jobless rate. ;-) And after all, the majority is always right. Right?
But OOCists really debated source code changes:
Q: "Why are you so against code changes? These modifications solve the problem once and for all."
A: Because the number of applications and tools that exist for
the Oberon System is to large and life is too short
to even look at them. I compiled and used them (the applications and tools)
on a 64 bit oberon system and they worked.
This proves that our 64 bit concept (you might call it 64 bit extension) works.
In addition, you have to consider Oberon as language and environment. The Oberon[-2] language is only one piece in extensible programming created by Wirth/Gutknecht. Any attempt to make the Oberon-2 language incompatible with existing source code developed for the Oberon System neglects their achievements.
An OOCist complained that
will output different values, depending on size of LONGINT.
A: but it always outputs MAX(LONGINT), as required, e.g. 2^31-1 using a 32 bit compiler and 2^63-1 using a 64 bit compiler. LONGINT ist abstract whole number type und implementation dependent; only MAX(SYSTEM.SIGNED_32) is always 2^31-1.
An OOCist disputed our source base to test the practicallity of our simple approach:
"Once and for all. The Oberon Systems are a research product of various universities to test new aproaches in operating system design. Cool stuff. However the commerical effect is nearly zero, nothing. ... Oberon Systems are a dead end for my purposes. The official Oberon people have missed that badly."
A: There are real world applications for OS V4 and S3. I use Gisela's spreadsheet [which she developed for V4] under AOS. (Too bad that her source package is difficult to find on the web.) I use Kepler to draw illustrations, and other tools. I know of someone (other than the author), who uses Nepros (artificial simulation system) I use AOS for programming. My friend plays Tetris under AOS. These are all really useful, real world, and there are more.
Q: can compile your programms using X11 without interface or programm code changes when your datatype sizes change as result of your type model?
A: Yes. The alpha oberon system (AOS) uses X11 in many lower level modules such as Font and Display. The whole AOS together with all its tools and applications can be compiled with LONGINT being 32 bit or 64 bit wide without any source code change. This was one of the goals of the port. Just for fun. Only that you can't use 64 bit addressing in 32 bit AOS, which is fair enough. 64 bit AOS can compile/run everything 32 bit AOs can, so there is no need for a 32 bit AOS.
Implementation Notes: The procedure calling conventions might be different on different operating system or processors, but this does not change the interface seen by the Oberon programmer, except if you talk different releases of x11 (each time they changed a lot).
Because not all Oberon System implementations are based on X11, module X11 is normally not directly used in Oberon System. Such programs would not be portable. X11 is used to implement the modules Display, Font, etc. on Oberon Systems based on X11 (linux/unix and openvms ports).
Last time I checked X11 was still 32 bit only (pointers etc.). I guess this is because a 64 bit version of X11 would break too much existing C/C++ code. So they can't come-up with a 64 bit X11 version. This is the reason why 64 bit OpenVMS has a 32 bit X11 implementation.
In AOS global variables and heap which are always located above the 32 bit address space. The full 32 bit address space is reserved for the main and coroutines stack (OpenVMS stack restriction).
Because of the 32 bit X11 restrictions, AOS copies all data which goes to or comes from X11 via 32 bit memory, which is allocated on the local procedure stack (auxiliary local variables). For open array formal parameters, the procedute stack is enlarged dynamically. (This extra copying is not noticable even on the slowest existing Alpha workstation.)
The copying must be done for all modules directly making data transfers to/from X11.
By the way, everything in AOS is written in O2. Not a single line of assembly, C language or any other foreign language is used -- in the case of AOS, even the primary Oberon System bootstrap loader is written in O2.
Using A2O to generate code for stand-alone programs, which by the way means that our concept even works for so-called command line compilers, i.e. out-side the Oberon System -- and yes, we can also use X11 in stand-alone 64 bit Oberon-2 programs -- in what you call the "real world" and we can also import the 64 bit ISO Modula-2 library modules in Oberon-2.
We used the same concept in our 64 bit OpenVMS Alpha Modula-2 compiler (8 byte sized INTEGER/CARDINAL/subranges and SYSTEM.[UN]SIGNED_32 and _64).
The advantage of a extensible operating system interface is that the application program interface is mostly 64 bit ready. OpenVMS Alpha is one such example. For the I/O-size it does not matter if you have a 32 bit or 64 bit actual parameter. The immediate value is always passed in a 64 bit register (or on the stack from the 7th. parameter); same applies to addresses. Except in case of VAR parameters, where the callee needs to know the size of the actual parameter (good that Oberon requires type identity for VAR params).
Of course X11 is still restricted to 32 bit. But this is due to the inflexibility of C's type system (pointer size/long int). Seems that C's "long int" is not an abstract integer type, but 32 bits. (wonder whether this size is really defined in the ansi c or c++ standard.) so Oberon programmers have to life with C's flawed language design when inheriting applications/standards written in C.
In 1998, I looked for a student who would port VisualOberon (VO) to
AOS in the summer holidays, just for curiosity and in order to prove that it is possible, but I couldn't find one.
Most of the work would have been attributed to overcome the 32 bit X11 restrictions.
(The resulting port would again be under GPL of course, although I always had and still have mixed feelings about of GPL in general.)
On a 128 bit compiler, the size of LONGINT and pointers will be 16 bytes. Again, no changes to existing source code will be required.
From what I've seen, the OOC compiler is great (apart from the fact that there is only a ansi-c back-end). One could make a 64 bit compiler from OOC by migrating LONGINT to 64 bit without changing too much (discard HUGEINT and put 32 bit ints into SYSTEM). But exactly this they are opposing to. They live in a 32 bit world, claiming that to interface foreign language libraries they need LONGINT to be 32 bit. But that will change on 64 bit operating systems which use 64 bit whole numbers in the API (size of a i/o transfer, size of memory allocation, etc).
"64 Bit Address Extension of the Alpha Oberon-2 Compiler", was published at our web-site in Sep-1996.
"64 bit Oberon", was published at our web-site in summer 1997 and in the ACM SIGPLAN notes in early 1998.
Most of the ModulaTor back-issues are available from http://www.modulaware.com/mdltr_.htm