The ModulaTor Erlangen's First Independent Modula-2 Journal! Nr. 1/Aug-1990 _____________________________________________________________ Does Modula-2 Generate Racehorses? Comparison of Compiler Generated Code Quality for Floating Point Arithmetic by Guenter Dotzel, ModulaWare A simple example serves to demonstrate the code quality generated by two different Modula-2 compilers. In this paper the two compilers are called the Zuerich/Hamburger and the Erlanger. 1. The Zuerich/Hamburger: ETH-Zuerich's DEC VAX/VMS implementation of the Modula/R compiler V1.2 from the LIDAS distribution kit of 1986, which itself is an extension of the DEC VAX/VMS Modula-2 compiler implementation V3.2 of the University of Hamburg. While the scanner and parser of Zuerich's compiler is based upon the Lilith Modula/R version B20, it uses the same machine code generator as the Hamburger. 2. The Erlanger: ModulaWare's Modula-2 & Modula/R compiler is based upon the Zuerich/Hamburger. The current release for DEC VAX/VMS V5.3 is called MVR V2.12 and is the result of a three years work with code generator improvements, extensive maintenance and many language extensions in accordance with the upcoming ISO/ANSI/BSI/DIN-Modula-2 Standard. The Erlanger allows standard functions in module and procedure declaration part (constant expression), record- and array-constructors (aggregates) and has more standard functions, namely CAST, FLOATD, FLOATG, FLOATH, LENGTH, MAX, MIN, and SHIFT. First, the Modula-2 source program is given in Listing 1. Listing 2 shows an excerpt of the machine code listing generated by the Erlanger and Listing 3 shows that of the Zuerich/Hamburger. Then the differences in code quality concerning instruction count, code size and execution speed is discussed and summarized. Listing 4 shows the main program module used for the benchmarks. The paper closes with a short description of the development history of Modula-2 and Modula/R. Listing 1: 1 DEFINITION MODULE DoSetALoop; 2 TYPE rCoord = RECORD x,y: REAL END; 3 PROCEDURE InnerLoop(p: rCoord; iterations: INTEGER): INTEGER; 4 END DoSetALoop. 1 IMPLEMENTATION MODULE DoSetALoop; 2 (* Mandelbrot set calculations inner loop *) 3 4 PROCEDURE InnerLoop (p: rCoord; iterations: INTEGER): INTEGER; 5 VAR a, a2: rCoord; count: INTEGER; 6 BEGIN 7 a := p; a2.x := a.x * a.x; a2.y := a.y * a.y; count:=0; 8 REPEAT 9 a.y := 2. * a.x * a.y + p.y; 10 a.x := a2.x - a2.y + p.x; 11 a2.x := a.x * a.x; 12 a2.y := a.y * a.y; 13 INC(count); 14 UNTIL (a2.x + a2.y >= 4.0) OR (count>=iterations); 15 RETURN count; 16 END InnerLoop; 17 18 END DoSetALoop. Listing 2: Code generated by the Erlanger The numbers with a colon at the end of the line indicate the line numbers of the Modula-2 source listing above. Note, that the parameters p and iterations are defined to be value parameters and hence have to be copied on the stack. By default the VMS procedure calling standard passes a parameter by reference if no prefix %IMMED is specified for the formal parameter. To be compatible with VMS, actually first the address of iterations and then the address of p is pushed on the stack before calling the procedure InnerLoop using the instruction CALLS #2,DoSetALoop.InnerLoop. This explains the displacement deferred addressing mode, indicated by @ in the first operand of the MOVQ and MOVL instructions. These instructions are marked by an * in Listing 1 and 2. All other local variables a, a2 and count are also allocated on the stack. The function result is returned by register R0 which is the VMS standard method. The procedure InnerLoop uses only one working register R10. The literals #16 and #24 are encoded short floating literals with the values 2.0 and 4.0. All parameters are accessed via argument pointer AP and the local variables are accessed via frame pointer FP. The offset +4 addresses the y-component of the variables of record type rCoord, whereas the x-component has offset 0. .TITLE DoSetALoop ; Program code section .PSECT MODULA2.$CODE$,PIC,REL,SHR,RD,EXE,LONG ; Symbols for procedure InnerLoop FFFFFFF8 p = -8 FFFFFFF4 iterations = -12 FFFFFFEC a = -20 FFFFFFE4 a2 = -28 FFFFFFE0 count = -32 00000 InnerLoop: 0400 00000 .ENTRY DoSetALoop.InnerLoop,^M<R10> ;6 5E 20 C2 00002 SUBL2 #32, SP F8 AD 04 BC 7D 00005 * MOVQ @4(AP), p(FP) F4 AD 08 BC D0 0000A * MOVL @8(AP), iterations(FP) EC AD F8 AD 7D 0000F MOVQ p(FP), a(FP) ;7 E4 AD EC AD EC AD 45 00014 MULF3 a(FP), a(FP), a2(FP) E8 AD F0 AD F0 AD 45 0001B MULF3 a+4(FP), a+4(FP), a2+4(FP) E0 AD D4 00022 CLRL count(FP) 5A 10 EC AD 45 00025 2$: MULF3 a(FP), #16, R10 ;9 5A F0 AD 44 0002A MULF2 a+4(FP), R10 F0 AD 5A FC AD 41 0002E ADDF3 p+4(FP), R10, a+4(FP) 5A E4 AD E8 AD 43 00034 SUBF3 a2+4(FP), a2(FP), R10 ;10 EC AD 5A F8 AD 41 0003A ADDF3 p(FP), R10, a(FP) E4 AD EC AD EC AD 45 00040 MULF3 a(FP), a(FP), a2(FP) ;11 E8 AD F0 AD F0 AD 45 00047 MULF3 a+4(FP), a+4(FP), a2+4(FP) ;12 E0 AD D6 0004E INCL count(FP) ;13 5A E4 AD E8 AD 41 00051 ADDF3 a2+4(FP), a2(FP), R10 ;14 18 5A 51 00057 CMPF R10, #24 09 18 0005A BGEQ 3$ F4 AD E0 AD D1 0005C CMPL count(FP), iterations(FP) 02 18 00061 BGEQ 3$ C0 11 00063 1$: BRB 2$ 50 E0 AD D0 00065 3$: MOVL count(FP), R0 ;15 04 00069 RET Listing 3: Code generated by Zuerich/Hamburger .TITLE DoSetALoop ; Program code section .PSECT MODULA2.$CODE$,PIC,REL,SHR,RD,EXE,LONG ; Symbols for procedure InnerLoop FFFFFFF8 p = -8 FFFFFFF4 iterations = -12 FFFFFFEC a = -20 FFFFFFE4 a2 = -28 FFFFFFE0 count = -32 00000 InnerLoop: 0000 00000 .ENTRY DoSetALoop.InnerLoop,^M<> ; 6 5E 20 C2 00002 SUBL2 #32,SP F8 AD 04 BC 7D 00005 * MOVQ @4(AP),p(FP) F4 AD 08 BC D0 0000A * MOVL @8(AP),iterations(FP) EC AD F8 AD 7D 0000F MOVQ p(FP),a(FP) ;7 7E EC AD 50 00014 MOVF a(FP),-(SP) 7E EC AD 50 00018 MOVF a(FP),-(SP) 6E 8E 44 0001C MULF2 (SP)+,(SP) E4 AD 8E 50 0001F MOVF (SP)+,a2(FP) 7E F0 AD 50 00023 MOVF a+4(FP),-(SP) 7E F0 AD 50 00027 MOVF a+4(FP),-(SP) 6E 8E 44 0002B MULF2 (SP)+,(SP) E8 AD 8E 50 0002E MOVF (SP)+,a2+4(FP) E0 AD D4 00032 CLRL count(FP) ;8 00004100 8F DD 00035 2$: PUSHL #16640 ;9 7E EC AD 50 0003B MOVF a(FP),-(SP) 6E 8E 44 0003F MULF2 (SP)+,(SP) 7E F0 AD 50 00042 MOVF a+4(FP),-(SP) 6E 8E 44 00046 MULF2 (SP)+,(SP) 7E FC AD 50 00049 MOVF p+4(FP),-(SP) 6E 8E 40 0004D ADDF2 (SP)+,(SP) F0 AD 8E 50 00050 MOVF (SP)+,a+4(FP) 7E E4 AD 50 00054 MOVF a2(FP),-(SP) ;10 7E E8 AD 50 00058 MOVF a2+4(FP),-(SP) 6E 8E 42 0005C SUBF2 (SP)+,(SP) 7E F8 AD 50 0005F MOVF p(FP),-(SP) 6E 8E 40 00063 ADDF2 (SP)+,(SP) EC AD 8E 50 00066 MOVF (SP)+,a(FP) 7E EC AD 50 0006A MOVF a(FP),-(SP) ;11 7E EC AD 50 0006E MOVF a(FP),-(SP) 6E 8E 44 00072 MULF2 (SP)+,(SP) E4 AD 8E 50 00075 MOVF (SP)+,a2(FP) 7E F0 AD 50 00079 MOVF a+4(FP),-(SP) ;12 7E F0 AD 50 0007D MOVF a+4(FP),-(SP) 6E 8E 44 00081 MULF2 (SP)+,(SP) E8 AD 8E 50 00084 MOVF (SP)+,a2+4(FP) E0 AD D6 00088 INCL count(FP) ;13 7E E4 AD 50 0008B MOVF a2(FP),-(SP) ;14 7E E8 AD 50 0008F MOVF a2+4(FP),-(SP) 6E 8E 40 00093 ADDF2 (SP)+,(SP) 00004180 8F DD 00096 PUSHL #16768 8E 8E 51 0009C CMPF (SP)+,(SP)+ 09 15 0009F BLEQ 3$ F4 AD E0 AD D1 000A1 CMPL count(FP),iterations(FP) 02 18 000A6 BGEQ 3$ 8B 11 000A8 1$: BRB 2$ 50 E0 AD D0 000AA 3$: MOVL count(FP),R0 ;15 04 000AE RET Discussion of the differences in code quality The Zuerich/Hamburger code generator used no working register at all and all expressions were evaluated on the stack. The VAX-11 architecture allows certain floating point constant values to be coded in one byte and to be used as immediate operand in floating point instructions as shown in Listing 2. The Erlanger makes use of short floating literals where possible, whereas the Zuerich/Hamburger always needed 4 bytes and pushed the constants and variables on the stack before use. While the code generated by the Erlanger has 14 instructions (program counter 065H-025H) for the statement sequence of the repeat construct (source line numbers 9 to 14), the Zuerich/Hamburger generated 32 (program counter 0AAH-035H). The code size of the Erlanger is 64 bytes versus 117 bytes. The Erlanger generated 7 three-operand and 1 two-operand floating point instructions, whereas the Zuerich/Hamburger generated no three-operand and 8 two-operand floating point instructions. Summary The improvement of the Erlanger compared to the Zuerich/Hamburger is 56% less instructions and 45% less code. To be able to produce such tight code is due to the genuine VAX-11 architecture which allows easy code generation for high-level languages. Since the Erlanger produces reliable code in a straight-forward fashion without applying special optimizing techniques, like assigning machine registers for temporary variables, the VMS debugger can still be used to examine those variables. To get an instruction count of less than 13 (one branch instruction could be saved at label 1$: in Listing 2) for the repeat-construct of Listing 1 is a challenge for an experienced assembly programmer. Using the Mandelbrot set starting at coordinate (-0.75, -0.23) over a range of 0.04 with 256 iterations and 196608 calls of InnerLoop (512 * 384 pixels), the code generated by the Erlanger runs about 44% faster. This was measured on a DEC VAXstationII, where the total running time is 907 seconds for the Erlanger versus 1633 seconds using the Zuerich/Hamburger - a speed-up of 1.8. Listing 4: Main program DoSetA For those trying to compete using their favorite compiler, the main program DosetA is also given here. In order to do a fair comparison, note, that the parameters of InnerLoop are passed by value and not by reference which, in the case of Fortran or C would be the default parameter passing mechanism. 1 MODULE DoSetA; 2 FROM DoSetALoop IMPORT InnerLoop, rCoord; 3 4 TYPE iCoord = RECORD x,y: INTEGER END; 5 VAR c, size, step: iCoord; 6 p, start, range, gap: rCoord; 7 iterations, count: INTEGER; 8 9 BEGIN 10 size := iCoord [1024, 768]; 11 step := iCoord [2, 2]; 12 iterations := 255; 13 start := rCoord [-0.75, -0.23]; 14 range := rCoord [0.04, 0.04]; 15 gap := rCoord [range.x/FLOAT(size.x)*FLOAT(step.x), 16 range.y/FLOAT(size.y)*FLOAT(step.y)]; 17 18 p.y := start.y + range.y; 19 c.y:=1; WHILE c.y<=size.y DO 20 p.x := start.x; 21 c.x:=1; WHILE c.x<=size.x DO 22 count:=InnerLoop(p, iterations); 23 INC(p.x, gap.x); 24 INC(c.x, step.x); 25 END; 26 DEC(p.y, gap.y); 27 INC(c.y, step.y); 28 END; 29 END DoSetA. The module DoSetA is input and output-free. The iteration count is only assigned to the variable count which may be used to draw a dot on the screen, e.g. by DrawColouredDot (drawing_mode := replace, color := count, position_x := c.x, position_y := c.y) The example uses so-called record-constructors (aggregates). The notation for an aggregates is type_ident "[" expression_list "]" which can be used to construct aggregates of user defined record or array type. The statement gap := rCoord [range.x/FLOAT(size.x)*FLOAT(step.x), range.y/FLOAT(size.y)*FLOAT(step.y)]; for example is equivalent to the statement sequence gap.x := range.x/FLOAT(size.x)*FLOAT(step.x); gap.y := range.y/FLOAT(size.y)*FLOAT(step.y); History of the programming languages Modula-2 and Modula/R Modula-2 was designed by N. Wirth at the Institut fuer Informatik, ETH-Zuerich in Switzerland as the logical successor of Pascal. Modula/R is a database programming language and is a true superset of Modula-2. Modula/R extends Modula-2 by fully embedded relational database language constructs such as data structure RELATION, by a predicate-oriented selection mechanism for relation elements by altering operators and by a transaction concept. The extension was designed and first implemented by the LIDAS Group of ETH-Zuerich at the Institut fuer Informatik (J. Koch, M. Mall, P. Putfarken, M. Reimer, J.W. Schmidt, C.A. Zehnder) with the intention not to introduce too many additional concepts into the language Modula-2. The origin of Modula/R is Pascal/R. Pascal/R is a database language designed at the University of Hamburg by J.W. Schmidt that extends Pascal by very similar concepts. A group under direction of J.W. Schmidt also derived the first Modula-2 Compiler for the VAX from the ETH-Zuerich's PDP-11 implementation.

IMPRESSUM: The ModulaTor is an unrefereed journal. Technical papers are to be
taken as working papers and personal rather than organizational statements.
Items are printed at the discretion of the Editor based upon his judgement on
the interest and relevancy to the readership. Letters, announcements, and
other items of professional interest are selected on the same basis. Office of
publication: The Editor of The ModulaTor is Guenter Dotzel; he can be reached
by tel/fax: [removed due to abuse] or by
mailto:[email deleted due to spam]

ModulaWare home page
The ModulaTor download