-------------------------------------------------------
Hewlett-Packard Company
SPEC CPU2000 FLAG DESCRIPTIONS

  - INTEL C++ AND FORTRAN COMPILERS 7.0

    - hp-20030630-ICL70-Windows.txt
-------------------------------------------------------




Description of compiler flags for Intel C Compiler 7.0
-------------------------------------------------------

-O2	Optimizes for speed. The -O2 option has the same effect as specifying
       	the following options: -Og, -Oi, -Ot, -Oy, -Ob1, -Gf, -Gs, and -Gy.
       	This options defaults to ON.

-O3    	Optimizes for speed. Enables high-level optimization. This level does 
       	not guarantee higher performance. Using this option may increase the
       	compilation time. Impact on performance is application dependent, some
       	applications may not see a performance improvement.

-Oa[-] 	Assume [not assume] no aliasing

-Obn      	Controls the compiler's inline expansion. The amount of inline
                expansion performed varies with the value of n as follows:
		0:  Disables inlining.
		1:  Enables (default) inlining of functions declared with the
                    __inline keyword. Also enables inlining according to the
                    C++ language.
		2:  Enables inlining of any function.  However, the 
                    compiler decides which functions to inline.  Enables 
                    interprocedural optimizations and has the same effect as 
                    -Qip.

-Og	Enables global optimizations.

-Ot	Enables all speed optimizations.

-Oi[-] 	Enables/disables inline expansion of intrinsic functions

-Ow[-]	Assume[not assume] no cross-function aliasing.

-Oy[-]	Enables [disables] the use of the EBP register in optimizations. When
	you disable with -Oy-, the EBP register is used as frame pointer.

-Gf	Enables string-pooling optimization.

-Gs[n]	Disables stack-checking for routines with n or more bytes of local
	variables and compiler temporaries. Default: n=4096

-Gy	Packages functions to enable linker optimization.

-Qax{i|M|K|W}	Generates specialized code for processor specific codes 
		i, M, K, W while also generating generic IA-32 code. 
    	i  = Pentium Pro and Pentium II processor instructions
    	M  = MMX(TM) instructions
    	K  = streaming SIMD extensions
    	W  = Pentium 4 processor instructions
    
-Qx{i|M|K|W}	Generate specialized code to run exclusively on processors
            supporting the extensions indicated by <codes> as 
            described above.

-QxK, -QaxK, -QxW, -QaxW ensure consistent floating point arithmetic.


-Qip        Enables single-file interprocedural optimizations within a file.

-Qipo       multi-file ip optimizations that includes:
              - inline function expansion
              - interprocedural constant propagation
	      - monitoring module-level static variables
              - dead code elimination
              - propagation of function characteristics
              - passing arguments in registers
              - loop-invariant code motion

-Qprof_gen       Instruments the  program for profiling: to get the execution
		 count of each basic block.

-Qprof_use       Enables the use of profiling dynamic feedback information
		 during optimization.

-Qrcd            Enables[disables] fast conversions of floating-point to 
		 integer conversions. This option does not guarantee that
		 any particular rounding mode will be used.

-GR[-]           Enables[disables] C++ Run Time Type Information (RTTI). 
		 Default is -GR-

-GX[-]           Enables[disables] C++ Exception Handling. Default is -GX-

/Qfp_port   	 round fp results at assignments & casts (some speed impact)

/Qprefetch       is warned and ignored by the Intel C/C++ Compiler

-Qunroll[n]	Specifies the maximum number of times to unroll a loop. n=0 disables
		loop unrolling.

shlW32M.lib:    MicroQuill SmartHeap Library 6.03 available from 
                 http://www.microquill.com/

-Zp{1|2|4|8|16}	 Specifies the strictest alignment constraint for structure and union 
		 types as 1, 2. 4. 8 or 16 bytes. Default is 16.


Description of compiler flags for Intel Fortran Compiler 7.0
------------------------------------------------------

-O2	Optimizes for maximum speed. The -O2 option has the same effect as 
       	-Ox.  This options defaults to ON.

-O3    	Enables -O2 option with more aggressive optimization, for example, 
	loop transformation. Optimizes for maximum speed but may not improve
	performance for some programs.

-Oa[-] 	Assume [not assume] no aliasing

-Ob{0|1|2}	Controls the compiler's inline expansion. The amount of inline
                expansion performed varies as follows:
		-Ob0:  Disable inlining.
		-Ob1:  Disables (default) inlining unless -Qip or -Ob2 is
		       specified. Enables inlining of functions.
		-Ob2:  Enables inlining of any function.  However, the 
                       compiler decides which functions to inline.  Enables 
                       interprocedural optimizations and has the same effect as 
                       -Qip.

-Og	Enables global optimizations.

-Ot	Enables all speed optimizations.

-Oi[-] 	Enables/disables inline expansion of intrinsic functions

-Ow[-]	Assume[not assume] no cross-function aliasing.

-Ox   	Same as the -O2 option: enables -Gs, and -Ob1, -Og, -Oy, -Ot, -Oi.

-Oy[-]	Enables [disables] the use of the EBP register in optimizations. When
	you disable with -Oy-, the EBP register is used as frame pointer.

-Gf	Enables string-pooling optimization.

-Gs[n]	Disables stack-checking for routines with n or more bytes of local
	variables and compiler temporaries. Default: n=4096

-Gy	Packages functions to enable linker optimization.

-Qax{i|M|K|W}	Generates specialized code for processor specific codes 
		i, M, K, W while also generating generic IA-32 code. 
    	i  = Pentium Pro and Pentium II processor instructions
    	M  = MMX(TM) instructions
    	K  = streaming SIMD extensions
    	W  = Pentium 4 processor instructions
    
-Qx{i|M|K|W}	Generate specialized code to run exclusively on processors
            supporting the extensions indicated by <codes> as 
            described above.

-QxK, -QaxK, -QxW, -QaxW ensure consistent floating point arithmetic.


-Qip        Enables single-file interprocedural optimizations within a file.

-Qipo       multi-file ip optimizations that includes:
              - inline function expansion
              - interprocedural constant propagation
	      - monitoring module-level static variables
              - dead code elimination
              - propagation of function characteristics
              - passing arguments in registers
              - loop-invariant code motion

-Qprof_gen       Instruments the  program for profiling: to get the execution
		 count of each basic block.

-Qprof_use       Enables the use of profiling dynamic feedback information
		 during optimization.

-Qrcd            Enables[disables] fast conversions of floating-point to 
		 integer conversions. This option does not guarantee that
		 any particular rounding mode will be used.

-GR[-]           Enables[disables] C++ Run Time Type Information (RTTI). 
		 Default is -GR-

-GX[-]           Enables[disables] C++ Exception Handling. Default is -GX-

-Qscalar_rep[-]	 Enables[disables] scalar replacement performed during loop
		 transformations. (requires /O3).

-Qunroll[n]	Specifies the maximum number of times to unroll a loop. n=0 disables
		loop unrolling.

-Qprefetch[-]    Enables or disables prefetch insertion (requires -O3).

shlW32M6.lib:    MicroQuill SmartHeap Library 6.0 available from 
                 http://www.microquill.com/

-Zp{1|2|4|8|16}	 Specifies the strictest alignment constraint for structure and union 
		 types as 1, 2. 4. 8 or 16 bytes. Default is 16.



Other Notes: 
------------
"/" and "-" are both allowable starting tokens for flags passed to the 
compiler i.e. -QxK and /QxK are identical switches. 


Portability options for CPU2000:
-------------------------------
176.gcc:     
   -Dalloca=_alloca 	  :  So as to use the built-in optimized alloca.
   /F10000000       	  :  176.gcc uses alloca and this options tells the
			     linker to pre-allocate 10MB of stack. The 
			     default amount of stack allocated is not enough 
			     and 176.gcc crashes with a run-time error

178.galgel: 
   -FI                    : Fixed-format F90 source code. 
   /F32000000             : Same as with 176.gcc, pre-allocates a 32MB 
                            stack

186.crafty: 
   -DNT_i386              :  Specifies that it is a Windows NT Intel processor-based 
                             system which makes the compiler use "_int64" 
                             as the 64-bit variable that 186.crafty needs.        

253.perlbmk: 
   -DSPEC_CPU2000_NTOS    :  This enables the code changes for porting to
                             Windows get included.
   -DPERLDLL              :  On Windows, we need a perl.exe instead of a
                             perl.exe and perl.dll. This
                             pre-defines ensures that the changes necessary
                             to get a single, UNIX-style executible
                             without getting the indirect calls that can
                             cause a 10% performance degradation. This
                             allows the Windows-based executible to be as
                             close as possible to the Unix-based one. 
   /MT                    :  Use the static multi-threaded library else it
                             will not compile.

254.gap:
   -DSYS_HAS_CALLOC_PROTO :  
   -DSYS_HAS_MALLOC_PROTO :  These two pre-defines tell of the existence of
                             malloc and calloc prototypes.