Description of compiler flags for Intel C++ Compiler 5.0.1
----------------------------------------------------------
-O1    optimize for speed, but disable some optimizations which increase 
       code size for a small speed benefit. Includes inline expansion 
       except for intrinsic functions, global optimizations, string 
       pooling optimizations.  

-O2    This is the default level of optimization.  
       Optimizes for speed. The -O2 option includes O1 optimizations 
       and in addition enables inlining of intrinsics and more speed 
       optimizations.


-O3:   Builds on -01 and -02 optimizations by enabling high-level 
       optimization. This level does not guarantee higher performance 
       unless loop and memory access transformation take place. In 
       conjunction with -QaxK/-QxK and QaxW/QxW, this switch causes the 
       compiler to perform more aggressive data dependency analysis than 
       for -O2. This may result in longer compilation times. 

-Oa[-] assume [do not assume] no aliasing in program


-Qax<codes> generate code specialized for processor extensions 
specified by <codes> while also generating generic IA-32 code. 
<codes> includes one or more of the following characters:
    i  Pentium Pro and Pentium II processor instructions
    M  MMX(TM) instructions
    K  streaming SIMD extensions (implies i and M above)
    W  Pentium 4 processor with Streaming SIMD Extensions 2 
       (implies i, M and K)
    
-Qx<codes>  generate specialized code to run exclusively on processors
            supporting the extensions indicated by <codes> as 
            described above.

-Ob{0|1|2}	Controls the compiler's inline expansion.
		0:  disable inlining.
		1:  disables inlining unless -Qip or -Ob2 are specified.
		2:  enables inlining of any function.  However, the 
                    compiler decides which functions are inlined.  This 
                    option enables interprocedural optimizations and has
                    the same effect as specifying the -Qip option.


-Qip        enable single-file IP optimizations 
           (within files, same as -Ob2)

-Qipo       multi-file ip optimizations that includes:
              - inline function expansion
              - interprocedural constant propogation
              - dead code elimination
              - propagation of function characteristics
              - passing arguments in registers
              - loop-invariant code motion

-Qwp_ipo         enable multi-file IP optimizations (between files) and make
                 "whole program" assumption that all variables and functions seen
                 in the compiled sources are referenced only within those sources;
                 the user must guarantee that this assumption is safe

-Qprof_gen       instrument program for profiling for the first phase of 
                 two-phase profile guided otimization

-Qprof_use       Instructs the compiler to produce a profile-optimized 
                 executable and merges available dynamic information (.dyn) 
                 files into a pgopti.dpi file. If you perform multiple 
                 executions of the instrumented program, -Qprof_use merges 
                 the dynamic information files again and overwrites the 
                 previous pgopti.dpi file.
                 Without any other options, the current directory is 
                 searched for .dyn files

-Qrcd           The Intel compiler uses the -Qrcd option to improve the
                performance of code that requires floating-point-to-integer                        
                conversions. 

                The system default floating point rounding mode is
                round-to-nearest. This means that values are rounded during 
                floating point calculations. However, the C language requires 
                floating point values to be truncated when a conversion to an                      
                integer is involved. To do this, the compiler must change the 
                rounding mode to truncation before each floating 
                point-to-integer conversion and change it back afterwards.

                The -Qrcd option disables the change to truncation of the 
                rounding mode for all floating point calculations, including                       
                floating point-to-integer conversions. Turning on this option 
                can improve performance, but floating point conversions to 
                integer will not conform to C semantics.

-GX             Enables the full C++ Exception Handling unwind semantics. 


-GR             Enables C++ Runtime Type Information (RTTI). 


shlW32M.lib:    MicroQuill SmartHeap Library 5.0 available from 
                http://www.microquill.com/


Description of compiler flags for Intel FORTRAN Compiler 5.0.1
--------------------------------------------------------------
-O1    optimize for speed, but disable some optimizations which increase 
       code size for a small speed benefit. Includes inline expansion 
       except for intrinsic functions, global optimizations, string 
       pooling optimizations.  

-O2    This is the default level of optimization.  
       Optimizes for speed. The -O2 option includes O1 optimizations 
       and in addition enables inlining of intrinsics and more speed 
       optimizations.


-O3:   Builds on -01 and -02 optimizations by enabling high-level 
       optimization. This level does not guarantee higher performance 
       unless loop and memory access transformation take place. In 
       conjunction with -QaxK/-QxK and QaxW/QxW, this switch causes the 
       compiler to perform more aggressive data dependency analysis than 
       for -O2. This may result in longer compilation times. 


-Qax<codes> generate code specialized for processor extensions 
specified by <codes> while also generating generic IA-32 code. 
<codes> includes one or more of the following characters:
    i  Pentium Pro and Pentium II processor instructions
    M  MMX(TM) instructions
    K  streaming SIMD extensions (implies i and M above)
    W  Pentium 4 processor with Streaming SIMD Extensions 2 
       (implies i, M and K above)

-Qx<codes>  generate specialized code to run exclusively on processors
            supporting the extensions indicated by <codes> as 
            described above.

-Qip        enable single-file IP optimizations (within files, same as -Ob2)

-Qipo       multi-file ip optimizations that includes:
              - inline function expansion
              - interprocedural constant propogation
              - dead code elimination
              - propagation of function characteristics
              - passing arguments in registers
              - loop-invariant code motion

-Qwp_ipo         enable multi-file IP optimizations (between files) and make
                 "whole program" assumption that all variables and functions seen
                 in the compiled sources are referenced only within those sources;
                 the user must guarantee that this assumption is safe

-Qprof_gen       instrument program for profiling for the first phase of 
                 two-phase profile guided otimization

-Qprof_use       Instructs the compiler to produce a profile-optimized 
                 executable and merges available dynamic information (.dyn) 
                 files into a pgopti.dpi file. If you perform multiple 
                 executions of the instrumented program, -Qprof_use merges 
                 the dynamic information files again and overwrites the 
                 previous pgopti.dpi file.
                 Without any other options, the current directory is 
                 searched for .dyn files


-Qrcd            Enables fast float-to-int conversion.

-Qscalar_rep(-)  Enables(disables) scalar replacement performed during loop 
                 transformations (requires /O3).

-Qauto           Causes all variables to be allocated on the stack, rather than 
                 in local static storage. Does not affect variables that appear in an 
                 EQUIVALENCE or SAVE statement, or those that are in COMMON. Makes all 
                 local variables AUTOMATIC, same as /4Ya.


Other Notes: 
------------
"/" and "-" are both allowable starting tokens for flags passed to the 
compiler i.e. -QxK and /QxK are identical switches. 


Portability options for CPU2000:
-------------------------------
176.gcc:     
         -Dalloca=_alloca : so as to use the built-in optimized alloca
         -Fn              : 176.gcc uses alloca and this options tells
                            the linker to pre-allocate n bytes of stack. 
                            The default amount of stack allocated is not 
                            enough and  176.gcc crashes with a run-time 
                            error

178.galgel: 
   -FI                    : Fixed-format F90 source code. 
   -F32000000             : Same as with 176.gcc, pre-allocates a 32MB 
                            stack

186.crafty: 
   -DNT_i386              : Specifies that it is a Windows NT Intel 
                            processor-based system which makes the compiler 
                            use "long long" as the 64-bit variable that 
                            186.crafty needs.        

253.perlbmk: 
   -DSPEC_CPU2000_NTOS    : This enables the code changes for porting to 
                            Windows get included 
   -DPERLDLL              : On Windows, we need a perl.exe instead of a 
                            perl.exe and perl.dll. This pre-define ensures 
                            that the changes necessary to get a single, 
                            UNIX-style executible without getting the 
                            indirect calls that can cause a 10% performance 
                            degradation. This allows the Windows-based 
                            executible to be as close as possible to 
                            the Unix-based one. 
   -MT                    : Use the static multi-threaded library else 
                            it will not compile.

254.gap:
   -DSYS_HAS_CALLOC_PROTO :  
   -DSYS_HAS_MALLOC_PROTO : These two pre-defines tell of the existence 
                            of malloc and calloc prototypes.