Description of compiler flags for Intel C Compiler 5.0
------------------------------------------------------
/O3: In conjunction with QaxK/QxK, this switch causes the compiler to perform more 
     aggressive data dependancy analysis than for /O2. This may result in longer 
     compilation times. 

/Oa[-] assume no aliasing in program

/Ow    assume no aliasing in program but assume aliasing across function calls. This switch
       tells the compiler that no aliasing occurs within function bodies but might occur
       across function calls. After each function call, pointer variables must be 
       reloaded from memory. 

/Qax<codes> generate code specialized for processor extensions 
specified by <codes> while also generating generic IA-32 code. 
<codes> includes one or more of the following characters:
    i  Pentium Pro and Pentium II processor instructions
    M  MMX(TM) instructions
    K  streaming SIMD extensions

/Qx<codes>  generate specialized code to run exclusively on processors
            supporting the extensions indicated by <codes> as 
            described above.

/QIfist[-]  enable/disable(DEFAULT) fast float-to-int conversions (*M)

/Qip        enable single-file IP optimizations (within files, same as /Ob2)

/Qipo       enable multi-file IP optimizations (between files)

/Qipo_wp    a higher level of ip optimizations that verifies that the
            whole program optimizations listed below are possible. These 
            optimizations only happen at link time when it is known that 
            an executable is generated. If the conditions for the listed 
            optimizations are not met,
            the link time compilation will just do the equivalent of -Qipo
 
            Whole program optimizations done at link time:
              - Data alignment within common blocks
              - Data layout within common blocks
              - Elimination of external function not called
              - Interprocedural constant propagation
              - No alternate entry needed for stack aligned external function entries

/Qprof_dir <d>   specify directory for profiling output files 
                 (*.dyn and *.dpi)
/Qprof_file <f>  specify file name for profiling summary file
/Qprof_gen[x]    instrument program for profiling; with the x qualifier,
                 extra information is gathered for use with the 
                 PROFORDER tool
/Qprof_use       enable use of profiling information during optimization

/Qrcd            same as /QIfist

/Qunroll[n]      set maximum number of times to unroll loops.  Omit n to use
                 default heuristics.  Use n=0 to disable loop unroller.

shlW32M.lib: MicroQuill SmartHeap Library 5.0 available from http://www.microquill.com/


Description of compiler flags for Intel FORTRAN Compiler 5.0
------------------------------------------------------------
/O1    optimize for maximum speed, but disable some optimizations which
       increase code size for a small speed benefit: /Gs /Ob1gysi-
/O2    optimize for maximum speed (same as /Ox)
/O3    enable /O2 plus more aggressive optimizations that may not improve
       performance for all programs

/Qax<codes> generate code specialized for processor extensions 
specified by <codes> while also generating generic IA-32 code. 
<codes> includes one or more of the following characters:
    i  Pentium Pro and Pentium II processor instructions
    M  MMX(TM) instructions
    K  streaming SIMD extensions

/Qx<codes>  generate specialized code to run exclusively on processors
            supporting the extensions indicated by <codes> as 
            described above.

/QIfist[-]  enable/disable(DEFAULT) fast float-to-int conversions (*M)

/Qip        enable single-file IP optimizations (within files, same as /Ob2)

/Qipo       enable multi-file interprocedural (ip) optimizations (between files)

/Qipo_wp    a higher level of ip optimizations that verifies that the
            whole program optimizations listed below are possible. These 
            optimizations only happen at link time when it is known that 
            an executable is generated. If the conditions for the listed 
            optimizations are not met,
            the link time compilation will just do the equivalent of -Qipo
 
            Whole program optimizations done at link time:
              - Data alignment within common blocks
              - Data layout within common blocks
              - Elimination of external function not called
              - Interprocedural constant propagation
              - No alternate entry needed for stack aligned external function entries

/Qpc 32          set internal FPU precision to 24 bit significand

/Qprefetch[-]    enable(DEFAULT)/disable prefetch insertion (requires /O3)

/Qprof_dir <d>   specify directory for profiling output files 
                 (*.dyn and *.dpi)
/Qprof_file <f>  specify file name for profiling summary file
/Qprof_gen[x]    instrument program for profiling; with the x qualifier,
                 extra information is gathered for use with the 
                 PROFORDER tool
/Qprof_use       enable use of profiling information during optimization


/Qrcd            same as /QIfist 

/Qscalar_rep[-]  enable(DEFAULT)/disable scalar replacement (requires /O3)

-Zi              Used to enable the generation of debugging information

Portability options for CPU2000:
-------------------------------
176.gcc:     
         -Dalloca=_alloca : so as to use the built-in optimized alloca
         /Fn              : 176.gcc uses alloca and this options tells the linker to pre-allocate 
                            n bytes of stack. The default amount of stack allocated is not enough and 
                            176.gcc crashes with a run-time error
         -Op              : restricts optimization to maintain declared precision and to ensure that 
                            fp arithmetic conforms ore closely to ANSI and IEEE standards. This options adversely
                            affects performance. It is necessary to ensure that certain fp operations in 176.gcc 
                            cause a loss of precision and hence a change in value. Using Intel's native 
                            80-bit arithmetic causes the wrong results to be generated. 

178.galgel: 
   -FI                    : Fixed-format F90 source code. 
   -link -stack:32000000  : Same as with 176.gcc, pre-allocates a 32MB stack

186.crafty: 
   -DNT_i386              : Specifies that it is a Windows NT Intel processor-based system which 
                            makes the compiler use "long long" as the 64-bit variable that 186.crafty needs.        

253.perlbmk: 
   -DSPEC_CPU2000_NTOS    : This enables the code changes for porting to WIndows get included 
   -DPERLDLL              : On Windows, we need a perl.exe instead of a perl.exe and perl.dll. This
                            pre-defines ensures that the changes necessary to get a single, UNIX-style executible
                            without getting the indirect calls that can cause a 10% performance degradation. This
                            allows the Windows-based executible to be as close as possible to the Unix-based one. 
   /MT                    : Use the static multi-threaded library else it will not compile.

254.gap:
   -DSYS_HAS_CALLOC_PROTO :  
   -DSYS_HAS_MALLOC_PROTO : These two pre-defines tell of the existence of malloc and calloc prototypes.