Description of compiler flags for Intel C++ Compiler 9.1
--------------------------------------------------------
-O1    optimize for speed, but disable some optimizations which increase 
       code size for a small speed benefit. Includes inline expansion 
       except for intrinsic functions, global optimizations, string 
       pooling optimizations.

-O2    This is the default level of optimization.  
       Optimizes for speed. The -O2 option includes O1 optimizations 
       and in addition enables inlining of intrinsics and more speed 
       optimizations. 

-O3    Builds on -01 and -02 optimizations by enabling high-level 
       optimization. This level does not guarantee higher performance 
       unless loop and memory access transformation take place. In 
       conjunction with -QaxK/-QxK and QaxW/QxW, this switch causes the 
       compiler to perform more aggressive data dependency analysis than 
       for -O2. This may result in longer compilation times. 

-Oa[-] assume [do not assume] no aliasing in program


-Qax<codes> generate code specialized for processors specified by <codes>
            while also generating generic IA-32 code.  <codes> includes
            one or more of the following characters:
    K  Intel Pentium III and compatible Intel processors
    W  Intel Pentium 4 and compatible Intel processors
    N  Intel Pentium 4 and compatible Intel processors.  Enables new
       optimizations in addition to Intel processor-specific optimizations
    P  Intel Pentium 4 processors and compatible Intel processors with Streaming SIMD Extensions 3
    B  Intel Pentium M and compatible Intel processors
    
-Qx<processor> (Windows)
-x<processor> (Linux)
		Generate specialized code for processor specified by <codes>
		while also generating generic code.

		<processor> is the processor for which you want to target your program. 
		Possible values are: 

  		K: Code is optimized for Intel� Pentium� III and compatible Intel processors. 
  		W: Code is optimized for Intel Pentium 4 and compatible Intel processors. 
  		N: Code is optimized for Intel Pentium 4 and compatible Intel processors 
		   with Streaming SIMD Extensions 2. The resulting code may contain 
		   unconditional use of features that are not supported on other 
		   processors.
		   This option also enables new optimizations in addition to Intel 
		   processor-specific optimizations including advanced data layout and 
		   code restructuring optimizations to improve memory accesses for Intel 
		   processors. 
  		B: Code is optimized for Intel Pentium M and compatible Intel processors. 
		   This option also enables new optimizations in addition to Intel 
	 	   processor-specific optimizations. 
  		P: Code is optimized for Intel� Core� Duo processors, Intel� Core� Solo 
		   processors, Intel� Pentium� 4 processors with Streaming SIMD 
		   Extensions 3, and compatible Intel processors with Streaming SIMD 
		   Extensions 3. The resulting code may contain unconditional use of 
		   features that are not supported on other processors. 
		   This option also enables new optimizations in addition to Intel 
		   processor-specific optimizations including advanced data layout and 
		   code restructuring optimizations to improve memory accesses for Intel 
		   processors.  

		Additional Notes on <codes> N and P:
		------------------------------------
		The N and P options target your program to run on Intel Pentium 4
		and compatible Intel processors.  The resulting code might
		contain unconditional use of features that are not supported
		on other processors.  Programs, where the function main() is
		compiled with this option, will detect non compatible processors
		and generate an error message during execution. These options also 
                enable new optimizations in addition to Intel processor-specific 
                optimizations including advanced data layout and code restructuring 
                optimizations to improve memory accesses for Intel processors.

/arch:{SSE|SSE2}
     same as -QxK and -QxW respectively


-Ob{0|1|2}	Controls the compiler's inline expansion.
		0:  disable inlining.
                1  inline functions declared with __inline, and perform C++ inlining
                2  inline any function, at the compiler's discretion (same as -Qip)


-Qip       enable single-file IP optimizations 
           (within files, same as -Ob2)

-Qipo       multi-file ip optimizations that includes:
              - inline function expansion
              - interprocedural constant propogation
              - dead code elimination
              - propagation of function characteristics
              - passing arguments in registers
              - loop-invariant code motion

-fast            The -fast option enhances execution speed across the entire program 
                 by including the following options that can improve run-time performance:

                     -O3 (maximum speed and high-level optimizations) 
                     -Qipo (enables interprocedural optimizations across files) 
                     -QxP (generate code specialized for Intel Pentium 4 processor 
                           and compatible Intel processors with Streaming SIMD Extensions 3)
                     -Qprec-div- (disable -Qprec-div)
                      where -Qprec-div improves precision of FP divides (some speed impact)

                 To override one of the options set by /fast, specify that option after the 
                 -fast option on the command line. The options set by /fast may change from 
                 release to release.

-Qansi_alias[-]  enable/disable use of ANSI aliasing rules in
                 optimizations; user asserts that the program adheres to
                 these rules.  The default for C++ is -Qansi_alias-
                 which is that aliasing rules are not assumed.  The default for
                 the Fortran compiler is -Qansi_alias as described in the 
                 next section.    For C++, the -Qansi_alias
                 flag will enable optimizations that would otherwise be
                 prevented by potential aliasing.

-Qprof_gen       instrument program for profiling for the first phase of 
                 two-phase profile guided otimization

-Qprof_use       Instructs the compiler to produce a profile-optimized 
                 executable and merges available dynamic information (.dyn) 
                 files into a pgopti.dpi file. If you perform multiple 
                 executions of the instrumented program, -Qprof_use merges 
                 the dynamic information files again and overwrites the 
                 previous pgopti.dpi file.
                 Without any other options, the current directory is 
                 searched for .dyn files

-Qrcd           The Intel compiler uses the -Qrcd option to improve the
                performance of code that requires floating-point-to-integer                        
                conversions. 

                The system default floating point rounding mode is
                round-to-nearest. This means that values are rounded during 
                floating point calculations. However, the C language requires 
                floating point values to be truncated when a conversion to an                      
                integer is involved. To do this, the compiler must change the 
                rounding mode to truncation before each floating 
                point-to-integer conversion and change it back afterwards.

                The -Qrcd option disables the change to truncation of the 
                rounding mode for all floating point calculations, including                       
                floating point-to-integer conversions. Turning on this option 
                can improve performance, but floating point conversions to 
                integer will not conform to C semantics.

-Qunroll[n]     Specifies the maximum number of times to unroll a loop. Omit n to 
                let the compiler decide whether to perform unrolling or not. Use
                n = 0 to disable unroller. 
                If n is not specified, the compiler automatically chooses the maximum 
                number of times to unroll a loop.

-Qcxx_features  Enables both -GX and -GR as described below so C++ Runtime Type Information and 
                Exception Handling are both enabled

-GX             Enables the full C++ Exception Handling unwind semantics. 

-GR             Enables C++ Runtime Type Information (RTTI). 

-Zp{1|2|4|8|16} Specifies the strictest alignment constraint for structure and union 
                types as one of the following: 1, 2, 4, 8, or 16 (default) bytes.

-Qprefetch[-]   Enables [disables] the insertion of software prefetching by the compiler. 
                Default is -Qprefetch. 

+FDO		PASS1=-Qprof_gen  PASS2=-Qprof_use

		Using feedback-directed optimization, a profile is generated 
		on the first pass of compilation and used on the second pass.

shlW32M.lib:    MicroQuill SmartHeap Library 8.0 available from 
                http://www.microquill.com/


Description of compiler flags for Intel FORTRAN Compiler 9.1
-------------------------------------------------------------
-O1    optimize for speed, but disable some optimizations which increase 
       code size for a small speed benefit. Includes inline expansion 
       except for intrinsic functions, global optimizations, string 
       pooling optimizations.  

-O2    This is the default level of optimization.  
       Optimizes for speed. The -O2 option includes O1 optimizations 
       and in addition enables inlining of intrinsics and more speed 
       optimizations.


-O3:   Builds on -01 and -02 optimizations by enabling high-level 
       optimization. This level does not guarantee higher performance 
       unless loop and memory access transformation take place. In 
       conjunction with -QaxK/-QxK and QaxW/QxW, this switch causes the 
       compiler to perform more aggressive data dependency analysis than 
       for -O2. This may result in longer compilation times. 


-Qax<codes> generate code specialized for processors specified by <codes>
            while also generating generic IA-32 code.  <codes> includes
            one or more of the following characters:
    K  Intel Pentium III and compatible Intel processors
    W  Intel Pentium 4 and compatible Intel processors
    N  Intel Pentium 4 and compatible Intel processors.  Enables new
       optimizations in addition to Intel processor-specific optimizations
    P  Intel Pentium 4 processors and compatible Intel processors with Streaming SIMD Extensions 3
    B  Intel Pentium M and compatible Intel processors
    
-Qx<processor> (Windows)
-x<processor> (Linux)
		Generate specialized code for processor specified by <codes>
		while also generating generic code.

		<processor> is the processor for which you want to target your program. 
		Possible values are: 

  		K: Code is optimized for Intel� Pentium� III and compatible Intel processors. 
  		W: Code is optimized for Intel Pentium 4 and compatible Intel processors. 
  		N: Code is optimized for Intel Pentium 4 and compatible Intel processors 
		   with Streaming SIMD Extensions 2. The resulting code may contain 
		   unconditional use of features that are not supported on other 
		   processors.
		   This option also enables new optimizations in addition to Intel 
		   processor-specific optimizations including advanced data layout and 
		   code restructuring optimizations to improve memory accesses for Intel 
		   processors. 
  		B: Code is optimized for Intel Pentium M and compatible Intel processors. 
		   This option also enables new optimizations in addition to Intel 
	 	   processor-specific optimizations. 
  		P: Code is optimized for Intel� Core� Duo processors, Intel� Core� Solo 
		   processors, Intel� Pentium� 4 processors with Streaming SIMD 
		   Extensions 3, and compatible Intel processors with Streaming SIMD 
		   Extensions 3. The resulting code may contain unconditional use of 
		   features that are not supported on other processors. 
		   This option also enables new optimizations in addition to Intel 
		   processor-specific optimizations including advanced data layout and 
		   code restructuring optimizations to improve memory accesses for Intel 
		   processors.  

		Additional Notes on <codes> N and P:
		------------------------------------
		The N and P options target your program to run on Intel Pentium 4
		and compatible Intel processors.  The resulting code might
		contain unconditional use of features that are not supported
		on other processors.  Programs, where the function main() is
		compiled with this option, will detect non compatible processors
		and generate an error message during execution. These options also 
                enable new optimizations in addition to Intel processor-specific 
                optimizations including advanced data layout and code restructuring 
                optimizations to improve memory accesses for Intel processors.

/arch:{SSE|SSE2}
     same as -QxK and -QxW respectively


-Qip        enable single-file IP optimizations (within files, same as -Ob2)

-Qipo       multi-file ip optimizations that includes:
              - inline function expansion
              - interprocedural constant propogation
              - dead code elimination
              - propagation of function characteristics
              - passing arguments in registers
              - loop-invariant code motion

-fast            The -fast option enhances execution speed across the entire program 
                 by including the following options that can improve run-time performance:

                 -O3   (maximum speed and high-level optimizations) 
                 -Qipo (enables interprocedural optimizations across files) 
                 -QxP (generate code specialized for Intel Pentium 4 processor 
                       and compatible Intel processors with Streaming SIMD Extensions 3)
                 -Qprec-div- (disable -Qprec-div)
                 where -Qprec-div improves precision of FP divides (some speed impact)

                 To override one of the options set by -fast, specify that option after the 
                 -fast option on the command line. The options set by -fast may change from 
                 release to release.

-Qansi_alias     Enables (default) or disables the compiler to assume that the program 
                 adheres to the ANSI Fortran type aliasability rules. For example, an object 
                 of type real cannot be accessed as an integer. You should see the ANSI 
                 standard for the complete set of rules.  The default for this flag
                 is the reverse in C++, as noted in the previous section.

-Qprof_gen       instrument program for profiling for the first phase of 
                 two-phase profile guided otimization

-Qprof_use       Instructs the compiler to produce a profile-optimized 
                 executable and merges available dynamic information (.dyn) 
                 files into a pgopti.dpi file. If you perform multiple 
                 executions of the instrumented program, -Qprof_use merges 
                 the dynamic information files again and overwrites the 
                 previous pgopti.dpi file.
                 Without any other options, the current directory is 
                 searched for .dyn files

-Qrcd            Enables fast float-to-int conversion.

-Qscalar_rep(-)  Enables(disables) scalar replacement performed during loop 
                 transformations (requires /O3).  Such replacement is disabled by
                 default.

-Qauto           Causes all variables to be allocated on the stack, rather than 
                 in local static storage. Does not affect variables that appear in an 
                 EQUIVALENCE or SAVE statement, or those that are in COMMON. Makes all 
                 local variables AUTOMATIC, same as /4Ya.

-Qunroll[n]     Specifies the maximum number of times to unroll a loop. Omit n to 
                let the compiler decide whether to perform unrolling or not. Use
                n = 0 to disable unroller. 
                If n is not specified, the compiler automatically chooses the maximum 
                number of times to unroll a loop.

-Qprefetch[-]   Enables [disables] the insertion of software prefetching by the compiler. 
                Default is -Qprefetch. 


Other Notes: 
------------
"/" and "-" are both allowable starting tokens for flags passed to the 
compiler i.e. -QxK and /QxK are identical switches. 


Portability options for CPU2000:
-------------------------------
176.gcc:     
         -Dalloca=_alloca : so as to use the built-in optimized alloca
         -Fn              : 176.gcc uses alloca and this options tells
                            the linker to pre-allocate n bytes of stack. 
                            The default amount of stack allocated is not 
                            enough and  176.gcc crashes with a run-time 
                            error

178.galgel: 
   -FI                    : Fixed-format F90 source code. 
   -F32000000             : Same as with 176.gcc, pre-allocates a 32MB 
                            stack

186.crafty: 
   -DNT_i386              : Specifies that it is a Windows NT Intel 
                            processor-based system which makes the compiler 
                            use "long long" as the 64-bit variable that 
                            186.crafty needs.        
252.eon:
   -DHAS_ERRLIST          : Prog env provides specification for
                            "sys_errlist[]".

253.perlbmk: 
   -DSPEC_CPU2000_NTOS    : This ensures some of the code changes necessary for compilation on 
                            Windows, get included 

   -DPERLDLL              : For Windows, SPEC modified the original perl source
                            code to allow building a monolithic executable
                            rather than the executable and DLL that is
                            standard on Windows.
                            -DPERLDLL is simply a signal to the Perl source
                            code that it is being built as if it were to go
                            into the DLL.
                             
                            The SPEC version of perl, in 253.perlbmk, will not
                            build on Windows without this flag.  
 
   /MT                    : Use the static multi-threaded library else 
                            it will not compile.

254.gap:
   -DSYS_HAS_CALLOC_PROTO :  
   -DSYS_HAS_MALLOC_PROTO : These two pre-defines tell of the existence 
                            of malloc and calloc prototypes.