----------------------------
-Super Micro Computer, Inc.- 
----------------------------

Description of compiler flags for Intel C++ Compiler 9.0
------------------------------------------------------------


O2	
		Optimizes for speed. The -O2 option includes the following options: 
		-Og, -Oi-, -Ot, -Oy, -Ob1, and -Gs  This options defaults to ON.
		This option also enables.
		* inlining of intrinsics
		* Intra-file interprocedural optimizations including:
		  * inlining
		  * constant propagation
		  * forward substitution
		  * routine attribute propagation
		  * variable address-taken analysis
		  * dead static function elimination
		  * removal of unreferenced variables.
		* The following performance optimizations:
		  * copy propogation.
		  * dead-code elimination
		  * global register allocation
		  * global instruction scheduling and control speculation
		  * loop unrolliing
		  * optimized code selection
		  * partial redundancy elimination
		  * strength reduction/induction variable simplification
		  * variable renaming
		  * exception handling optimizations
		  * tail recursions
		  * peephole optimizations
		  * structure assignment lowering and optimizations
		  * dead store elimination

-O3    	
		Optimizes for speed. Enables high-level optimization. This level does 
		not guarantee higher performance. Using this option may increase the
		compilation time. Impact on performance is application dependent, some
		applications may not see a performance improvement.  The optimizations
		include:
		* All optimizations done with -O2
		* loop unrolling, including instruction scheduling
		* code replication to eliminate branches
		* padding the size of certain power-of-two arrays to allow more efficient
		  cache use.
		* When used with -Qax or -Qx, it causes the compiler to perform more aggressive
		  data dependency analysis than for -O2.

-Oa[-] 	
		Assume [not assume] no aliasing.  Default Disabled.

-Obn      	
		Controls the compiler's inline expansion. The amount of inline
		expansion performed varies with the value of n as follows:
		0:  Disables inlining.  Statement functions are always inlined.
		1:  Enables (default) inlining of functions declared with the
		    __inline keyword. Also enables inlining according to the
		    C++ language.
		2:  Enables inlining of any function.  However, the 
		    compiler decides which functions to inline.  Enables 
		    interprocedural optimizations and has the same effect as 
		    -Qip.
		Default n=2.

-Og	
		Enables global optimizations.  Default ON.

-Ot	
		Enables all speed optimizations.

-Oi[-] 	
		Enables/disables inline expansion of intrinsic functions.  Default Enabled.

-Ow[-]	
		Assume[not assume] no cross function aliasing.

-Oy[-]	
		Enables [disables] the use of the EBP register in optimizations. When
		you disable with -Oy-, the EBP register is used as frame pointer.  -Oy has
		the effect of reducing the number of general-purpose registers by 1, and can
		produce slightly less efficient code.
		Default Enabled.

-Gf	
		Enables string-pooling optimization.

-Gs[n]	
		Disables stack-checking for routines with n or more bytes of local
		variables and compiler temporaries. Default: n=4096

-Gy	
		Packages functions to enable linker optimization.  Default ON.

-Qax{K|W|N|P}	
		Generates specialized code for processor specific codes 
		K, W, N while also generating generic IA-32 code. 
		K  = Intel Pentium III and compatible Intel processors
		W  = Intel Pentium 4 and compatible Intel processors
		N  = Intel Pentium 4 and compatible Intel processors. These options also enable
		     advanced data layout and code restructuring optimizations to improve memory
		     accesses for Intel processors.
		P  = Intel Pentium 4 processor with Streaming SIMD Extensions 3

-Qx{K|W|N|P} 
		Generate specialized code to run exclusively on processors
		supporting the extensions indicated by <codes> as 
		described above.
    
-Qip        
		Enables single-file interprocedural optimizations within a file.  

-Qipo       
		Enables multi-file ip optimizations which allows inline function expansion for
		calls to functions defined in separate files.  The compiler decides whether to create
		one or more object files based on an estimate of the size of the application.  It
		generates one object file for small applications and two for large ones.

-Qprof_gen       
		Instruments the  program for profiling: to get the execution
		count of each basic block.

-Qprof_use       
		Enables the use of profiling dynamic feedback information
		during optimization.  Turns on -Qfnsplit.  Forces function grouping.

-Qrcd            
		Enables[disables] fast conversions of floating-point to 
		integer conversions. This option does not guarantee that
		any particular rounding mode will be used.

-Qansi_alias[-]  
		-Qansi_alias directs the compiler to assume the following: 
		    - Arrays are not accessed out of bounds. 
		    - Pointers are not cast to non-pointer types, and vice-versa. 
		    - References to objects of two different scalar types cannot alias. 
		      For example, an object of type int cannot alias with an object 
		      of type float, or an object of type float cannot alias with an 
		      object of type double. 
		If your program satisfies the above conditions, setting the -Qansi_alias 
		flag will help the compiler better optimize the program. However, if your
		program does not satisfy one of the above conditions, the -Qansi_alias
		flag may lead the compiler to generate incorrect code.

-Qcxx_features  Enables both -GX and -GR as described above so C++ Runtime Type Information and 
                Exception Handling are both enabled

-GR[-]           
		Enables[disables] C++ Run Time Type Information (RTTI).

-GX[-]           
		Enables[disables] C++ Exception Handling.  Default Disabled.

-fast            
		Maximize speed across the entire program. Turns on -O3, -Qipo,
		-Qprec-div-,  and -QxP.

                 To override one of the options set by /fast, specify that option after the 
                 /fast option on the command line. The options set by /fast may change from 
                 release to release.


-Qfp_port   	 
		round fp results at assignments & casts (some speed impact)

-Qprefetch       
		Enable prefetch insertion.  Default ON.

-Qunroll[n]	 
		Specifies the maximum number of times to unroll a loop. n=0 disables
		loop unrolling.  Default: the compiler uses default heuristics when 
		unrolling loops.

-Qoption,tool,optlist 
		-Qoption passes an option specified by optlist to a tool, where
		optlist is a comma-separated list of options.

		tool		Description
		------------------------------------  
		cpp		Specifies the compiler front-end preprocessor 
		c		Specifies the C++ compiler
		asm		Specifies the assembler
		link		Specifies the linker
		oplist		Indicates one or more valid argument strings for the
				designated program. If the argument is a command-line
				option, you must include the hyphen. If the argument
				contains a space or tab character, you must enclose the
				entire argument in quotation characters (""). You must
				separate multiple arguments with commas

                NOTE: If 'tool' is incorrectly specified, the compiler gives an
		warning and the option is ignored. For example, if
		-Qoption,f,...  is used with the Intel C++ compiler, the
		option is ignored with an warning.		      
		      
		-Qoption can be used with the -Qipo flag to refine IPO. The valid options
		that can be used for this purpose are:

		-ip_args_in_regs=0        
				Disables the passing of arguments in registers.

		-ip_ninl_max_stats=n      
				Sets the valid max number of intermediate
				language statements for a function that is 
				expanded in line. The number n is a positive
				integer. The number of intermediate language
				statements usually exceeds the actual number of
				source language statements. The default value
				for n is 230. The compiler uses a larger limit
				for user inline functions. 
						
                      -ip_ninl_min_stats=n      
				Sets the valid min number of intermediate 
				language statements for a function that is 
				expanded in line. The number n is a positive 
				integer. The default values for 
				ip_ninl_min_stats are: 
				IA-32 compiler: ip_ninl_min_stats = 7 
 
                      -ip_ninl_max_total_stats=n 
				Sets the maximum increase in size of a function,
				measured in intermediate language statements, 
				due to inlining. n is a positive integer whose 
				default value is 2000. 

		      


shlW32M.lib:    
		MicroQuill SmartHeap Library 7.0 available from 
		http://www.microquill.com/

-Zp{1|2|4|8|16}	 
		Specifies the strictest alignment constraint for structure and union 
		types as 1, 2. 4. 8 or 16 bytes. Default is 16.


-arch:SSE        
		Enables the compiler to use SSE instructions.

-arch:SSE2       
		Enables the compiler to use SSE2 instructions.

-Qprec-div[-]    
		Enables[disables] improved precision of floating-point divides.  Disabling may
		slightly improve speed.  Default Enabled.

-Qpc64           
		Enables floating-point significand precision control.  The value is used to round
		the significand to the correct number of bits.  The value must be either 32, 64, 
		or 80.  Default ON.






Description of compiler flags for Intel Fortran Compiler 9.0
------------------------------------------------------------

-O2	
		Optimizes for speed. The -O2 option includes the following options: 
		-Og, Ot, -Oy, -Ob1, and -Gs  This options defaults to ON.
		This option also enables.
		* inlining of intrinsics
		* Intra-file interprocedural optimizations including:
		  * inlining
		  * constant propagation
		  * forward substitution
		  * routine attribute propagation
		  * variable address-taken analysis
		  * dead static function elimination
		  * removal of unreferenced variables.
		* The following performance optimizations:
		  * copy propogation.
		  * dead-code elimination
		  * global register allocation
		  * global instruction scheduling and control speculation
		  * loop unrolliing
		  * optimized code selection
		  * partial redundancy elimination
		  * strength reduction/induction variable simplification
		  * variable renaming
 		  * exception handling optimizations
		  * tail recursions
		  * peephole optimizations
		  * structure assignment lowering and optimizations
		  * dead store elimination

-O3    	
		Optimizes for speed. Enables high-level optimization. This level does 
		not guarantee higher performance. Using this option may increase the
		compilation time. Impact on performance is application dependent, some
		applications may not see a performance improvement.  The optimizations
		include:
		* All optimizations done with -O2
		* loop unrolling, including instruction scheduling
		* code replication to eliminate branches
		* padding the size of certain power-of-two arrays to allow more efficient
		  cache use.
		* When used with -Qax or -Qx, it causes the compiler to perform more aggressive
		  data dependency analysis than for -O2.

-Oa[-] 	
		Assume [not assume] no aliasing

-Ob{0|1|2}	
		Controls the compiler's inline expansion. The amount of inline
		expansion performed varies as follows:
		-Ob0:  Disable inlining.
		-Ob1:  Disables (default) inlining unless -Qip or -Ob2 is
		       specified. Enables inlining of functions.
		-Ob2:  Enables inlining of any function.  However, the 
		       compiler decides which functions to inline.  Enables 
		       interprocedural optimizations and has the same effect as 
		       -Qip.

-Og	
		Enables global optimizations.

-Ot	
		Enables all speed optimizations.

-Oi[-] 	
		Enables/disables inline expansion of intrinsic functions

-Ow[-]	
		Assume[not assume] no cross-function aliasing.

-Ox   	
		Same as the -O2 option: enables -Gs, and -Ob1, -Og, -Oy, and -Ot.

-Oy[-]	
		Enables [disables] the use of the EBP register in optimizations. When
		you disable with -Oy-, the EBP register is used as frame pointer.

-auto   
		Determines whether local variables are put on the run-time stack.

-Gf	
		Enables string-pooling optimization.

-Gs[n]	
		Disables stack-checking for routines with n or more bytes of local
		variables and compiler temporaries. Default: n=4096

-Gy	
		Packages functions to enable linker optimization.

-fast   
		Maximize speed across the entire program. Turns on -O3, -Qprec-div-, -QxP, and -Qipo.

-Qax{K|W|N|P} 
		Generates specialized code for processor specific codes 
		K, W, N, P while also generating generic IA-32 code. 
		K  = Intel Pentium III and compatible Intel processors
		W  = Intel Pentium 4 and compatible Intel processors
		N  = Intel Pentium 4 and compatible Intel processors. These option also enable
		     advanced data layout and code restructuring optimizations to improve memory
		     accesses for Intel processors.
		P  = Intel Pentium 4 processor with Streaming SIMD 3 (SSE3) support. These option 
		     also enable advanced data layout and code restructuring optimizations to improve memory
		     accesses for Intel processors.
    
-Qx{K|W|N|P} 
		Generate specialized code to run exclusively on processors
		supporting the extensions indicated by <codes> as 
		described above.


-Qip        
		Enables single-file interprocedural optimizations within a file.

-Qipo      
		multi-file ip optimizations that includes:
		- inline function expansion
		- interprocedural constant propagation
		- monitoring module-level static variables
		- dead code elimination
		- propagation of function characteristics
		- passing arguments in registers
		- loop-invariant code motion

-Qprof_gen       
		Instruments the  program for profiling: to get the execution
		count of each basic block.

-Qprof_use       
		Enables the use of profiling dynamic feedback information
		during optimization.

-Qrcd            
		Enables[disables] fast conversions of floating-point to 
		integer conversions. This option does not guarantee that
		any particular rounding mode will be used.

-Qansi_alias     
		Enables (default) or disables the compiler to assume that the program 
		adheres to the ANSI Fortran type aliasablility rules. For example, an object 
		of type real cannot be accessed as an integer. You should see the ANSI 
		Standard for the complete set of rules.


-Qscalar_rep[-]	 
		Enables[disables] scalar replacement performed during loop
		transformations. (requires /O3).

-Qauto      Causes all variables to be allocated on the stack, rather than 
            in local static storage. Does not affect variables that appear in an 
            EQUIVALENCE or SAVE statement, or those that are in COMMON. Makes all 
            local variables AUTOMATIC, same as /4Ya.


-Qunroll[n]	
		Specifies the maximum number of times to unroll a loop. n=0 disables
		loop unrolling.

-Qprefetch[-]    
		Enables[disables] prefetch insertion (requires -O3).

-Qoption,tool,optlist 
		-Qoption passes an option specified by optlist to a tool, where
		optlist is a comma-separated list of options.

		tool		Description
		------------------------------------  
		fpp		Specifies the Fortran preprocessor 
		f		Specifies the Fortran compiler
		asm		Specifies the assembler
		link		Specifies the linker
		oplist		Indicates one or more valid argument strings for the
				designated tool. You must separate multiple arguments with commas.
		      
		-Qoption can be used with the -Qipo flag to refine IPO. The valid option
		list that can be used for this purpose are

		-ip_args_in_regs=0        
				Disables the passing of arguments in registers.

		-ip_ninl_max_stats=n      
				Sets the valid max number of intermediate
				language statements for a function that is 
				expanded in line. The number n is a positive
				integer. The number of intermediate language
				statements usually exceeds the actual number of
				source language statements. The default value
				for n is 230. The compiler uses a larger limit
				for user inline functions. 
						
		-ip_ninl_min_stats=n      
				Sets the valid min number of intermediate 
				language statements for a function that is 
				expanded in line. The number n is a positive 
				integer. The default values for 
				ip_ninl_min_stats are: 
				IA-32 compiler: ip_ninl_min_stats = 7 
 
		-ip_ninl_max_total_stats=n Sets 
				the maximum increase in size of a function,
				measured in intermediate language statements, 
				due to inlining. n is a positive integer whose 
				default value is 2000. 


shlW32M.lib:    
		MicroQuill SmartHeap Library 7.0 available from 
		http://www.microquill.com/

-Zp{1|2|4|8|16}	 
		Specifies the strictest alignment constraint for structure and union 
		types as 1, 2. 4. 8 or 16 bytes. Default is 16.

-Qprec-div[-]    
		Enables[disables] improved precision of floating-point divides.  Disabling may
		slightly improve speed.  Default Enabled.


Description of compiler flags for Intel C++ Compiler 8.0
-------------------------------------------------------------------------------

-O2
		Optimizes for speed. The -O2 option has the same effect as specifying
		the following options: -Og, -Oi, -Ot, -Oy, -Ob1, -Gf, -Gs, and -Gy.
		This options defaults to ON.

-O3    	
		Optimizes for speed. Enables high-level optimization. This level does 
		not guarantee higher performance. Using this option may increase the
		compilation time. Impact on performance is application dependent, some
		applications may not see a performance improvement.

-Oa[-] 	
		Assume [not assume] no aliasing

-Obn      	
		Controls the compiler's inline expansion. The amount of inline
		expansion performed varies with the value of n as follows:
		0:  Disables inlining.
		1:  Enables (default) inlining of functions declared with the
		    __inline keyword. Also enables inlining according to the
                    C++ language.
		2:  Enables inlining of any function.  However, the 
		    compiler decides which functions to inline.  Enables 
		    interprocedural optimizations and has the same effect as 
		    -Qip.
		Default n=1.

-Og	
		Enables global optimizations.  Default ON.

-Ot	
		Enables all speed optimizations.  Overrides -Os

-Oi[-] 	
		Enables/disables inline expansion of intrinsic functions.  Default Enabled.

-Ow[-]	
		Assume[not assume] no aliasing within functions, but assume aliasing
		across calls.

-Oy[-]	
		Enables [disables] the use of the EBP register in optimizations. When
		you disable with -Oy-, the EBP register is used as frame pointer. 
		Default Enabled.

-Gf	
		Enables string-pooling optimization.  Default ON.

-Gs[n]	
		Disables stack-checking for routines with n or more bytes of local
		variables and compiler temporaries. Default: n=4096

-Gy	
		Packages functions to enable linker optimization.  Default ON.

-Qax{K|W|N}
		Generates specialized code for processor specific codes 
		K, W, N while also generating generic IA-32 code. 
		K  = Intel Pentium III and compatible Intel processors
		W  = Intel Pentium 4 and compatible Intel processors
		N  = Intel Pentium 4 and compatible Intel processors. These options also enable
		advanced data layout and code restructuring optimizations to improve memory
		accesses for Intel processors.
    
-Qx{K|W|N}	
		Generate specialized code to run exclusively on processors
		supporting the extensions indicated by <codes> as 
		described above.


-Qip        
		Enables single-file interprocedural optimizations within a file.  
		Same as -Ob2.

-Qipo       
		multi-file ip optimizations that includes:
		- inline function expansion
		- interprocedural constant propagation
		- monitoring module-level static variables
		- dead code elimination
		- propagation of function characteristics
		- passing arguments in registers
		- loop-invariant code motion

-Qprof_gen       
		Instruments the  program for profiling: to get the execution
		count of each basic block.

-Qprof_use       
		Enables the use of profiling dynamic feedback information
		during optimization.  Turns on -Qfnsplit.

-Qrcd            
		Enables[disables] fast conversions of floating-point to 
		integer conversions. This option does not guarantee that
		any particular rounding mode will be used.

-Qansi_alias[-]  
		-Qansi_alias directs the compiler to assume[not assume] the following: 
		    - Arrays are not accessed out of bounds. 
		    - Pointers are not cast to non-pointer types, and vice-versa. 
		    - References to objects of two different scalar types cannot alias. 
		      For example, an object of type int cannot alias with an object 
		      of type float, or an object of type float cannot alias with an 
		      object of type double. 
		If your program satisfies the above conditions, setting the -Qansi_alias 
		flag will help the compiler better optimize the program. However, if your
		program does not satisfy one of the above conditions, the -Qansi_alias
		flag may lead the compiler to generate incorrect code.
       		 

-GR[-]           
		Enables[disables] C++ Run Time Type Information (RTTI).

-GX[-]           
		Enables[disables] C++ Exception Handling.

-fast            
		Maximize speed across the entire program. Turns on -O3 and -Qipo.

-Qfp_port   	 
		round fp results at assignments & casts (some speed impact)

-Qprefetch       
		Enable prefetch insertion.  Default ON.

-Qunroll[n]	 
		Specifies the maximum number of times to unroll a loop. n=0 disables
		loop unrolling.

-Qoption,tool,optlist 
		-Qoption passes an option specified by optlist to a tool, where
		optlist is a comma-separated list of options.

		tool		Description
		------------------------------------  
		cpp		Specifies the compiler front-end preprocessor 
		c		Specifies the C++ compiler
		asm		Specifies the assembler
		link		Specifies the linker
		oplist		Indicates one or more valid argument strings for the
				designated program. If the argument is a command-line
				option, you must include the hyphen. If the argument
				contains a space or tab character, you must enclose the
				entire argument in quotation characters (""). You must
				separate multiple arguments with commas
		      

		-Qoption can be used with the -Qipo flag to refine IPO. The valid options
		that can be used for this purpose are:

		-ip_args_in_regs=0		
				Disables the passing of arguments in registers.
		
		-ip_ninl_max_stats=n		
				Sets the valid max number of intermediate
				language statements for a function that is 
				expanded in line. The number n is a positive
				integer. The number of intermediate language
				statements usually exceeds the actual number of
				source language statements. The default value
				for n is 230. The compiler uses a larger limit
				for user inline functions. 
						
		-ip_ninl_min_stats=n      
				Sets the valid min number of intermediate 
				language statements for a function that is 
				expanded in line. The number n is a positive 
				integer. The default values for 
				ip_ninl_min_stats are: 
				IA-32 compiler: ip_ninl_min_stats = 7 
 
		-ip_ninl_max_total_stats=n 
				Sets the maximum increase in size of a function,
				measured in intermediate language statements, 
				due to inlining. n is a positive integer whose 
				default value is 2000. 

		      


shlW32M.lib:    
		MicroQuill SmartHeap Library 7.0 available from 
		http://www.microquill.com/

-Zp{1|2|4|8|16}	 
		Specifies the strictest alignment constraint for structure and union 
		types as 1, 2. 4. 8 or 16 bytes. Default is 16.


-arch:SSE        
		Enables the compiler to use SSE instructions.

-arch:SSE2
		Enables the compiler to use SSE2 instructions.

-EHc             
		Specifies that C functions do not throw exceptions. Default ON.

-G7              
		Target optimization to Intel Pentium 4 processors.  Default ON.

-ML              
		Compiles and links with the static, single-thread C run time library.  Default ON.

-QA              
		Enables all predefined macros and all assertions.  Default ON.

-Qfnsplit        
		Enables function splitting.  Default ON.

-Qms1            
		Instructs the compiler to enable most Microsoft compatability bugs.  Default ON.

-Qmspp           
		Enables Microsoft C++ 6.0 Processor Pack binary compatability.  Default ON.

-Qpc64           
		Enables floating-point significand precision control.  The value is used to round
		the significand to the correct number of bits.  The value must be either 32, 64, 
		or 80.  Default ON.

-Qpchi           
		Enables precompiled header files coexistence to reduce build time.  Default ON.

-Qsfalign8	 
		May align stack for functions with 8 or 16 byte vars.  Default ON.

-Qvc7            
		Enables compatability with Visual C++ .NET.  Default ON.

-Qvec_report1    
		Indicate vectorized loops in diagnostic information.  Default ON.

-vmb             
		Selects the smallest representation for pointers to members. Use this
		option if you define each class before you declare a pointer to a member of the class.
		Default ON.





Description of compiler flags for Intel C++ Compiler 8.1
----------------------------------------------------------------------------------
-O1    optimize for speed, but disable some optimizations which increase 
       code size for a small speed benefit. Includes inline expansion 
       except for intrinsic functions, global optimizations, string 
       pooling optimizations.  

-O2    This is the default level of optimization.  
       Optimizes for speed. The -O2 option includes O1 optimizations 
       and in addition enables inlining of intrinsics and more speed 
       optimizations.


-O3:   Builds on -01 and -02 optimizations by enabling high-level 
       optimization. This level does not guarantee higher performance 
       unless loop and memory access transformation take place. In 
       conjunction with -QaxK/-QxK and QaxW/QxW, this switch causes the 
       compiler to perform more aggressive data dependency analysis than 
       for -O2. This may result in longer compilation times. 

-Oa[-] assume [do not assume] no aliasing in program


-Qax<codes> generate code specialized for processor extensions 
specified by <codes> while also generating generic IA-32 code. 
<codes> includes one or more of the following characters:
    i  Pentium Pro and Pentium II processor instructions
    M  MMX(TM) instructions
    K  streaming SIMD extensions (implies i and M above)
    W  Pentium 4 processor with Streaming SIMD Extensions 2 
       (implies i, M and K)
    N  Pentium 4 processor with Streaming SIMD Extensions 2 
    P  Pentium 4 processor with Streaming SIMD Extensions 3 
    
-Qx<codes>  generate specialized code to run exclusively on processors
            supporting the extensions indicated by <codes> as 
            described above.

----------------------------------------------------------------------------------
Additional Notes on /QxN and /QxP:
----------------------------------------------------------------------------------
-Qx{N|P}   The /QxN and /QxP options target your program to run on Intel Pentium 4 
           and compatible Intel processors.  The resulting code might 
           contain unconditional use of features that are not supported 
           on other processors.  Programs, where the function main() is 
           compiled with this option, will detect non compatible processors 
           and generate an error message during execution.  This option 
           also enables new optimizations in addition to Intel processor 
           specific optimizations.

           These options also enable advanced data layout and code restructuring
	     optimizations to improve memory accesses for Intel processors.
----------------------------------------------------------------------------------

-Ob{0|1|2}	Controls the compiler's inline expansion.
		0:  disable inlining.
		1:  disables inlining unless -Qip or -Ob2 are specified.
		2:  enables inlining of any function.  However, the 
                    compiler decides which functions are inlined.  This 
                    option enables interprocedural optimizations and has
                    the same effect as specifying the -Qip option.


-Qip        enable single-file IP optimizations 
           (within files, same as -Ob2)

-Qipo       multi-file ip optimizations that includes:
              - inline function expansion
              - interprocedural constant propogation
              - dead code elimination
              - propagation of function characteristics
              - passing arguments in registers
              - loop-invariant code motion

-fast            The /fast option enhances execution speed across the entire program 
                 by including the following options that can improve run-time performance:

                     /O3 (maximum speed and high-level optimizations) 
                     /Qipo (enables interprocedural optimizations across files) 
                     /QxP (generate code specialized for Intel Pentium 4 processor with 
                           Streaming SIMD Extensions 3)

                 To override one of the options set by /fast, specify that option after the 
                 /fast option on the command line. The options set by /fast may change from 
                 release to release.

-Qansi_alias     Directs the compiler to assume that the program
                 adheres to the type-based aliasing rules defined in Section 6.5 of the ISO C
                 Standard.  If your program adheres to these rules, this option will allow
                 the compiler to optimize more aggressively.  If it doesn't adhere to these
                 rules, it can cause the compiler to generate incorrect code.
 

-Qprof_gen       instrument program for profiling for the first phase of 
                 two-phase profile guided otimization

-Qprof_use       Instructs the compiler to produce a profile-optimized 
                 executable and merges available dynamic information (.dyn) 
                 files into a pgopti.dpi file. If you perform multiple 
                 executions of the instrumented program, -Qprof_use merges 
                 the dynamic information files again and overwrites the 
                 previous pgopti.dpi file.
                 Without any other options, the current directory is 
                 searched for .dyn files

-Qrcd           The Intel compiler uses the -Qrcd option to improve the
                performance of code that requires floating-point-to-integer                        
                conversions. 

                The system default floating point rounding mode is
                round-to-nearest. This means that values are rounded during 
                floating point calculations. However, the C language requires 
                floating point values to be truncated when a conversion to an                      
                integer is involved. To do this, the compiler must change the 
                rounding mode to truncation before each floating 
                point-to-integer conversion and change it back afterwards.

                The -Qrcd option disables the change to truncation of the 
                rounding mode for all floating point calculations, including                       
                floating point-to-integer conversions. Turning on this option 
                can improve performance, but floating point conversions to 
                integer will not conform to C semantics.

-Qunroll[n]     Specifies the maximum number of times to unroll a loop. Omit n to 
                let the compiler decide whether to perform unrolling or not. Use
                n = 0 to disable unroller. 
                If n is not specified, the compiler automatically chooses the maximum 
                number of times to unroll a loop.

-GX             Enables the full C++ Exception Handling unwind semantics. 

-GR             Enables C++ Runtime Type Information (RTTI). 

-Qcxx_features  Enables both -GX and -GR as described above so C++ Runtime Type Information and 
                Exception Handling are both enabled

-Zp{1|2|4|8|16} Specifies the strictest alignment constraint for structure and union 
                types as one of the following: 1, 2, 4, 8, or 16 (default) bytes.

-Qprefetch[-]   Enables [disables] the insertion of software prefetching by the compiler. 
                Default is /Qprefetch. 

shlW32M.lib:    MicroQuill SmartHeap Library 6.0 available from 
                http://www.microquill.com/


Description of compiler flags for Intel FORTRAN Compiler 8.1
-------------------------------------------------------------
-O1    optimize for speed, but disable some optimizations which increase 
       code size for a small speed benefit. Includes inline expansion 
       except for intrinsic functions, global optimizations, string 
       pooling optimizations.  

-O2    This is the default level of optimization.  
       Optimizes for speed. The -O2 option includes O1 optimizations 
       and in addition enables inlining of intrinsics and more speed 
       optimizations.


-O3:   Builds on -01 and -02 optimizations by enabling high-level 
       optimization. This level does not guarantee higher performance 
       unless loop and memory access transformation take place. In 
       conjunction with -QaxK/-QxK and QaxW/QxW, this switch causes the 
       compiler to perform more aggressive data dependency analysis than 
       for -O2. This may result in longer compilation times. 


-Qax<codes> generate code specialized for processor extensions 
specified by <codes> while also generating generic IA-32 code. 
<codes> includes one or more of the following characters:
    i  Pentium Pro and Pentium II processor instructions
    M  MMX(TM) instructions
    K  streaming SIMD extensions (implies i and M above)
    W  Pentium 4 processor with Streaming SIMD Extensions 2 
       (implies i, M and K)
    N  Pentium 4 processor with Streaming SIMD Extensions 2 
    P  Pentium 4 processor with Streaming SIMD Extensions 3 
    
-Qx<codes>  generate specialized code to run exclusively on processors
            supporting the extensions indicated by <codes> as 
            described above.

----------------------------------------------------------------------------------
Additional Notes on /QxN and /QxP:
----------------------------------------------------------------------------------
-Qx{N|P}   The /QxN and /QxP options target your program to run on Intel Pentium 4 
           and compatible Intel processors.  The resulting code might 
           contain unconditional use of features that are not supported 
           on other processors.  Programs, where the function main() is 
           compiled with this option, will detect non compatible processors 
           and generate an error message during execution.  This option 
           also enables new optimizations in addition to Intel processor 
           specific optimizations.

           These options also enable advanced data layout and code restructuring
	     optimizations to improve memory accesses for Intel processors.
----------------------------------------------------------------------------------

-Qip        enable single-file IP optimizations (within files, same as -Ob2)

-Qipo       multi-file ip optimizations that includes:
              - inline function expansion
              - interprocedural constant propogation
              - dead code elimination
              - propagation of function characteristics
              - passing arguments in registers
              - loop-invariant code motion

-fast            The /fast option enhances execution speed across the entire program 
                 by including the following options that can improve run-time performance:

                 -O3   (maximum speed and high-level optimizations) 
                 -Qipo (enables interprocedural optimizations across files) 
                 -QxP  (generate code specialized for Intel Pentium 4 processor with 
                        Streaming SIMD Extensions 3)

                 To override one of the options set by /fast, specify that option after the 
                 /fast option on the command line. The options set by /fast may change from 
                 release to release.

-Qansi_alias     Enables (default) or disables the compiler to assume that the program 
                 adheres to the ANSI Fortran type aliasablility rules. For example, an object 
                 of type real cannot be accessed as an integer. You should see the ANSI 
                 standard for the complete set of rules 

-Qprof_gen       instrument program for profiling for the first phase of 
                 two-phase profile guided otimization

-Qprof_use       Instructs the compiler to produce a profile-optimized 
                 executable and merges available dynamic information (.dyn) 
                 files into a pgopti.dpi file. If you perform multiple 
                 executions of the instrumented program, -Qprof_use merges 
                 the dynamic information files again and overwrites the 
                 previous pgopti.dpi file.
                 Without any other options, the current directory is 
                 searched for .dyn files

-Qrcd            Enables fast float-to-int conversion.

-Qscalar_rep(-)  Enables(disables) scalar replacement performed during loop 
                 transformations (requires /O3).

-Qauto           Causes all variables to be allocated on the stack, rather than 
                 in local static storage. Does not affect variables that appear in an 
                 EQUIVALENCE or SAVE statement, or those that are in COMMON. Makes all 
                 local variables AUTOMATIC, same as /4Ya.

-Qprefetch[-]   Enables [disables] the insertion of software prefetching by the compiler. 
                Default is /Qprefetch. 


Other Notes: 
------------
"/" and "-" are both allowable starting tokens for flags passed to the 
compiler i.e. -QxK and /QxK are identical switches. 


Compiler options for PGI Fortran compiler 6.0 for Windows XP IA32
-----------------------------------------------------------------

The optimization levels and their meanings are as follows:	

-lacml  
		Link with the AMD Core Math Library 2.5.3, packaged with the
		compiler. Also available at www.amd.com

-O0	
		A basic block is generated for each Fortran statement.  No scheduling 
	
		is done between statements.  No global optimizations are performed.

-O1	
		Scheduling within extended basic blocks is performed.  Some register 
		allocation is performed.  No global optimizations are performed.

-O2	
		All level 1 optimizations are performed.  In addition,  scalar
		optimizations such as induction recognition and loop invariant motion 
		are performed by the global optimizer. 
                
-O3	
		This level performs all level-one and level-two optimizations and 
		enables more aggressive hoisting and scalar replacement optimizations.

-fast	 
		Equivalent to "-O2 -Munroll=c:1 -Mnoframe -Mlre" 

-fastsse 
		Equivalent to "-fast -Mscalarsse -Mvect=sse -Mcache_align -Mflushz" 

-Mpfi    
		Generate profile feedback instrumentation; this
		includes extra code to collect run-time statistics to
		be used in a subsequent compile; -Mpfi must also appear
		when the program is linked.  When the program is run, a
		profile feedback file pgfi.out will be generated; see
		-Mpfo.

-Mpfo    
		Enable profile feedback optimizations; there must be a
		profile feedback file pgfi.out in the current
		directory, which contains the result of an execution of
		the program compiled with -Mpfi.

-Mcache_align    
		Align unconstrained objects of length greater than or equal to 16 bytes on
		cache-line boundaries. An unconstrained object is a data object that is not
		a member of an aggregate structure or common block. This option does
		not affect the alignment of allocatable or automatic arrays.

		Note: To effect cache-line alignment of stack-based local variables, the
		main program or function must be compiled with -Mcache_align.

-Mfixed 
		Process source using Fortran90 freeform specifications.

-Mflushz 	 
		Set SSE MXCSR register to flush-to-zero mode.

-Mipa=[option]  
		Enables interprocedural analysis with the specified option. The valid options are:

-Mipa=align  
		Instructs the IPA to recognize when pointer targets are all cache-line 
		aligned, allowing better SSE code generation.

-Mipa=arg  
		Instructs the IPA to remove arguments replaced by -Mipa=ptr,const 

-Mipa=const  
		Enable propagation of constants across procedure calls.

-Mipa=fast  
		Equivalent to: -Mipa=align,arg,const,globals,f90ptr,shape,localarg,ptr,vestigial 

-Mipa=f90ptr
		Enable Fortran 90 pointer disambiguation across procdure calls.
              	
-Mipa=globals  
		Instructs the IPA to optimize references to globals when not used in procedure calls.

-Mipa=inline
		Automatically determine which functions to inline

-Mipa=safe
		Assume unknown function references are safe

-Mipa=localarg  
		Externalizes local variables for use with -Mipa=arg

-Mipa=ptr  
		Instructs the IPA to perform pointer disambiguation across procedure calls.

-Mipa=vestigial  
		Instructs the IPA to eliminate functions that are not called.

-Mlre
		Enables loop-carried redundancy elimination.
	
-Mnoframe  
		Eliminate operations that set up a true stack frame pointer for functions.

-Mnovect
		Disables the vectorizer.

-Mscalarsse   
		Utilize the SSE (Streaming SIMD(Single Instruction Multiple Data) 
		Extensions) and SSE2  instructions to perform the operations coded. 
		This implies -Mflushz.

-Munix   
		Use UNIX calling conventions, no trailing underscores.

-Munroll  
		Invokes the loop unroller.  This also sets the optimization level to 2 
		if the level is set to less than 2.
			
		:m	Instructs the compiler to completely unroll loops with a
			constant loop count less than or equal to m, a supplied constant.
			If this value is not supplied, the m count is set to 4.

		n:u	Instructs the compiler to unroll u times, a loop which is
			not completely unrolled, or has a non-constant loop count.
			If u is not supplied, the unroller computes the number of times a
			candidate loop is unrolled.

-Mvect=sse  
		Instructs the vectorizer to search for loops, and where possible,
		use the SSE or SSE2 and prefetch instructions
		(depending on which processor is targeted).


Compiler options for PGI C compiler 6.0 for Windows XP
------------------------------------------------------

The optimization levels and their meanings are as follows:	

-lacml  
		Link with the AMD Core Math Library 2.5.3. Available from www.amd.com

-O0	
		A basic block is generated for each C statement.  No scheduling 
		is done between statements.  No global optimizations are performed.

-O1	
		Scheduling within extended basic blocks is performed.  Some register 
		allocation is performed.  No global optimizations are performed.

-O2	
		All level 1 optimizations are performed.  In addition,  scalar
		optimizations such as induction recognition and loop invariant motion 
		are performed by the global optimizer. 
                
-O3	
		This level performs all level-one and level-two optimizations and 
		enables more aggressive hoisting and scalar replacement optimizations.

-fast	 
		Equivalent to "-O2 -Munroll=c:1 -Mnoframe -Mlre" 

-fastsse 
		Equivalent to "-fast -Mscalarsse -Mvect=sse -Mcache_align -Mflushz" 

-Mpfi    
		Generate profile feedback instrumentation; this
		includes extra code to collect run-time statistics to
		be used in a subsequent compile; -Mpfi must also appear
		when the program is linked.  When the program is run, a
		profile feedback file pgfi.out will be generated; see
		-Mpfo.

-Mpfo    
		Enable profile feedback optimizations; there must be a
		profile feedback file pgfi.out in the current
		directory, which contains the result of an execution of
		the program compiled with -Mpfi.

-Mcache_align    
		Align unconstrained objects of length greater than or equal to 16 bytes on
		cache-line boundaries. An unconstrained object is a data object that is not
		a member of an aggregate structure or common block. This option does
		not affect the alignment of allocatable or automatic arrays.

		Note: To effect cache-line alignment of stack-based local variables, the
		main program or function must be compiled with -Mcache_align.


-Mflushz 	 
		Set SSE MXCSR register to flush-to-zero mode.

-Mipa=[option]  
		Enables interprocedural analysis with the specified option. The valid options are:

-Mipa=align  
		Instructs the IPA to recognize when pointer targets are all cache-line 
		aligned, allowing better SSE code generation.

-Mipa=arg  
		Instructs the IPA to remove arguments replaced by -Mipa=ptr,const 

-Mipa=const  
		Enable propagation of constants across procedure calls.

-Mipa=fast  
		Equivalent to: -Mipa=align,arg,const,globals,f90ptr,shape,localarg,ptr,vestigial 

-Mipa=f90ptr
		Enable Fortran 90 pointer disambiguation across procdure calls.
              	
-Mipa=globals  
		Instructs the IPA to optimize references to globals when not used in procedure calls.

-Mipa=inline
		Automatically determine which functions to inline

-Mipa=safe
		Assume unknown function references are safe

-Mipa=localarg  
		Externalizes local variables for use with -Mipa=arg

-Mipa=ptr  
 		Instructs the IPA to perform pointer disambiguation across procedure calls.

-Mipa=vestigial  
		Instructs the IPA to eliminate functions that are not called.

-Mlre
		Enables loop-carried redundancy elimination.
	
-Mnoframe  
		Eliminate operations that set up a true stack frame pointer for functions.

-Mnovect
		Disables the vectorizer.

-Mscalarsse   
		Utilize the SSE (Streaming SIMD(Single Instruction Multiple Data) 
		Extensions) and SSE2  instructions to perform the operations coded. 
		This implies -Mflushz.

-Munix   
		Use UNIX calling conventions, no trailing underscores.

-Munroll  
		Invokes the loop unroller.  This also sets the optimization level to 2 
		if the level is set to less than 2.
			
		c:m	Instructs the compiler to completely unroll loops with a
			constant loop count less than or equal to m, a supplied constant.
			If this value is not supplied, the m count is set to 4.

		n:u	Instructs the compiler to unroll u times, a loop which is
			not completely unrolled, or has a non-constant loop count.
			If u is not supplied, the unroller computes the number of times a
			candidate loop is unrolled.

-Mvect=sse  
		Instructs the vectorizer to search for loops, and where possible,
		use the SSE or SSE2 and prefetch instructions
		(depending on which processor is targeted).


Description of the 'start' command used for rate runs:
------------------------------------------------------
Starts a separate window to run a specified program or command.

START ["title"] [/D path] [/I] [/MIN] [/MAX] [/SEPARATE | /SHARED]
      [/LOW | /NORMAL | /HIGH | /REALTIME | /ABOVENORMAL | /BELOWNORMAL]
      [/AFFINITY <hex affinity>] [/WAIT] [/B] [command/program]
      [parameters]

    "title"     Title to display in  window title bar.
    path        Starting directory
    B           Start application without creating a new window. The
                application has ^C handling ignored. Unless the application
                enables ^C processing, ^Break is the only way to interrupt
                the application
    I           The new environment will be the original environment passed
                to the cmd.exe and not the current environment.
    MIN         Start window minimized
    MAX         Start window maximized
    SEPARATE    Start 16-bit Windows program in separate memory space
    SHARED      Start 16-bit Windows program in shared memory space
    LOW         Start application in the IDLE priority class
    NORMAL      Start application in the NORMAL priority class
    HIGH        Start application in the HIGH priority class
    REALTIME    Start application in the REALTIME priority class
    ABOVENORMAL Start application in the ABOVENORMAL priority class
    BELOWNORMAL Start application in the BELOWNORMAL priority class
    AFFINITY    The new application will have the specified processor
                affinity mask, expressed as a hexadecimal number.
    WAIT        Start application and wait for it to terminate
    command/program
                If it is an internal cmd command or a batch file then
                the command processor is run with the /K switch to cmd.exe.
                This means that the window will remain after the command
                has been run.

                If it is not an internal cmd command or batch file then
                it is a program and will run as either a windowed application
                or a console application.

    parameters  These are the parameters passed to the command/program





Portability options for CPU2000:
-------------------------------
176.gcc:     
         -Dalloca=_alloca : so as to use the built-in optimized alloca
         /Fn              : 176.gcc uses alloca and this options tells
                            the linker to pre-allocate n bytes of stack. 
                            The default amount of stack allocated is not 
                            enough and  176.gcc crashes with a run-time 
                            error

178.galgel: 
   -Mfixed                : Assume free-format source


186.crafty: 
   -DNT_i386              : Specifies that it is a Windows NT Intel 
                            processor-based system which makes the compiler 
                            use "long long" as the 64-bit variable that 
                            186.crafty needs.        

253.perlbmk: 
   -DSPEC_CPU2000_NTOS    : This enables the code changes for porting to 
                            Windows get included. 
   -DPERLDLL              : On Windows, we need a perl.exe instead of a 
                            perl.exe and perl.dll. This pre-define ensures 
                            that the changes necessary to get a single, 
                            UNIX-style executable without getting the 
                            indirect calls that can cause a 10% performance 
                            degradation. This allows the Windows-based 
                            executable to be as close as possible to 
                            the Unix-based one.
   /MT                    : Use the static multi-threaded library else 
                            it will not compile.

254.gap:
   -DSYS_HAS_CALLOC_PROTO :  
   -DSYS_HAS_MALLOC_PROTO : These two pre-defines tell of the existence 
                            of malloc and calloc prototypes.