Description of compiler	flags for Intel	C++ Compiler 8.0
--------------------------------------------------------

-O2
		Optimizes for speed. The -O2 option has	the same effect	as specifying
		the following options: -Og, -Oi, -Ot, -Oy, -Ob1, -Gf, -Gs, and -Gy.
		This options defaults to ON.

-O3
		Optimizes for speed. Enables high-level	optimization. This level does
		not guarantee higher performance. Using	this option may	increase the
		compilation time. Impact on performance	is application dependent, some
		applications may not see a performance improvement.

-Oa[-]
		Assume [not assume] no aliasing

-Obn
		Controls the compiler's	inline expansion. The amount of	inline
		expansion performed varies with	the value of n as follows:
		0:  Disables inlining.
		1:  Enables (default) inlining of functions declared with the
		    __inline keyword. Also enables inlining according to the
		    C++	language.
		2:  Enables inlining of	any function.  However,	the
		    compiler decides which functions to	inline.	 Enables
		    interprocedural optimizations and has the same effect as
		    -Qip.
		Default	n=1.

-Og
		Enables	global optimizations.  Default ON.

-Ot
		Enables	all speed optimizations.  Overrides -Os

-Oi[-]
		Enables/disables inline	expansion of intrinsic functions.  Default Enabled.

-Ow[-]
		Assume[not assume] no aliasing within functions, but assume aliasing
		across calls.

-Oy[-]
		Enables	[disables] the use of the EBP register in optimizations. When
		you disable with -Oy-, the EBP register	is used	as frame pointer.
		Default	Enabled.

-Gf
		Enables	string-pooling optimization.  Default ON.

-Gs[n]
		Disables stack-checking	for routines with n or more bytes of local
		variables and compiler temporaries. Default: n=4096

-Gy
		Packages functions to enable linker optimization.  Default ON.

-Qax{K|W|N}
		Generates specialized code for processor specific codes
		K, W, N	while also generating generic IA-32 code.
		K  = Intel Pentium III and compatible Intel processors
		W  = Intel Pentium 4 and compatible Intel processors
		N  = Intel Pentium 4 and compatible Intel processors. These options also enable
		advanced data layout and code restructuring optimizations to improve memory
		accesses for Intel processors.

-Qx{K|W|N}
		Generate specialized code to run exclusively on	processors
		supporting the extensions indicated by <codes> as
		described above.


-Qip
		Enables	single-file interprocedural optimizations within a file.
		Same as	-Ob2.

-Qipo
		multi-file ip optimizations that includes:
		- inline function expansion
		- interprocedural constant propagation
		- monitoring module-level static variables
		- dead code elimination
		- propagation of function characteristics
		- passing arguments in registers
		- loop-invariant code motion

-Qprof_gen
		Instruments the	 program for profiling:	to get the execution
		count of each basic block.

-Qprof_use
		Enables	the use	of profiling dynamic feedback information
		during optimization.  Turns on -Qfnsplit.

-Qrcd
		Enables[disables] fast conversions of floating-point to
		integer	conversions. This option does not guarantee that
		any particular rounding	mode will be used.

-Qansi_alias[-]
		-Qansi_alias directs the compiler to assume[not	assume]	the following:
		    - Arrays are not accessed out of bounds.
		    - Pointers are not cast to non-pointer types, and vice-versa.
		    - References to objects of two different scalar types cannot alias.
		      For example, an object of	type int cannot	alias with an object
		      of type float, or	an object of type float	cannot alias with an
		      object of	type double.
		If your	program	satisfies the above conditions,	setting	the -Qansi_alias
		flag will help the compiler better optimize the	program. However, if your
		program	does not satisfy one of	the above conditions, the -Qansi_alias
		flag may lead the compiler to generate incorrect code.


-GR[-]
		Enables[disables] C++ Run Time Type Information	(RTTI).

-GX[-]
		Enables[disables] C++ Exception	Handling.

-fast
		Maximize speed across the entire program. Turns	on -O3 and -Qipo.

-Qfp_port
		round fp results at assignments	& casts	(some speed impact)

-Qprefetch
		Enable prefetch	insertion.  Default ON.

-Qunroll[n]
		Specifies the maximum number of	times to unroll	a loop.	n=0 disables
		loop unrolling.

-Qoption,tool,optlist
		-Qoption passes	an option specified by optlist to a tool, where
		optlist	is a comma-separated list of options.

		tool		Description
		------------------------------------
		cpp		Specifies the compiler front-end preprocessor
		c		Specifies the C++ compiler
		asm		Specifies the assembler
		link		Specifies the linker
		oplist		Indicates one or more valid argument strings for the
				designated program. If the argument is a command-line
				option,	you must include the hyphen. If	the argument
				contains a space or tab	character, you must enclose the
				entire argument	in quotation characters	(""). You must
				separate multiple arguments with commas

		NOTE: If 'tool'	is incorrectly specified, the compiler gives an
		warning	and the	option is ignored. For example,	if
		-Qoption,f,...	is used	with the Intel C++ compiler, the
		option is ignored with an warning.

		-Qoption can be	used with the -Qipo flag to refine IPO.	The valid options
		that can be used for this purpose are:

		-ip_args_in_regs=0
				Disables the passing of	arguments in registers.

		-ip_ninl_max_stats=n
				Sets the valid max number of intermediate
				language statements for	a function that	is
				expanded in line. The number n is a positive
				integer. The number of intermediate language
				statements usually exceeds the actual number of
				source language	statements. The	default	value
				for n is 230. The compiler uses	a larger limit
				for user inline	functions.

		-ip_ninl_min_stats=n
				Sets the valid min number of intermediate
				language statements for	a function that	is
				expanded in line. The number n is a positive
				integer. The default values for
				ip_ninl_min_stats are:
				IA-32 compiler:	ip_ninl_min_stats = 7

		-ip_ninl_max_total_stats=n
				Sets the maximum increase in size of a function,
				measured in intermediate language statements,
				due to inlining. n is a	positive integer whose
				default	value is 2000.




shlW32M.lib:
		MicroQuill SmartHeap Library 7.0 available from
		http://www.microquill.com/

-Zp{1|2|4|8|16}
		Specifies the strictest	alignment constraint for structure and union
		types as 1, 2. 4. 8 or 16 bytes. Default is 16.


-arch:SSE
		Enables	the compiler to	use SSE	instructions.

-arch:SSE2
		Enables	the compiler to	use SSE2 instructions.

-EHc
		Specifies that C functions do not throw	exceptions. Default ON.

-G7
		Target optimization to Intel Pentium 4 processors.  Default ON.

-ML
		Compiles and links with	the static, single-thread C run	time library.  Default ON.

-QA
		Enables	all predefined macros and all assertions.  Default ON.

-Qfnsplit
		Enables	function splitting.  Default ON.

-Qms1
		Instructs the compiler to enable most Microsoft	compatability bugs.  Default ON.

-Qmspp
		Enables	Microsoft C++ 6.0 Processor Pack binary	compatability.	Default	ON.

-Qpc64
		Enables	floating-point significand precision control.  The value is used to round
		the significand	to the correct number of bits.	The value must be either 32, 64,
		or 80.	Default	ON.

-Qpchi
		Enables	precompiled header files coexistence to	reduce build time.  Default ON.

-Qsfalign8
		May align stack	for functions with 8 or	16 byte	vars.  Default ON.

-Qvc7
		Enables	compatability with Visual C++ .NET.  Default ON.

-Qvec_report1
		Indicate vectorized loops in diagnostic	information.  Default ON.

-vmb
		Selects	the smallest representation for	pointers to members. Use this
		option if you define each class	before you declare a pointer to	a member of the	class.
		Default	ON.




Description of compiler	flags for Intel	C++ Compiler 9.0
--------------------------------------------------------

-O2
		Optimizes for speed. The -O2 option includes the following options:
		-Og, -Oi-, -Os,	-Oy, -Ob1, and -Gs  This options defaults to ON.
		This option also enables.
		* inlining of intrinsics
		* Intra-file interprocedural optimizations including:
		  * inlining
		  * constant propagation
		  * forward substitution
		  * routine attribute propagation
		  * variable address-taken analysis
		  * dead static	function elimination
		  * removal of unreferenced variables.
		* The following	performance optimizations:
		  * copy propogation.
		  * dead-code elimination
		  * global register allocation
		  * global instruction scheduling and control speculation
		  * loop unrolliing
		  * optimized code selection
		  * partial redundancy elimination
		  * strength reduction/induction variable simplification
		  * variable renaming
		  * exception handling optimizations
		  * tail recursions
		  * peephole optimizations
		  * structure assignment lowering and optimizations
		  * dead store elimination

-O3
		Optimizes for speed. Enables high-level	optimization. This level does
		not guarantee higher performance. Using	this option may	increase the
		compilation time. Impact on performance	is application dependent, some
		applications may not see a performance improvement.  The optimizations
		include:
		* All optimizations done with -O2
		* loop unrolling, including instruction	scheduling
		* code replication to eliminate	branches
		* padding the size of certain power-of-two arrays to allow more	efficient
		  cache	use.
		* When used with -Qax or -Qx, it causes	the compiler to	perform	more aggressive
		  data dependency analysis than	for -O2.

-Oa[-]
		Assume [not assume] no aliasing.  Default Disabled.

-Obn
		Controls the compiler's	inline expansion. The amount of	inline
		expansion performed varies with	the value of n as follows:
		0:  Disables inlining.	Statement functions are	always inlined.
		1:  Enables (default) inlining of functions declared with the
		    __inline keyword. Also enables inlining according to the
		    C++	language.
		2:  Enables inlining of	any function.  However,	the
		    compiler decides which functions to	inline.	 Enables
		    interprocedural optimizations and has the same effect as
		    -Qip.
		Default	n=2.

-Og
		Enables	global optimizations.  Default ON.

-Ot
		Enables	all speed optimizations.  Overrides -Os

-Oi[-]
		Enables/disables inline	expansion of intrinsic functions.  Default Enabled.

-Ow[-]
		Assume[not assume] no cross function aliasing.

-Oy[-]
		Enables	[disables] the use of the EBP register in optimizations. When
		you disable with -Oy-, the EBP register	is used	as frame pointer.  -Oy has
		the effect of reducing the number of general-purpose registers by 1, and can
		produce	slightly less efficient	code.
		Default	Enabled.

-Gf
		Enables	string-pooling optimization.

-Gs[n]
		Disables stack-checking	for routines with n or more bytes of local
		variables and compiler temporaries. Default: n=4096

-Gy
		Packages functions to enable linker optimization.  Default ON.

-Qax{K|W|N}
		Generates specialized code for processor specific codes
		K, W, N	while also generating generic IA-32 code.
		K  = Intel Pentium III and compatible Intel processors
		W  = Intel Pentium 4 and compatible Intel processors
		N  = Intel Pentium 4 and compatible Intel processors. These options also enable
		     advanced data layout and code restructuring optimizations to improve memory
		     accesses for Intel	processors.

-Qx{K|W|N}
		Generate specialized code to run exclusively on	processors
		supporting the extensions indicated by <codes> as
		described above.


-Qip
		Enables	single-file interprocedural optimizations within a file.

-Qipo
		Enables	multi-file ip optimizations which allows inline	function expansion for
		calls to functions defined in separate files.  The compiler decides whether to create
		one or more object files based on an estimate of the size of the application.  It
		generates one object file for small applications and two for large ones.

-Qprof_gen
		Instruments the	 program for profiling:	to get the execution
		count of each basic block.

-Qprof_use
		Enables	the use	of profiling dynamic feedback information
		during optimization.  Turns on -Qfnsplit.  Forces function grouping.

-Qrcd
		Enables[disables] fast conversions of floating-point to
		integer	conversions. This option does not guarantee that
		any particular rounding	mode will be used.

-Qansi_alias[-]
		-Qansi_alias directs the compiler to assume the	following:
		    - Arrays are not accessed out of bounds.
		    - Pointers are not cast to non-pointer types, and vice-versa.
		    - References to objects of two different scalar types cannot alias.
		      For example, an object of	type int cannot	alias with an object
		      of type float, or	an object of type float	cannot alias with an
		      object of	type double.
		If your	program	satisfies the above conditions,	setting	the -Qansi_alias
		flag will help the compiler better optimize the	program. However, if your
		program	does not satisfy one of	the above conditions, the -Qansi_alias
		flag may lead the compiler to generate incorrect code.


-GR[-]
		Enables[disables] C++ Run Time Type Information	(RTTI).

-GX[-]
		Enables[disables] C++ Exception	Handling.  Default Disabled.

-fast
		Maximize speed across the entire program. Turns	on -O3,	-Qipo,
		-Qprec-div-,  and -QxP.

-Qfp_port
		round fp results at assignments	& casts	(some speed impact)

-Qprefetch
		Enable prefetch	insertion.  Default ON.

-Qunroll[n]
		Specifies the maximum number of	times to unroll	a loop.	n=0 disables
		loop unrolling.	 Default: the compiler uses default heuristics when
		unrolling loops.

-Qoption,tool,optlist
		-Qoption passes	an option specified by optlist to a tool, where
		optlist	is a comma-separated list of options.

		tool		Description
		------------------------------------
		cpp		Specifies the compiler front-end preprocessor
		c		Specifies the C++ compiler
		asm		Specifies the assembler
		link		Specifies the linker
		oplist		Indicates one or more valid argument strings for the
				designated program. If the argument is a command-line
				option,	you must include the hyphen. If	the argument
				contains a space or tab	character, you must enclose the
				entire argument	in quotation characters	(""). You must
				separate multiple arguments with commas

		-Qoption can be	used with the -Qipo flag to refine IPO.	The valid options
		that can be used for this purpose are:

		-ip_args_in_regs=0
				Disables the passing of	arguments in registers.

		-ip_ninl_max_stats=n
				Sets the valid max number of intermediate
				language statements for	a function that	is
				expanded in line. The number n is a positive
				integer. The number of intermediate language
				statements usually exceeds the actual number of
				source language	statements. The	default	value
				for n is 230. The compiler uses	a larger limit
				for user inline	functions.

		      -ip_ninl_min_stats=n
				Sets the valid min number of intermediate
				language statements for	a function that	is
				expanded in line. The number n is a positive
				integer. The default values for
				ip_ninl_min_stats are:
				IA-32 compiler:	ip_ninl_min_stats = 7

		      -ip_ninl_max_total_stats=n
				Sets the maximum increase in size of a function,
				measured in intermediate language statements,
				due to inlining. n is a	positive integer whose
				default	value is 2000.




shlW32M.lib:
		MicroQuill SmartHeap Library 7.0 available from
		http://www.microquill.com/

-Zp{1|2|4|8|16}
		Specifies the strictest	alignment constraint for structure and union
		types as 1, 2. 4. 8 or 16 bytes. Default is 16.


-arch:SSE
		Enables	the compiler to	use SSE	instructions.

-arch:SSE2
		Enables	the compiler to	use SSE2 instructions.

-Qprec-div[-]
		Enables[disables] improved precision of	floating-point divides.	 Disabling may
		slightly improve speed.	 Default Enabled.

-Qpc64
		Enables	floating-point significand precision control.  The value is used to round
		the significand	to the correct number of bits.	The value must be either 32, 64,
		or 80.	Default	ON.






Description of compiler	flags for Intel	Fortran	Compiler 9.0
------------------------------------------------------------

-O2
		Optimizes for speed. The -O2 option includes the following options:
		-Og, Ot, -Oy, -Ob1, and	-Gs  This options defaults to ON.
		This option also enables.
		* inlining of intrinsics
		* Intra-file interprocedural optimizations including:
		  * inlining
		  * constant propagation
		  * forward substitution
		  * routine attribute propagation
		  * variable address-taken analysis
		  * dead static	function elimination
		  * removal of unreferenced variables.
		* The following	performance optimizations:
		  * copy propogation.
		  * dead-code elimination
		  * global register allocation
		  * global instruction scheduling and control speculation
		  * loop unrolliing
		  * optimized code selection
		  * partial redundancy elimination
		  * strength reduction/induction variable simplification
		  * variable renaming
		  * exception handling optimizations
		  * tail recursions
		  * peephole optimizations
		  * structure assignment lowering and optimizations
		  * dead store elimination

-O3
		Optimizes for speed. Enables high-level	optimization. This level does
		not guarantee higher performance. Using	this option may	increase the
		compilation time. Impact on performance	is application dependent, some
		applications may not see a performance improvement.  The optimizations
		include:
		* All optimizations done with -O2
		* loop unrolling, including instruction	scheduling
		* code replication to eliminate	branches
		* padding the size of certain power-of-two arrays to allow more	efficient
		  cache	use.
		* When used with -Qax or -Qx, it causes	the compiler to	perform	more aggressive
		  data dependency analysis than	for -O2.

-Oa[-]
		Assume [not assume] no aliasing

-Ob{0|1|2}
		Controls the compiler's	inline expansion. The amount of	inline
		expansion performed varies as follows:
		-Ob0:  Disable inlining.
		-Ob1:  Disables	(default) inlining unless -Qip or -Ob2 is
		       specified. Enables inlining of functions.
		-Ob2:  Enables inlining	of any function.  However, the
		       compiler	decides	which functions	to inline.  Enables
		       interprocedural optimizations and has the same effect as
		       -Qip.

-Og
		Enables	global optimizations.

-Ot
		Enables	all speed optimizations.

-Oi[-]
		Enables/disables inline	expansion of intrinsic functions

-Ow[-]
		Assume[not assume] no cross-function aliasing.

-Ox
		Same as	the -O2	option:	enables	-Gs, and -Ob1, -Og, -Oy, and -Ot.

-Oy[-]
		Enables	[disables] the use of the EBP register in optimizations. When
		you disable with -Oy-, the EBP register	is used	as frame pointer.

-auto
		Determines whether local variables are put on the run-time stack.

-Gf
		Enables	string-pooling optimization.

-Gs[n]
		Disables stack-checking	for routines with n or more bytes of local
		variables and compiler temporaries. Default: n=4096

-Gy
		Packages functions to enable linker optimization.

-fast
		Maximize speed across the entire program. Turns	on -O3,	-Qprec-div-, -QxP, and -Qipo.

-Qax{K|W|N|P}
		Generates specialized code for processor specific codes
		K, W, N, P while also generating generic IA-32 code.
		K  = Intel Pentium III and compatible Intel processors
		W  = Intel Pentium 4 and compatible Intel processors
		N  = Intel Pentium 4 and compatible Intel processors. These option also	enable
		     advanced data layout and code restructuring optimizations to improve memory
		     accesses for Intel	processors.
		P  = Intel Pentium 4 processor with Streaming SIMD 3 (SSE3) support. These option
		     also enable advanced data layout and code restructuring optimizations to improve memory
		     accesses for Intel	processors.

-Qx{K|W|N|P}
		Generate specialized code to run exclusively on	processors
		supporting the extensions indicated by <codes> as
		described above.


-Qip
		Enables	single-file interprocedural optimizations within a file.

-Qipo
		multi-file ip optimizations that includes:
		- inline function expansion
		- interprocedural constant propagation
		- monitoring module-level static variables
		- dead code elimination
		- propagation of function characteristics
		- passing arguments in registers
		- loop-invariant code motion

-Qprof_gen
		Instruments the	 program for profiling:	to get the execution
		count of each basic block.

-Qprof_use
		Enables	the use	of profiling dynamic feedback information
		during optimization.

-Qrcd
		Enables[disables] fast conversions of floating-point to
		integer	conversions. This option does not guarantee that
		any particular rounding	mode will be used.

-Qansi_alias
		Enables	(default) or disables the compiler to assume that the program
		adheres	to the ANSI Fortran type aliasablility rules. For example, an object
		of type	real cannot be accessed	as an integer. You should see the ANSI
		Standard for the complete set of rules.


-Qscalar_rep[-]
		Enables[disables] scalar replacement performed during loop
		transformations. (requires /O3).

-Qunroll[n]
		Specifies the maximum number of	times to unroll	a loop.	n=0 disables
		loop unrolling.

-Qprefetch[-]
		Enables[disables] prefetch insertion (requires -O3).

-Qoption,tool,optlist
		-Qoption passes	an option specified by optlist to a tool, where
		optlist	is a comma-separated list of options.

		tool		Description
		------------------------------------
		fpp		Specifies the Fortran preprocessor
		f		Specifies the Fortran compiler
		asm		Specifies the assembler
		link		Specifies the linker
		oplist		Indicates one or more valid argument strings for the
				designated tool. You must separate multiple arguments with commas.

		-Qoption can be	used with the -Qipo flag to refine IPO.	The valid option
		list that can be used for this purpose are

		-ip_args_in_regs=0
				Disables the passing of	arguments in registers.

		-ip_ninl_max_stats=n
				Sets the valid max number of intermediate
				language statements for	a function that	is
				expanded in line. The number n is a positive
				integer. The number of intermediate language
				statements usually exceeds the actual number of
				source language	statements. The	default	value
				for n is 230. The compiler uses	a larger limit
				for user inline	functions.

		-ip_ninl_min_stats=n
				Sets the valid min number of intermediate
				language statements for	a function that	is
				expanded in line. The number n is a positive
				integer. The default values for
				ip_ninl_min_stats are:
				IA-32 compiler:	ip_ninl_min_stats = 7

		-ip_ninl_max_total_stats=n Sets
				the maximum increase in	size of	a function,
				measured in intermediate language statements,
				due to inlining. n is a	positive integer whose
				default	value is 2000.


shlW32M.lib:
		MicroQuill SmartHeap Library 7.0 available from
		http://www.microquill.com/

-Zp{1|2|4|8|16}
		Specifies the strictest	alignment constraint for structure and union
		types as 1, 2. 4. 8 or 16 bytes. Default is 16.

-Qprec-div[-]
		Enables[disables] improved precision of	floating-point divides.	 Disabling may
		slightly improve speed.	 Default Enabled.



Other Notes:
------------
"/" and	"-" are	both allowable starting	tokens for flags passed	to the
compiler i.e. -QxK and /QxK are	identical switches.


Compiler options for PGI Fortran compiler 6.0 for Windows XP IA32
-----------------------------------------------------------------

The optimization levels	and their meanings are as follows:

-lacml
		Link with the AMD Core Math Library 2.5.3, packaged with the
		compiler. Also available at www.amd.com

-O0
		A basic	block is generated for each Fortran statement.	No scheduling

		is done	between	statements.  No	global optimizations are performed.

-O1
		Scheduling within extended basic blocks	is performed.  Some register
		allocation is performed.  No global optimizations are performed.

-O2
		All level 1 optimizations are performed.  In addition,	scalar
		optimizations such as induction	recognition and	loop invariant motion
		are performed by the global optimizer.

-O3
		This level performs all	level-one and level-two	optimizations and
		enables	more aggressive	hoisting and scalar replacement	optimizations.

-fast
		Equivalent to "-O2 -Munroll=c:1	-Mnoframe -Mlre"

-fastsse
		Equivalent to "-fast -Mscalarsse -Mvect=sse -Mcache_align -Mflushz"

-Mpfi
		Generate profile feedback instrumentation; this
		includes extra code to collect run-time	statistics to
		be used	in a subsequent	compile; -Mpfi must also appear
		when the program is linked.  When the program is run, a
		profile	feedback file pgfi.out will be generated; see
		-Mpfo.

-Mpfo
		Enable profile feedback	optimizations; there must be a
		profile	feedback file pgfi.out in the current
		directory, which contains the result of	an execution of
		the program compiled with -Mpfi.

-Mcache_align
		Align unconstrained objects of length greater than or equal to 16 bytes	on
		cache-line boundaries. An unconstrained	object is a data object	that is	not
		a member of an aggregate structure or common block. This option	does
		not affect the alignment of allocatable	or automatic arrays.

		Note: To effect	cache-line alignment of	stack-based local variables, the
		main program or	function must be compiled with -Mcache_align.

-Mfixed
		Process	source using Fortran90 freeform	specifications.

-Mflushz
		Set SSE	MXCSR register to flush-to-zero	mode.

-Mipa=[option]
		Enables	interprocedural	analysis with the specified option. The	valid options are:

-Mipa=align
		Instructs the IPA to recognize when pointer targets are	all cache-line
		aligned, allowing better SSE code generation.

-Mipa=arg
		Instructs the IPA to remove arguments replaced by -Mipa=ptr,const

-Mipa=const
		Enable propagation of constants	across procedure calls.

-Mipa=fast
		Equivalent to: -Mipa=align,arg,const,globals,f90ptr,shape,localarg,ptr,vestigial

-Mipa=f90ptr
		Enable Fortran 90 pointer disambiguation across	procdure calls.

-Mipa=globals
		Instructs the IPA to optimize references to globals when not used in procedure calls.

-Mipa=inline
		Automatically determine	which functions	to inline

-Mipa=safe
		Assume unknown function	references are safe

-Mipa=localarg
		Externalizes local variables for use with -Mipa=arg

-Mipa=ptr
		Instructs the IPA to perform pointer disambiguation across procedure calls.

-Mipa=vestigial
		Instructs the IPA to eliminate functions that are not called.

-Mlre
		Enables	loop-carried redundancy	elimination.

-Mnoframe
		Eliminate operations that set up a true	stack frame pointer for	functions.

-Mnovect
		Disables the vectorizer.

-Mscalarsse
		Utilize	the SSE	(Streaming SIMD(Single Instruction Multiple Data)
		Extensions) and	SSE2  instructions to perform the operations coded.
		This implies -Mflushz.

-Munix
		Use UNIX calling conventions, no trailing underscores.

-Munroll
		Invokes	the loop unroller.  This also sets the optimization level to 2
		if the level is	set to less than 2.

		:m	Instructs the compiler to completely unroll loops with a
			constant loop count less than or equal to m, a supplied	constant.
			If this	value is not supplied, the m count is set to 4.

		n:u	Instructs the compiler to unroll u times, a loop which is
			not completely unrolled, or has	a non-constant loop count.
			If u is	not supplied, the unroller computes the	number of times	a
			candidate loop is unrolled.

-Mvect=sse
		Instructs the vectorizer to search for loops, and where	possible,
		use the	SSE or SSE2 and	prefetch instructions
		(depending on which processor is targeted).


Compiler options for PGI C compiler 6.0	for Windows XP
------------------------------------------------------

The optimization levels	and their meanings are as follows:

-lacml
		Link with the AMD Core Math Library 2.5.3. Available from www.amd.com

-O0
		A basic	block is generated for each C statement.  No scheduling
		is done	between	statements.  No	global optimizations are performed.

-O1
		Scheduling within extended basic blocks	is performed.  Some register
		allocation is performed.  No global optimizations are performed.

-O2
		All level 1 optimizations are performed.  In addition,	scalar
		optimizations such as induction	recognition and	loop invariant motion
		are performed by the global optimizer.

-O3
		This level performs all	level-one and level-two	optimizations and
		enables	more aggressive	hoisting and scalar replacement	optimizations.

-fast
		Equivalent to "-O2 -Munroll=c:1	-Mnoframe -Mlre"

-fastsse
		Equivalent to "-fast -Mscalarsse -Mvect=sse -Mcache_align -Mflushz"

-Mpfi
		Generate profile feedback instrumentation; this
		includes extra code to collect run-time	statistics to
		be used	in a subsequent	compile; -Mpfi must also appear
		when the program is linked.  When the program is run, a
		profile	feedback file pgfi.out will be generated; see
		-Mpfo.

-Mpfo
		Enable profile feedback	optimizations; there must be a
		profile	feedback file pgfi.out in the current
		directory, which contains the result of	an execution of
		the program compiled with -Mpfi.

-Mcache_align
		Align unconstrained objects of length greater than or equal to 16 bytes	on
		cache-line boundaries. An unconstrained	object is a data object	that is	not
		a member of an aggregate structure or common block. This option	does
		not affect the alignment of allocatable	or automatic arrays.

		Note: To effect	cache-line alignment of	stack-based local variables, the
		main program or	function must be compiled with -Mcache_align.


-Mflushz
		Set SSE	MXCSR register to flush-to-zero	mode.

-Mipa=[option]
		Enables	interprocedural	analysis with the specified option. The	valid options are:

-Mipa=align
		Instructs the IPA to recognize when pointer targets are	all cache-line
		aligned, allowing better SSE code generation.

-Mipa=arg
		Instructs the IPA to remove arguments replaced by -Mipa=ptr,const

-Mipa=const
		Enable propagation of constants	across procedure calls.

-Mipa=fast
		Equivalent to: -Mipa=align,arg,const,globals,f90ptr,shape,localarg,ptr,vestigial

-Mipa=f90ptr
		Enable Fortran 90 pointer disambiguation across	procdure calls.

-Mipa=globals
		Instructs the IPA to optimize references to globals when not used in procedure calls.

-Mipa=inline
		Automatically determine	which functions	to inline

-Mipa=safe
		Assume unknown function	references are safe

-Mipa=localarg
		Externalizes local variables for use with -Mipa=arg

-Mipa=ptr
		Instructs the IPA to perform pointer disambiguation across procedure calls.

-Mipa=vestigial
		Instructs the IPA to eliminate functions that are not called.

-Mlre
		Enables	loop-carried redundancy	elimination.

-Mnoframe
		Eliminate operations that set up a true	stack frame pointer for	functions.

-Mnovect
		Disables the vectorizer.

-Mscalarsse
		Utilize	the SSE	(Streaming SIMD(Single Instruction Multiple Data)
		Extensions) and	SSE2  instructions to perform the operations coded.
		This implies -Mflushz.

-Munix
		Use UNIX calling conventions, no trailing underscores.

-Munroll
		Invokes	the loop unroller.  This also sets the optimization level to 2
		if the level is	set to less than 2.

		c:m	Instructs the compiler to completely unroll loops with a
			constant loop count less than or equal to m, a supplied	constant.
			If this	value is not supplied, the m count is set to 4.

		n:u	Instructs the compiler to unroll u times, a loop which is
			not completely unrolled, or has	a non-constant loop count.
			If u is	not supplied, the unroller computes the	number of times	a
			candidate loop is unrolled.

-Mvect=sse
		Instructs the vectorizer to search for loops, and where	possible,
		use the	SSE or SSE2 and	prefetch instructions
		(depending on which processor is targeted).


Description of the 'start' command used	for rate runs:
------------------------------------------------------
Starts a separate window to run	a specified program or command.

START ["title"]	[/D path] [/I] [/MIN] [/MAX] [/SEPARATE	| /SHARED]
      [/LOW | /NORMAL |	/HIGH |	/REALTIME | /ABOVENORMAL | /BELOWNORMAL]
      [/AFFINITY <hex affinity>] [/WAIT] [/B] [command/program]
      [parameters]

    "title"	Title to display in  window title bar.
    path	Starting directory
    B		Start application without creating a new window. The
		application has	^C handling ignored. Unless the	application
		enables	^C processing, ^Break is the only way to interrupt
		the application
    I		The new	environment will be the	original environment passed
		to the cmd.exe and not the current environment.
    MIN		Start window minimized
    MAX		Start window maximized
    SEPARATE	Start 16-bit Windows program in	separate memory	space
    SHARED	Start 16-bit Windows program in	shared memory space
    LOW		Start application in the IDLE priority class
    NORMAL	Start application in the NORMAL	priority class
    HIGH	Start application in the HIGH priority class
    REALTIME	Start application in the REALTIME priority class
    ABOVENORMAL	Start application in the ABOVENORMAL priority class
    BELOWNORMAL	Start application in the BELOWNORMAL priority class
    AFFINITY	The new	application will have the specified processor
		affinity mask, expressed as a hexadecimal number.
    WAIT	Start application and wait for it to terminate
    command/program
		If it is an internal cmd command or a batch file then
		the command processor is run with the /K switch	to cmd.exe.
		This means that	the window will	remain after the command
		has been run.

		If it is not an	internal cmd command or	batch file then
		it is a	program	and will run as	either a windowed application
		or a console application.

    parameters	These are the parameters passed	to the command/program





Portability options for	CPU2000:
-------------------------------
176.gcc:
	 -Dalloca=_alloca : so as to use the built-in optimized	alloca
	 /Fn		  : 176.gcc uses alloca	and this options tells
			    the	linker to pre-allocate n bytes of stack.
			    The	default	amount of stack	allocated is not
			    enough and	176.gcc	crashes	with a run-time
			    error

178.galgel:
   -Mfixed		  : Assume free-format source


186.crafty:
   -DNT_i386		  : Specifies that it is a Windows NT Intel
			    processor-based system which makes the compiler
			    use	"long long" as the 64-bit variable that
			    186.crafty needs.

253.perlbmk:
   -DSPEC_CPU2000_NTOS	  : This enables the code changes for porting to
			    Windows get	included.
   -DPERLDLL		  : On Windows,	we need	a perl.exe instead of a
			    perl.exe and perl.dll. This	pre-define ensures
			    that the changes necessary to get a	single,
			    UNIX-style executable without getting the
			    indirect calls that	can cause a 10%	performance
			    degradation. This allows the Windows-based
			    executable to be as	close as possible to
			    the	Unix-based one.
   /MT			  : Use	the static multi-threaded library else
			    it will not	compile.

254.gap:
   -DSYS_HAS_CALLOC_PROTO :
   -DSYS_HAS_MALLOC_PROTO : These two pre-defines tell of the existence
			    of malloc and calloc prototypes.

Other:
------
submit= specperl -e "system sprintf qq{start /b /wait /affinity %x %s}, (1<<$SPECUSERNUM), qq{ $command }"

'start /b /wait /affinity' used in SPEC command to bind CPU(s) to processes.
When running multiple copies of benchmarks, the SPEC config file feature
submit is sometimes used to cause individual jobs to be bound to specific
processors:

submit= causes the SPEC tools to use this line when submitting jobs.

$SPECUSERNUM: the SPEC tools-assigned number for this copy of the benchmark.
     
specperl -e "system sprintf".... : used to generate command line using $SPECUSERNUM and
                                   actual SPEC run command. 

START 
starts a program in another session or window.  START offers a large
number of switches to control the session you start.

	/B
	The program is started without creating a new window or console. Normally,
	the application is started in its own window.

	/AFFINITY <n>   
	On multiple processor machines, set the processor affinity
	for this process. Acceptable values are 0 to n-1 (where n is the number of
	available processors)

	/WAIT
	Wait for the new session or window to finish before continuing.