-----------------------------------------------------------------------------------
Fujitsu PRIMEPOWER flags/tunables description			(May.09 2005)

(Each section is sorted in case insensitive, alphabetical order)

Table of Contents
[1] Fujitsu Parallelnavi 2.3 compiler flag description
[2] Sun Studio 9 flag description
[3] Environment Variables
[4] Kernel Parameters (/etc/system)
[5] Commands for feedback control

-----------------------------------------------------------------------------------
[1] Fujitsu Parallelnavi 2.3 compiler flag description

Compiler options        Remark
-----------------------------------------------------------------------------------
[Fortran]
-Am                     Required if a source file contains modules which
                        will be referenced by USE statements in other
                        source files or if a source file contains USE
                        statements that reference modules in another
                        source file. The -Am option creates a module
                        information file (module_name.mod) for each module
                        compiled, either in the current directory or in a
                        directory specified by the -Mdirectory option.
                        frt searches for module information files in the
                        current directory and also in directories speci-
                        fied by the -Mdirectory and/or -Idirectory
                        options.

-dy/-dn                 In c, specify y or n.  -dy specifies dynamic linking in
                        the linker.  -dn specifies static linking in the
                        linker.  The default is -dy.  This option and its argu-
                        ment are passed to the linker.

-f omitmsg              Set the level of diagnostic messages output and inhibit
                        specific messages.
                        omitmsg can be one of the characters i, w, or s, and/or
                        a list of msgnum.  If several arguments are specified,
                        they must be delimited by commas.

                        i    All messages are output, this is the default.

                        w    i level messages are not output.

                        s    i and w level messages are not output.

                        msgnum
                        Message number msgnum is inhibited.  msgnum must
                        be an i or w level message.

-Fixed                  Specifies that Fortran source programs are written in
                        fixed source form.
                        If file.f or file.F is specified as a Fortran source
                        program, the -Fixed option is effective by default.

-K opt                  Control specific optimizations and code generation.
                        If several of these are specified at the same time, they
                        must be delimited by commas.

   alignc[=N]           Adjust entry of global data alignment at n-byte boundary.
                        N can be specified from 1 to 32768.

   alignl[=N]           Adjust entry of local data alignment at n-byte boundary. 
                        N can be specified from 1 to 32768.

   commonpad[=N]        Insert padding elements in common blocks for effi-
                        cient use of cache.  N can be specified from 4 to
                        4096 bytes.  When it's omitted, the compiler
                        automatically determines suitable value.

   dalign               Generates instructions assuming that eight-byte
                        integer data, double-precision real data, double-
                        precision complex data, quadruple-precision real
                        data or quadruple-precision complex data referred
                        to by dummy arguments or pointers is aligned on
                        eight-byte boundaries.

   eval                 This option specifies the optimization by changing
                        the  method  of  operator  evaluation.  Specifying
                        this option may give rise to side effects  (preci-
                        sion  errors and runtime exceptions) in the execu-
                        tion results, leading to unintended results.  This
                        option  is  effective  only  if  -O option is also
                        specified.

   fast_GP2[={0|1|2|3}] Specifies the best optimization level suitable for
                        the system equipped with SPARC64 V. 0, 1, 2 or 3
                        can be specified for the argument level.  If the
                        level is not specified, 1 is used.
                        Moreover, -Kprefetch_model=kind is automatically
                        chosen according to the compiling machine.
                        -Kprefetch_model=L is chosen when compiling with
                        the system which is not equipped with SPARC64 V.

                        0      Induces -O4 -Kfsimple,dalign,ns,fuse,mfunc,
                               prefetch,SPARC64_GP2,V8PLUS,VIS1,gs
                               options.

                        1      Induces -O4 -Kfsimple,dalign,ns,fuse,mfunc,
                               prefetch,SPARC64_GP2,V8PLUS,VIS1,gs
                               options.

                        2      Induces -O4 -Kfsimple,dalign,ns,fuse,mfunc,
                               prefetch,SPARC64_GP2,V8PLUS,VIS1,gs, eval
                               options.
        
                        3      Induces -O4 -Kfsimple,dalign,ns,fuse,mfunc,
                               prefetch,SPARC64_GP2,V8PLUS,VIS1,gs,eval,
                               preex options.

   FMADD                Use of the combined multiply-add/subtract
                        floating-point instructions. Either of the
                        -KV8PLUS option or -KV9 option must be specified
                        together.

   frecipro             This option specifies to convert a floating point
                        division into multiplication by the reciprocal.

   fsimple              This option creates an object program by applying
                        optimizaitons that simplify floating-point opera-
                        tion.

   fuse                 Fuses neighboring loops.


   GREG_SYSTEM          This option specifies to global registers g5
                        through g7(when -KV9 is available, g6,g7 are used)
                        are subject to register allocation in the compile
                        stage. These registers are reservation register of
                        system.

   gs                   Performs global instruction scheduling. It is
                        ignored if the -O5 option is specified after it.

   largepage[=level]    This option creates executable file that uses the
                        Parallelnavi largepage functionality.  1 or 2 can
                        be specified for the argument level.  If the level
                        is not specified, 1 is used.

                        1      The largepage functionality applies to the
                               data and the heap areas.

                        2      In addition to the -Klargepage=1, the lar-
                               gepage functionality applies to the stack
                               area.

   mfunc                Indicates that the intrinsic function (including
                        power operation) into a multi-operation function.
                        Single precision real type LOG and EXP, and double
                        precision real type LOG, EXP and power operation
                        are targets for this optimization.  Either of the
                        -KV8PLUS option or -KV9 option must be specified
                        together.

   NOFMADD              Suppresses use of the combined multiply-
                        add/subtract floating-point instructions.
                        -KNOFMADD is default.

   nomfunc              Suppresses to change the intrinsic function or an
                        operation to a multi-operation function.
                        -Knomfunc is default.

   noprefetch           Suppresses use the prefetch instruction.

   nounroll             Prevents loop unrolling optimizations.

   ns                   Initialize the FPU in non-standard mode of operation.
                        Used mostly to suppress underflow interruptions.

   preex                Optimizes by moving the evaluation of invariant
                        expressions ahead of branch instructions.

   pg                   Generates instructions to produce a profile file for
                        subsequent optimization (global instruction scheduling
                        etc.).

   prefetch[=level]     If -KSPARC64_GP is in effect and level is not
                        specified, compiler is selected 2.
                        If -KSPARC64_GP2 is in effect and level is not
                        specified, compiler is selected 3.

                        1: Basic level prefetch for array elements only
                           inner-most loop.

                        2: In addition to the -Kprefetch=1, generates the
                           prefetch instruction for array elements within the
                           loop pre-header which access the first iteration in
                           the loop.

                        3: In addition to the -Kprefetch=2, when the stride of
                           access for array elements are larger than cache line
                           size, compiler generates prefetch instruction for
                           each cache line size access.

                        4: In addition to -Kprefetch=3, prefetch with
                           address calculation is executed.

                        5: In addition to -Kprefetch=4, prefetching is
                           applied to array data which are accessed
                           indirectly.

   prefetch_cache_level=N
                        This option is specified cache-level to prefetch
                        of data. It means -KSPARC64_GP2 and
                        -Kprefetch={2|3|4|5} option is in effect.  N can
                        be specified as following:

                        1      Data is prefetch in first cache. Prefetch
                               instruction is used normal.

                        2      Data is prefetch only second cache. Pre-
                               fetch instruction is used instruction is
                               prefetch only second cache.

                        3      1 and 2 function is in effect. Two kind of
                               prefetch instruction is used, so that pre-
                               fetch become high level.

   prefetch_infer       The compiler assumes the memory access to be continuous
                        access and generate the prefetch instruction.

   pu                   Optimizes by using an existing profile information
                        file.

   SPARC64_GP2          Optimization for SPARC64 V is applied.

   unroll[=N]           Performs loop unrolling.  N means upper limit of
                        unrolling expansion number, whose value should be
                        from 2 to 100.  When specification of N is omit-
                        ted, the compiler automatically determines suit-
                        able value. Default is -Knounroll if -O0 or -O1 is
                        specified, and -Kunroll if -O2 or higher is speci-
                        fied.

   V8PLUS               Indicates that SPARC V8+ instructions should be used.

   V9                   Indicates that SPARC V9 instructions should be used.

   VIS1                 This option specifies to output Visual Instruction
                        Set (VIS) version 1.0. Either of the -KV8PLUS option or
                        -KV9 option must be specified together.

-O[level]               Specifies the optimization level.
                        0, 1, 2, 3, 4 or 5 can be specified for level.  If
                        level is omitted, level 3 is assumed.  If the -O option
                        is not specified, -O2 is assumed.  If the -O option is
                        specified together with the -g or -Kcover option, the
                        level is treated as 0 and optimization is not done.  In
                        addition to the -O option, the Fortran compiler sup-
                        ports the -K and -x options for optimization.

                        0    No optimization.

                        1    Basic optimization.

                        2    Loop unrolling in addition to -O1.

                        3    Global instruction scheduling, loop tiling and
                             restructuring of nested loop in addition to -O2.

                        4    Further optimization of loop restructuring, that
                             is full unrolling, splitting for promoting loop
                             exchange etc. in addition to -O3.

                        5    Creates an object program by applying further
                             optimizations of register allocation in addition
                             to -O4.

-SSL2                   The whole set of routines from SSL II, SSL II Thread-
                        Parallel Capabilities and BLAS/LAPACK becomes part of
                        link-edit libraries.

-x inline               Expands calls to external, internal and module pro-
                        cedures to the corresponding lines in the calling pro-
                        cedure.
                        -, pgm1[,pgm2] ..., stno, dsizeK or
                        dir=dirname1[,dir=dirname2] ...  can be specified in
                        the argument inline.  If several are specified they
                        must be delimited by commas.

-                       Expands user-defined procedures which have 30 or
                        fewer executable statements.

pgm                     Only expands the procedures specified by the argu-
                        ment pgm.  For pgm, specify the external procedure
                        name for external procedure, and specify the host
                        procedure name + '.' + internal procedure name for
                        internal procedure.  For internal procedure in
                        module procedure, specify module name + '.' module
                        procedure name + '.' + internal procedure name.
                        pgm may be combined with stno or dsizeK.

-x stno                 Expands user-defined procedures which have stno or
                        fewer executable statements.  stno may be combined
                        with pgm or dsizeK.

-x dir=dirname          Performs inline expansion of procedures defined in
                        the file under the directory specified as the
                        argument dirname and reference in file currently
                        being compiled. But the files whose suffixes are
                        .f, .for, .f90 or .f95 under the directory are to
                        be target of this optimization. The argument dir-
                        name is the directory name, and sub option(
                        dir=dirname ) can be specified multiply using a
                        comma as a delimiter. When specifying multiple
                        arguments, files under all directories are to be
                        target of this optimization.

[C]

-K opt                  The -K option can use multiple parameters. For example,
                        -Klib,PIC can be used instead of -Klib -KPIC.

   cfunc                This uses high speed  mathematical  functions  and
                        library   functions   (malloc,calloc,realloc,free)
                        prepared by this compilation system.  This  option
                        functionally  include -Kmfunc option.  This option
                        is effective if -Kmfunc option is also  specified.
                        This option is ignored if -mt, -KOMP or -Kparallel
                        option is also specified.

   crossfile            This option specifies the crossfile  optimization.
                        If program consists of several files, the compiler
                        refers these files at one time, and analyzes  data
                        dependency   and  control  relation  across  these
                        files.  This optimization is called the  crossfile
                        optimization.  This option is effective only if -O
                        option is also specified.   When  this  option  is
                        specified,  -Kiopt  and -Kxi= N are assumed.  When
                        specification N is ommited,  the  compiler  deter-
                        mines  automatically  suitable value.  This option
                        is ignored if -g, -KV9, -KOMP or -Kparallel option
                        is  also  specified.   This  function  can be used
                        under the Parallelnavi environment.

   dalign               Generates instructions assuming that eight-byte
                        integer data, double-precision real data, double-
                        precision complex data, quadruple-precision real
                        data or quadruple-precision complex data referred
                        to by dummy arguments or pointers is aligned on
                        eight-byte boundaries.

   eval                 This option specifies the optimization by changing
                        the  method  of  operator  evaluation.  Specifying
                        this option may give rise to side effects  (preci-
                        sion  errors and runtime exceptions) in the execu-
                        tion results, leading to unintended results.  This
                        option  is  effective  only  if  -O option is also
                        specified.

   fast_GP2[={0|1|2|3}] This performs optimization for SPARC64  V  series.
                        When  -Kfast_GP2 option is specified, -Kfast_GP2=1
                        is assumed.
                        This option ignored if -g  or  -Kcover  option  is
                        also  specified.  This option makes -O0, -O1 , -O2
                        or -KV8 option ineffective forcedly.  -Kfast_GP2=3
                        can be used under the Parallelnavi environment.

                        -Kfast_GP2=0
                        This performs optimization same as -O3  -Klib
                        -Kdalign -KSPARC64_GP2 -KV8PLUS -Kgs options.

                        -Kfast_GP2=1
                        This performs optimization same as -O3  -Klib
                        -Kdalign -KSPARC64_GP2 -KV8PLUS -Kgs options.

                        -Kfast_GP2=2
                        This generates -Keval option in  addition  to
                        -Kfast_GP2=1.

                        -Kfast_GP2=3
                        This generates -Kcrossfile option in addition
                        to  -Kfast_GP2=1.  -Kfast_GP2=1 is assumed if
                        -KV9, -KOMP  or  -Kparallel  option  is  also
                        specified.

   GREG                 The global registers  g2  through  g7  (when  -KV9
                        option  is  available,  g2,g3,g6,g7  are used) are
                        subject to  register  allocation  in  the  compile
                        stage.    This   is  equivalent  to  specifying  -
                        KGREG_APPLI, GREG_SYSTEM option.

   gs                   Performs global instruction scheduling. It is
                        ignored if the -O5 option is specified after it.

   lib                  Recognizing the operation of  the  standard  func-
                        tions, this option replaces the standard functions
                        with faster, inline expanded  standard  functions.
                        If a user-defined function with the same name as a
                        standard function is used, unintended  results  by
                        user  may  occur. This option is effective only if
                        -O option is also specified.

   pg                   This generates an  instruction  sequence  used  to
                        generate  profile information referred by the com-
                        piler in order  to  perform  optimization  (global
                        instruction  scheduling,  etc.).  This  option  is
                        effective only if -O option is also specified.

   preex                This option specifies the optimization  by  moving
                        the  evaluation  of  invariant  expressions beyond
                        branch. Specifying this option may  give  rise  to
                        side  effects in the execution results, leading to
                        results unintended by the user.   This  option  is
                        effective only if -O option is also specified.

   pu[=file]            This  performs  optimization  (global  instruction
                        scheduling,  etc.)   using program runtime profile
                        information obtained by  specifying  -Kpg  option.
                        If  both  -Kpg  and -Kcrossfile options are speci-
                        fied, profile file name, which is gotten  by  -Kpg
                        option,  has  to  be specified as file.  Among the
                        execution with -Kpg and -Kpu options,  the  number
                        of  CPU  and  maximum  threads  cannot be changed.
                        This option is effective only if -O option is also
                        specified.

   SPARC64_GP2          Optimization for SPARC64 V is applied.

   V8PLUS               Indicates that SPARC V8+ instructions should be used.

   V9                   Indicates that SPARC V9 instructions should be used.

-O[n]                   In n, specify the level of optimization as 0, 1, 2,  3,
                        4  or  5 . When -O option is specified, -O3 is assumed.
                        The higher the level of optimization, the  shorter  the
                        execution  time and take more compile time.  The higher
                        levels of optimization functionally include  the  lower
                        levels of optimization.  This option is not valid to .s
                        file.

                        Optimization level 0
                        No optimization is performed. This  is  equivalent
                        to that -O option is not specified.

                        Optimization level 1
                        Optimization   is   performed   through   detailed
                        analysis of program control flow.

                        Optimization level 2
                        In addition to the  optimization  of  optimization
                        level 1, the following optimization is performed:
                        - Loop unrolling

                        This may involve increase in object size.

                        Optimization level 3
                        In addition to the  optimization  of  optimization
                        level  2,  the  following  optimizations  are per-
                        formed:
                        - Loop unrolling (expanded)
                        - Software pipelining
                        - Repeated application of optimization functions
                        
                        Repeated  application  of  optimization  functions
                        means that the optimization functions performed in
                        optimization  level  1  are  repeatedly  performed
                        until there is no room for further optimization.

                        Optimization level 4
                        In addition to the  optimization  of  optimization
                        level  3,  the  following  optimizations  are per-
                        formed:
                        - spliting for promoting loop exchange
                        - -KGREG_APPLI option is assumed.

                        Optimization level 5
                        In addition to the  optimization  of  optimization
                        level  4,  the  following  optimizations  are per-
                        formed:
                        - register allocation (expanded)

-----------------------------------------------------------------------------------
[2] Sun Studio 9 flag description

Compiler options	Remark
-----------------------------------------------------------------------------------

cc			Invoke the Sun Studio 9 Compiler C 
(C compiler)

CC			Invoke the Sun Studio 9 Compiler C++ 
(C++ compiler)

-crit			Enable optimization of critical control paths 
(optimizer)

-dalign			Assume data is naturally aligned. 
(C, C++, Fortran)

-Dalloca=__builtin_alloca
(Portability flag)	Portability switch, used for 176.gcc:
			allow use of compiler's internal builtin alloca.

-depend			Synonym for -xdepend.
(Fortran)

-DHOST_WORDS_BIG_ENDIAN	Portability switch, used for 176.gcc:
(Portability flag)	controls how bytes are numbered within a word. 

-D__MATHERR_ERRNO_DONTCARE	
(C)			Allows the compiler to assume that your code
			does not rely on setting of the errno variable.

-DSPEC_CPU2000_SOLARIS	Portability switch, used for 253.perlbmk:
(Portability flag)	selects header files and code paths compatible
			with Solaris.
			

-DSUN			Portability switch, used for 186.crafty:
(Portability flag)	selects header files and code paths
			compatible with Solaris. 

-DSYS_HAS_CALLOC_PROTO	Portability switch, used for 254.gap:
(Portability flag)	allows use of the designated prototype.

-DSYS_HAS_IOCTL_PROTO	Portability switch, used for 254.gap:
(Portability flag)	allows use of the designated prototype.

-DSYS_HAS_SIGNAL_PROTO	Portability switch, used for 254.gap: 
(Portability flag)	allows use of the designated prototype.

-DSYS_HAS_TIME_PROTO	Portability switch, used for 254.gap:
(Portability flag)	allows use of the designated prototype.

-DSYS_IS_USG		Portability switch, used for 254.gap:
(Portability flag)	selects code compatible with USG-based systems. 

-e			Portability switch, used for 178.galgel:
(Portability, Fortran)	allows source lines to be up to 132 characters long. 

f90			Invoke the Sun Studio 9 Compiler Fortran 90
(Fortran compiler)

-fast			A convenience option, this switch selects the
(C)			following switches that are defined elsewhere
			in this page: 

			-D__MATHERR_ERRNO_DONTCARE
			-fns
			-fsimple=2
			-fsingle
			-xalias_level=basic
			-xbuiltin=%all
			-xdepend
			-xlibmil
			-xlibmopt
			-xmemalign=8s
			-xO5
			-xprefetch=auto,explicit
			-xtarget=native

-fast			A convenience option, this switch selects the
(C++)			following switches that are defined elsewhere
			in this page: 

			-dalign
			-fns
			-fsimple=2
			-ftrap=%none
			-xbuiltin=%all
			-xlibmil
			-xlibmopt
			-xO5
			-xtarget=native

-fast			A convenience option, this switch selects the
(Fortran)		following switches that are defined elsewhere
			in this page: 

			-dalign
			-depend
			-fns
			-fsimple=2
			-ftrap=common
			-xlibmil
			-xlibmopt
			-xO5
			-xpad=local
			-xprefetch=auto,explicit
			-xtarget=native
			-xvector=yes

-fixed			Portability switch, used for 178.galgel:
(Portability, Fortran)	assume fixed-format source input.

-fns			Selects faster (but nonstandard) handling of
(C, C++, Fortran)	floating point arithmetic exceptions and
			gradual underflow.

-fsimple=<n>		Controls simplifying assumptions for
(C, C++, Fortran)	floating point arithmetic:

	    -fsimple=0	Permits no simplifying assumptions.
			Preserves strict IEEE 754 conformance. 

	    -fsimple=1	Allows the optimizer to assume: 
			The IEEE 754 default rounding/trapping
			modes do not change after process initialization. 
			Computations producing no visible result other
			than potential floating-point exceptions may
			be deleted. Computations with Infinity or NaNs
			as operands need not propagate NaNs to their
			results. For example, x*0 may be replaced by 0. 
			Computations do not depend on sign of zero. 

	    -fsimple=2	Permits more aggressive floating point
			optimizations that may cause programs to
			produce different numeric results due to
			changes in rounding. Even with -fsimple=2,
			the optimizer still is not permitted to
			introduce a floating point exception in a
			program that otherwise produces none. 

-fsingle		Evaluate float expressions as single precision. 
(C)

-ftrap=common		Sets the IEEE 754 trapping mode to common exceptions
(C, C++, Fortran)	(invalid, division by zero, and overflow).

-ftrap=%none		Turns off all IEEE 754 trapping modes.
(C, C++, Fortran)

-library=iostream	Portability switch, used for 252.eon:
(Portability, C++)	allow use of the classic iostream library.

-ll2amm			Include a library containing chip specific
(linker)		memory routines.

-lm			Include the math library.
(linker)

-lmopt			Include the optimized math library. This option
(linker)		usually generates faster code, but may produce
			slightly different results. Usually these results
			will differ only in the last bit.

-noex			Do not allow C++ exceptions. A throw specification
(C++)			on a function is accepted but ignored; the compiler
			does not generate exception code.

-O			A synomym for -xO3.
(Fortran)

-Qoption <phase> <flags>
			Pass flags along to compiler phase:

			f90comp	Fortran first pass
			iropt	Global optimizer
			cg	Code Genetator

-Qoption cg <flags>	See -Wc,<flags> below. (The code generator
(code generator)	phase is addressed via -Qoption cg in
			Fortran and C++; and via -Wc in C.)

-Qoption cg -Qeps:enabled=1
(code generator)	See -Wc,-Qeps:enabled=1


-Qoption cg -Qeps:ws=<n>
(code generator)	See -Wc,-Qeps:ws=<n>

-Qoption cg -Qgsched-T<n>
(code generator)	See -Wc,-Qgsched-T<n>

-Qoption cg -Qgsched-trace_late=1
(code generator)	See -Wc,-Qgsched-trace_late=1

-Qoption iropt <flags>	See -W2,<flags> below. (The optimizer can
(optimizer)		be addressed either via Qoption iropt in
			Fortran and C++; or via -W2 in C.)

-Qoption iropt -Addint:sf=<n>		
(optimizer)		When considering whether to interchange loops, set memory
			store operation weight to n. A higher value of n indicates
			a greater performance cost for stores.

-Qoption iropt -Ainline[:cp=<n>][:cs=<n>][:inc=<n>][:irs=<n>][:mi][:recursion=1]
(optimizer)		See -W2,[:cp=<n>][:cs=<n>][:inc=<n>][:irs=<n>][:mi][:recursion=1]

-Qoption iropt -Apf:llist=<n>:noinnerllist
(optimizer)		Do speculative prefetching for link-list data structures:
			llist=<n> perform prefetching n iterations ahead
			noinnerllist do not attempt for innermost loops.

-Qoption iropt -Atile:skewp[:b<n>]
(optimizer)		Perform loop tiling which is enabled by loop skewing.
			Loop skewing is a transformation that transforms a
			non-fully interchangeable loop nest to a fully
			interchangeable loop nest. The optional b<n> sets the
			tiling block size to n.

-Qoption iropt -Aujam:inner=g		
(optimizer)		Increase the probability that small-trip-count inner
			loops will be fully unrolled.

RM_SOURCES = lapak.f90	This option allows building the benchmark 178.galgel
(SPEC tools)		without its copy of the lapak sources; instead,
			the lapak entry points in the sunperf library are used.

rm -rf ./feedback.profile ./SunWS_cache		
(Unix)			Remove any profile feedback information from previous runs. 

-W<phase>,<flags>	Pass flags along to compiler phase (2=optimizer,
			c=code genetator).

-W2,-Abcopy		Increase the probability that the compiler will
(optimizer)		perform memcpy/memset transformations. 

-W2,-Ainline[:cp=<n>][:cs=<n>][:inc=<n>][:irs=<n>][:mi][:recursion=1]
(optimizer)		Control the optimizer's loop inliner:

     (without a value)	Perform Inter-Procedural Analysis (IPA) -based inlining.

		cp=<n>	The minimum call site frequency counter in order to
			consider a routine for inlining.

		cs=<n>	Set inline callee size limit to n. The unit roughly
			corresponds to the number of instructions.

	       inc=<n>	The inliner is allowed to increase the size of the
			program by up to n%.

	       irs=<n>	Allow routines to increase by up to n. The unit
			roughly corresponds to the number of instructions. 

		    mi	Perform maximum inlining (without considering code
			size increase). 

	   recursion=1	Allow routines that are called recursively to still
			be eligible for inlining. 

-W2,-crit		Enable optimization of critical control paths.
(optimizer)

-W2,-Apf:llist=<n>:noinnerllist		
(optimizer)		Do speculative prefetching for link-list data structures:
			llist=<n> perform prefetching n iterations ahead
			noinnerllist do not attempt for innermost loops. 

-W2,-Ashort_ldst	Convert multiple short memory operations into
(optimizer)		single long memory operations.

-W2,-whole		Do whole program optimizations.
(optimizer)

-Wc,-Qdepgraph-early_cross_call=1	
(code generator)	There are several scheduling passes in the compiler.
			This option allows early passes to move instructions
			across call instructions.

-Wc,-Qeps:enabled=1	Use enhanced pipeline scheduling(EPS)
(code generator)	and selective scheduling algorithms for
			instruction scheduling. 

-Wc,-Qeps:ws=<n>	Set the EPS window size, that is, the number
(code generator)	of instructions it will consider across all
			paths when trying to find independent
			instructions to schedule a parallel group.
			Larger values may result in better run time,
			at the cost of increased compile time.

-Wc,-Qgsched-T<n>	Sets the aggressiveness of the trace
(code generator)	formation, where n is 4, 5, or 6. 
			The higher the value of n, the lower
			the branch probability needed to include
			a basic block in a trace.

-Wc,-Qgsched-trace_late=1
(code generator)	Turns on the late trace scheduler.

-Wc,-Qipa:valueprediction	
(code generator)	Use profile feedback data to predict values and attempt
			to generate faster code along these control paths,
			even at the expense of possibly slower code along paths
			leading to different values. Correct code is generated
			for all paths.

-Wc,-Qlp=<n>[-av=<n>][-t=<n>][-fa=<n>][-fl=<n>]
(code generator) 	Control irregular loop prefetching:

		lp=<n>	Turns the module on (1) or off (0)
			(default is on for F90; off for C/C++)

	       -av=<n>	Sets the prefetch look ahead distance, in bytes.
			Default is 256.

		-t=<n>	Sets the number of attempts at prefetching. If not
			specified, t=2 if -xprefetch_level=3 has been set;
			otherwise, defaults to t=1.

	       -fa=<n>	1=Force user settings to override internally computed values. 
    
	       -fl=<n>	1=Force the optimization to be turned on for all languages. 

-Wc,-Qms_pipe-pref	Turn off prefetching within modulo scheduling.
(code generator)

-xalias_level=[basic|std|strong]
(C)			Allows the compiler to perform type-based alias analysis
			at the specified alias level:

		 basic	Assume that memory references that involve
			different C basic types do not alias each other.

		   std	Assume aliasing rules described in the ISO 1999 C
			standard.

		strong	In addition to the restrictions at the std level,
			assume that pointers of type char * are used only
			to access an object of type char; and assume that
			there are no interior pointers.

-xalias_level=compatible
(C++)			Allows the compiler to assume that
			layout-incompatible types are not aliased.

-xarch=<a>		Limit the set of instructions the compiler may use
(C, C++, Fortran)	to generic, generic64, native, native64, v7, v8a,
			v8, v8plus, v8plusa, v8plusb, v9, v9a, v9b.
			Typical settings include:

				UltraSPARC-II, 32-bit mode: v8plusa
				UltraSPARC-II, 64-bit mode: v9a
				UltraSPARC-III, 32-bit mode: v8plusb
				UltraSPARC-III, 64-bit mode: v9b

			For more information, see the Fortran User's Guide
			at docs.sun.com

-xbuiltin=%all		Substitute intrinsic functions or inline system
(C, C++)		functions where profitable for performance. 

-xchip=<c>		Specifies the target processor for use by the
(C, C++, Fortran)	optimizer. c must be one of: generic, native,
			old, super, super2, micro, micro2, hyper, hyper2,
			powerup, ultra, ultra2, ultra2i, ultra3, ultra3cu,
			ultra3i, ultra4, 386, 486, pentium, pentium_pro,
			pentium3, pentium4

-xcache=<c>		Defines the cache properties for use by the
(C, C++, Fortran)	optimizer. c must be one of  the following:
			native (set parameters for the host environment)

				* s1/l1/a1
				* s1/l1/a1:s2/l2/a2
				* s1/l1/a1:s2/l2/a2:s3/l3/a3

			The si/li/ai are defined as follows:

				si The size of the data cache
				at level i, in kilobytes.
				li The line size of the data cache
				at level i, in bytes.
				ai The associativety of the data cache
				at level i.

-xdepend		Analyze loops for inter-iteration data dependencies,
(C, Fortran)		and do loop restructuring.

-xinline=		Turn off inlining.
(C, C++, Fortran)

-xipo[=2]		Perform optimizations across all object files in the
(C, C++, Fortran)	link step:

			0=off
			1=on
			2=performs whole-program detection and analysis

-xlibmil		Use inline expansion for math library, libm.
(C, C++, Fortran)

-xlibmopt		Select the optimized math library.
(C++, Fortran)

-xlic_lib=sunperf	Link with Sun supplied licensed sunperf library.
(C, C++, Fortran)

-xlinkopt		Perform link-time optimizations, such as branch
(C, C++, Fortran)	optimization and cache coloring.

-xO<n>			Specify optimization level n:
(C, C++, Fortran)

		  -xO1	Does only basic local optimizations (peephole).

		  -xO2	Do basic local and global optimizations, such as
			induction variable elimination, common
			subexpression elimination, constant propogation,
			register allocation, and basic block merging. 

		  -xO3	Add global optimizations at the function level,
			loop unrolling, and software pipelining.

		  -xO4	Adds automatic inlining of functions in the
			same file.

		  -xO5	Uses optmization algorithms that may take
			significantly more compilation time or that
			do not have as high a probability of improving
			execution time, such as speculative code motion.

-xpad=common[:<n>]	If multiple same-sized arrays are placed in common,
(Fortran)		insert padding between them for better use of cache.
			n specifies the amount of padding to apply, in units
			that are the same size as the array elements. If no
			parameter is specified then the compiler selects one
			automatically.

-xpad=local		Pad local variables, for better use of cache.
(Fortran)

-xpagesize=<n>		Set the preferred page size for running the program.
(C, C++, Fortran)

-xprefetch=auto,explicit
(C, C++, Fortran)	Allow generation of prefetch instructions. -xprefetch and
			-xprefetch=yes is a synonym for -xprefetch=auto,explicit.

-xprefetch=latx:<n>	Adjust the compiler's assumptions about prefetch latency
(C, C++, Fortran)	by the specified factor. Typically values in the range of
			0.5 to 2.0 will be useful. A lower number might indicate
			that data will usually be cache resident; a higher number
			might indicate a relatively larger gap between the
			processor speed and the memory speed (compared to the
			assumptions built into the compiler).

-xprefetch=no%auto	Turn off prefetch instruction generation.
(C, C++, Fortran) 

-xprefetch_level=<n>	Control the level of searching that the compiler does
(C, C++, Fortran)	for prefetch opportunities by setting n to 1, 2, or 3,
			where higher numbers mean to do more searching.
			The default is 2.

-xprofile=collect:./feedback
(C, C++, Fortran)	Collect profile data for feedback-directed optimization,
			and store it in a sub directory of the current directory,
			named ./feedback.

-xprofile=use:./feedback
(C, C++, Fortran)	Use data collected for profile feedback. Look for it in
			a subdirectory of the current directory, named ./feedback.

-xregs=syst		Allows use of the system reserved registers %g6 and
(C, C++, Fortran)	%g7, and %g5 if not already allowed by -xarch value.

-xrestrict		Treat pointer-valued function parameters as
(C)			restricted pointers.

-xsafe=mem		Enables the use of non-faulting loads when used in
(C, C++, Fortran)	conjunction with -xarch=v8plus. Assumes that no memory
			based traps will occur.

-xsfpconst		Represents unsuffixed floating-point constants
(C, C++, Fortran)	as single precision.

-xtarget=[system_name]	Selects options appropriate for the system where
(C, C++, Fortran)	the compile is taking place, including architecture,
			chip, and cache sizes. (These can also be controlled
			separately, via -xarch, -xchip, and -xcache, respectively.) 

-xunroll=n		Specifies whether or not the compiler optimizes
(C, C++, Fortran)	(unrolls) loops.  n is a positive integer. When n is
			1, it is a command and the compiler unrolls no loops.
			When n is greater than 1, -xunroll=n merely suggests
			to the compiler that it unroll loops n times.

-xvector		Allow the compiler to transform math library calls within
(C, Fortran)		loops into calls to the vector math library.

-----------------------------------------------------------------------------------
[3] Environment Variables

Flag			Remark
-----------------------------------------------------------------------------------
LD_LIBRARY_PATH=<p>	Specify the locations to resolve dynamic link dependencies.

LD_PRELOAD=mpss.so.1	Allow use of the mpss.so.1 shared object, which provides
			a means by which preferred stack and/or heap page sizes
			can be selected.

MPSSHEAP=<n>		Specify the preferred page size for heap. The specified
			page size is applied to all created processe.

MPSSSTACK=<n>		Specify the preferred page size for stack. The specified
			page size is applied to all created processe.

ulimit -s unlimited	Allow stack size to grow without limit.

-----------------------------------------------------------------------------------
[4] Kernel Parameters (/etc/system)

System Tunable		Remark
-----------------------------------------------------------------------------------
autoup			The frequency of file system sync operations.

consistent_coloring	Controls the page coloring policy. It can be set to
			one of the following:

		     0	(default) dynamic (uses various vaddr bits)
		     1	static (virtual=paddr)

tune_t_fsflushr		The number of seconds between fsflush invocations for
			checking dirty memory.

--------------------------------------------------------------------------------
[5] Commands for feedback control

Command			Remark
--------------------------------------------------------------------------------
Paralllenavi compiler:
fdo_pre0 = rm -rf `pwd`*.f.d
fdo_pre0 = rm -rf `pwd`*.fbk
			remove the profile data generated at the last
			feedback-optimized compilation.

Sun Studio 9 compiler:
fdo_pre0 = rm -rf `pwd`/..feedback.profile
fdo_pre0 = rm -rf `pwd`/SunWS_cache
			remove the profile data generated at the last
			feedback-optimized compilation.