FLAG DESCRIPTIONS
SUN C, C++ AND FORTRAN
Forte[tm] Developer 6 update 2, and Forte[tm] Developer 7 Early Access
          11/5/01

Compiler Flags

Flag                               Description

-D                                 Set definition for preprocessor.

-dalign                            Assume double-type data is double
                                   aligned.

-dn                                Specify static binding.

-e                                 Accept extended (132 character) input
                                   source lines (FORTRAN).

-fast                              This is a convenience option for
                                   selecting a set of optimizations for
                                   performance, and it chooses:

                                   o The -native best machine
                                   characteristics option (-xarch=native,
                                   -xchip=native, -xcache=native)

                                   o Optimization level: -xO5

                                   o A set of inline expansion templates
                                   (-libmil)

                                   o The -fsimple=2 option

                                   o The -dalign option

                                   o The -xalias_level=basic option (C
                                   only)

                                   o The -xlibmopt option

                                   o The -xdepend option (FORTRAN only)

                                   o The -xprefetch option (FORTRAN only)

                                   o Options to turn off all trapping
                                   (-fns -ftrap=%none)

-fixed                             Accept fixed-format input source files
                                   (FORTRAN).

-fns                               Select non-standard floating point
                                   mode.

                                   This flag causes the nonstandard
                                   floating point mode to be enabled when
                                   a program begins execution. By default,
                                   the nonstandard floating point mode
                                   will not be enabled automatically.

                                   On some SPARC systems, the nonstandard
                                   floating point mode disables "gradual
                                   underflow", causing tiny results to be
                                   flushed to zero rather than producing
                                   subnormal numbers. It also causes
                                   subnormal operands to be silently
                                   replaced by zero. On those SPARC
                                   systems that do not support gradual
                                   underflow and subnormal numbers in
                                   hardware, use of this option can
                                   significantly improve the performance
                                   of some programs.

                                   Warning: When nonstandard mode is
                                   enabled, floating point arithmetic may
                                   produce results that do not con- form
                                   to the requirements of the IEEE 754
                                   standard. See the Numerical Computation
                                   Guide for more information.

-fsimple=0                         Permits no simplifying assumptions.
                                   Preserves strict IEEE 754 conformance.

-fsimple=1                         With -fsimple=1, the optimizer can
                                   assume the following:

                                   o The IEEE 754 default
                                   rounding/trapping modes do not change
                                   after process initialization.

                                   o Computations producing no visible
                                   result other than potential
                                   floating-point exceptions may be
                                   deleted.

                                   o Computations with Infinity or NaNs as
                                   operands need not propagate NaNs to
                                   their results. For example, x*0 may be
                                   replaced by 0.

                                   o Computations do not depend on sign of
                                   zero.

-fsimple=2                         Permits aggressive floating point
                                   optimizations that may cause programs
                                   to produce different numeric results
                                   due to changes in rounding. Even with
                                   -fsimple=2, the optimizer still is not
                                   permitted to introduce a floating point
                                   exception in a program that otherwise
                                   produces none.

-fsimple[=n]                       Allows the compiler to make simplifying
                                   assumptions concerning floating-point
                                   arithmetic.

-ftrap=t                           Sets the IEEE 754 trapping mode in
                                   effect at startup.

                                   t is a comma-separated list that
                                   consists of one or more of the
                                   following: %all, %none, common,
                                   [no%]invalid, [no%]overflow,
                                   [no%]underflow, [no%]division,
                                   [no%]inexact.

                                   The default is -ftrap=%none.

                                   This option sets the IEEE 754 trapping
                                   modes that are established at program
                                   initialization. Processing is
                                   left-to-right. The common exceptions,
                                   by definition, are invalid, division by
                                   zero, and overflow.

                                   o %none, the default, turns off all
                                   trapping modes.

                                   Do not use this option for programs
                                   that depend on IEEE standard exception
                                   handling; you can get different
                                   numerical results, premature program
                                   termination, or unexpected SIGFPE
                                   signals.

-libmil                            Use inline expansion templates for
                                   libm.

-library=iostream                  Use "classic" (pre 1998 C++ standard)
                                   iostream library

                                   Prior to the C++ standard (1998), there
                                   was one iostream library, what is now
                                   often called "classic" iostreams. The
                                   C++ standard defines a different, but
                                   similar, iostream library, which we
                                   call "standard" iostreams. To get
                                   classic iostreams in standard (default)
                                   mode, you use the option
                                   "-library=iostream".

-ll2amm                            Library containing chip specific memory
                                   routines.

-lm                                Link with math library

-lmopt                             This chooses the math library that is
                                   optimized for speed

-lprism32                          Library to enable ISM (4MB page) usage.

-lsunperf                          Link with the Sun Performance Library
                                   (netlib and SIAM routines)

-native                            Select native machine characteristics
                                   for optimization.

-Qicache-chbab=1                   Turn on optimization to reduce branch
                                   after branch penalty

-Qoption <phase> <flags>           Pass flags along to compiler phase:

                                   f90comp Fortran first pass

                                   iropt Global optimizer

                                   cg Code generator

-Qoption cg <flags>                See -Wc,<flags> below.

-Qoption cg                        Control irregular loop prefetching.
-Qlp=1-av=<nav>-t=<nt>-fa=1-fl=1

                                   lp lp=1 turns on the module (default is
                                   on for F90; off for C/C++)

                                   fa fa=1 forces user settings to
                                   override internally computed values.

                                   fl fl=1 forces the optimization to be
                                   turned on for all languages.

                                   t Make <nt> attempts at prefetching.

                                   av Sets the prefetch look ahead to
                                   <nav>.

-Qoption f90comp                   Enable padding of f90 arrays by n.
-array_pad_rows,<n>

-Qoption f90comp -expansion        Enable f90 array expansion.

-qoption f90comp -O3               This reduces the optimization level of
                                   the f90 front/middle end to O3. The
                                   effect is to turn off loop cloning and
                                   unrolling (Note that it has no effect
                                   on cg's loop unrolling).

-Qoption iropt <flags>             See -W2,<flags> below.

-Qoption iropt -Adata_access       enable optimizations based on data
                                   access patterns

-Qoption iropt -Addint:sf=9        Set memory store operation weight for
                                   loop interchange to 9

-Qoption iropt -Amemopt            See -W2,-Amemopt

-Qoption iropt -Ma<n>              See -W2,-Ma<n>

-Qoption iropt -Mm<n>              See -W2,-Mm<n>

-Qoption iropt -MR                 Do not inline calls when parameters are
                                   arrays and actual array dimensions and
                                   formal array dimensions are mismatched

-Qoption iropt -Mr<n>              See -W2,-Mr<n>

-Qoption iropt -O4+scalarrep       disable scalar replacement optimization

-Qoption iropt -Rscalarrep,-MR     Same as -Qoption iropt -Rscalarrep plus
                                   -Qoption iropt -MR

-Qoption iropt -whole              See -W2,-whole

-stackvar                          Allocate routine local variables on
                                   stack (FORTRAN).

-W<phase>,<flags>                  Pass flags along to compiler phase
                                   (2=optimizer, c=code generator)

-W2,-Abopt                         Enable aggressive optimizations of all
                                   branches.

-W2,-Adata_access                  Enable optimizations based on data
                                   access patterns.

-W2,-Aheap                         Allows the compiler to recognize
                                   malloc-like memory allocation
                                   functions.

-W2,-Ainline                       Perform IPA-based inlining.

-W2,-Aivel:duplicate_loops         More aggresive strength reduction by
                                   replicating loops.

-W2,-Amemopt                       Memory access optimization. This does
                                   whole-program mode inter-procedural
                                   memory access analysis, merges memory
                                   allocations, and performs cache
                                   conscious data layout program
                                   transformations.

-W2,-Amemopt:arrayloc              Reconstruct array subscripts during
                                   memory allocation merging and data
                                   layout program transformation

-W2,-Ashort_ldst                   Convert multiple short memory
                                   operations into single long memory
                                   operations.

-W2,-Aunroll                       Enables outer-loop unrolling.

-W2,-crit                          Enable optimization of critical control
                                   paths

-W2,-Ma<n>                         Enable inlining of routines with frame
                                   size up to n.

-W2,-Mm<n>                         Maximum module increase limit for
                                   inlining.

-W2,-Mp<n>                         Procedures with entry counts equal or
                                   greater than n become candidates for
                                   inlining.

-W2,-Mr<n>                         Maximum code increase due to inlining
                                   is limited to n triples.

-W2,-Ms<n>                         Maximum level of recursive inlining.

-W2,-Mt<n>                         The maximum size of a routine body
                                   eligible for inlining is limited to n
                                   triples.

-W2,-O4+restrict_g                 Assume that different global pointer
                                   variables point to their own memory
                                   locations.

-W2,-reroll=1                      Turns on loop rerolling.

-W2,-whole                         Do whole program optimizations.

-Wc,-Qdepgraph-early_cross_call=1  Enable early cross-call instruction
                                   scheduling.

-Wc,-Qeps:do_spec_load=1           Allow generating speculative load
                                   during EPS.

-Wc,-Qeps:enabled=1                Use enhanced pipeline scheduling(EPS)
                                   and selective scheduling algorithms for
                                   instruction scheduling.

-Wc,-Qeps:rp_filtering_margin=100  Turn off register pressure heuristic in
                                   EPS.

-Wc,-Qgsched-T4                    Sets the aggressiveness of the trace
                                   formation.

-Wc,-Qgsched-trace_late=1          Turns on the late trace scheduler.

-Wc,-Qgsched-trace_spec_load=1     Turns on the conversion of loads to
                                   non-faulting loads inside the trace.

-Wc,-Qinline_memcpy=<n>            Inline calls to memcpy with n bytes or
                                   fewer being copied

-Wc,-Qipa:valueprediction          Use profile feedback data to predict
                                   values and attempt to generate faster
                                   code along these control paths, even at
                                   the expense of possibly slower code
                                   along paths leading to different
                                   values. Correct code is generated for
                                   both paths.

-Wc,-Qiselect-funcalign=<n>        Do function entry alignment at n-byte
                                   boundaries.

-Wc,-Qiselect-sw_pf_tbl_th=<n>     Peels the most frequent test
                                   branches/cases off a switch until the
                                   branch probability reaches less than
                                   1/n. This is effective only when
                                   profile feedback is used

-Wc,-Qms_pipe+intdivusefp          Use fp divide for signed integer
                                   division

-Wc,-Qms_pipe-pref                 Turn off prefetching within modulo
                                   scheduling

-Wc,-Qpeep-Sh0                     Disables the max live base registers
                                   algorithm for sethi hoisting.

-Xa                                Assume ANSI C conformance, allow K & R
                                   extensions. (default mode)

-xalias_level=<a>                  Allows compiler to perform type-based
                                   alias analysis at the given alias
                                   level.

                                   basic assume ISO C9X aliasing rules for
                                   basic types only.

                                   std assume ISO C9X aliasing rules.

                                   strong assume all pointers are type
                                   safe (strongly typed).

-xarch=<a>                         Limit the set of instructions the
                                   compiler may use.

-Xc                                Assume strict ANSI C conformance.

-xcache=<c>                        Defines the cache properties for use by
                                   the optimizer.

                                   c must be one of the following:

                                   o native (set parameters for the host
                                   environment)

                                   o s1/l1/a1

                                   o s1/l1/a1:s2/l2/a2

                                   o s1/l1/a1:s2/l2/a2:s3/l3/a3

                                   The si/li/ai are defined as follows:

                                   si The size of the data cache at level
                                   i, in kilobytes.

                                   li The line size of the data cache at
                                   level i, in bytes.

                                   ai The associativity of the data cache
                                   at level i.

-xchip=<c>                         Specifies the target processor for use
                                   by the optimizer. c must be one of:
                                   generic, generic64, native, native64,
                                   old, super, super2, micro, micro2,
                                   hyper, hyper2, powerup, ultra, ultra2,
                                   ultra2i, ultra3, 386, 486, pentium,
                                   pentium_pro, 603, 604.

-xcrossfile                        Enable cross-file inlining.

-xdepend                           Analyze loops for data dependencies.

-xF                                Allow function reordering by the
                                   WorkShop Performance Analyzer

-xinline=                          Turn off inlining

-xipo=n                            Performs optimizations across all
                                   object files in the link step: 0=off,
                                   1=on, 2=performs whole-program
                                   detection and analysis

-xlibmopt                          This chooses the math library that is
                                   optimized for speed.

-xO1                               Does basic local optimization
                                   (peephole).

-xO2                               xO1 and more local and global
                                   optimizations.

-xO3                               Besides what xO2 does, it optimizes
                                   references or definitions for external
                                   variables. Loop unrolling and software
                                   pipelining are also performed.

-xO4                               xO3 plus function inlining.

-xO5                               Besides what xO4 does, it enables
                                   speculative code motion.

-xpad=common[:<n>]                 Pad common block variables, for better
                                   use of cache. n specifies the amount of
                                   padding to apply. If no parameter is
                                   specified then the compiler selects one
                                   automatically.

-xpad=local[:<n>]                  Pad local variables only, for better
                                   use of cache. n specifies the amount of
                                   padding to apply. If no parameter is
                                   specified then the compiler selects one
                                   automatically.

-xparallel                         Use parallel processing to improve
                                   performance.

-xprefetch[=value]                 Enable prefetch instructions on those
                                   architectures that support prefetch,
                                   such as UltraSPARC II (-xarch=v8plus,
                                   v8plusa, v9plusb, v9, v9a, or v9b)

                                   auto

                                   Enable automatic generation of prefetch
                                   instructions

                                   no%auto

                                   Disable automatic generation of
                                   prefetch instructions

                                   explicit

                                   Enable explicit prefetch macros

                                   no%explicit

                                   Disable explicit prefetch macros

                                   yes

                                   -xprefetch=yes is the same as
                                   -xprefetch=auto,explicit

                                   no

                                   -xprefetch=no is the same as
                                   -xprefetch=no%auto,no%explicit

                                   Defaults

                                   If -xprefetch is not specified,
                                   -xprefetch=no%auto,explicit is assumed.

                                   If only -xprefetch is specified,
                                   -xprefetch=auto,explicit is assumed.

-xprofile=collect                  Collect profile data for feedback
                                   directed optimizations.

-xprofile=use                      Use data collected for profile
                                   feedback.

-xreduction                        Parallelize loops containing
                                   reductions.

-xregs=syst                        Allows use of the system reserved
                                   registers %g6 and %g7, and %g5 if not
                                   already allowed by -xarch value.

-xrestrict[=f1,...,f2,%all,        Treat pointer-valued function
%none]                             parameters as restricted pointers. The
                                   default is %none. Specifying -xrestrict
                                   is equivalent to specifying
                                   -xrestrict=%all.

-xsafe=mem                         Enables the use of non-faulting loads
                                   when used in conjunction with
                                   -xarch=v8plus. Assumes that no memory
                                   based traps will occur.

-xsfpconst                         Represents unsuffixed floating-point
                                   constants as single precision

-Xt                                Assume K & R conformance, allow ANSI C.

-xtarget=native                    Same as -native


------------------------------------------------------------------
Kernel Parameters

Flag                               Description

shmsys:shminfo_shmmin              Minimum size of system V shared memory
                                   segment that can be created.

shmsys:shminfo_shmmax              Maximum size of system V shared memory
                                   segment that can be created. This
                                   parameter is an upper limit that is
                                   checked before the system sees if it
                                   actually has the physical resources to
                                   create the requested memory segment.

shmsys:shminfo_shmmni              System wide limit on number of shared
                                   memory segments that can be created.

shmsys:shminfo_shmseg              Limit on the number of shared memory
                                   segments that any one process can
                                   create.


------------------------------------------------------------------
Environment Variables

Flag                               Description

PRISM_HEAP=<n>                     Set the heap size limit for large pages

PRISM_MODE=2                       Large page mode: Attempt to put text,
                                   data and heap all into large pages.