This document describes the stabs debugging symbol tables.
Copyright 1992, 1993 Free Software Foundation, Inc.
-Contributed by Cygnus Support. Written by Julia Menapace.
+Contributed by Cygnus Support. Written by Julia Menapace, Jim Kingdon,
+and David MacKenzie.
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
@menu
* Overview:: Overview of stabs
-* Program structure:: Encoding of the structure of the program
+* Program Structure:: Encoding of the structure of the program
* Constants:: Constants
* Variables::
* Types:: Type definitions
-* Symbol tables:: Symbol information in symbol tables
+* Symbol Tables:: Symbol information in symbol tables
* Cplusplus:: Appendixes:
-* Stab types:: Symbol types in a.out files
-* Symbol descriptors:: Table of symbol descriptors
-* Type descriptors:: Table of type descriptors
-* Expanded reference:: Reference information by stab type
+* Stab Types:: Symbol types in a.out files
+* Symbol Descriptors:: Table of symbol descriptors
+* Type Descriptors:: Table of type descriptors
+* Expanded Reference:: Reference information by stab type
* Questions:: Questions and anomolies
-* XCOFF differences:: Differences between GNU stabs in a.out
+* XCOFF Differences:: Differences between GNU stabs in a.out
and GNU stabs in XCOFF
-* Sun differences:: Differences between GNU stabs and Sun
+* Sun Differences:: Differences between GNU stabs and Sun
native stabs
-* Stabs in ELF:: Stabs in an ELF file.
-* Symbol Types Index:: Index of symbolic stab symbol type names.
+* Stabs In ELF:: Stabs in an ELF file.
+* Symbol Types Index:: Index of symbolic stab symbol type names.
@end menu
@end ifinfo
@node Overview
-@chapter Overview of stabs
+@chapter Overview of Stabs
@dfn{Stabs} refers to a format for information that describes a program
to a debugger. This format was apparently invented by
This document is one of the few published sources of documentation on
stabs. It is believed to be comprehensive for stabs used by C. The
-lists of symbol descriptors (@pxref{Symbol descriptors}) and type
-descriptors (@pxref{Type descriptors}) are believed to be completely
+lists of symbol descriptors (@pxref{Symbol Descriptors}) and type
+descriptors (@pxref{Type Descriptors}) are believed to be completely
comprehensive. Stabs for COBOL-specific features and for variant
records (used by Pascal and Modula-2) are poorly documented here.
@menu
* Flow:: Overview of debugging information flow
-* Stabs format:: Overview of stab format
-* String field:: The @code{.stabs} @var{string} field
-* C example:: A simple example in C source
-* Assembly code:: The simple example at the assembly level
+* Stabs Format:: Overview of stab format
+* String Field:: The string field
+* C Example:: A simple example in C source
+* Assembly Code:: The simple example at the assembly level
@end menu
@node Flow
-@section Overview of debugging information flow
+@section Overview of Debugging Information Flow
The GNU C compiler compiles C source in a @file{.c} file into assembly
language in a @file{.s} file, which the assembler translates into
table. Debuggers use the symbol and string tables in the executable as
a source of debugging information about the program.
-@node Stabs format
-@section Overview of stab format
+@node Stabs Format
+@section Overview of Stab Format
There are three overall formats for stab assembler directives,
differentiated by the first word of the stab. The name of the directive
The overall format of each class of stab is:
@example
-.stabs "@var{string}",@var{type},0,@var{desc},@var{value}
-.stabn @var{type},0,@var{desc},@var{value}
-.stabd @var{type},0,@var{desc}
+.stabs "@var{string}",@var{type},@var{other},@var{desc},@var{value}
+.stabn @var{type},@var{other},@var{desc},@var{value}
+.stabd @var{type},@var{other},@var{desc}
.stabx "@var{string}",@var{value},@var{type},@var{sdb-type}
@end example
@c what is the correct term for "current file location"? My AIX
@c assembler manual calls it "the value of the current location counter".
For @code{.stabn} and @code{.stabd}, there is no @var{string} (the
-@code{n_strx} field is zero; see @ref{Symbol tables}). For
+@code{n_strx} field is zero; see @ref{Symbol Tables}). For
@code{.stabd}, the @var{value} field is implicit and has the value of
the current file location. For @code{.stabx}, the @var{sdb-type} field
-is unused for stabs and can always be set to zero.
+is unused for stabs and can always be set to zero. The @var{other}
+field is almost always unused and can be set to zero.
The number in the @var{type} field gives some basic information about
which type of stab this is (or whether it @emph{is} a stab, as opposed
to an ordinary symbol). Each valid type number defines a different stab
type; further, the stab type defines the exact interpretation of, and
possible values for, any remaining @var{string}, @var{desc}, or
-@var{value} fields present in the stab. @xref{Stab types}, for a list
+@var{value} fields present in the stab. @xref{Stab Types}, for a list
in numeric order of the valid @var{type} field values for stab directives.
-@node String field
-@section The @code{.stabs} @var{string} field
+@node String Field
+@section The String Field
-For @code{.stabs} the @var{string} field holds the meat of the
-debugging information. The generally unstructured nature of this field
-is what makes stabs extensible. For some stab types the @var{string} field
+For most stabs the string field holds the meat of the
+debugging information. The flexible nature of this field
+is what makes stabs extensible. For some stab types the string field
contains only a name. For other stab types the contents can be a great
deal more complex.
-The overall format is of the @var{string} field is:
+The overall format of the string field for most stab types is:
@example
"@var{name}:@var{symbol-descriptor} @var{type-information}"
character that tells more specifically what kind of symbol the stab
represents. If the @var{symbol-descriptor} is omitted, but type
information follows, then the stab represents a local variable. For a
-list of symbol descriptors, see @ref{Symbol descriptors}. The @samp{c}
+list of symbol descriptors, see @ref{Symbol Descriptors}. The @samp{c}
symbol descriptor is an exception in that it is not followed by type
information. @xref{Constants}.
non-numeric then it is a @var{type-descriptor}, and tells what kind of
type is about to be defined. Any other values following the
@var{type-descriptor} vary, depending on the @var{type-descriptor}.
-@xref{Type descriptors}, for a list of @var{type-descriptor} values. If
+@xref{Type Descriptors}, for a list of @var{type-descriptor} values. If
a number follows the @samp{=} then the number is a @var{type-reference}.
For a full description of types, @ref{Types}.
expense of speed.
@end table
-All of this can make the @var{string} field quite long. All
+All of this can make the string field quite long. All
versions of GDB, and some versions of dbx, can handle arbitrarily long
strings. But many versions of dbx cretinously limit the strings to
about 80 characters, so compilers which must work with such dbx's need
to split the @code{.stabs} directive into several @code{.stabs}
directives. Each stab duplicates exactly all but the
-@var{string} field. The @var{string} field of
+string field. The string field of
every stab except the last is marked as continued with a
double-backslash at the end. Removing the backslashes and concatenating
-the @var{string} fields of each stab produces the original,
+the string fields of each stab produces the original,
long string.
-@node C example
-@section A simple example in C source
+@node C Example
+@section A Simple Example in C Source
To get the flavor of how stabs describe source information for a C
program, let's look at the simple program:
to parts of the @file{.s} file in the description of the stabs that
follows.
-@node Assembly code
-@section The simple example at the assembly level
+@node Assembly Code
+@section The Simple Example at the Assembly Level
This simple ``hello world'' example demonstrates several of the stab
types used to describe C language source files.
52 .stabn 224,0,0,LBE2
@end example
-@node Program structure
-@chapter Encoding the structure of the program
+@node Program Structure
+@chapter Encoding the Structure of the Program
The elements of the program structure that stabs encode include the name
of the main function, the names of the source and include files, the
blocks of code.
@menu
-* Main program:: Indicate what the main program is
-* Source files:: The path and name of the source file
-* Include files:: Names of include files
-* Line numbers::
+* Main Program:: Indicate what the main program is
+* Source Files:: The path and name of the source file
+* Include Files:: Names of include files
+* Line Numbers::
* Procedures::
-* Nested procedures::
-* Block structure::
+* Nested Procedures::
+* Block Structure::
@end menu
-@node Main program
-@section Main program
+@node Main Program
+@section Main Program
@findex N_MAIN
Most languages allow the main program to have any name. The
@code{N_MAIN} stab type tells the debugger the name that is used in this
-program. Only the @var{string} field is significant; it is the name of
+program. Only the string field is significant; it is the name of
a function which is the main program. Most C compilers do not use this
stab (they expect the debugger to assume that the name is @code{main}),
but some C compilers emit an @code{N_MAIN} stab for the @code{main}
function.
-@node Source files
-@section Paths and names of the source files
+@node Source Files
+@section Paths and Names of the Source Files
@findex N_SO
Before any other stabs occur, there must be a stab specifying the source
file. This information is contained in a symbol of stab type
-@code{N_SO}; the @var{string} field contains the name of the file. The
-@var{value} of the symbol is the start address of the portion of the
+@code{N_SO}; the string field contains the name of the file. The
+value of the symbol is the start address of the portion of the
text section corresponding to that file.
-With the Sun Solaris2 compiler, the @var{desc} field contains a
+With the Sun Solaris2 compiler, the desc field contains a
source-language code.
@c Do the debuggers use it? What are the codes? -djm
directive which assembles to a standard COFF @code{.file} symbol;
explaining this in detail is outside the scope of this document.
-@node Include files
-@section Names of include files
+@node Include Files
+@section Names of Include Files
There are several schemes for dealing with include files: the
traditional @code{N_SOL} approach, Sun's @code{N_BINCL} approach, and the
@findex N_SOL
An @code{N_SOL} symbol specifies which include file subsequent symbols
-refer to. The @var{string} field is the name of the file and the
-@var{value} is the text address corresponding to the start of the
-previous include file and the start of this one. To specify the main
-source file again, use an @code{N_SOL} symbol with the name of the main
-source file.
+refer to. The string field is the name of the file and the value is the
+text address corresponding to the end of the previous include file and
+the start of this one. To specify the main source file again, use an
+@code{N_SOL} symbol with the name of the main source file.
@findex N_BINCL
@findex N_EINCL
@findex N_EXCL
The @code{N_BINCL} approach works as follows. An @code{N_BINCL} symbol
specifies the start of an include file. In an object file, only the
-@var{string} is significant; the Sun linker puts data into some of the
+string is significant; the Sun linker puts data into some of the
other fields. The end of the include file is marked by an
-@code{N_EINCL} symbol (which has no @var{string} field). In an object
+@code{N_EINCL} symbol (which has no string field). In an object
file, there is no significant data in the @code{N_EINCL} symbol; the Sun
linker puts data into some of the fields. @code{N_BINCL} and
@code{N_EINCL} can be nested.
directive, which generates a @code{C_BINCL} symbol. A @file{.ei}
directive, which generates a @code{C_EINCL} symbol, denotes the end of
the include file. Both directives are followed by the name of the
-source file in quotes, which becomes the @var{string} for the symbol.
-The @var{value} of each symbol, produced automatically by the assembler
+source file in quotes, which becomes the string for the symbol.
+The value of each symbol, produced automatically by the assembler
and linker, is the offset into the executable of the beginning
(inclusive, as you'd expect) or end (inclusive, as you would not expect)
of the portion of the COFF line table that corresponds to this include
file. @code{C_BINCL} and @code{C_EINCL} do not nest.
-@node Line numbers
-@section Line numbers
+@node Line Numbers
+@section Line Numbers
@findex N_SLINE
An @code{N_SLINE} symbol represents the start of a source line. The
-@var{desc} field contains the line number and the @var{value} field
+desc field contains the line number and the value
contains the code address for the start of that source line. On most
machines the address is absolute; for Sun's stabs-in-ELF, it is relative
to the function in which the @code{N_SLINE} symbol occurs.
to @code{N_SLINE} but are relocated differently by the linker. They
were intended to be used to describe the source location of a variable
declaration, but I believe that GCC2 actually puts the line number in
-the @var{desc} field of the stab for the variable itself. GDB has been
-ignoring these symbols (unless they contain a @var{string} field) since
+the desc field of the stab for the variable itself. GDB has been
+ignoring these symbols (unless they contain a string field) since
at least GDB 3.5.
For single source lines that generate discontiguous code, such as flow
the same source line. In this case there is a line number entry at the
start of each code range, each with the same line number.
-XCOFF uses COFF line numbers, which are outside the scope of this
-document, ammeliorated by adequate marking of include files
-(@pxref{Include files}).
+XCOFF does not use stabs for line numbers. Instead, it uses COFF line
+numbers (which are outside the scope of this document). Standard COFF
+line numbers cannot deal with include files, but in XCOFF this is fixed
+with the @code{C_BINCL} method of marking include files (@pxref{Include
+Files}).
@node Procedures
@section Procedures
-@findex N_FUN
+@findex N_FUN, for functions
@findex N_FNAME
@findex N_STSYM, for functions (Sun acc)
@findex N_GSYM, for functions (Sun acc)
A function is represented by an @samp{F} symbol descriptor for a global
(extern) function, and @samp{f} for a static (local) function. The
-@var{value} field is the address of the start of the function (absolute
-for @code{a.out}; relative to the start of the file for Sun's
-stabs-in-ELF). The type information of the stab represents the return
-type of the function; thus @samp{foo:f5} means that foo is a function
-returning type 5. There is no need to try to get the line number of the
-start of the function from the stab for the function; it is in the next
+value is the address of the start of the function. For @code{a.out}, it
+is already relocated. For stabs in ELF, the SunPRO compiler version
+2.0.1 and GCC put out an address which gets relocated by the linker. In
+a future release SunPRO is planning to put out zero, in which case the
+address can be found from the ELF (non-stab) symbol. Because looking
+things up in the ELF symbols would probably be slow, I'm not sure how to
+find which symbol of that name is the right one, and this doesn't
+provide any way to deal with nested functions, it would probably be
+better to make the value of the stab an address relative to the start of
+the file. See @ref{Stabs In ELF} for more information on linker
+relocation of stabs in ELF files.
+
+The type information of the stab represents the return type of the
+function; thus @samp{foo:f5} means that foo is a function returning type
+5. There is no need to try to get the line number of the start of the
+function from the stab for the function; it is in the next
@code{N_SLINE} symbol.
@c FIXME: verify whether the "I suspect" below is true or not.
type of the function, followed by the arguments, each preceded by
@samp{;}, as in a stab with symbol descriptor @samp{f} or @samp{F}.
This use of symbol descriptor @samp{P} can be distinguished from its use
-for register parameters (@pxref{Register parameters}) by the fact that it has
+for register parameters (@pxref{Register Parameters}) by the fact that it has
symbol type @code{N_FUN}.
The AIX documentation also defines symbol descriptor @samp{J} as an
stabs describe the procedure's parameters, its block local variables, and
its block structure.
-@node Nested procedures
-@section Nested procedures
+@node Nested Procedures
+@section Nested Procedures
For any of the symbol descriptors representing procedures, after the
symbol descriptor and the type information is optionally a scope
.stabs "foo:F1",36,0,0,_foo
@end example
-@node Block structure
-@section Block structure
+@node Block Structure
+@section Block Structure
@findex N_LBRAC
@findex N_RBRAC
defined inside a block precede the @code{N_LBRAC} symbol for most
compilers, including GCC. Other compilers, such as the Convex, Acorn
RISC machine, and Sun @code{acc} compilers, put the variables after the
-@code{N_LBRAC} symbol. The @var{value} fields of the @code{N_LBRAC} and
+@code{N_LBRAC} symbol. The values of the @code{N_LBRAC} and
@code{N_RBRAC} symbols are the start and end addresses of the code of
the block, respectively. For most machines, they are relative to the
starting address of this source file. For the Gould NP1, they are
scope of a procedure are located after the @code{N_FUN} stab that
represents the procedure itself.
-Sun documents the @var{desc} field of @code{N_LBRAC} and
+Sun documents the desc field of @code{N_LBRAC} and
@code{N_RBRAC} symbols as containing the nesting level of the block.
-However, dbx seems to not care, and GCC always sets @var{desc} to
+However, dbx seems to not care, and GCC always sets desc to
zero.
@node Constants
@item e @var{type-information} , @var{value}
Constant whose value can be represented as integral.
@var{type-information} is the type of the constant, as it would appear
-after a symbol descriptor (@pxref{String field}). @var{value} is the
+after a symbol descriptor (@pxref{String Field}). @var{value} is the
numeric value of the constant. GDB 4.9 does not actually get the right
value if @var{value} does not fit in a host @code{int}, but it does not
do anything violent, and future debuggers could be extended to accept
@item S @var{type-information} , @var{elements} , @var{bits} , @var{pattern}
Set constant. @var{type-information} is the type of the constant, as it
-would appear after a symbol descriptor (@pxref{String field}).
+would appear after a symbol descriptor (@pxref{String Field}).
@var{elements} is the number of elements in the set (does this means
how many bits of @var{pattern} are actually used, which would be
redundant with the type, or perhaps the number of bits set in
statically, or as arguments to a function.
@menu
-* Stack variables:: Variables allocated on the stack.
-* Global variables:: Variables used by more than one source file.
-* Register variables:: Variables in registers.
-* Common blocks:: Variables statically allocated together.
+* Stack Variables:: Variables allocated on the stack.
+* Global Variables:: Variables used by more than one source file.
+* Register Variables:: Variables in registers.
+* Common Blocks:: Variables statically allocated together.
* Statics:: Variables local to one source file.
+* Based Variables:: Fortran pointer based variables.
* Parameters:: Variables for arguments to functions.
@end menu
-@node Stack variables
-@section Automatic variables allocated on the stack
+@node Stack Variables
+@section Automatic Variables Allocated on the Stack
If a variable's scope is local to a function and its lifetime is only as
long as that function executes (C calls such variables
@dfn{automatic}), it can be allocated in a register (@pxref{Register
-variables}) or on the stack.
+Variables}) or on the stack.
@findex N_LSYM
Each variable allocated on the stack has a stab with the symbol
guarantee that type descriptors are distinct from symbol descriptors.
Stabs for stack variables use the @code{N_LSYM} stab type.
-The @var{value} of the stab is the offset of the variable within the
+The value of the stab is the offset of the variable within the
local variables. On most machines this is an offset from the frame
pointer and is negative. The location of the stab specifies which block
-it is defined in; see @ref{Block structure}.
+it is defined in; see @ref{Block Structure}.
For example, the following C code:
@end example
@xref{Procedures} for more information on the @code{N_FUN} stab, and
-@ref{Block structure} for more information on the @code{N_LBRAC} and
+@ref{Block Structure} for more information on the @code{N_LBRAC} and
@code{N_RBRAC} stabs.
-@node Global variables
-@section Global variables
+@node Global Variables
+@section Global Variables
@findex N_GSYM
A variable whose scope is not specific to just one source file is
represented by the @samp{G} symbol descriptor. These stabs use the
@code{N_GSYM} stab type. The type information for the stab
-(@pxref{String field}) gives the type of the variable.
+(@pxref{String Field}) gives the type of the variable.
For example, the following source code:
the @code{.global _g_foo} and @code{_g_foo:} lines tell the assembler to
produce an external symbol.
-@node Register variables
-@section Register variables
+@node Register Variables
+@section Register Variables
@findex N_RSYM
@c According to an old version of this manual, AIX uses C_RPSYM instead
@c of C_RSYM. I am skeptical; this should be verified.
Register variables have their own stab type, @code{N_RSYM}, and their
-own symbol descriptor, @samp{r}. The stab's @var{value} field contains the
+own symbol descriptor, @samp{r}. The stab's value is the
number of the register where the variable data will be stored.
@c .stabs "name:type",N_RSYM,0,RegSize,RegNumber (Sun doc)
then the stab may be emitted at the end of the object file, with
the other bss symbols.
-@node Common blocks
-@section Common blocks
+@node Common Blocks
+@section Common Blocks
A common block is a statically allocated section of memory which can be
referred to by several source files. It may contain several variables.
@findex N_BCOMM
@findex N_ECOMM
+@findex C_BCOMM
+@findex C_ECOMM
A @code{N_BCOMM} stab begins a common block and an @code{N_ECOMM} stab
ends it. The only field that is significant in these two stabs is the
-@var{string}, which names a normal (non-debugging) symbol that gives the
-address of the common block.
+string, which names a normal (non-debugging) symbol that gives the
+address of the common block. According to IBM documentation, only the
+@code{N_BCOMM} has the name of the common block (even though their
+compiler actually puts it both places).
@findex N_ECOML
-Each stab between the @code{N_BCOMM} and the @code{N_ECOMM} specifies a
-member of that common block; its @var{value} is the offset within the
-common block of that variable. The @code{N_ECOML} stab type is
-documented for this purpose, but Sun's Fortran compiler uses
-@code{N_GSYM} instead. The test case I looked at had a common block
-local to a function and it used the @samp{V} symbol descriptor; I assume
-one would use @samp{S} if not local to a function (that is, if a common
-block @emph{can} be anything other than local to a function).
+@findex C_ECOML
+The stabs for the members of the common block are between the
+@code{N_BCOMM} and the @code{N_ECOMM}; the value of each stab is the
+offset within the common block of that variable. IBM uses the
+@code{C_ECOML} stab type, and there is a corresponding @code{N_ECOML}
+stab type, but Sun's Fortran compiler uses @code{N_GSYM} instead. The
+variables within a common block use the @samp{V} symbol descriptor (I
+believe this is true of all Fortran variables). Other stabs (at least
+type declarations using @code{C_DECL}) can also be between the
+@code{N_BCOMM} and the @code{N_ECOMM}.
@node Statics
-@section Static variables
+@section Static Variables
Initialized static variables are represented by the @samp{S} and
@samp{V} symbol descriptors. @samp{S} means file scope static, and
@c find the variables)
@findex N_STSYM
@findex N_LCSYM
-In a.out files, @code{N_STSYM} means the data segment, @code{N_FUN}
-means the text segment, and @code{N_LCSYM} means the bss segment.
+@findex N_FUN, for variables
+@findex N_ROSYM
+In a.out files, @code{N_STSYM} means the data section, @code{N_FUN}
+means the text section, and @code{N_LCSYM} means the bss section. For
+those systems with a read-only data section separate from the text
+section (Solaris), @code{N_ROSYM} means the read-only data section.
For example, the source lines:
@end example
In XCOFF files, each symbol has a section number, so the stab type
-need not indicate the segment.
+need not indicate the section.
In ECOFF files, the storage class is used to specify the section, so the
-stab type need not indicate the segment.
+stab type need not indicate the section.
+
+In ELF files, for the SunPRO compiler version 2.0.1, symbol descriptor
+@samp{S} means that the address is absolute (the linker relocates it)
+and symbol descriptor @samp{V} means that the address is relative to the
+start of the relevant section for that compilation unit. SunPRO has
+plans to have the linker stop relocating stabs; I suspect that their the
+debugger gets the address from the corresponding ELF (not stab) symbol.
+I'm not sure how to find which symbol of that name is the right one.
+The clean way to do all this would be to have a the value of a symbol
+descriptor @samp{S} symbol be an offset relative to the start of the
+file, just like everything else, but that introduces obvious
+compatibility problems. For more information on linker stab relocation,
+@xref{Stabs In ELF}.
+
+@node Based Variables
+@section Fortran Based Variables
-@c In ELF files, it apparently is a big mess. See kludge in dbxread.c
-@c in GDB. FIXME: Investigate where this kludge comes from.
-@c
-@c This is the place to mention N_ROSYM; I'd rather do so once I can
-@c coherently explain how this stuff works for stabs-in-ELF.
+Fortran (at least, the Sun and SGI dialects of FORTRAN-77) has a feature
+which allows allocating arrays with @code{malloc}, but which avoids
+blurring the line between arrays and pointers the way that C does. In
+stabs such a variable uses the @samp{b} symbol descriptor.
+
+For example, the Fortran declarations
+
+@example
+real foo, foo10(10), foo10_5(10,5)
+pointer (foop, foo)
+pointer (foo10p, foo10)
+pointer (foo105p, foo10_5)
+@end example
+
+produce the stabs
+
+@example
+foo:b6
+foo10:bar3;1;10;6
+foo10_5:bar3;1;5;ar3;1;10;6
+@end example
+
+In this example, @code{real} is type 6 and type 3 is an integral type
+which is the type of the subscripts of the array (probably
+@code{integer}).
+
+The @samp{b} symbol descriptor is like @samp{V} in that it denotes a
+statically allocated symbol whose scope is local to a function; see
+@xref{Statics}. The value of the symbol, instead of being the address
+of the variable itself, is the address of a pointer to that variable.
+So in the above example, the value of the @code{foo} stab is the address
+of a pointer to a real, the value of the @code{foo10} stab is the
+address of a pointer to a 10-element array of reals, and the value of
+the @code{foo10_5} stab is the address of a pointer to a 5-element array
+of 10-element arrays of reals.
@node Parameters
@section Parameters
@findex N_PSYM
Parameters passed on the stack use the symbol descriptor @samp{p} and
-the @code{N_PSYM} symbol type. The @var{value} of the symbol is an offset
+the @code{N_PSYM} symbol type. The value of the symbol is an offset
used to locate the parameter on the stack; its exact meaning is
machine-dependent, but on most machines it is an offset from the frame
pointer.
@code{argv} (type 20) is pointer to type 21.
@c FIXME: figure out what these mean and describe them coherently.
-The following are also said to go with @code{N_PSYM}:
+The following symbol descriptors are also said to go with @code{N_PSYM}.
+The value of the symbol is said to be an offset from the argument
+pointer (I'm not sure whether this is true or not).
@example
-"name" -> "param_name:#type"
- -> pP (<<??>>)
- -> pF Fortran function parameter
- -> X (function result variable)
- -> b (based variable)
-
-value -> offset from the argument pointer (positive).
+pP (<<??>>)
+pF Fortran function parameter
+X (function result variable)
@end example
@menu
-* Register parameters::
-* Local variable parameters::
-* Reference parameters::
-* Conformant arrays::
+* Register Parameters::
+* Local Variable Parameters::
+* Reference Parameters::
+* Conformant Arrays::
@end menu
-@node Register parameters
-@subsection Passing parameters in registers
+@node Register Parameters
+@subsection Passing Parameters in Registers
If the parameter is passed in a register, then traditionally there are
two symbols for each argument:
Because that approach is kind of ugly, some compilers use symbol
descriptor @samp{P} or @samp{R} to indicate an argument which is in a
register. Symbol type @code{C_RPSYM} is used with @samp{R} and
-@code{N_RSYM} is used with @samp{P}. The symbol @var{value} field is
+@code{N_RSYM} is used with @samp{P}. The symbol's value is
the register number. @samp{P} and @samp{R} mean the same thing; the
difference is that @samp{P} is a GNU invention and @samp{R} is an IBM
(XCOFF) invention. As of version 4.9, GDB should handle either one.
access"; I don't know the source for this information), but I don't know
details or what compilers or debuggers use it, if any (not GDB or GCC).
It is not clear to me whether this case needs to be dealt with
-differently than parameters passed by reference (@pxref{Reference parameters}).
+differently than parameters passed by reference (@pxref{Reference Parameters}).
-@node Local variable parameters
-@subsection Storing parameters as local variables
+@node Local Variable Parameters
+@subsection Storing Parameters as Local Variables
There is a case similar to an argument in a register, which is an
argument that is actually stored as a local variable. Sometimes this
stores it as a local variable. If possible, the compiler should claim
that it's in a register, but this isn't always done.
+If a parameter is passed as one type and converted to a smaller type by
+the prologue (for example, the parameter is declared as a @code{float},
+but the calling conventions specify that it is passed as a
+@code{double}), then GCC2 (sometimes) uses a pair of symbols. The first
+symbol uses symbol descriptor @samp{p} and the type which is passed.
+The second symbol has the type and location which the parameter actually
+has after the prologue. For example, suppose the following C code
+appears with no prototypes involved:
+
+@example
+void
+subr (f)
+ float f;
+@{
+@end example
+
+if @code{f} is passed as a double at stack offset 8, and the prologue
+converts it to a float in register number 0, then the stabs look like:
+
+@example
+.stabs "f:p13",160,0,3,8 # @r{160 is @code{N_PSYM}, here 13 is @code{double}}
+.stabs "f:r12",64,0,3,0 # @r{64 is @code{N_RSYM}, here 12 is @code{float}}
+@end example
+
+In both stabs 3 is the line number where @code{f} is declared
+(@pxref{Line Numbers}).
+
@findex N_LSYM, for parameter
-Some compilers use the pair of symbols approach described above
-(@samp{@var{arg}:p} followed by @samp{@var{arg}:}); this includes GCC1
-(not GCC2) on the sparc when passing a small structure and GCC2
-(sometimes) when the argument type is @code{float} and it is passed as a
-@code{double} and converted to @code{float} by the prologue (in the
-latter case the type of the @samp{@var{arg}:p} symbol is @code{double}
-and the type of the @samp{@var{arg}:} symbol is @code{float}). GCC, at
-least on the 960, uses a single @samp{p} symbol descriptor for an
-argument which is stored as a local variable but uses @code{N_LSYM}
-instead of @code{N_PSYM}. In this case, the @var{value} of the symbol
-is an offset relative to the local variables for that function, not
-relative to the arguments; on some machines those are the same thing,
-but not on all.
-
-@node Reference parameters
-@subsection Passing parameters by reference
+GCC, at least on the 960, has another solution to the same problem. It
+uses a single @samp{p} symbol descriptor for an argument which is stored
+as a local variable but uses @code{N_LSYM} instead of @code{N_PSYM}. In
+this case, the value of the symbol is an offset relative to the local
+variables for that function, not relative to the arguments; on some
+machines those are the same thing, but not on all.
+
+@c This is mostly just background info; the part that logically belongs
+@c here is the last sentence.
+On the VAX or on other machines in which the calling convention includes
+the number of words of arguments actually passed, the debugger (GDB at
+least) uses the parameter symbols to keep track of whether it needs to
+print nameless arguments in addition to the formal parameters which it
+has printed because each one has a stab. For example, in
+
+@example
+extern int fprintf (FILE *stream, char *format, @dots{});
+@dots{}
+fprintf (stdout, "%d\n", x);
+@end example
+
+there are stabs for @code{stream} and @code{format}. On most machines,
+the debugger can only print those two arguments (because it has no way
+of knowing that additional arguments were passed), but on the VAX or
+other machines with a calling convention which indicates the number of
+words of arguments, the debugger can print all three arguments. To do
+so, the parameter symbol (symbol descriptor @samp{p}) (not necessarily
+@samp{r} or symbol descriptor omitted symbols) needs to contain the
+actual type as passed (for example, @code{double} not @code{float} if it
+is passed as a double and converted to a float).
+
+@node Reference Parameters
+@subsection Passing Parameters by Reference
If the parameter is passed by reference (e.g., Pascal @code{VAR}
parameters), then the symbol descriptor is @samp{v} if it is in the
respectively. I believe @samp{a} is an AIX invention; @samp{v} is
supported by all stabs-using systems as far as I know.
-@node Conformant arrays
-@subsection Passing conformant array parameters
+@node Conformant Arrays
+@subsection Passing Conformant Array Parameters
@c Is this paragraph correct? It is based on piecing together patchy
@c information and some guesswork
languages, in which the size of an array parameter is not known to the
called function until run-time. Such parameters have two stabs: a
@samp{x} for the array itself, and a @samp{C}, which represents the size
-of the array. The @var{value} of the @samp{x} stab is the offset in the
+of the array. The value of the @samp{x} stab is the offset in the
argument list where the address of the array is stored (it this right?
-it is a guess); the @var{value} of the @samp{C} stab is the offset in the
+it is a guess); the value of the @samp{C} stab is the offset in the
argument list where the size of the array (in elements? in bytes?) is
stored.
@node Types
-@chapter Defining types
+@chapter Defining Types
The examples so far have described types as references to previously
defined types, or defined in terms of subranges of or pointers to
descriptors that may follow the @samp{=} in a type definition.
@menu
-* Builtin types:: Integers, floating point, void, etc.
-* Miscellaneous types:: Pointers, sets, files, etc.
-* Cross-references:: Referring to a type not yet defined.
+* Builtin Types:: Integers, floating point, void, etc.
+* Miscellaneous Types:: Pointers, sets, files, etc.
+* Cross-References:: Referring to a type not yet defined.
* Subranges:: A type with a specific range.
* Arrays:: An aggregate type of same-typed elements.
* Strings:: Like an array but also has a length.
* Structures:: An aggregate type of different-typed elements.
* Typedefs:: Giving a type a name.
* Unions:: Different types sharing storage.
-* Function types::
+* Function Types::
@end menu
-@node Builtin types
-@section Builtin types
+@node Builtin Types
+@section Builtin Types
Certain types are built in (@code{int}, @code{short}, @code{void},
@code{float}, etc.); the debugger recognizes these types and knows how
formats. The following sections describe each of these formats.
@menu
-* Traditional builtin types:: Put on your seatbelts and prepare for kludgery
-* Builtin type descriptors:: Builtin types with special type descriptors
-* Negative type numbers:: Builtin types using negative type numbers
+* Traditional Builtin Types:: Put on your seatbelts and prepare for kludgery
+* Builtin Type Descriptors:: Builtin types with special type descriptors
+* Negative Type Numbers:: Builtin types using negative type numbers
@end menu
-@node Traditional builtin types
-@subsection Traditional builtin types
+@node Traditional Builtin Types
+@subsection Traditional Builtin Types
This is the traditional, convoluted method for defining builtin types.
There are several classes of such type definitions: integer, floating
point, and @code{void}.
@menu
-* Traditional integer types::
-* Traditional other types::
+* Traditional Integer Types::
+* Traditional Other Types::
@end menu
-@node Traditional integer types
-@subsubsection Traditional integer types
+@node Traditional Integer Types
+@subsubsection Traditional Integer Types
Often types are defined as subranges of themselves. If the bounding values
fit within an @code{int}, then they are given normally. For example:
.stabs "unsigned int:t4=r1;0;-1;",128,0,0,0
@end example
-For larger types, GCC 2.4.5 puts out bounds in octal, with a leading 0.
-In this case a negative bound consists of a number which is a 1 bit
-followed by a bunch of 0 bits, and a positive bound is one in which a
-bunch of bits are 1. All known versions of dbx and GDB version 4 accept
-this, but GDB 3.5 refuses to read the whole file containing such
-symbols. So GCC 2.3.3 did not output the proper size for these types.
-@c FIXME: How about an example?
+For larger types, GCC 2.4.5 puts out bounds in octal, with one or more
+leading zeroes. In this case a negative bound consists of a number
+which is a 1 bit (for the sign bit) followed by a 0 bit for each bit in
+the number (except the sign bit), and a positive bound is one which is a
+1 bit for each bit in the number (except possibly the sign bit). All
+known versions of dbx and GDB version 4 accept this (at least in the
+sense of not refusing to process the file), but GDB 3.5 refuses to read
+the whole file containing such symbols. So GCC 2.3.3 did not output the
+proper size for these types. As an example of octal bounds, the string
+fields of the stabs for 64 bit integer types look like:
+
+@c .stabs directives, etc., omitted to make it fit on the page.
+@example
+long int:t3=r1;001000000000000000000000;000777777777777777777777;
+long unsigned int:t5=r1;000000000000000000000000;001777777777777777777777;
+@end example
If the lower bound of a subrange is 0 and the upper bound is negative,
the type is an unsigned integral type whose size in bytes is the
subrange, the type should be a subrange of itself. I'm not sure whether
this is the case for Convex.
-@node Traditional other types
-@subsubsection Traditional other types
+@node Traditional Other Types
+@subsubsection Traditional Other Types
If the upper bound of a subrange is 0 and the lower bound is positive,
the type is a floating point type, and the lower bound of the subrange
I'm not sure how a boolean type is represented.
-@node Builtin type descriptors
-@subsection Defining builtin types using builtin type descriptors
+@node Builtin Type Descriptors
+@subsection Defining Builtin Types Using Builtin Type Descriptors
This is the method used by Sun's @code{acc} for defining builtin types.
These are the type descriptors to define builtin types:
of bits in the type.
Note that type descriptor @samp{b} used for builtin types conflicts with
-its use for Pascal space types (@pxref{Miscellaneous types}); they can
+its use for Pascal space types (@pxref{Miscellaneous Types}); they can
be distinguished because the character following the type descriptor
will be a digit, @samp{(}, or @samp{-} for a Pascal space type, or
@samp{u} or @samp{s} for a builtin type.
@item w
Documented by AIX to define a wide character type, but their compiler
-actually uses negative type numbers (@pxref{Negative type numbers}).
+actually uses negative type numbers (@pxref{Negative Type Numbers}).
@item R @var{fp-type} ; @var{bytes} ;
Define a floating point type. @var{fp-type} has one of the following values:
@item g @var{type-information} ; @var{nbits}
Documented by AIX to define a floating type, but their compiler actually
-uses negative type numbers (@pxref{Negative type numbers}).
+uses negative type numbers (@pxref{Negative Type Numbers}).
@item c @var{type-information} ; @var{nbits}
Documented by AIX to define a complex type, but their compiler actually
-uses negative type numbers (@pxref{Negative type numbers}).
+uses negative type numbers (@pxref{Negative Type Numbers}).
@end table
The C @code{void} type is defined as a signed integral type 0 bits long:
I'm not sure how a boolean type is represented.
-@node Negative type numbers
-@subsection Negative type numbers
+@node Negative Type Numbers
+@subsection Negative Type Numbers
This is the method used in XCOFF for defining builtin types.
Since the debugger knows about the builtin types anyway, the idea of
negative type numbers is simply to give a special type number which
indicates the builtin type. There is no stab defining these types.
-I'm not sure whether anyone has tried to define what this means if
-@code{int} can be other than 32 bits (or if other types can be other than
-their customary size). If @code{int} has exactly one size for each
-architecture, then it can be handled easily enough, but if the size of
-@code{int} can vary according the compiler options, then it gets hairy.
-The best way to do this would be to define separate negative type
-numbers for 16-bit @code{int} and 32-bit @code{int}; therefore I have
-indicated below the customary size (and other format information) for
-each type. The information below is currently correct because AIX on
-the RS6000 is the only system which uses these type numbers. If these
-type numbers start to get used on other systems, I suspect the correct
-thing to do is to define a new number in cases where a type does not
-have the size and format indicated below (or avoid negative type numbers
-in these cases).
-
-Part of the definition of the negative type number is
-the name of the type. Types with identical size and format but
-different names have different negative type numbers.
+There are several subtle issues with negative type numbers.
+
+One is the size of the type. A builtin type (for example the C types
+@code{int} or @code{long}) might have different sizes depending on
+compiler options, the target architecture, the ABI, etc. This issue
+doesn't come up for IBM tools since (so far) they just target the
+RS/6000; the sizes indicated below for each size are what the IBM
+RS/6000 tools use. To deal with differing sizes, either define separate
+negative type numbers for each size (which works but requires changing
+the debugger, and, unless you get both AIX dbx and GDB to accept the
+change, introduces an incompatibility), or use a type attribute
+(@pxref{String Field}) to define a new type with the appropriate size
+(which merely requires a debugger which understands type attributes,
+like AIX dbx). For example,
+
+@example
+.stabs "boolean:t10=@@s8;-16",128,0,0,0
+@end example
+
+defines an 8-bit boolean type, and
+
+@example
+.stabs "boolean:t10=@@s64;-16",128,0,0,0
+@end example
+
+defines a 64-bit boolean type.
+
+A similar issue is the format of the type. This comes up most often for
+floating-point types, which could have various formats (particularly
+extended doubles, which vary quite a bit even among IEEE systems).
+Again, it is best to define a new negative type number for each
+different format; changing the format based on the target system has
+various problems. One such problem is that the Alpha has both VAX and
+IEEE floating types. One can easily imagine one library using the VAX
+types and another library in the same executable using the IEEE types.
+Another example is that the interpretation of whether a boolean is true
+or false can be based on the least significant bit, most significant
+bit, whether it is zero, etc., and different compilers (or different
+options to the same compiler) might provide different kinds of boolean.
+
+The last major issue is the names of the types. The name of a given
+type depends @emph{only} on the negative type number given; these do not
+vary depending on the language, the target system, or anything else.
+One can always define separate type numbers---in the following list you
+will see for example separate @code{int} and @code{integer*4} types
+which are identical except for the name. But compatibility can be
+maintained by not inventing new negative type numbers and instead just
+defining a new type with a new name. For example:
+
+@example
+.stabs "CARDINAL:t10=-8",128,0,0,0
+@end example
+
+Here is the list of negative type numbers. The phrase @dfn{integral
+type} is used to mean twos-complement (I strongly suspect that all
+machines which use stabs use twos-complement; most machines use
+twos-complement these days).
@table @code
@item -1
Unicode?).
@end table
-@node Miscellaneous types
-@section Miscellaneous types
+@node Miscellaneous Types
+@section Miscellaneous Types
@table @code
@item b @var{type-information} ; @var{bytes}
Pascal space type. This is documented by IBM; what does it mean?
This use of the @samp{b} type descriptor can be distinguished
-from its use for builtin integral types (@pxref{Builtin type
-descriptors}) because the character following the type descriptor is
+from its use for builtin integral types (@pxref{Builtin Type
+Descriptors}) because the character following the type descriptor is
always a digit, @samp{(}, or @samp{-}.
@item B @var{type-information}
Multiple instance type. The type seems to composed of @var{length}
repetitions of @var{type-information}, for example @code{character*3} is
represented by @samp{M-2;3}, where @samp{-2} is a reference to a
-character type (@pxref{Negative type numbers}). I'm not sure how this
+character type (@pxref{Negative Type Numbers}). I'm not sure how this
differs from an array. This appears to be a Fortran feature.
@var{length} is a bound, like those in range types; see @ref{Subranges}.
Pointer to @var{type-information}.
@end table
-@node Cross-references
-@section Cross-references to other types
+@node Cross-References
+@section Cross-References to Other Types
A type can be used before it is defined; one common way to deal with
that situation is just to use a type reference to a type which has not
that it identifies the module; I don't understand whether the name of
the type given here is always just the same as the name we are giving
it, or whether this type descriptor is used with a nameless stab
-(@pxref{String field}), or what. The symbol ends with @samp{;}.
+(@pxref{String Field}), or what. The symbol ends with @samp{;}.
@node Subranges
-@section Subrange types
+@section Subrange Types
The @samp{r} type descriptor defines a type as a subrange of another
type. It is followed by type information for the type of which it is a
There is no bound.
@end table
-Subranges are also used for builtin types; see @ref{Traditional builtin types}.
+Subranges are also used for builtin types; see @ref{Traditional Builtin Types}.
@node Arrays
-@section Array types
+@section Array Types
Arrays use the @samp{a} type descriptor. Following the type descriptor
is the type of the index and the type of the array elements. If the
It is well established, and widely used, that the type of the index,
unlike most types found in the stabs, is merely a type definition, not
-type information (@pxref{String field}) (that is, it need not start with
+type information (@pxref{String Field}) (that is, it need not start with
@samp{@var{type-number}=} if it is defining a new type). According to a
comment in GDB, this is also true of the type of the array elements; it
gives @samp{ar1;1;10;ar1;1;10;4} as a legitimate way to express a two
example, an array of 3-byte objects might, if unpacked, have each
element aligned on a 4-byte boundary, but if packed, have no padding.
One way to specify that something is packed is with type attributes
-(@pxref{String field}). In the case of arrays, another is to use the
+(@pxref{String Field}). In the case of arrays, another is to use the
@samp{P} type descriptor instead of @samp{a}. Other than specifying a
packed array, @samp{P} is identical to @samp{a}.
is determined by the architecture (normally all enumerations types are
32 bits). There should be a way to specify an enumeration type of
another size; type attributes would be one way to do this. @xref{Stabs
-format}.
+Format}.
@node Structures
@section Structures
contains a type definition for an element which is a pointer to type 16.
@node Typedefs
-@section Giving a type a name
+@section Giving a Type a Name
To give a type a name, use the @samp{t} symbol descriptor. The type
-is specified by the type information (@pxref{String field}) for the stab.
+is specified by the type information (@pxref{String Field}) for the stab.
For example,
@example
AIX provides a type descriptor to specify it. The type descriptor is
@samp{o} and is followed by a name. I don't know what the name
means---is it always the same as the name of the type, or is this type
-descriptor used with a nameless stab (@pxref{String field})? There
+descriptor used with a nameless stab (@pxref{String Field})? There
optionally follows a comma followed by type information which defines
the type of this type. If omitted, a semicolon is used in place of the
comma and the type information, and the type is much like a generic
@end example
@samp{-20} specifies where the variable is stored (@pxref{Stack
-variables}).
+Variables}).
-@node Function types
-@section Function types
+@node Function Types
+@section Function Types
Various types can be defined for function variables. These types are
not used in defining functions (@pxref{Procedures}); they are used for
The variable defines a new type, 24, which is a pointer to another new
type, 25, which is a function returning @code{int}.
-@node Symbol tables
-@chapter Symbol information in symbol tables
+@node Symbol Tables
+@chapter Symbol Information in Symbol Tables
This chapter describes the format of symbol table entries
and how stab assembler directives map to them. It also describes the
transformations that the assembler and linker make on data from stabs.
@menu
-* Symbol table format::
-* Transformations on symbol tables::
+* Symbol Table Format::
+* Transformations On Symbol Tables::
@end menu
-@node Symbol table format
-@section Symbol table format
+@node Symbol Table Format
+@section Symbol Table Format
Each time the assembler encounters a stab directive, it puts
each field of the stab into a corresponding field in a symbol table
-entry of its output file. If the stab contains a @var{string} field, the
+entry of its output file. If the stab contains a string field, the
symbol table entry for that stab points to a string table entry
containing the string data from the stab. Assembler labels become
relocatable addresses. Symbol table entries in a.out have the format:
@};
@end example
-For @code{.stabs} directives, the @code{n_strx} field holds the offset
-in bytes from the start of the string table to the string table entry
-containing the @var{string} field. For other classes of stabs
-(@code{.stabn} and @code{.stabd}) this field is zero.
+If the stab has a string, the @code{n_strx} field holds the offset in
+bytes of the string within the string table. The string is terminated
+by a NUL character. If the stab lacks a string (for example, it was
+produced by a @code{.stabn} or @code{.stabd} directive), the
+@code{n_strx} field is zero.
Symbol table entries with @code{n_type} field values greater than 0x1f
originated as stabs generated by the compiler (with one random
exception). The other entries were placed in the symbol table of the
executable by the assembler or the linker.
-@node Transformations on symbol tables
-@section Transformations on symbol tables
+@node Transformations On Symbol Tables
+@section Transformations on Symbol Tables
The linker concatenates object files and does fixups of externally
defined symbols.
low 5 bits are @code{N_ABS}, which tells the linker not to relocate the
value.
-Where the @var{value} field of a stab contains an assembly language label,
+Where the value of a stab contains an assembly language label,
it is transformed by each build step. The assembler turns it into a
relocatable address and the linker turns it into an absolute address.
@menu
-* Transformations on static variables::
-* Transformations on global variables::
+* Transformations On Static Variables::
+* Transformations On Global Variables::
+* ELF Transformations:: In ELF, things are a bit different.
@end menu
-@node Transformations on static variables
-@subsection Transformations on static variables
+@node Transformations On Static Variables
+@subsection Transformations on Static Variables
This source line defines a static variable at file scope:
0000e00c - 00 0000 STSYM s_g_repeat:S1
@end example
-@node Transformations on global variables
-@subsection Transformations on global variables
+@node Transformations On Global Variables
+@subsection Transformations on Global Variables
Stabs for global variables do not contain location information. In
this case, the debugger finds location information in the assembler or
file (see below). The first one originated as a stab. The second one
is an external symbol. The upper case @samp{D} signifies that the
@code{n_type} field of the symbol table contains 7, @code{N_DATA} with
-local linkage. The @var{value} field is empty for the stab entry. For
-the linker symbol, it contains the relocatable address corresponding to
-the variable.
+local linkage. The stab's value is zero since the value is not used for
+@code{N_GSYM} stabs. The value of the linker symbol is the relocatable
+address corresponding to the variable.
@example
00000000 - 00 0000 GSYM g_foo:G2
0000e008 D _g_foo
@end example
+@node ELF Transformations
+@subsection Transformations of Stabs in ELF Files
+
+For ELF files, use @code{objdump --stabs} instead of @code{nm} to show
+the stabs in an object or executable file. @code{objdump} is a GNU
+utility; Sun does not provide any equivalent.
+
+The following example is for a stab whose value is an address is
+relative to the compilation unit (@pxref{Stabs In ELF}). For example,
+if the source line
+
+@example
+static int ld = 5;
+@end example
+
+appears within a function, then the assembly language output from the
+compiler contains:
+
+@example
+.Ddata.data:
+@dots{}
+ .stabs "ld:V(0,3)",0x26,0,4,.L18-Ddata.data # @r{0x26 is N_STSYM}
+@dots{}
+.L18:
+ .align 4
+ .word 0x5
+@end example
+
+Because the value is formed by subtracting one symbol from another, the
+value is absolute, not relocatable, and so the object file contains
+
+@example
+Symnum n_type n_othr n_desc n_value n_strx String
+31 STSYM 0 4 00000004 680 ld:V(0,3)
+@end example
+
+without any relocations, and the executable file also contains
+
+@example
+Symnum n_type n_othr n_desc n_value n_strx String
+31 STSYM 0 4 00000004 680 ld:V(0,3)
+@end example
+
@node Cplusplus
-@chapter GNU C++ stabs
+@chapter GNU C++ Stabs
@menu
-* Basic cplusplus types::
-* Simple classes::
-* Class instance::
+* Basic Cplusplus Types::
+* Simple Classes::
+* Class Instance::
* Methods:: Method definition
* Protections::
-* Method modifiers::
-* Virtual methods::
+* Method Modifiers::
+* Virtual Methods::
* Inheritence::
-* Virtual base classes::
-* Static members::
+* Virtual Base Classes::
+* Static Members::
@end menu
Type descriptors added for C++ descriptions:
gibberish. Can anyone say what really goes here?).
Note that there is a conflict between this and type attributes
-(@pxref{String field}); both use type descriptor @samp{@@}.
+(@pxref{String Field}); both use type descriptor @samp{@@}.
Fortunately, the @samp{@@} type descriptor used in this C++ sense always
will be followed by a digit, @samp{(}, or @samp{-}, and type attributes
never start with those things.
@end table
-@node Basic cplusplus types
-@section Basic types for C++
+@node Basic Cplusplus Types
+@section Basic Types For C++
<< the examples that follow are based on a01.C >>
.stabs "$vtbl_ptr_type:T17",128,0,0,0
@end example
-@node Simple classes
-@section Simple class definition
+@node Simple Classes
+@section Simple Class Definition
The stabs describing C++ language features are an extension of the
stabs describing C. Stabs representing C++ class types elaborate
.stabs "baseA:T20",128,0,0,0
@end smallexample
-@node Class instance
-@section Class instance
+@node Class Instance
+@section Class Instance
As shown above, describing even a simple C++ class definition is
accomplished by massively extending the stab format used in C to
@end example
@node Methods
-@section Method defintion
+@section Method Definition
The class definition shown above declares Ameth. The C++ source below
defines Ameth:
pubMeth::24=##12;:f;2A.;;",128,0,0,0
@end smallexample
-@node Method modifiers
-@section Method modifiers (@code{const}, @code{volatile}, @code{const volatile})
+@node Method Modifiers
+@section Method Modifiers (@code{const}, @code{volatile}, @code{const volatile})
<< based on a6.C >>
ConstVolMeth::23=##12;:f;2D.;;",128,0,0,0
@end example
-@node Virtual methods
-@section Virtual methods
+@node Virtual Methods
+@section Virtual Methods
<< The following examples are based on a4.C >>
28;;D_virt::32:i;2A*-2147483646;31;;;~%20;",128,0,0,0
@end smallexample
-@node Virtual base classes
-@section Virtual base classes
+@node Virtual Base Classes
+@section Virtual Base Classes
A derived class object consists of a concatination in memory of the data
areas defined by each base class, starting with the leftmost and ending
virtual base pointer for @code{B} at 128, and @code{Ddat} at 160.
-@node Static members
-@section Static members
+@node Static Members
+@section Static Members
The data area for a class is a concatenation of the space used by the
data members of the class. If the class has virtual methods, a vtable
<< How is this reflected in stabs? See Cygnus bug #677 for some info. >>
-@node Stab types
-@appendix Table of stab types
+@node Stab Types
+@appendix Table of Stab Types
-The following are all the possible values for the stab @var{type} field, for
-@code{a.out} files, in numeric order. This does not apply to XCOFF.
+The following are all the possible values for the stab type field, for
+@code{a.out} files, in numeric order. This does not apply to XCOFF, but
+it does apply to stabs in ELF. Stabs in ECOFF use these values but add
+0x8f300 to distinguish them from non-stab symbols.
The symbolic names are defined in the file @file{include/aout/stabs.def}.
@menu
-* Non-stab symbol types:: Types from 0 to 0x1f
-* Stab symbol types:: Types from 0x20 to 0xff
+* Non-Stab Symbol Types:: Types from 0 to 0x1f
+* Stab Symbol Types:: Types from 0x20 to 0xff
@end menu
-@node Non-stab symbol types
-@appendixsec Non-stab symbol types
+@node Non-Stab Symbol Types
+@appendixsec Non-Stab Symbol Types
The following types are used by the linker and assembler, not by stab
directives. Since this document does not attempt to describe aspects of
File name of a @file{.o} file
@end table
-@node Stab symbol types
-@appendixsec Stab symbol types
+@node Stab Symbol Types
+@appendixsec Stab Symbol Types
The following symbol types indicate that this is a stab. This is the
full list of stab numbers, including stab types that are used in
@table @code
@item 0x20 N_GSYM
-Global symbol; see @ref{Global variables}.
+Global symbol; see @ref{Global Variables}.
@item 0x22 N_FNAME
Function name (for BSD Fortran); see @ref{Procedures}.
BSS segment file-scope variable; see @ref{Statics}.
@item 0x2a N_MAIN
-Name of main routine; see @ref{Main program}.
+Name of main routine; see @ref{Main Program}.
-@c FIXME: discuss this in the Statics node where we talk about
-@c the fact that the n_type indicates the section.
@item 0x2c N_ROSYM
Variable in @code{.rodata} section; see @ref{Statics}.
Debugger options (Solaris2).
@item 0x40 N_RSYM
-Register variable; see @ref{Register variables}.
+Register variable; see @ref{Register Variables}.
@item 0x42 N_M2C
Modula-2 compilation unit; see @ref{N_M2C}.
@item 0x44 N_SLINE
-Line number in text segment; see @ref{Line numbers}.
+Line number in text segment; see @ref{Line Numbers}.
@item 0x46 N_DSLINE
-Line number in data segment; see @ref{Line numbers}.
+Line number in data segment; see @ref{Line Numbers}.
@item 0x48 N_BSLINE
-Line number in bss segment; see @ref{Line numbers}.
+Line number in bss segment; see @ref{Line Numbers}.
@item 0x48 N_BROWS
Sun source code browser, path to @file{.cb} file; see @ref{N_BROWS}.
Last stab for module (Solaris2).
@item 0x64 N_SO
-Path and name of source file; see @ref{Source files}.
+Path and name of source file; see @ref{Source Files}.
@item 0x80 N_LSYM
-Stack variable (@pxref{Stack variables}) or type (@pxref{Typedefs}).
+Stack variable (@pxref{Stack Variables}) or type (@pxref{Typedefs}).
@item 0x82 N_BINCL
-Beginning of an include file (Sun only); see @ref{Include files}.
+Beginning of an include file (Sun only); see @ref{Include Files}.
@item 0x84 N_SOL
-Name of include file; see @ref{Include files}.
+Name of include file; see @ref{Include Files}.
@item 0xa0 N_PSYM
Parameter variable; see @ref{Parameters}.
@item 0xa2 N_EINCL
-End of an include file; see @ref{Include files}.
+End of an include file; see @ref{Include Files}.
@item 0xa4 N_ENTRY
Alternate entry point; see @ref{N_ENTRY}.
@item 0xc0 N_LBRAC
-Beginning of a lexical block; see @ref{Block structure}.
+Beginning of a lexical block; see @ref{Block Structure}.
@item 0xc2 N_EXCL
-Place holder for a deleted include file; see @ref{Include files}.
+Place holder for a deleted include file; see @ref{Include Files}.
@item 0xc4 N_SCOPE
Modula2 scope information (Sun linker); see @ref{N_SCOPE}.
@item 0xe0 N_RBRAC
-End of a lexical block; see @ref{Block structure}.
+End of a lexical block; see @ref{Block Structure}.
@item 0xe2 N_BCOMM
-Begin named common block; see @ref{Common blocks}.
+Begin named common block; see @ref{Common Blocks}.
@item 0xe4 N_ECOMM
-End named common block; see @ref{Common blocks}.
+End named common block; see @ref{Common Blocks}.
@item 0xe8 N_ECOML
-Member of a common block; see @ref{Common blocks}.
+Member of a common block; see @ref{Common Blocks}.
@c FIXME: How does this really work? Move it to main body of document.
@item 0xea N_WITH
@tableindent=.8in
@end iftex
-@node Symbol descriptors
-@appendix Table of symbol descriptors
+@node Symbol Descriptors
+@appendix Table of Symbol Descriptors
-These tell in the @code{.stabs} @var{string} field what kind of symbol
-the stab represents. They follow the colon which follows the symbol
-name. @xref{String field}, for more information about their use.
+The symbol descriptor is the character which follows the colon in many
+stabs, and which tells what kind of stab it is. @xref{String Field},
+for more information about their use.
@c Please keep this alphabetical
@table @code
@item @var{digit}
@itemx (
@itemx -
-Variable on the stack; see @ref{Stack variables}.
+Variable on the stack; see @ref{Stack Variables}.
@item a
-Parameter passed by reference in register; see @ref{Reference parameters}.
+Parameter passed by reference in register; see @ref{Reference Parameters}.
+
+@item b
+Based variable; see @ref{Based Variables}.
@item c
Constant; see @ref{Constants}.
@item C
Conformant array bound (Pascal, maybe other languages); @ref{Conformant
-arrays}. Name of a caught exception (GNU C++). These can be
+Arrays}. Name of a caught exception (GNU C++). These can be
distinguished because the latter uses @code{N_CATCH} and the former uses
another symbol type.
@item d
-Floating point register variable; see @ref{Register variables}.
+Floating point register variable; see @ref{Register Variables}.
@item D
-Parameter in floating point register; see @ref{Register parameters}.
+Parameter in floating point register; see @ref{Register Parameters}.
@item f
File scope function; see @ref{Procedures}.
Global function; see @ref{Procedures}.
@item G
-Global variable; see @ref{Global variables}.
+Global variable; see @ref{Global Variables}.
@item i
-@xref{Register parameters}.
+@xref{Register Parameters}.
@item I
-Internal (nested) procedure; see @ref{Nested procedures}.
+Internal (nested) procedure; see @ref{Nested Procedures}.
@item J
-Internal (nested) function; see @ref{Nested procedures}.
+Internal (nested) function; see @ref{Nested Procedures}.
@item L
Label name (documented by AIX, no further information known).
Static Procedure; see @ref{Procedures}.
@item R
-Register parameter; see @ref{Register parameters}.
+Register parameter; see @ref{Register Parameters}.
@item r
-Register variable; see @ref{Register variables}.
+Register variable; see @ref{Register Variables}.
@item S
File scope variable; see @ref{Statics}.
Enumeration, structure, or union tag; see @ref{Typedefs}.
@item v
-Parameter passed by reference; see @ref{Reference parameters}.
+Parameter passed by reference; see @ref{Reference Parameters}.
@item V
Procedure scope static variable; see @ref{Statics}.
@item x
-Conformant array; see @ref{Conformant arrays}.
+Conformant array; see @ref{Conformant Arrays}.
@item X
Function return variable; see @ref{Parameters}.
@end table
-@node Type descriptors
-@appendix Table of type descriptors
+@node Type Descriptors
+@appendix Table of Type Descriptors
-These tell in the @code{.stabs} @var{string} field what kind of type is being
-defined. They follow the type number and an equals sign.
-@xref{String field}, for more information about their use.
+The type descriptor is the character which follows the type number and
+an equals sign. It specifies what kind of type is being defined.
+@xref{String Field}, for more information about their use.
@table @code
@item @var{digit}
@itemx (
-Type reference; see @ref{String field}.
+Type reference; see @ref{String Field}.
@item -
-Reference to builtin type; see @ref{Negative type numbers}.
+Reference to builtin type; see @ref{Negative Type Numbers}.
@item #
Method (C++); see @ref{Cplusplus}.
@item *
-Pointer; see @ref{Miscellaneous types}.
+Pointer; see @ref{Miscellaneous Types}.
@item &
Reference (C++).
@item @@
-Type Attributes (AIX); see @ref{String field}. Member (class and variable)
+Type Attributes (AIX); see @ref{String Field}. Member (class and variable)
type (GNU C++); see @ref{Cplusplus}.
@item a
Open array; see @ref{Arrays}.
@item b
-Pascal space type (AIX); see @ref{Miscellaneous types}. Builtin integer
-type (Sun); see @ref{Builtin type descriptors}.
+Pascal space type (AIX); see @ref{Miscellaneous Types}. Builtin integer
+type (Sun); see @ref{Builtin Type Descriptors}.
@item B
-Volatile-qualified type; see @ref{Miscellaneous types}.
+Volatile-qualified type; see @ref{Miscellaneous Types}.
@item c
-Complex builtin type; see @ref{Builtin type descriptors}.
+Complex builtin type; see @ref{Builtin Type Descriptors}.
@item C
COBOL Picture type. See AIX documentation for details.
@item d
-File type; see @ref{Miscellaneous types}.
+File type; see @ref{Miscellaneous Types}.
@item D
N-dimensional dynamic array; see @ref{Arrays}.
N-dimensional subarray; see @ref{Arrays}.
@item f
-Function type; see @ref{Function types}.
+Function type; see @ref{Function Types}.
@item F
-Pascal function parameter; see @ref{Function types}
+Pascal function parameter; see @ref{Function Types}
@item g
-Builtin floating point type; see @ref{Builtin type descriptors}.
+Builtin floating point type; see @ref{Builtin Type Descriptors}.
@item G
COBOL Group. See AIX documentation for details.
@item i
-Imported type; see @ref{Cross-references}.
+Imported type; see @ref{Cross-References}.
@item k
-Const-qualified type; see @ref{Miscellaneous types}.
+Const-qualified type; see @ref{Miscellaneous Types}.
@item K
COBOL File Descriptor. See AIX documentation for details.
@item M
-Multiple instance type; see @ref{Miscellaneous types}.
+Multiple instance type; see @ref{Miscellaneous Types}.
@item n
String type; see @ref{Strings}.
Opaque type; see @ref{Typedefs}.
@item p
-Procedure; see @ref{Function types}.
+Procedure; see @ref{Function Types}.
@item P
Packed array; see @ref{Arrays}.
Range type; see @ref{Subranges}.
@item R
-Builtin floating type; see @ref{Builtin type descriptors} (Sun). Pascal
-subroutine parameter; see @ref{Function types} (AIX). Detecting this
+Builtin floating type; see @ref{Builtin Type Descriptors} (Sun). Pascal
+subroutine parameter; see @ref{Function Types} (AIX). Detecting this
conflict is possible with careful parsing (hint: a Pascal subroutine
parameter type will always contain a comma, and a builtin type
descriptor never will).
Structure type; see @ref{Structures}.
@item S
-Set type; see @ref{Miscellaneous types}.
+Set type; see @ref{Miscellaneous Types}.
@item u
Union; see @ref{Unions}.
union within a struct in C. See AIX documentation for details.
@item w
-Wide character; see @ref{Builtin type descriptors}.
+Wide character; see @ref{Builtin Type Descriptors}.
@item x
-Cross-reference; see @ref{Cross-references}.
+Cross-reference; see @ref{Cross-References}.
@item z
gstring; see @ref{Strings}.
@end table
-@node Expanded reference
-@appendix Expanded reference by stab type
+@node Expanded Reference
+@appendix Expanded Reference by Stab Type
@c FIXME: This appendix should go away; see N_PSYM or N_SO for an example.
For a full list of stab types, and cross-references to where they are
-described, see @ref{Stab types}. This appendix just duplicates certain
+described, see @ref{Stab Types}. This appendix just duplicates certain
information from the main body of this document; eventually the
information will all be in one place.
<<?>>
"path to associated @file{.cb} file"
-Note: @var{type} field value overlaps with N_BSLINE.
+Note: N_BROWS has the same value as N_BSLINE.
@end deffn
@node N_DEFD
@findex N_DEFD
GNU Modula2 definition module dependency.
-GNU Modula-2 definition module dependency. @var{value} is the modification
-time of the definition file. @var{other} is non-zero if it is imported with
-the GNU M2 keyword @code{%INITIALIZE}. Perhaps @code{N_M2C} can be used
-if there are enough empty fields?
+GNU Modula-2 definition module dependency. The value is the
+modification time of the definition file. The other field is non-zero
+if it is imported with the GNU M2 keyword @code{%INITIALIZE}. Perhaps
+@code{N_M2C} can be used if there are enough empty fields?
@end deffn
@node N_EHDECL
@findex N_CATCH
GNU C++ @code{catch} clause
-GNU C++ @code{catch} clause. @code{value} is its address. @code{desc}
+GNU C++ @code{catch} clause. The value is its address. The desc field
is nonzero if this entry is immediately followed by a @code{CAUGHT} stab
saying what exception was caught. Multiple @code{CAUGHT} stabs means
-that multiple exceptions can be caught here. If @code{desc} is 0, it
-means all exceptions are caught here.
+that multiple exceptions can be caught here. If desc is 0, it means all
+exceptions are caught here.
@end deffn
@node N_SSYM
@findex N_SSYM
Structure or union element.
-@code{value} is offset in the structure.
+The value is the offset in the structure.
<<?looking at structs and unions in C I didn't see these>>
@end deffn
@deffn @code{.stabn} N_ENTRY
@findex N_ENTRY
Alternate entry point.
-@code{value} is its address.
+The value is its address.
<<?>>
@end deffn
@deffn @code{.stabn} N_LENG
@findex N_LENG
Second symbol entry containing a length-value for the preceding entry.
-The @var{value} is the length.
+The value is the length.
@end deffn
@node Questions
-@appendix Questions and anomalies
+@appendix Questions and Anomalies
@itemize @bullet
@item
@c I think this is changed in GCC 2.4.5 to put the line number there.
For GNU C stabs defining local and global variables (@code{N_LSYM} and
-@code{N_GSYM}), the @var{desc} field is supposed to contain the source
-line number on which the variable is defined. In reality the @var{desc}
+@code{N_GSYM}), the desc field is supposed to contain the source
+line number on which the variable is defined. In reality the desc
field is always 0. (This behavior is defined in @file{dbxout.c} and
-putting a line number in @var{desc} is controlled by @samp{#ifdef
+putting a line number in desc is controlled by @samp{#ifdef
WINNING_GDB}, which defaults to false). GDB supposedly uses this
information if you say @samp{list @var{var}}. In reality, @var{var} can
be a variable defined in the program and GDB says @samp{function
@c dbx?
@end itemize
-@node XCOFF differences
-@appendix Differences between GNU stabs in a.out and GNU stabs in XCOFF
+@node XCOFF Differences
+@appendix Differences Between GNU Stabs in a.out and GNU Stabs in XCOFF
@c FIXME: Merge *all* these into the main body of the document.
The AIX/RS6000 native object file format is XCOFF with stabs. This
@c used (I suspect not), explain clearly, and move to node Statics.
Exception: initialised static @code{N_STSYM} and un-initialized static
@code{N_LCSYM} both map to the @code{C_STSYM} storage class. But the
-destinction is preserved because in XCOFF @code{N_STSYM} and
+distinction is preserved because in XCOFF @code{N_STSYM} and
@code{N_LCSYM} must be emited in a named static block. Begin the block
with @samp{.bs s[RW] data_section_name} for @code{N_STSYM} or @samp{.bs
s bss_section_name} for @code{N_LCSYM}. End the block with @samp{.es}.
N_LENG unknown
@end example
-@node Sun differences
-@appendix Differences between GNU stabs and Sun native stabs
+@node Sun Differences
+@appendix Differences Between GNU Stabs and Sun Native Stabs
@c FIXME: Merge all this stuff into the main body of the document.
@code{N_LSYM}. Sun doc talks about using @code{N_GSYM} too.
@item
-Sun C stabs use type number pairs in the format (@var{a},@var{b}) where
-@var{a} is a number starting with 1 and incremented for each sub-source
-file in the compilation. @var{b} is a number starting with 1 and
+Sun C stabs use type number pairs in the format
+(@var{file-number},@var{type-number}) where @var{file-number} is a
+number starting with 1 and incremented for each sub-source file in the
+compilation. @var{type-number} is a number starting with 1 and
incremented for each new type defined in the compilation. GNU C stabs
use the type number alone, with no source file number.
@end itemize
-@node Stabs in ELF
-@appendix Using stabs with the ELF object file format
+@node Stabs In ELF
+@appendix Using Stabs With The ELF Object File Format
The ELF object file format allows tools to create object files with
custom sections containing any arbitrary data. To use stabs in ELF
of the ELF file itself, as determined from the @code{EI_DATA} field in
the @code{e_ident} member of the ELF header.
-@c Is "source file" the right term for this concept? We don't mean that
-@c there is a separate one for include files (but "object file" or
-@c "object module" isn't quite right either; the output from ld -r is a
-@c single object file but contains many source files).
-The first stab in the @code{.stab} section for each source file is
+The first stab in the @code{.stab} section for each compilation unit is
synthetic, generated entirely by the assembler, with no corresponding
@code{.stab} directive as input to the assembler. This stab contains
the following fields:
header @code{sh_type} member set to @code{SHT_STRTAB} to mark it as a
string table.
-Because the linker does not process the @code{.stab} section in any
-special way, none of the addresses in the @code{n_value} field of the
-stabs are relocated by the linker. Instead they are relative to the
-source file (or some entity smaller than a source file, like a
-function). To find the address of each section corresponding to a given
-source file, the (compiler? assembler?) puts out symbols giving the
-address of each section for a given source file. Since these are normal
-ELF symbols, the linker can relocate them correctly. They are
-named @code{Bbss.bss} for the bss section, @code{Ddata.data} for
-the data section, and @code{Drodata.rodata} for the rodata section. I
-haven't yet figured out how the debugger gets the address for the text
-section.
+To keep linking fast, it is a bad idea to have the linker relocating
+stabs, so (except for a few cases, see below) none of the addresses in
+the @code{n_value} field of the stabs are relocated by the linker.
+Instead they are relative to the source file (or some entity smaller
+than a source file, like a function). To find the address of each
+section corresponding to a given source file, the compiler puts out
+symbols giving the address of each section for a given source file.
+Since these are ELF (not stab) symbols, the linker relocates them
+correctly without having to touch the stabs section. They are named
+@code{Bbss.bss} for the bss section, @code{Ddata.data} for the data
+section, and @code{Drodata.rodata} for the rodata section. For the text
+section, there is no such symbol (but there should be, see below). For
+an example of how these symbols work, @xref{ELF Transformations}. GCC
+does not provide these symbols; it instead relies on the stabs getting
+relocated, which slows down linking. Thus addresses which would
+normally be relative to @code{Bbss.bss}, etc., are already relocated.
+The Sun linker provided with Solaris 2.2 and earlier relocates stabs
+using normal ELF relocation information, as it would do for any section.
+Sun has been threatening to kludge their linker to not do this (to speed
+up linking), even though the correct way to avoid having the linker do
+these relocations is to have the compiler no longer output relocatable
+values. Last I heard they had been talked out of the linker kludge.
+See Sun point patch 101052-01 and Sun bug 1142109. This affects
+@samp{S} symbol descriptor stabs (@pxref{Statics}) and functions
+(@pxref{Procedures}). In the latter case, to adopt the clean solution
+(making the value of the stab relative to the start of the compilation
+unit), it would be necessary to invent a @code{Ttext.text} symbol,
+analogous to the @code{Bbss.bss}, etc., symbols. I recommend this
+rather than using a zero value and getting the address from the ELF
+symbols.
@node Symbol Types Index
@unnumbered Symbol Types Index