@end ifinfo
@ifinfo
-This document describes GNU stabs (debugging symbol tables) in a.out files.
+This document describes the stabs debugging symbol tables.
Copyright 1992 Free Software Foundation, Inc.
Contributed by Cygnus Support. Written by Julia Menapace.
* Overview:: Overview of stabs
* Program structure:: Encoding of the structure of the program
* Constants:: Constants
-* Simple types::
* Example:: A comprehensive example in C
* Variables::
-* Aggregate Types::
+* Types:: Type definitions
* Symbol tables:: Symbol information in symbol tables
-* GNU Cplusplus stabs::
+* Cplusplus::
Appendixes:
* Example2.c:: Source code for extended example
* Example2.s:: Assembly code for extended example
-* Quick reference:: Various refernce tables
+* Stab types:: Table A: Symbol types from stabs
+* Assembler types:: Table B: Symbol types from assembler and linker
+* Symbol Descriptors:: Table C
+* Type Descriptors:: Table D
* Expanded reference:: Reference information by stab type
* Questions:: Questions and anomolies
* xcoff-differences:: Differences between GNU stabs in a.out
the University of California at Berkeley, for the @code{pdx} Pascal
debugger; the format has spread widely since then.
+This document is one of the few published sources of documentation on
+stabs. It is believed to be completely comprehensive for stabs used by
+C. The lists of symbol descriptors (@pxref{Symbol Descriptors}) and
+type descriptors (@pxref{Type Descriptors}) are believed to be completely
+comprehensive. There are known to be stabs for C++ and COBOL which are
+poorly documented here. Stabs specific to other languages (e.g. Pascal,
+Modula-2) are probably not as well documented as they should be.
+
+Other sources of information on stabs are @cite{dbx and dbxtool
+interfaces}, 2nd edition, by Sun, circa 1988, and @cite{AIX Version 3.2
+Files Reference}, Fourth Edition, September 1992, "dbx Stabstring
+Grammar" in the a.out section, page 2-31. This document is believed to
+incorporate the information from those two sources except where it
+explictly directs you to them for more information.
+
@menu
* Flow:: Overview of debugging information flow
-* Stabs format:: Overview of stab format
+* Stabs Format:: Overview of stab format
* C example:: A simple example in C source
* Assembly code:: The simple example at the assembly level
@end menu
table. Debuggers use the symbol and string tables in the executable as
a source of debugging information about the program.
-@node Stabs format
+@node Stabs Format
@section Overview of stab format
There are three overall formats for stab assembler directives
@var{name} is the name of the symbol represented by the stab.
@var{name} can be omitted, which means the stab represents an unnamed
-object. For example, @code{":t10=*2"} defines type 10 as a pointer to
+object. For example, @samp{:t10=*2} defines type 10 as a pointer to
type 2, but does not give the type a name. Omitting the @var{name}
field is supported by AIX dbx and GDB after about version 4.8, but not
other debuggers.
character that tells more specifically what kind of symbol the stab
represents. If the @var{symbol_descriptor} is omitted, but type
information follows, then the stab represents a local variable. For a
-list of symbol_descriptors, see @ref{Symbol descriptors,,Table C: Symbol
+list of symbol descriptors, see @ref{Symbol Descriptors,,Table C: Symbol
descriptors}.
The @samp{c} symbol descriptor is an exception in that it is not
There is an AIX extension for type attributes. Following the @samp{=}
is any number of type attributes. Each one starts with @samp{@@} and
ends with @samp{;}. Debuggers, including AIX's dbx, skip any type
-attributes they do not recognize. The attributes are:
+attributes they do not recognize. GDB 4.9 does not do this--it will
+ignore the entire symbol containing a type attribute. Hopefully this
+will be fixed in the next GDB release. Because of a conflict with C++
+(@pxref{Cplusplus}), new attributes should not be defined which begin
+with a digit, @samp{(}, or @samp{-}; GDB may be unable to distinguish
+those from the C++ type descriptor @samp{@@}. The attributes are:
@table @code
@item a@var{boundary}
-@var{boundary} is an integer specifying the alignment. I assume that
+@var{boundary} is an integer specifying the alignment. I assume it
applies to all variables of this type.
@item s@var{size}
-Size in bits of a variabe of this type.
+Size in bits of a variable of this type.
@item p@var{integer}
Pointer class (for checking). Not sure what this means, or how
These symbol descriptors are unusual in that they are not followed by
type information.
-After the symbol descriptor and the type information, there is
-optionally a comma, followed by the name of the procedure, followed by a
-comma, followed by a name specifying the scope. The first name is local
-to the scope specified. I assume then that the name of the symbol
-(before the @samp{:}), if specified, is some sort of global name. I
-assume the name specifying the scope is the name of a function
-specifying that scope. This feature is an AIX extension, and this
-information is based on the manual; I haven't actually tried it.
+For any of the above symbol descriptors, after the symbol descriptor and
+the type information, there is optionally a comma, followed by the name
+of the procedure, followed by a comma, followed by a name specifying the
+scope. The first name is local to the scope specified. I assume then
+that the name of the symbol (before the @samp{:}), if specified, is some
+sort of global name. I assume the name specifying the scope is the name
+of a function specifying that scope. This feature is an AIX extension,
+and this information is based on the manual; I haven't actually tried
+it.
The stab representing a procedure is located immediately following the
code of the procedure. This stab is in turn directly followed by a
@item e@var{type-information},@var{value}
Enumeration constant. @var{type-information} is the type of the
constant, as it would appear after a symbol descriptor
-(@pxref{Overview}). @var{value} is the numeric value of the constant.
+(@pxref{Stabs Format}). @var{value} is the numeric value of the constant.
@item i@var{value}
Integer constant. @var{value} is the numeric value.
@item S@var{type-information},@var{elements},@var{bits},@var{pattern}
Set constant. @var{type-information} is the type of the constant, as it
-would appear after a symbol descriptor (@pxref{Overview}).
+would appear after a symbol descriptor (@pxref{Stabs Format}).
@var{elements} is the number of elements in the set (is this just the
number of bits set in @var{pattern}? Or redundant with the type? I
don't get it), @var{bits} is the number of bits in the constant (meaning
This information is followed by @samp{;}.
-@node Simple types
-@chapter Simple types
-
-@menu
-* Basic types:: Basic type definitions
-* Range types:: Range types defined by min and max value
-* Float "range" types:: Range type defined by size in bytes
-@end menu
-
-@node Basic types
-@section Basic type definitions
-
-@table @strong
-@item Directive:
-@code{.stabs}
-@item Type:
-@code{N_LSYM}
-@item Symbol Descriptor:
-@code{t}
-@end table
-
-The basic types for the language are described using the @code{N_LSYM} stab
-type. They are boilerplate and are emited by the compiler for each
-compilation unit. Basic type definitions are not always a complete
-description of the type and are sometimes circular. The debugger
-recognizes the type anyway, and knows how to read bits as that type.
-
-Each language and compiler defines a slightly different set of basic
-types. In this example we are looking at the basic types for C emited
-by the GNU compiler targeting the Sun4. Here the basic types are
-mostly defined as range types.
-
-
-@node Range types
-@section Range types defined by min and max value
-
-@table @strong
-@item Type Descriptor:
-@code{r}
-@end table
-
-When defining a range type, if the number after the first semicolon is
-smaller than the number after the second one, then the two numbers
-represent the smallest and the largest values in the range.
-
-@example
-4 .text
-5 Ltext0:
-
-.stabs "@var{name}:
- @var{descriptor} @r{(type)}
- @var{type-def}=
- @var{type-desc}
- @var{type-ref};
- @var{low-bound};
- @var{high-bound};
- ",
- N_LSYM, NIL, NIL, NIL
-
-6 .stabs "int:t1=r1;-2147483648;2147483647;",128,0,0,0
-7 .stabs "char:t2=r2;0;127;",128,0,0,0
-@end example
-
-Here the integer type (@code{1}) is defined as a range of the integer
-type (@code{1}). Likewise @code{char} is a range of @code{char}. This
-part of the definition is circular, but at least the high and low bound
-values of the range hold more information about the type.
-
-Here short unsigned int is defined as type number 8 and described as a
-range of type @code{int}, with a minimum value of 0 and a maximum of 65535.
-
-@example
-13 .stabs "short unsigned int:t8=r1;0;65535;",128,0,0,0
-@end example
-
-@node Float "range" types
-@section Range type defined by size in bytes
-
-@table @strong
-@item Type Descriptor:
-@code{r}
-@end table
-
-In a range definition, if the first number after the semicolon is
-positive and the second is zero, then the type being defined is a
-floating point type, and the number after the first semicolon is the
-number of bytes needed to represent the type. Note that this does not
-provide a way to distinguish 8-byte real floating point types from
-8-byte complex floating point types.
-
-@example
-.stabs "@var{name}:
- @var{desc}
- @var{type-def}=
- @var{type-desc}
- @var{type-ref};
- @var{bit-count};
- 0;
- ",
- N_LSYM, NIL, NIL, NIL
-
-17 .stabs "float:t12=r1;4;0;",128,0,0,0
-18 .stabs "double:t13=r1;8;0;",128,0,0,0
-19 .stabs "long double:t14=r1;8;0;",128,0,0,0
-@end example
-
-Cosmically enough, the @code{void} type is defined directly in terms of
-itself.
-
-@example
-.stabs "@var{name}:
- @var{symbol-desc}
- @var{type-def}=
- @var{type-ref}
- ",N_LSYM,NIL,NIL,NIL
-
-20 .stabs "void:t15=15",128,0,0,0
-@end example
-
-
@node Example
@chapter A Comprehensive Example in C
none
@end table
-
In addition to describing types, the @code{N_LSYM} stab type also
describes locally scoped automatic variables. Refer again to the body
of @code{main} in @file{example2.c}. It allocates two automatic
@exdent @code{N_LSYM} (128): automatic variable, scoped locally to @code{main}
.stabs "@var{name}:
- @var{type-ref}",
+ @var{type information}",
N_LSYM, NIL, NIL,
@var{frame-pointer-offset}
@exdent @code{N_LSYM} (128): automatic variable, scoped locally to the @code{for} loop
.stabs "@var{name}:
- @var{type-ref}",
+ @var{type information}",
N_LSYM, NIL, NIL,
@var{frame-pointer-offset}
101 .stabn 192,0,0,LBB3 ## begin `for' loop N_LBRAC
@end example
-Since the character in the string field following the colon is not a
-letter, there is no symbol descriptor. This means that the stab
-describes a local variable, and that the number after the colon is a
-type reference. In this case it a a reference to the basic type @code{int}.
-Notice also that the frame pointer offset is negative number for
-automatic variables.
-
+The symbol descriptor is omitted for automatic variables. Since type
+information should being with a digit, @samp{-}, or @samp{(}, only
+digits, @samp{-}, and @samp{(} are precluded from being used for symbol
+descriptors by this fact. However, the Acorn RISC machine (ARM) is said
+to get this wrong: it puts out a mere type definition here, without the
+preceding @code{@var{typenumber}=}. This is a bad idea; there is no
+guarantee that type descriptors are distinct from symbol descriptors.
@node Global Variables
@section Global Variables
@node Register variables
@section Register variables
+@c According to an old version of this manual, AIX uses C_RPSYM instead
+@c of C_RSYM. I am skeptical; this should be verified.
Register variables have their own stab type, @code{N_RSYM}, and their
own symbol descriptor, @code{r}. The stab's value field contains the
number of the register where the variable data will be stored.
AIX defines a separate symbol descriptor @samp{d} for floating point
registers. This seems incredibly stupid--why not just just give
-floating point registers different register numbers.
+floating point registers different register numbers? I have not
+verified whether the compiler actually uses @samp{d}.
If the register is explicitly allocated to a global variable, but not
initialized, as in
@example
"name" -> "param_name:#type"
-> pP (<<??>>)
- -> pF (<<??>>)
+ -> pF FORTRAN function parameter
-> X (function result variable)
-> b (based variable)
type definitions. Type 21 is pointer to type 2 (char) and argv (type 20) is
pointer to type 21.
-@node Aggregate Types
-@chapter Aggregate Types
+@node Types
+@chapter Type definitions
Now let's look at some variable definitions involving complex types.
This involves understanding better how types are described. In the
type definition.
@menu
-* Arrays::
-* Enumerations::
-* Structure tags::
-* Typedefs::
+* Builtin types:: Integers, floating point, void, etc.
+* Miscellaneous Types:: Pointers, sets, files, etc.
+* Cross-references:: Referring to a type not yet defined.
+* Subranges:: A type with a specific range.
+* Arrays:: An aggregate type of same-typed elements.
+* Strings:: Like an array but also has a length.
+* Enumerations:: Like an integer but the values have names.
+* Structures:: An aggregate type of different-typed elements.
+* Typedefs:: Giving a type a name
* Unions::
* Function types::
@end menu
-@node Arrays
-@section Array types
+@node Builtin types
+@section Builtin types
-@table @strong
-@item Directive:
-@code{.stabs}
-@item Types:
-@code{N_GSYM}, @code{N_LSYM}
-@item Symbol Descriptor:
-@code{T}
-@item Type Descriptor:
-@code{a}
+Certain types are built in (@code{int}, @code{short}, @code{void},
+@code{float}, etc.); the debugger recognizes these types and knows how
+to handle them. Thus don't be surprised if some of the following ways
+of specifying builtin types do not specify everything that a debugger
+would need to know about the type---in some cases they merely specify
+enough information to distinguish the type from other types.
+
+The traditional way to define builtin types is convolunted, so new ways
+have been invented to describe them. Sun's ACC uses the @samp{b} and
+@samp{R} type descriptors, and IBM uses negative type numbers. GDB can
+accept all three, as of version 4.8; dbx just accepts the traditional
+builtin types and perhaps one of the other two formats.
+
+@menu
+* Traditional Builtin Types:: Put on your seatbelts and prepare for kludgery
+* Builtin Type Descriptors:: Builtin types with special type descriptors
+* Negative Type Numbers:: Builtin types using negative type numbers
+@end menu
+
+@node Traditional Builtin Types
+@subsection Traditional Builtin types
+
+Often types are defined as subranges of themselves. If the array bounds
+can fit within an @code{int}, then they are given normally. For example:
+
+@example
+.stabs "int:t1=r1;-2147483648;2147483647;",128,0,0,0 ; 128 is N_LSYM
+.stabs "char:t2=r2;0;127;",128,0,0,0
+@end example
+
+Builtin types can also be described as subranges of @code{int}:
+
+@example
+.stabs "unsigned short:t6=r1;0;65535;",128,0,0,0
+@end example
+
+If the upper bound of a subrange is -1, it means that the type is an
+integral type whose bounds are too big to describe in an int.
+Traditionally this is only used for @code{unsigned int} and
+@code{unsigned long}; GCC also uses it for @code{long long} and
+@code{unsigned long long}, and the only way to tell those types apart is
+to look at their names. On other machines GCC puts out bounds in octal,
+with a leading 0. In this case a negative bound consists of a number
+which is a 1 bit followed by a bunch of 0 bits, and a positive bound is
+one in which a bunch of bits are 1.
+
+@example
+.stabs "unsigned int:t4=r1;0;-1;",128,0,0,0
+.stabs "long long int:t7=r1;0;-1;",128,0,0,0
+@end example
+
+If the upper bound of a subrange is 0, it means that this is a floating
+point type, and the lower bound of the subrange indicates the number of
+bytes in the type:
+
+@example
+.stabs "float:t12=r1;4;0;",128,0,0,0
+.stabs "double:t13=r1;8;0;",128,0,0,0
+@end example
+
+However, GCC writes @code{long double} the same way it writes
+@code{double}; the only way to distinguish them is by the name:
+
+@example
+.stabs "long double:t14=r1;8;0;",128,0,0,0
+@end example
+
+Complex types are defined the same way as floating-point types; the only
+way to distinguish a single-precision complex from a double-precision
+floating-point type is by the name.
+
+The C @code{void} type is defined as itself:
+
+@example
+.stabs "void:t15=15",128,0,0,0
+@end example
+
+I'm not sure how a boolean type is represented.
+
+@node Builtin Type Descriptors
+@subsection Defining Builtin Types using Builtin Type Descriptors
+
+There are various type descriptors to define builtin types:
+
+@table @code
+@c FIXME: clean up description of width and offset
+@item b @var{signed} @var{char-flag} @var{width} ; @var{offset} ; @var{nbits} ;
+Define an integral type. @var{signed} is @samp{u} for unsigned or
+@samp{s} for signed. @var{char-flag} is @samp{c} which indicates this
+is a character type, or is omitted. I assume this is to distinguish an
+integral type from a character type of the same size, for example it
+might make sense to set it for the C type @code{wchar_t} so the debugger
+can print such variables differently (Solaris does not do this). Sun
+sets it on the C types @code{signed char} and @code{unsigned char} which
+arguably is wrong. @var{width} and @var{offset} appear to be for small
+objects stored in larger ones, for example a @code{short} in an
+@code{int} register. @var{width} is normally the number of bytes in the
+type. @var{offset} seems to always be zero. @var{nbits} is the number
+of bits in the type.
+
+Note that type descriptor @samp{b} used for builtin types conflicts with
+its use for Pascal space types (@pxref{Miscellaneous Types}); they can
+be distinguished because the character following the type descriptor
+will be a digit, @samp{(}, or @samp{-} for a Pascal space type, or
+@samp{u} or @samp{s} for a builtin type.
+
+@item w
+Documented by AIX to define a wide character type, but their compiler
+actually uses negative type numbers (@pxref{Negative Type Numbers}).
+
+@item R @var{details} ; @var{bytes} ;
+@c FIXME: What does @var{details} mean?
+Define a floating point type. @var{details} is a number which has
+details about the type, for example whether it is complex. @var{bytes}
+is the number of bytes occupied by the type.
+
+@item g @var{type-information} ; @var{nbits}
+Documented by AIX to define a floating type, but their compiler actually
+uses negative type numbers (@pxref{Negative Type Numbers}).
+
+@item c @var{type-information} ; @var{nbits}
+Documented by AIX to define a complex type, but their compiler actually
+uses negative type numbers (@pxref{Negative Type Numbers}).
+@end table
+
+The C @code{void} type is defined as a signed integral type 0 bits long:
+@example
+.stabs "void:t19=bs0;0;0",128,0,0,0
+@end example
+
+I'm not sure how a boolean type is represented.
+
+@node Negative Type Numbers
+@subsection Negative Type numbers
+
+Since the debugger knows about the builtin types anyway, the idea of
+negative type numbers is simply to give a special type number which
+indicates the built in type. There is no stab defining these types.
+
+I'm not sure whether anyone has tried to define what this means if
+@code{int} can be other than 32 bits (or other types can be other than
+their customary size). If @code{int} has exactly one size for each
+architecture, then it can be handled easily enough, but if the size of
+@code{int} can vary according the compiler options, then it gets hairy.
+I guess the consistent way to do this would be to define separate
+negative type numbers for 16-bit @code{int} and 32-bit @code{int};
+therefore I have indicated below the customary size (and other format
+information) for each type. The information below is currently correct
+because AIX on the RS6000 is the only system which uses these type
+numbers. If these type numbers start to get used on other systems, I
+suspect the correct thing to do is to define a new number in cases where
+a type does not have the size and format indicated below.
+
+@table @code
+@item -1
+@code{int}, 32 bit signed integral type.
+
+@item -2
+@code{char}, 8 bit type holding a character. Both GDB and dbx on AIX
+treat this as signed. GCC uses this type whether @code{char} is signed
+or not, which seems like a bad idea. The AIX compiler (xlc) seems to
+avoid this type; it uses -5 instead for @code{char}.
+
+@item -3
+@code{short}, 16 bit signed integral type.
+
+@item -4
+@code{long}, 32 bit signed integral type.
+
+@item -5
+@code{unsigned char}, 8 bit unsigned integral type.
+
+@item -6
+@code{signed char}, 8 bit signed integral type.
+
+@item -7
+@code{unsigned short}, 16 bit unsigned integral type.
+
+@item -8
+@code{unsigned int}, 32 bit unsigned integral type.
+
+@item -9
+@code{unsigned}, 32 bit unsigned integral type.
+
+@item -10
+@code{unsigned long}, 32 bit unsigned integral type.
+
+@item -11
+@code{void}, type indicating the lack of a value.
+
+@item -12
+@code{float}, IEEE single precision.
+
+@item -13
+@code{double}, IEEE double precision.
+
+@item -14
+@code{long double}, IEEE extended, RS6000 format.
+
+@item -15
+@code{integer}. Pascal, I assume. 32 bit signed integral type.
+
+@item -16
+Boolean. Only one bit is used, not sure about the actual size of the
+type.
+
+@item -17
+@code{short real}. Pascal, I assume. IEEE single precision.
+
+@item -18
+@code{real}. Pascal, I assume. IEEE double precision.
+
+@item -19
+A Pascal Stringptr. @xref{Strings}.
+
+@item -20
+@code{character}, 8 bit unsigned type.
+
+@item -21
+@code{logical*1}, 8 bit unsigned integral type.
+
+@item -22
+@code{logical*2}, 16 bit unsigned integral type.
+
+@item -23
+@code{logical*4}, 32 bit unsigned integral type.
+
+@item -24
+@code{logical}, 32 bit unsigned integral type.
+
+@item -25
+A complex type consisting of two IEEE single-precision floating point values.
+
+@item -26
+A complex type consisting of two IEEE double-precision floating point values.
+
+@item -27
+@code{integer*1}, 8 bit signed integral type.
+
+@item -28
+@code{integer*2}, 16 bit signed integral type.
+
+@item -29
+@code{integer*4}, 32 bit signed integral type.
+
+@item -30
+Wide character. AIX appears not to use this for the C type
+@code{wchar_t}; instead it uses an integral type of the appropriate
+size.
+@end table
+
+@node Miscellaneous Types
+@section Miscellaneous Types
+
+@table @code
+@item b @var{type-information} ; @var{bytes}
+Pascal space type. This is documented by IBM; what does it mean?
+
+Note that this use of the @samp{b} type descriptor can be distinguished
+from its use for builtin integral types (@pxref{Builtin Type
+Descriptors}) because the character following the type descriptor is
+always a digit, @samp{(}, or @samp{-}.
+
+@item B @var{type-information}
+A volatile-qualified version of @var{type-information}. This is a Sun
+extension. A volatile-qualified type means that references and stores
+to a variable of that type must not be optimized or cached; they must
+occur as the user specifies them.
+
+@item d @var{type-information}
+File of type @var{type-information}. As far as I know this is only used
+by Pascal.
+
+@item k @var{type-information}
+A const-qualified version of @var{type-information}. This is a Sun
+extension. A const-qualified type means that a variable of this type
+cannot be modified.
+
+@item M @var{type-information} ; @var{length}
+Multiple instance type. The type seems to composed of @var{length}
+repetitions of @var{type-information}, for example @code{character*3} is
+represented by @samp{M-2;3}, where @samp{-2} is a reference to a
+character type (@pxref{Negative Type Numbers}). I'm not sure how this
+differs from an array. This appears to be a FORTRAN feature.
+@var{length} is a bound, like those in range types, @xref{Subranges}.
+
+@item S @var{type-information}
+Pascal set type. @var{type-information} must be a small type such as an
+enumeration or a subrange, and the type is a bitmask whose length is
+specified by the number of elements in @var{type-information}.
+
+@item * @var{type-information}
+Pointer to @var{type-information}.
@end table
-As an example of an array type consider the global variable below.
+@node Cross-references
+@section Cross-references to other types
+
+If a type is used before it is defined, one common way to deal with this
+is just to use a type reference to a type which has not yet been
+defined. The debugger is expected to be able to deal with this.
+
+Another way is with the @samp{x} type descriptor, which is followed by
+@samp{s} for a structure tag, @samp{u} for a union tag, or @samp{e} for
+a enumerator tag, followed by the name of the tag, followed by @samp{:}.
+for example the following C declarations:
@example
-15 char char_vec[3] = @{'a','b','c'@};
+struct foo;
+struct foo *bar;
@end example
-Since the array is a global variable, it is described by the N_GSYM
-stab type. The symbol descriptor G, following the colon in stab's
-string field, also says the array is a global variable. Following the
-G is a definition for type (19) as shown by the equals sign after the
-type number.
+produce
+
+@example
+.stabs "bar:G16=*17=xsfoo:",32,0,0,0
+@end example
+
+Not all debuggers support the @samp{x} type descriptor, so on some
+machines GCC does not use it. I believe that for the above example it
+would just emit a reference to type 17 and never define it, but I
+haven't verified that.
+
+Modula-2 imported types, at least on AIX, use the @samp{i} type
+descriptor, which is followed by the name of the module from which the
+type is imported, followed by @samp{:}, followed by the name of the
+type. There is then optionally a comma followed by type information for
+the type (This differs from merely naming the type (@pxref{Typedefs}) in
+that it identifies the module; I don't understand whether the name of
+the type given here is always just the same as the name we are giving
+it, or whether this type descriptor is used with a nameless stab
+(@pxref{Stabs Format}), or what). The symbol ends with @samp{;}.
-After the equals sign is a type descriptor, a, which says that the type
-being defined is an array. Following the type descriptor for an array
-is the type of the index, a semicolon, and the type of the array elements.
+@node Subranges
+@section Subrange types
+
+The @samp{r} type descriptor defines a type as a subrange of another
+type. It is followed by type information for the type which it is a
+subrange of, a semicolon, an integral lower bound, a semicolon, an
+integral upper bound, and a semicolon. The AIX documentation does not
+specify the trailing semicolon; I believe it is confused.
+
+AIX allows the bounds to be one of the following instead of an integer:
+
+@table @code
+@item A @var{offset}
+The bound is passed by reference on the stack at offset @var{offset}
+from the argument list. @xref{Parameters}, for more information on such
+offsets.
+
+@item T @var{offset}
+The bound is passed by value on the stack at offset @var{offset} from
+the argument list.
+
+@item a @var{register-number}
+The bound is pased by reference in register number
+@var{register-number}.
+
+@item t @var{register-number}
+The bound is passed by value in register number @var{register-number}.
+
+@item J
+There is no bound.
+@end table
+
+Subranges are also used for builtin types, @xref{Traditional Builtin Types}.
+
+@node Arrays
+@section Array types
+
+Arrays use the @samp{a} type descriptor. Following the type descriptor
+is the type of the index and the type of the array elements. The two
+types types are not separated by any sort of delimiter; if the type of
+the index does not end in a semicolon I don't know what is supposed to
+happen. IBM documents a semicolon between the two types. For the
+common case (a range type), this ends up as being the same since IBM
+documents a range type as not ending in a semicolon, but the latter does
+not accord with common practice, in which range types do end with
+semicolons.
The type of the index is often a range type, expressed as the letter r
-and some parameters. It defines the size of the array. In in the
-example below, the range @code{r1;0;2;} defines an index type which is
-a subrange of type 1 (integer), with a lower bound of 0 and an upper
-bound of 2. This defines the valid range of subscripts of a
-three-element C array.
+and some parameters. It defines the size of the array. In the example
+below, the range @code{r1;0;2;} defines an index type which is a
+subrange of type 1 (integer), with a lower bound of 0 and an upper bound
+of 2. This defines the valid range of subscripts of a three-element C
+array.
-The array definition above generates the assembly language that
-follows.
+For example, the definition
@example
-@exdent <32> N_GSYM - global variable
-@exdent .stabs "name:sym_desc(global)type_def(19)=type_desc(array)
-@exdent index_type_ref(range of int from 0 to 2);element_type_ref(char)";
-@exdent N_GSYM, NIL, NIL, NIL
+char char_vec[3] = @{'a','b','c'@};
+@end example
-32 .stabs "char_vec:G19=ar1;0;2;2",32,0,0,0
-33 .global _char_vec
-34 .align 4
-35 _char_vec:
-36 .byte 97
-37 .byte 98
-38 .byte 99
+@noindent
+produces the output
+
+@example
+.stabs "char_vec:G19=ar1;0;2;2",32,0,0,0
+ .global _char_vec
+ .align 4
+_char_vec:
+ .byte 97
+ .byte 98
+ .byte 99
+@end example
+
+If an array is @dfn{packed}, it means that the elements are spaced more
+closely than normal, saving memory at the expense of speed. For
+example, an array of 3-byte objects might, if unpacked, have each
+element aligned on a 4-byte boundary, but if packed, have no padding.
+One way to specify that something is packed is with type attributes
+(@pxref{Stabs Format}), in the case of arrays another is to use the
+@samp{P} type descriptor instead of @samp{a}. Other than specifying a
+packed array, @samp{P} is identical to @samp{a}.
+
+@c FIXME-what is it? A pointer?
+An open array is represented by the @samp{A} type descriptor followed by
+type information specifying the type of the array elements.
+
+@c FIXME: what is the format of this type? A pointer to a vector of pointers?
+An N-dimensional dynamic array is represented by
+
+@example
+D @var{dimensions} ; @var{type-information}
+@end example
+
+@c Does dimensions really have this meaning? The AIX documentation
+@c doesn't say.
+@var{dimensions} is the number of dimensions; @var{type-information}
+specifies the type of the array elements.
+
+@c FIXME: what is the format of this type? A pointer to some offsets in
+@c another array?
+A subarray of an N-dimensional array is represented by
+
+@example
+E @var{dimensions} ; @var{type-information}
@end example
+@c Does dimensions really have this meaning? The AIX documentation
+@c doesn't say.
+@var{dimensions} is the number of dimensions; @var{type-information}
+specifies the type of the array elements.
+
+@node Strings
+@section Strings
+
+Some languages, like C or the original Pascal, do not have string types,
+they just have related things like arrays of characters. But most
+Pascals and various other languages have string types, which are
+indicated as follows:
+
+@table @code
+@item n @var{type-information} ; @var{bytes}
+@var{bytes} is the maximum length. I'm not sure what
+@var{type-information} is; I suspect that it means that this is a string
+of @var{type-information} (thus allowing a string of integers, a string
+of wide characters, etc., as well as a string of characters). Not sure
+what the format of this type is. This is an AIX feature.
+
+@item z @var{type-information} ; @var{bytes}
+Just like @samp{n} except that this is a gstring, not an ordinary
+string. I don't know the difference.
+
+@item N
+Pascal Stringptr. What is this? This is an AIX feature.
+@end table
+
@node Enumerations
@section Enumerations
-@table @strong
-@item Directive:
-@code{.stabs}
-@item Type:
-@code{N_LSYM}
-@item Symbol Descriptor:
-@code{T}
-@item Type Descriptor:
-@code{e}
-@end table
+Enumerations are defined with the @samp{e} type descriptor.
+@c FIXME: Where does this information properly go? Perhaps it is
+@c redundant with something we already explain.
The source line below declares an enumeration type. It is defined at
file scope between the bodies of main and s_proc in example2.c.
-Because the N_LSYM is located after the N_RBRAC that marks the end of
+The type definition is located after the N_RBRAC that marks the end of
the previous procedure's block scope, and before the N_FUN that marks
-the beginning of the next procedure's block scope, the N_LSYM does not
-describe a block local symbol, but a file local one. The source line:
+the beginning of the next procedure's block scope. Therefore it does not
+describe a block local symbol, but a file local one.
+
+The source line:
@example
-29 enum e_places @{first,second=3,last@};
+enum e_places @{first,second=3,last@};
@end example
@noindent
-generates the following stab, located just after the N_RBRAC (close
-brace stab) for main. The type definition is in an N_LSYM stab
-because type definitions are file scope not global scope.
-
-@display
- <128> N_LSYM - local symbol
- .stab "name:sym_dec(type)type_def(22)=sym_desc(enum)
- enum_name:value(0),enum_name:value(3),enum_name:value(4),;",
- N_LSYM, NIL, NIL, NIL
-@end display
+generates the following stab
@example
-104 .stabs "e_places:T22=efirst:0,second:3,last:4,;",128,0,0,0
+.stabs "e_places:T22=efirst:0,second:3,last:4,;",128,0,0,0
@end example
The symbol descriptor (T) says that the stab describes a structure,
the e is a list of the elements of the enumeration. The format is
name:value,. The list of elements ends with a ;.
-@node Structure tags
-@section Structure Tags
+There is no standard way to specify the size of an enumeration type; it
+is determined by the architecture (normally all enumerations types are
+32 bits). There should be a way to specify an enumeration type of
+another size; type attributes would be one way to do this @xref{Stabs
+Format}.
+
+@node Structures
+@section Structures
@table @strong
@item Directive:
@code{.stabs}
@item Type:
-@code{N_LSYM}
+@code{N_LSYM} or @code{C_DECL}
@item Symbol Descriptor:
@code{T}
@item Type Descriptor:
definition for an element which is a pointer to type 16.
@node Typedefs
-@section Typedefs
-
-@table @strong
-@item Directive:
-@code{.stabs}
-@item Type:
-@code{N_LSYM}
-@item Symbol Descriptor:
-@code{t}
-@end table
-
-Here is the stab for the typedef equating the structure tag with a
-type.
+@section Giving a type a name
-@display
- <128> N_LSYM - type definition
- .stabs "name:sym_desc(type name)type_ref(struct_tag)",N_LSYM,NIL,NIL,NIL
-@end display
+To give a type a name, use the @samp{t} symbol descriptor. For example,
@example
-31 .stabs "s_typedef:t16",128,0,0,0
+.stabs "s_typedef:t16",128,0,0,0
@end example
-And here is the code generated for the structure variable.
-
-@display
- <32> N_GSYM - global symbol
- .stabs "name:sym_desc(global)type_ref(struct_tag)",N_GSYM,NIL,NIL,NIL
-@end display
-
-@example
-136 .stabs "g_an_s:G16",32,0,0,0
-137 .common _g_an_s,20,"bss"
-@end example
+specifies that @code{s_typedef} refers to type number 16. Such stabs
+have symbol type @code{N_LSYM} or @code{C_DECL}.
-Notice that the structure tag has the same type number as the typedef
-for the structure tag. It is impossible to distinguish between a
-variable of the struct type and one of its typedef by looking at the
-debugging information.
+If instead, you are giving a name to a tag for a structure, union, or
+enumeration, use the @samp{T} symbol descriptor instead. I believe C is
+the only language with this feature.
+If the type is an opaque type (I believe this is a Modula-2 feature),
+AIX provides a type descriptor to specify it. The type descriptor is
+@samp{o} and is followed by a name. I don't know what the name
+means---is it always the same as the name of the type, or is this type
+descriptor used with a nameless stab (@pxref{Stabs Format})? There
+optionally follows a comma followed by type information which defines
+the type of this type. If omitted, a semicolon is used in place of the
+comma and the type information, and, the type is much like a generic
+pointer type---it has a known size but little else about it is
+specified.
@node Unions
@section Unions
-@table @strong
-@item Directive:
-@code{.stabs}
-@item Type:
-@code{N_LSYM}
-@item Symbol Descriptor:
-@code{T}
-@item Type Descriptor:
-@code{u}
-@end table
-
Next let's look at unions. In example2 this union type is declared
locally to a procedure and an instance of the union is defined.
@node Function types
@section Function types
-@display
-type descriptor f
-@end display
+There are various types for function variables. These types are not
+used in defining functions; see symbol descriptor @samp{f}; they are
+used for things like pointers to functions.
-The last type descriptor in C which remains to be described is used
-for function types. Consider the following source line defining a
-global function pointer.
+The simple, traditional, type is type descriptor @samp{f} is followed by
+type information for the return type of the function, followed by a
+semicolon.
+
+This does not deal with functions the number and type of whose
+parameters are part of their type, as found in Modula-2 or ANSI C. AIX
+provides extensions to specify these, using the @samp{f}, @samp{F},
+@samp{p}, and @samp{R} type descriptors.
+
+First comes the type descriptor. Then, if it is @samp{f} or @samp{F},
+this is a function, and the type information for the return type of the
+function follows, followed by a comma. Then comes the number of
+parameters to the function and a semicolon. Then, for each parameter,
+there is the name of the parameter followed by a colon (this is only
+present for type descriptors @samp{R} and @samp{F} which represent
+Pascal function or procedure parameters), type information for the
+parameter, a comma, @samp{0} if passed by reference or @samp{1} if
+passed by value, and a semicolon. The type definition ends with a
+semicolon.
+
+For example,
@example
-4 int (*g_pf)();
+int (*g_pf)();
@end example
-It generates the following code. Since the variable is not
-initialized, the code is located in the common area at the end of the
-file.
-
-@display
- <32> N_GSYM - global variable
- .stabs "name:sym_desc(global)type_def(24)=ptr_to(25)=
- type_def(func)type_ref(int)
-@end display
+@noindent
+generates the following code:
@example
-134 .stabs "g_pf:G24=*25=f1",32,0,0,0
-135 .common _g_pf,4,"bss"
+.stabs "g_pf:G24=*25=f1",32,0,0,0
+ .common _g_pf,4,"bss"
@end example
-Since the variable is global, the stab type is N_GSYM and the symbol
-descriptor is G. The variable defines a new type, 24, which is a
-pointer to another new type, 25, which is defined as a function
-returning int.
+The variable defines a new type, 24, which is a pointer to another new
+type, 25, which is defined as a function returning int.
@node Symbol tables
@chapter Symbol information in symbol tables
215 0000e008 D _g_foo
@end example
-@node GNU Cplusplus stabs
+@node Cplusplus
@chapter GNU C++ stabs
@menu
* Static Members::
@end menu
-
-@subsection Symbol descriptors added for C++ descriptions:
-
-@display
-P - register parameter.
-@end display
-
@subsection type descriptors added for C++ descriptions
@table @code
@item #
method type (two ## if minimal debug)
-@item xs
-cross-reference
+@item @@
+Member (class and variable) type. It is followed by type information
+for the offset basetype, a comma, and type information for the type of
+the field being pointed to. (FIXME: this is acknowledged to be
+gibberish. Can anyone say what really goes here?).
+
+Note that there is a conflict between this and type attributes
+(@pxref{Stabs Format}); both use type descriptor @samp{@@}.
+Fortunately, the @samp{@@} type descriptor used in this C++ sense always
+will be followed by a digit, @samp{(}, or @samp{-}, and type attributes
+never start with those things.
@end table
-
@node Basic Cplusplus types
@section Basic types for C++
137 .common _g_an_s,20,"bss"
@end example
-
-@node Quick reference
-@appendix Quick reference
-
-@menu
-* Stab types:: Table A: Symbol types from stabs
-* Assembler types:: Table B: Symbol types from assembler and linker
-* Symbol descriptors:: Table C
-* Type Descriptors:: Table D
-@end menu
-
@node Stab types
-@section Table A: Symbol types from stabs
+@appendix Table A: Symbol types from stabs
Table A lists stab types sorted by type number. Stab type numbers are
32 and greater. This is the full list of stab numbers, including stab
@end smallexample
@node Assembler types
-@section Table B: Symbol types from assembler and linker
+@appendix Table B: Symbol types from assembler and linker
Table B shows the types of symbol table entries that hold assembler
and linker symbols.
31 0x1f N_FN file name of a .o file
@end smallexample
-@node Symbol descriptors
-@section Table C: Symbol descriptors
+@node Symbol Descriptors
+@appendix Table C: Symbol descriptors
@c Please keep this alphabetical
@table @code
-@item (empty)
+@item @var{(digit)}
+@itemx (
+@itemx -
Local variable, @xref{Automatic variables}.
@item a
Constant, @xref{Constants}.
@item C
-Conformant array bound, @xref{Parameters}.
+Conformant array bound (Pascal, maybe other languages),
+@xref{Parameters}. Name of a caught exception (GNU C++). These can be
+distinguished because the latter uses N_CATCH and the former uses
+another symbol type.
@item d
Floating point register variable, @xref{Register variables}.
Module, @xref{Procedures}.
@item p
-Argument list parameter @xref{Parameters}.
+Argument list parameter, @xref{Parameters}.
@item pP
@xref{Parameters}.
@item pF
-@xref{Parameters}.
+FORTRAN Function parameter, @xref{Parameters}.
@item P
-Global Procedure (AIX), @xref{Procedures}.
-Register parameter (GNU), @xref{Parameters}.
+Global Procedure (AIX), @xref{Procedures}. Register parameter (GNU),
+@xref{Parameters}. These two uses can be distinguised because a
+register parameter uses N_PSYM and a procedure uses some other symbol
+type. Prototype of function referenced by this file (Sun acc) (have not
+yet investigated this conflict. FIXME).
@item Q
Static Procedure, @xref{Procedures}.
Type name, @xref{Typedefs}.
@item T
-enumeration, struct or union tag, @xref{Unions}.
+enumeration, struct or union tag, @xref{Typedefs}.
@item v
-Call by reference, @xref{Parameters}.
+Parameter passed by reference, @xref{Parameters}.
@item V
Static procedure scope variable @xref{Initialized statics},
@end table
@node Type Descriptors
-@section Table D: Type Descriptors
+@appendix Table D: Type Descriptors
@table @code
-@item (digits)
-Type reference, @xref{Overview}.
+@item @var{digit}
+@itemx (
+Type reference, @xref{Stabs Format}.
+
+@item -
+Reference to builtin type, @xref{Negative Type Numbers}.
+
+@item #
+Method (C++), @xref{Cplusplus}.
@item *
-Pointer type.
+Pointer, @xref{Miscellaneous Types}.
+
+@item &
+Reference (C++).
@item @@
-Type Attributes (AIX), @xref{Overview}.
-Some C++ thing (GNU).
+Type Attributes (AIX), @xref{Stabs Format}. Member (class and variable)
+type (GNU C++), @xref{Cplusplus}.
@item a
-Array type.
+Array, @xref{Arrays}.
+
+@item A
+Open array, @xref{Arrays}.
+
+@item b
+Pascal space type (AIX), @xref{Miscellaneous Types}. Builtin integer
+type (Sun), @xref{Builtin Type Descriptors}.
+
+@item B
+Volatile-qualified type, @xref{Miscellaneous Types}.
+
+@item c
+Complex builtin type, @xref{Builtin Type Descriptors}.
+
+@item C
+COBOL Picture type. See AIX documentation for details.
+
+@item d
+File type, @xref{Miscellaneous Types}.
+
+@item D
+N-dimensional dynamic array, @xref{Arrays}.
@item e
-Enumeration type.
+Enumeration type, @xref{Enumerations}.
+
+@item E
+N-dimensional subarray, @xref{Arrays}.
@item f
-Function type.
+Function type, @xref{Function types}.
+
+@item g
+Builtin floating point type, @xref{Builtin Type Descriptors}.
+
+@item G
+COBOL Group. See AIX documentation for details.
+
+@item i
+Imported type, @xref{Cross-references}.
+
+@item k
+Const-qualified type, @xref{Miscellaneous Types}.
+
+@item K
+COBOL File Descriptor. See AIX documentation for details.
+
+@item n
+String type, @xref{Strings}.
+
+@item N
+Stringptr, @xref{Strings}.
+
+@item M
+Multiple instance type, @xref{Miscellaneous Types}.
+
+@item o
+Opaque type, @xref{Typedefs}.
+
+@item P
+Packed array, @xref{Arrays}.
@item r
-Range type.
+Range type, @xref{Subranges}.
+
+@item R
+Builtin floating type, @xref{Builtin Type Descriptors}.
@item s
-Structure type.
+Structure type, @xref{Structures}.
+
+@item S
+Set type, @xref{Miscellaneous Types}.
@item u
-Union specifications.
+Union, @xref{Unions}.
+
+@item v
+Variant record. This is a Pascal and Modula-2 feature which is like a
+union within a struct in C. See AIX documentation for details.
+
+@item w
+Wide character, @xref{Builtin Type Descriptors}.
+
+@item x
+Cross-reference, @xref{Cross-references}.
+@item z
+gstring, @xref{Strings}.
@end table
@node Expanded reference
@appendix Expanded reference by stab type.
+@c FIXME: For most types this should be much shorter and much sweeter,
+@c see N_PSYM for an example. For stuff like N_SO where the stab type
+@c really is the important thing, the information can stay here.
+
+@c FIXME: It probably should be merged with Tables A and B.
+
Format of an entry:
The first line is the symbol type expressed in decimal, hexadecimal,
block. Begin the block with .bs s[RW] data_section_name for N_STSYM
or .bs s bss_section_name for N_LCSYM. End the block with .es
-@item
-xcoff stabs describing tags and typedefs use the N_DECL (0x8c)instead
-of N_LSYM stab type.
-
-@item
-xcoff uses N_RPSYM (0x8e) instead of the N_RSYM stab type for register
-variables. If the register variable is also a value parameter, then
-use R instead of P for the symbol descriptor.
-
-6.
-xcoff uses negative numbers as type references to the basic types.
-There are no boilerplate type definitions emited for these basic
-types. << make table of basic types and type numbers for C >>
-
-@item
-xcoff .stabx sometimes don't have the name part of the string field.
-
@item
xcoff uses a .file stab type to represent the source file name. There
is no stab for the path to the source file.