+++ /dev/null
-\input texinfo @c -*-texinfo-*-
-@c %**start of header
-@setfilename g++int.info
-@settitle G++ internals
-@setchapternewpage odd
-@ifinfo
-@dircategory Programming
-@direntry
-* G++ internals: (g++int). G++ Internals.
-@end direntry
-@end ifinfo
-@c %**end of header
-
-@node Top, Limitations of g++, (dir), (dir)
-@chapter Internal Architecture of the Compiler
-
-This is meant to describe the C++ front-end for gcc in detail.
-Questions and comments to Jason Merrill @email{jason@@redhat.com} and
-Mark Mitchell @email{mark@@codesourcery.com}.
-
-@menu
-* Limitations of g++::
-* Routines::
-* Implementation Specifics::
-* Glossary::
-* Macros::
-* Typical Behavior::
-* Coding Conventions::
-* Templates::
-* Access Control::
-* Error Reporting::
-* Parser::
-* Exception Handling::
-* Free Store::
-* Mangling:: Function name mangling for C++ and Java
-* Concept Index::
-@end menu
-
-@node Limitations of g++, Routines, Top, Top
-@section Limitations of g++
-
-@itemize @bullet
-@item
-Limitations on input source code: 240 nesting levels with the parser
-stacksize (YYSTACKSIZE) set to 500 (the default), and requires around
-16.4k swap space per nesting level. The parser needs about 2.09 *
-number of nesting levels worth of stackspace.
-
-@cindex pushdecl_class_level
-@item
-I suspect there are other uses of pushdecl_class_level that do not call
-set_identifier_type_value in tandem with the call to
-pushdecl_class_level. It would seem to be an omission.
-
-@end itemize
-
-@node Routines, Implementation Specifics, Limitations of g++, Top
-@section Routines
-
-This section describes some of the routines used in the C++ front-end.
-
-@code{build_vtable} and @code{prepare_fresh_vtable} is used only within
-the @file{cp-class.c} file, and only in @code{finish_struct} and
-@code{modify_vtable_entries}.
-
-@code{build_vtable}, @code{prepare_fresh_vtable}, and
-@code{finish_struct} are the only routines that set @code{DECL_VPARENT}.
-
-@code{finish_struct} can steal the virtual function table from parents,
-this prohibits related_vslot from working. When finish_struct steals,
-we know that
-
-@example
-get_binfo (DECL_FIELD_CONTEXT (CLASSTYPE_VFIELD (t)), t, 0)
-@end example
-
-@noindent
-will get the related binfo.
-
-@code{layout_basetypes} does something with the VIRTUALS.
-
-Supposedly (according to Tiemann) most of the breadth first searching
-done, like in @code{get_base_distance} and in @code{get_binfo} was not
-because of any design decision. I have since found out the at least one
-part of the compiler needs the notion of depth first binfo searching, I
-am going to try and convert the whole thing, it should just work. The
-term left-most refers to the depth first left-most node. It uses
-@code{MAIN_VARIANT == type} as the condition to get left-most, because
-the things that have @code{BINFO_OFFSET}s of zero are shared and will
-have themselves as their own @code{MAIN_VARIANT}s. The non-shared right
-ones, are copies of the left-most one, hence if it is its own
-@code{MAIN_VARIANT}, we know it IS a left-most one, if it is not, it is
-a non-left-most one.
-
-@code{get_base_distance}'s path and distance matters in its use in:
-
-@itemize @bullet
-@item
-@code{prepare_fresh_vtable} (the code is probably wrong)
-@item
-@code{init_vfields} Depends upon distance probably in a safe way,
-build_offset_ref might use partial paths to do further lookups,
-hack_identifier is probably not properly checking access.
-
-@item
-@code{get_first_matching_virtual} probably should check for
-@code{get_base_distance} returning -2.
-
-@item
-@code{resolve_offset_ref} should be called in a more deterministic
-manner. Right now, it is called in some random contexts, like for
-arguments at @code{build_method_call} time, @code{default_conversion}
-time, @code{convert_arguments} time, @code{build_unary_op} time,
-@code{build_c_cast} time, @code{build_modify_expr} time,
-@code{convert_for_assignment} time, and
-@code{convert_for_initialization} time.
-
-But, there are still more contexts it needs to be called in, one was the
-ever simple:
-
-@example
-if (obj.*pmi != 7)
- @dots{}
-@end example
-
-Seems that the problems were due to the fact that @code{TREE_TYPE} of
-the @code{OFFSET_REF} was not a @code{OFFSET_TYPE}, but rather the type
-of the referent (like @code{INTEGER_TYPE}). This problem was fixed by
-changing @code{default_conversion} to check @code{TREE_CODE (x)},
-instead of only checking @code{TREE_CODE (TREE_TYPE (x))} to see if it
-was @code{OFFSET_TYPE}.
-
-@end itemize
-
-@node Implementation Specifics, Glossary, Routines, Top
-@section Implementation Specifics
-
-@itemize @bullet
-@item Explicit Initialization
-
-The global list @code{current_member_init_list} contains the list of
-mem-initializers specified in a constructor declaration. For example:
-
-@example
-foo::foo() : a(1), b(2) @{@}
-@end example
-
-@noindent
-will initialize @samp{a} with 1 and @samp{b} with 2.
-@code{expand_member_init} places each initialization (a with 1) on the
-global list. Then, when the fndecl is being processed,
-@code{emit_base_init} runs down the list, initializing them. It used to
-be the case that g++ first ran down @code{current_member_init_list},
-then ran down the list of members initializing the ones that weren't
-explicitly initialized. Things were rewritten to perform the
-initializations in order of declaration in the class. So, for the above
-example, @samp{a} and @samp{b} will be initialized in the order that
-they were declared:
-
-@example
-class foo @{ public: int b; int a; foo (); @};
-@end example
-
-@noindent
-Thus, @samp{b} will be initialized with 2 first, then @samp{a} will be
-initialized with 1, regardless of how they're listed in the mem-initializer.
-
-@item The Explicit Keyword
-
-The use of @code{explicit} on a constructor is used by @code{grokdeclarator}
-to set the field @code{DECL_NONCONVERTING_P}. That value is used by
-@code{build_method_call} and @code{build_user_type_conversion_1} to decide
-if a particular constructor should be used as a candidate for conversions.
-
-@end itemize
-
-@node Glossary, Macros, Implementation Specifics, Top
-@section Glossary
-
-@table @r
-@item binfo
-The main data structure in the compiler used to represent the
-inheritance relationships between classes. The data in the binfo can be
-accessed by the BINFO_ accessor macros.
-
-@item vtable
-@itemx virtual function table
-
-The virtual function table holds information used in virtual function
-dispatching. In the compiler, they are usually referred to as vtables,
-or vtbls. The first index is not used in the normal way, I believe it
-is probably used for the virtual destructor.
-
-@item vfield
-
-vfields can be thought of as the base information needed to build
-vtables. For every vtable that exists for a class, there is a vfield.
-See also vtable and virtual function table pointer. When a type is used
-as a base class to another type, the virtual function table for the
-derived class can be based upon the vtable for the base class, just
-extended to include the additional virtual methods declared in the
-derived class. The virtual function table from a virtual base class is
-never reused in a derived class. @code{is_normal} depends upon this.
-
-@item virtual function table pointer
-
-These are @code{FIELD_DECL}s that are pointer types that point to
-vtables. See also vtable and vfield.
-@end table
-
-@node Macros, Typical Behavior, Glossary, Top
-@section Macros
-
-This section describes some of the macros used on trees. The list
-should be alphabetical. Eventually all macros should be documented
-here.
-
-@table @code
-@item BINFO_BASETYPES
-A vector of additional binfos for the types inherited by this basetype.
-The binfos are fully unshared (except for virtual bases, in which
-case the binfo structure is shared).
-
- If this basetype describes type D as inherited in C,
- and if the basetypes of D are E anf F,
- then this vector contains binfos for inheritance of E and F by C.
-
-Has values of:
-
- TREE_VECs
-
-
-@item BINFO_INHERITANCE_CHAIN
-Temporarily used to represent specific inheritances. It usually points
-to the binfo associated with the lesser derived type, but it can be
-reversed by reverse_path. For example:
-
-@example
- Z ZbY least derived
- |
- Y YbX
- |
- X Xb most derived
-
-TYPE_BINFO (X) == Xb
-BINFO_INHERITANCE_CHAIN (Xb) == YbX
-BINFO_INHERITANCE_CHAIN (Yb) == ZbY
-BINFO_INHERITANCE_CHAIN (Zb) == 0
-@end example
-
-Not sure is the above is really true, get_base_distance has is point
-towards the most derived type, opposite from above.
-
-Set by build_vbase_path, recursive_bounded_basetype_p,
-get_base_distance, lookup_field, lookup_fnfields, and reverse_path.
-
-What things can this be used on:
-
- TREE_VECs that are binfos
-
-
-@item BINFO_OFFSET
-The offset where this basetype appears in its containing type.
-BINFO_OFFSET slot holds the offset (in bytes) from the base of the
-complete object to the base of the part of the object that is allocated
-on behalf of this `type'. This is always 0 except when there is
-multiple inheritance.
-
-Used on TREE_VEC_ELTs of the binfos BINFO_BASETYPES (...) for example.
-
-
-@item BINFO_VIRTUALS
-A unique list of functions for the virtual function table. See also
-TYPE_BINFO_VIRTUALS.
-
-What things can this be used on:
-
- TREE_VECs that are binfos
-
-
-@item BINFO_VTABLE
-Used to find the VAR_DECL that is the virtual function table associated
-with this binfo. See also TYPE_BINFO_VTABLE. To get the virtual
-function table pointer, see CLASSTYPE_VFIELD.
-
-What things can this be used on:
-
- TREE_VECs that are binfos
-
-Has values of:
-
- VAR_DECLs that are virtual function tables
-
-
-@item BLOCK_SUPERCONTEXT
-In the outermost scope of each function, it points to the FUNCTION_DECL
-node. It aids in better DWARF support of inline functions.
-
-
-@item CLASSTYPE_TAGS
-CLASSTYPE_TAGS is a linked (via TREE_CHAIN) list of member classes of a
-class. TREE_PURPOSE is the name, TREE_VALUE is the type (pushclass scans
-these and calls pushtag on them.)
-
-finish_struct scans these to produce TYPE_DECLs to add to the
-TYPE_FIELDS of the type.
-
-It is expected that name found in the TREE_PURPOSE slot is unique,
-resolve_scope_to_name is one such place that depends upon this
-uniqueness.
-
-
-@item CLASSTYPE_METHOD_VEC
-The following is true after finish_struct has been called (on the
-class?) but not before. Before finish_struct is called, things are
-different to some extent. Contains a TREE_VEC of methods of the class.
-The TREE_VEC_LENGTH is the number of differently named methods plus one
-for the 0th entry. The 0th entry is always allocated, and reserved for
-ctors and dtors. If there are none, TREE_VEC_ELT(N,0) == NULL_TREE.
-Each entry of the TREE_VEC is a FUNCTION_DECL. For each FUNCTION_DECL,
-there is a DECL_CHAIN slot. If the FUNCTION_DECL is the last one with a
-given name, the DECL_CHAIN slot is NULL_TREE. Otherwise it is the next
-method that has the same name (but a different signature). It would
-seem that it is not true that because the DECL_CHAIN slot is used in
-this way, we cannot call pushdecl to put the method in the global scope
-(cause that would overwrite the TREE_CHAIN slot), because they use
-different _CHAINs. finish_struct_methods setups up one version of the
-TREE_CHAIN slots on the FUNCTION_DECLs.
-
-friends are kept in TREE_LISTs, so that there's no need to use their
-TREE_CHAIN slot for anything.
-
-Has values of:
-
- TREE_VECs
-
-
-@item CLASSTYPE_VFIELD
-Seems to be in the process of being renamed TYPE_VFIELD. Use on types
-to get the main virtual function table pointer. To get the virtual
-function table use BINFO_VTABLE (TYPE_BINFO ()).
-
-Has values of:
-
- FIELD_DECLs that are virtual function table pointers
-
-What things can this be used on:
-
- RECORD_TYPEs
-
-
-@item DECL_CLASS_CONTEXT
-Identifies the context that the _DECL was found in. For virtual function
-tables, it points to the type associated with the virtual function
-table. See also DECL_CONTEXT, DECL_FIELD_CONTEXT and DECL_FCONTEXT.
-
-The difference between this and DECL_CONTEXT, is that for virtuals
-functions like:
-
-@example
-struct A
-@{
- virtual int f ();
-@};
-
-struct B : A
-@{
- int f ();
-@};
-
-DECL_CONTEXT (A::f) == A
-DECL_CLASS_CONTEXT (A::f) == A
-
-DECL_CONTEXT (B::f) == A
-DECL_CLASS_CONTEXT (B::f) == B
-@end example
-
-Has values of:
-
- RECORD_TYPEs, or UNION_TYPEs
-
-What things can this be used on:
-
- TYPE_DECLs, _DECLs
-
-
-@item DECL_CONTEXT
-Identifies the context that the _DECL was found in. Can be used on
-virtual function tables to find the type associated with the virtual
-function table, but since they are FIELD_DECLs, DECL_FIELD_CONTEXT is a
-better access method. Internally the same as DECL_FIELD_CONTEXT, so
-don't us both. See also DECL_FIELD_CONTEXT, DECL_FCONTEXT and
-DECL_CLASS_CONTEXT.
-
-Has values of:
-
- RECORD_TYPEs
-
-
-What things can this be used on:
-
-@display
-VAR_DECLs that are virtual function tables
-_DECLs
-@end display
-
-
-@item DECL_FIELD_CONTEXT
-Identifies the context that the FIELD_DECL was found in. Internally the
-same as DECL_CONTEXT, so don't us both. See also DECL_CONTEXT,
-DECL_FCONTEXT and DECL_CLASS_CONTEXT.
-
-Has values of:
-
- RECORD_TYPEs
-
-What things can this be used on:
-
-@display
-FIELD_DECLs that are virtual function pointers
-FIELD_DECLs
-@end display
-
-
-@item DECL_NAME
-
-Has values of:
-
-@display
-0 for things that don't have names
-IDENTIFIER_NODEs for TYPE_DECLs
-@end display
-
-@item DECL_IGNORED_P
-A bit that can be set to inform the debug information output routines in
-the back-end that a certain _DECL node should be totally ignored.
-
-Used in cases where it is known that the debugging information will be
-output in another file, or where a sub-type is known not to be needed
-because the enclosing type is not needed.
-
-A compiler constructed virtual destructor in derived classes that do not
-define an explicit destructor that was defined explicit in a base class
-has this bit set as well. Also used on __FUNCTION__ and
-__PRETTY_FUNCTION__ to mark they are ``compiler generated.'' c-decl and
-c-lex.c both want DECL_IGNORED_P set for ``internally generated vars,''
-and ``user-invisible variable.''
-
-Functions built by the C++ front-end such as default destructors,
-virtual destructors and default constructors want to be marked that
-they are compiler generated, but unsure why.
-
-Currently, it is used in an absolute way in the C++ front-end, as an
-optimization, to tell the debug information output routines to not
-generate debugging information that will be output by another separately
-compiled file.
-
-
-@item DECL_VIRTUAL_P
-A flag used on FIELD_DECLs and VAR_DECLs. (Documentation in tree.h is
-wrong.) Used in VAR_DECLs to indicate that the variable is a vtable.
-It is also used in FIELD_DECLs for vtable pointers.
-
-What things can this be used on:
-
- FIELD_DECLs and VAR_DECLs
-
-
-@item DECL_VPARENT
-Used to point to the parent type of the vtable if there is one, else it
-is just the type associated with the vtable. Because of the sharing of
-virtual function tables that goes on, this slot is not very useful, and
-is in fact, not used in the compiler at all. It can be removed.
-
-What things can this be used on:
-
- VAR_DECLs that are virtual function tables
-
-Has values of:
-
- RECORD_TYPEs maybe UNION_TYPEs
-
-
-@item DECL_FCONTEXT
-Used to find the first baseclass in which this FIELD_DECL is defined.
-See also DECL_CONTEXT, DECL_FIELD_CONTEXT and DECL_CLASS_CONTEXT.
-
-How it is used:
-
- Used when writing out debugging information about vfield and
- vbase decls.
-
-What things can this be used on:
-
- FIELD_DECLs that are virtual function pointers
- FIELD_DECLs
-
-
-@item DECL_REFERENCE_SLOT
-Used to hold the initialize for the reference.
-
-What things can this be used on:
-
- PARM_DECLs and VAR_DECLs that have a reference type
-
-
-@item DECL_VINDEX
-Used for FUNCTION_DECLs in two different ways. Before the structure
-containing the FUNCTION_DECL is laid out, DECL_VINDEX may point to a
-FUNCTION_DECL in a base class which is the FUNCTION_DECL which this
-FUNCTION_DECL will replace as a virtual function. When the class is
-laid out, this pointer is changed to an INTEGER_CST node which is
-suitable to find an index into the virtual function table. See
-get_vtable_entry as to how one can find the right index into the virtual
-function table. The first index 0, of a virtual function table it not
-used in the normal way, so the first real index is 1.
-
-DECL_VINDEX may be a TREE_LIST, that would seem to be a list of
-overridden FUNCTION_DECLs. add_virtual_function has code to deal with
-this when it uses the variable base_fndecl_list, but it would seem that
-somehow, it is possible for the TREE_LIST to pursist until method_call,
-and it should not.
-
-
-What things can this be used on:
-
- FUNCTION_DECLs
-
-
-@item DECL_SOURCE_FILE
-Identifies what source file a particular declaration was found in.
-
-Has values of:
-
- "<built-in>" on TYPE_DECLs to mean the typedef is built in
-
-
-@item DECL_SOURCE_LINE
-Identifies what source line number in the source file the declaration
-was found at.
-
-Has values of:
-
-@display
-0 for an undefined label
-
-0 for TYPE_DECLs that are internally generated
-
-0 for FUNCTION_DECLs for functions generated by the compiler
- (not yet, but should be)
-
-0 for ``magic'' arguments to functions, that the user has no
- control over
-@end display
-
-
-@item TREE_USED
-
-Has values of:
-
- 0 for unused labels
-
-
-@item TREE_ADDRESSABLE
-A flag that is set for any type that has a constructor.
-
-
-@item TREE_COMPLEXITY
-They seem a kludge way to track recursion, poping, and pushing. They only
-appear in cp-decl.c and cp-decl2.c, so the are a good candidate for
-proper fixing, and removal.
-
-
-@item TREE_HAS_CONSTRUCTOR
-A flag to indicate when a CALL_EXPR represents a call to a constructor.
-If set, we know that the type of the object, is the complete type of the
-object, and that the value returned is nonnull. When used in this
-fashion, it is an optimization. Can also be used on SAVE_EXPRs to
-indicate when they are of fixed type and nonnull. Can also be used on
-INDIRECT_EXPRs on CALL_EXPRs that represent a call to a constructor.
-
-
-@item TREE_PRIVATE
-Set for FIELD_DECLs by finish_struct. But not uniformly set.
-
-The following routines do something with PRIVATE access:
-build_method_call, alter_access, finish_struct_methods,
-finish_struct, convert_to_aggr, CWriteLanguageDecl, CWriteLanguageType,
-CWriteUseObject, compute_access, lookup_field, dfs_pushdecl,
-GNU_xref_member, dbxout_type_fields, dbxout_type_method_1
-
-
-@item TREE_PROTECTED
-The following routines do something with PROTECTED access:
-build_method_call, alter_access, finish_struct, convert_to_aggr,
-CWriteLanguageDecl, CWriteLanguageType, CWriteUseObject,
-compute_access, lookup_field, GNU_xref_member, dbxout_type_fields,
-dbxout_type_method_1
-
-
-@item TYPE_BINFO
-Used to get the binfo for the type.
-
-Has values of:
-
- TREE_VECs that are binfos
-
-What things can this be used on:
-
- RECORD_TYPEs
-
-
-@item TYPE_BINFO_BASETYPES
-See also BINFO_BASETYPES.
-
-@item TYPE_BINFO_VIRTUALS
-A unique list of functions for the virtual function table. See also
-BINFO_VIRTUALS.
-
-What things can this be used on:
-
- RECORD_TYPEs
-
-
-@item TYPE_BINFO_VTABLE
-Points to the virtual function table associated with the given type.
-See also BINFO_VTABLE.
-
-What things can this be used on:
-
- RECORD_TYPEs
-
-Has values of:
-
- VAR_DECLs that are virtual function tables
-
-
-@item TYPE_NAME
-Names the type.
-
-Has values of:
-
-@display
-0 for things that don't have names.
-should be IDENTIFIER_NODE for RECORD_TYPEs UNION_TYPEs and
- ENUM_TYPEs.
-TYPE_DECL for RECORD_TYPEs, UNION_TYPEs and ENUM_TYPEs, but
- shouldn't be.
-TYPE_DECL for typedefs, unsure why.
-@end display
-
-What things can one use this on:
-
-@display
-TYPE_DECLs
-RECORD_TYPEs
-UNION_TYPEs
-ENUM_TYPEs
-@end display
-
-History:
-
- It currently points to the TYPE_DECL for RECORD_TYPEs,
- UNION_TYPEs and ENUM_TYPEs, but it should be history soon.
-
-
-@item TYPE_METHODS
-Synonym for @code{CLASSTYPE_METHOD_VEC}. Chained together with
-@code{TREE_CHAIN}. @file{dbxout.c} uses this to get at the methods of a
-class.
-
-
-@item TYPE_DECL
-Used to represent typedefs, and used to represent bindings layers.
-
-Components:
-
- DECL_NAME is the name of the typedef. For example, foo would
- be found in the DECL_NAME slot when @code{typedef int foo;} is
- seen.
-
- DECL_SOURCE_LINE identifies what source line number in the
- source file the declaration was found at. A value of 0
- indicates that this TYPE_DECL is just an internal binding layer
- marker, and does not correspond to a user supplied typedef.
-
- DECL_SOURCE_FILE
-
-@item TYPE_FIELDS
-A linked list (via @code{TREE_CHAIN}) of member types of a class. The
-list can contain @code{TYPE_DECL}s, but there can also be other things
-in the list apparently. See also @code{CLASSTYPE_TAGS}.
-
-
-@item TYPE_VIRTUAL_P
-A flag used on a @code{FIELD_DECL} or a @code{VAR_DECL}, indicates it is
-a virtual function table or a pointer to one. When used on a
-@code{FUNCTION_DECL}, indicates that it is a virtual function. When
-used on an @code{IDENTIFIER_NODE}, indicates that a function with this
-same name exists and has been declared virtual.
-
-When used on types, it indicates that the type has virtual functions, or
-is derived from one that does.
-
-Not sure if the above about virtual function tables is still true. See
-also info on @code{DECL_VIRTUAL_P}.
-
-What things can this be used on:
-
- FIELD_DECLs, VAR_DECLs, FUNCTION_DECLs, IDENTIFIER_NODEs
-
-
-@item VF_BASETYPE_VALUE
-Get the associated type from the binfo that caused the given vfield to
-exist. This is the least derived class (the most parent class) that
-needed a virtual function table. It is probably the case that all uses
-of this field are misguided, but they need to be examined on a
-case-by-case basis. See history for more information on why the
-previous statement was made.
-
-Set at @code{finish_base_struct} time.
-
-What things can this be used on:
-
- TREE_LISTs that are vfields
-
-History:
-
- This field was used to determine if a virtual function table's
- slot should be filled in with a certain virtual function, by
- checking to see if the type returned by VF_BASETYPE_VALUE was a
- parent of the context in which the old virtual function existed.
- This incorrectly assumes that a given type _could_ not appear as
- a parent twice in a given inheritance lattice. For single
- inheritance, this would in fact work, because a type could not
- possibly appear more than once in an inheritance lattice, but
- with multiple inheritance, a type can appear more than once.
-
-
-@item VF_BINFO_VALUE
-Identifies the binfo that caused this vfield to exist. If this vfield
-is from the first direct base class that has a virtual function table,
-then VF_BINFO_VALUE is NULL_TREE, otherwise it will be the binfo of the
-direct base where the vfield came from. Can use @code{TREE_VIA_VIRTUAL}
-on result to find out if it is a virtual base class. Related to the
-binfo found by
-
-@example
-get_binfo (VF_BASETYPE_VALUE (vfield), t, 0)
-@end example
-
-@noindent
-where @samp{t} is the type that has the given vfield.
-
-@example
-get_binfo (VF_BASETYPE_VALUE (vfield), t, 0)
-@end example
-
-@noindent
-will return the binfo for the given vfield.
-
-May or may not be set at @code{modify_vtable_entries} time. Set at
-@code{finish_base_struct} time.
-
-What things can this be used on:
-
- TREE_LISTs that are vfields
-
-
-@item VF_DERIVED_VALUE
-Identifies the type of the most derived class of the vfield, excluding
-the class this vfield is for.
-
-Set at @code{finish_base_struct} time.
-
-What things can this be used on:
-
- TREE_LISTs that are vfields
-
-
-@item VF_NORMAL_VALUE
-Identifies the type of the most derived class of the vfield, including
-the class this vfield is for.
-
-Set at @code{finish_base_struct} time.
-
-What things can this be used on:
-
- TREE_LISTs that are vfields
-
-
-@item WRITABLE_VTABLES
-This is a option that can be defined when building the compiler, that
-will cause the compiler to output vtables into the data segment so that
-the vtables maybe written. This is undefined by default, because
-normally the vtables should be unwritable. People that implement object
-I/O facilities may, or people that want to change the dynamic type of
-objects may want to have the vtables writable. Another way of achieving
-this would be to make a copy of the vtable into writable memory, but the
-drawback there is that that method only changes the type for one object.
-
-@end table
-
-@node Typical Behavior, Coding Conventions, Macros, Top
-@section Typical Behavior
-
-@cindex parse errors
-
-Whenever seemingly normal code fails with errors like
-@code{syntax error at `\@{'}, it's highly likely that grokdeclarator is
-returning a NULL_TREE for whatever reason.
-
-@node Coding Conventions, Templates, Typical Behavior, Top
-@section Coding Conventions
-
-It should never be that case that trees are modified in-place by the
-back-end, @emph{unless} it is guaranteed that the semantics are the same
-no matter how shared the tree structure is. @file{fold-const.c} still
-has some cases where this is not true, but rms hypothesizes that this
-will never be a problem.
-
-@node Templates, Access Control, Coding Conventions, Top
-@section Templates
-
-A template is represented by a @code{TEMPLATE_DECL}. The specific
-fields used are:
-
-@table @code
-@item DECL_TEMPLATE_RESULT
-The generic decl on which instantiations are based. This looks just
-like any other decl.
-
-@item DECL_TEMPLATE_PARMS
-The parameters to this template.
-@end table
-
-The generic decl is parsed as much like any other decl as possible,
-given the parameterization. The template decl is not built up until the
-generic decl has been completed. For template classes, a template decl
-is generated for each member function and static data member, as well.
-
-Template members of template classes are represented by a TEMPLATE_DECL
-for the class' parameters around another TEMPLATE_DECL for the member's
-parameters.
-
-All declarations that are instantiations or specializations of templates
-refer to their template and parameters through DECL_TEMPLATE_INFO.
-
-How should I handle parsing member functions with the proper param
-decls? Set them up again or try to use the same ones? Currently we do
-the former. We can probably do this without any extra machinery in
-store_pending_inline, by deducing the parameters from the decl in
-do_pending_inlines. PRE_PARSED_TEMPLATE_DECL?
-
-If a base is a parm, we can't check anything about it. If a base is not
-a parm, we need to check it for name binding. Do finish_base_struct if
-no bases are parameterized (only if none, including indirect, are
-parms). Nah, don't bother trying to do any of this until instantiation
--- we only need to do name binding in advance.
-
-Always set up method vec and fields, inc. synthesized methods. Really?
-We can't know the types of the copy folks, or whether we need a
-destructor, or can have a default ctor, until we know our bases and
-fields. Otherwise, we can assume and fix ourselves later. Hopefully.
-
-@node Access Control, Error Reporting, Templates, Top
-@section Access Control
-The function compute_access returns one of three values:
-
-@table @code
-@item access_public
-means that the field can be accessed by the current lexical scope.
-
-@item access_protected
-means that the field cannot be accessed by the current lexical scope
-because it is protected.
-
-@item access_private
-means that the field cannot be accessed by the current lexical scope
-because it is private.
-@end table
-
-DECL_ACCESS is used for access declarations; alter_access creates a list
-of types and accesses for a given decl.
-
-Formerly, DECL_@{PUBLIC,PROTECTED,PRIVATE@} corresponded to the return
-codes of compute_access and were used as a cache for compute_access.
-Now they are not used at all.
-
-TREE_PROTECTED and TREE_PRIVATE are used to record the access levels
-granted by the containing class. BEWARE: TREE_PUBLIC means something
-completely unrelated to access control!
-
-@node Error Reporting, Parser, Access Control, Top
-@section Error Reporting
-
-The C++ front-end uses a call-back mechanism to allow functions to print
-out reasonable strings for types and functions without putting extra
-logic in the functions where errors are found. The interface is through
-the @code{cp_error} function (or @code{cp_warning}, etc.). The
-syntax is exactly like that of @code{error}, except that a few more
-conversions are supported:
-
-@itemize @bullet
-@item
-%C indicates a value of `enum tree_code'.
-@item
-%D indicates a *_DECL node.
-@item
-%E indicates a *_EXPR node.
-@item
-%L indicates a value of `enum languages'.
-@item
-%P indicates the name of a parameter (i.e. "this", "1", "2", ...)
-@item
-%T indicates a *_TYPE node.
-@item
-%O indicates the name of an operator (MODIFY_EXPR -> "operator =").
-
-@end itemize
-
-There is some overlap between these; for instance, any of the node
-options can be used for printing an identifier (though only @code{%D}
-tries to decipher function names).
-
-For a more verbose message (@code{class foo} as opposed to just @code{foo},
-including the return type for functions), use @code{%#c}.
-To have the line number on the error message indicate the line of the
-DECL, use @code{cp_error_at} and its ilk; to indicate which argument you want,
-use @code{%+D}, or it will default to the first.
-
-@node Parser, Exception Handling, Error Reporting, Top
-@section Parser
-
-Some comments on the parser:
-
-The @code{after_type_declarator} / @code{notype_declarator} hack is
-necessary in order to allow redeclarations of @code{TYPENAME}s, for
-instance
-
-@example
-typedef int foo;
-class A @{
- char *foo;
-@};
-@end example
-
-In the above, the first @code{foo} is parsed as a @code{notype_declarator},
-and the second as a @code{after_type_declarator}.
-
-Ambiguities:
-
-There are currently four reduce/reduce ambiguities in the parser. They are:
-
-1) Between @code{template_parm} and
-@code{named_class_head_sans_basetype}, for the tokens @code{aggr
-identifier}. This situation occurs in code looking like
-
-@example
-template <class T> class A @{ @};
-@end example
-
-It is ambiguous whether @code{class T} should be parsed as the
-declaration of a template type parameter named @code{T} or an unnamed
-constant parameter of type @code{class T}. Section 14.6, paragraph 3 of
-the January '94 working paper states that the first interpretation is
-the correct one. This ambiguity results in two reduce/reduce conflicts.
-
-2) Between @code{primary} and @code{type_id} for code like @samp{int()}
-in places where both can be accepted, such as the argument to
-@code{sizeof}. Section 8.1 of the pre-San Diego working paper specifies
-that these ambiguous constructs will be interpreted as @code{typename}s.
-This ambiguity results in six reduce/reduce conflicts between
-@samp{absdcl} and @samp{functional_cast}.
-
-3) Between @code{functional_cast} and
-@code{complex_direct_notype_declarator}, for various token strings.
-This situation occurs in code looking like
-
-@example
-int (*a);
-@end example
-
-This code is ambiguous; it could be a declaration of the variable
-@samp{a} as a pointer to @samp{int}, or it could be a functional cast of
-@samp{*a} to @samp{int}. Section 6.8 specifies that the former
-interpretation is correct. This ambiguity results in 7 reduce/reduce
-conflicts. Another aspect of this ambiguity is code like 'int (x[2]);',
-which is resolved at the '[' and accounts for 6 reduce/reduce conflicts
-between @samp{direct_notype_declarator} and
-@samp{primary}/@samp{overqualified_id}. Finally, there are 4 r/r
-conflicts between @samp{expr_or_declarator} and @samp{primary} over code
-like 'int (a);', which could probably be resolved but would also
-probably be more trouble than it's worth. In all, this situation
-accounts for 17 conflicts. Ack!
-
-The second case above is responsible for the failure to parse 'LinppFile
-ppfile (String (argv[1]), &outs, argc, argv);' (from Rogue Wave
-Math.h++) as an object declaration, and must be fixed so that it does
-not resolve until later.
-
-4) Indirectly between @code{after_type_declarator} and @code{parm}, for
-type names. This occurs in (as one example) code like
-
-@example
-typedef int foo, bar;
-class A @{
- foo (bar);
-@};
-@end example
-
-What is @code{bar} inside the class definition? We currently interpret
-it as a @code{parm}, as does Cfront, but IBM xlC interprets it as an
-@code{after_type_declarator}. I believe that xlC is correct, in light
-of 7.1p2, which says "The longest sequence of @i{decl-specifiers} that
-could possibly be a type name is taken as the @i{decl-specifier-seq} of
-a @i{declaration}." However, it seems clear that this rule must be
-violated in the case of constructors. This ambiguity accounts for 8
-conflicts.
-
-Unlike the others, this ambiguity is not recognized by the Working Paper.
-
-@node Exception Handling, Free Store, Parser, Top
-@section Exception Handling
-
-Note, exception handling in g++ is still under development.
-
-This section describes the mapping of C++ exceptions in the C++
-front-end, into the back-end exception handling framework.
-
-The basic mechanism of exception handling in the back-end is
-unwind-protect a la elisp. This is a general, robust, and language
-independent representation for exceptions.
-
-The C++ front-end exceptions are mapping into the unwind-protect
-semantics by the C++ front-end. The mapping is describe below.
-
-When -frtti is used, rtti is used to do exception object type checking,
-when it isn't used, the encoded name for the type of the object being
-thrown is used instead. All code that originates exceptions, even code
-that throws exceptions as a side effect, like dynamic casting, and all
-code that catches exceptions must be compiled with either -frtti, or
--fno-rtti. It is not possible to mix rtti base exception handling
-objects with code that doesn't use rtti. The exceptions to this, are
-code that doesn't catch or throw exceptions, catch (...), and code that
-just rethrows an exception.
-
-Currently we use the normal mangling used in building functions names
-(int's are "i", const char * is PCc) to build the non-rtti base type
-descriptors for exception handling. These descriptors are just plain
-NULL terminated strings, and internally they are passed around as char
-*.
-
-In C++, all cleanups should be protected by exception regions. The
-region starts just after the reason why the cleanup is created has
-ended. For example, with an automatic variable, that has a constructor,
-it would be right after the constructor is run. The region ends just
-before the finalization is expanded. Since the backend may expand the
-cleanup multiple times along different paths, once for normal end of the
-region, once for non-local gotos, once for returns, etc, the backend
-must take special care to protect the finalization expansion, if the
-expansion is for any other reason than normal region end, and it is
-`inline' (it is inside the exception region). The backend can either
-choose to move them out of line, or it can created an exception region
-over the finalization to protect it, and in the handler associated with
-it, it would not run the finalization as it otherwise would have, but
-rather just rethrow to the outer handler, careful to skip the normal
-handler for the original region.
-
-In Ada, they will use the more runtime intensive approach of having
-fewer regions, but at the cost of additional work at run time, to keep a
-list of things that need cleanups. When a variable has finished
-construction, they add the cleanup to the list, when the come to the end
-of the lifetime of the variable, the run the list down. If the take a
-hit before the section finishes normally, they examine the list for
-actions to perform. I hope they add this logic into the back-end, as it
-would be nice to get that alternative approach in C++.
-
-On an rs6000, xlC stores exception objects on that stack, under the try
-block. When is unwinds down into a handler, the frame pointer is
-adjusted back to the normal value for the frame in which the handler
-resides, and the stack pointer is left unchanged from the time at which
-the object was thrown. This is so that there is always someplace for
-the exception object, and nothing can overwrite it, once we start
-throwing. The only bad part, is that the stack remains large.
-
-The below points out some things that work in g++'s exception handling.
-
-All completely constructed temps and local variables are cleaned up in
-all unwinded scopes. Completely constructed parts of partially
-constructed objects are cleaned up. This includes partially built
-arrays. Exception specifications are now handled. Thrown objects are
-now cleaned up all the time. We can now tell if we have an active
-exception being thrown or not (__eh_type != 0). We use this to call
-terminate if someone does a throw; without there being an active
-exception object. uncaught_exception () works. Exception handling
-should work right if you optimize. Exception handling should work with
--fpic or -fPIC.
-
-The below points out some flaws in g++'s exception handling, as it now
-stands.
-
-Only exact type matching or reference matching of throw types works when
--fno-rtti is used. Only works on a SPARC (like Suns) (both -mflat and
--mno-flat models work), SPARClite, Hitachi SH, i386, arm, rs6000,
-PowerPC, Alpha, mips, VAX, m68k and z8k machines. SPARC v9 may not
-work. HPPA is mostly done, but throwing between a shared library and
-user code doesn't yet work. Some targets have support for data-driven
-unwinding. Partial support is in for all other machines, but a stack
-unwinder called __unwind_function has to be written, and added to
-libgcc2 for them. The new EH code doesn't rely upon the
-__unwind_function for C++ code, instead it creates per function
-unwinders right inside the function, unfortunately, on many platforms
-the definition of RETURN_ADDR_RTX in the tm.h file for the machine port
-is wrong. See below for details on __unwind_function. RTL_EXPRs for EH
-cond variables for && and || exprs should probably be wrapped in
-UNSAVE_EXPRs, and RTL_EXPRs tweaked so that they can be unsaved.
-
-We only do pointer conversions on exception matching a la 15.3 p2 case
-3: `A handler with type T, const T, T&, or const T& is a match for a
-throw-expression with an object of type E if [3]T is a pointer type and
-E is a pointer type that can be converted to T by a standard pointer
-conversion (_conv.ptr_) not involving conversions to pointers to private
-or protected base classes.' when -frtti is given.
-
-We don't call delete on new expressions that die because the ctor threw
-an exception. See except/18 for a test case.
-
-15.2 para 13: The exception being handled should be rethrown if control
-reaches the end of a handler of the function-try-block of a constructor
-or destructor, right now, it is not.
-
-15.2 para 12: If a return statement appears in a handler of
-function-try-block of a constructor, the program is ill-formed, but this
-isn't diagnosed.
-
-15.2 para 11: If the handlers of a function-try-block contain a jump
-into the body of a constructor or destructor, the program is ill-formed,
-but this isn't diagnosed.
-
-15.2 para 9: Check that the fully constructed base classes and members
-of an object are destroyed before entering the handler of a
-function-try-block of a constructor or destructor for that object.
-
-build_exception_variant should sort the incoming list, so that it
-implements set compares, not exact list equality. Type smashing should
-smash exception specifications using set union.
-
-Thrown objects are usually allocated on the heap, in the usual way. If
-one runs out of heap space, throwing an object will probably never work.
-This could be relaxed some by passing an __in_chrg parameter to track
-who has control over the exception object. Thrown objects are not
-allocated on the heap when they are pointer to object types. We should
-extend it so that all small (<4*sizeof(void*)) objects are stored
-directly, instead of allocated on the heap.
-
-When the backend returns a value, it can create new exception regions
-that need protecting. The new region should rethrow the object in
-context of the last associated cleanup that ran to completion.
-
-The structure of the code that is generated for C++ exception handling
-code is shown below:
-
-@example
-Ln: throw value;
- copy value onto heap
- jump throw (Ln, id, address of copy of value on heap)
-
- try @{
-+Lstart: the start of the main EH region
-|... ...
-+Lend: the end of the main EH region
- @} catch (T o) @{
- ...1
- @}
-Lresume:
- nop used to make sure there is something before
- the next region ends, if there is one
-... ...
-
- jump Ldone
-[
-Lmainhandler: handler for the region Lstart-Lend
- cleanup
-] zero or more, depending upon automatic vars with dtors
-+Lpartial:
-| jump Lover
-+Lhere:
- rethrow (Lhere, same id, same obj);
-Lterm: handler for the region Lpartial-Lhere
- call terminate
-Lover:
-[
- [
- call throw_type_match
- if (eq) @{
- ] these lines disappear when there is no catch condition
-+Lsregion2:
-| ...1
-| jump Lresume
-|Lhandler: handler for the region Lsregion2-Leregion2
-| rethrow (Lresume, same id, same obj);
-+Leregion2
- @}
-] there are zero or more of these sections, depending upon how many
- catch clauses there are
------------------------------ expand_end_all_catch --------------------------
- here we have fallen off the end of all catch
- clauses, so we rethrow to outer
- rethrow (Lresume, same id, same obj);
------------------------------ expand_end_all_catch --------------------------
-[
-L1: maybe throw routine
-] depending upon if we have expanded it or not
-Ldone:
- ret
-
-start_all_catch emits labels: Lresume,
-
-@end example
-
-The __unwind_function takes a pointer to the throw handler, and is
-expected to pop the stack frame that was built to call it, as well as
-the frame underneath and then jump to the throw handler. It must
-restore all registers to their proper values as well as all other
-machine state as determined by the context in which we are unwinding
-into. The way I normally start is to compile:
-
- void *g;
- foo(void* a) @{ g = a; @}
-
-with -S, and change the thing that alters the PC (return, or ret
-usually) to not alter the PC, making sure to leave all other semantics
-(like adjusting the stack pointer, or frame pointers) in. After that,
-replicate the prologue once more at the end, again, changing the PC
-altering instructions, and finally, at the very end, jump to `g'.
-
-It takes about a week to write this routine, if someone wants to
-volunteer to write this routine for any architecture, exception support
-for that architecture will be added to g++. Please send in those code
-donations. One other thing that needs to be done, is to double check
-that __builtin_return_address (0) works.
-
-@subsection Specific Targets
-
-For the alpha, the __unwind_function will be something resembling:
-
-@example
-void
-__unwind_function(void *ptr)
-@{
- /* First frame */
- asm ("ldq $15, 8($30)"); /* get the saved frame ptr; 15 is fp, 30 is sp */
- asm ("bis $15, $15, $30"); /* reload sp with the fp we found */
-
- /* Second frame */
- asm ("ldq $15, 8($30)"); /* fp */
- asm ("bis $15, $15, $30"); /* reload sp with the fp we found */
-
- /* Return */
- asm ("ret $31, ($16), 1"); /* return to PTR, stored in a0 */
-@}
-@end example
-
-@noindent
-However, there are a few problems preventing it from working. First of
-all, the gcc-internal function @code{__builtin_return_address} needs to
-work given an argument of 0 for the alpha. As it stands as of August
-30th, 1995, the code for @code{BUILT_IN_RETURN_ADDRESS} in @file{expr.c}
-will definitely not work on the alpha. Instead, we need to define
-the macros @code{DYNAMIC_CHAIN_ADDRESS} (maybe),
-@code{RETURN_ADDR_IN_PREVIOUS_FRAME}, and definitely need a new
-definition for @code{RETURN_ADDR_RTX}.
-
-In addition (and more importantly), we need a way to reliably find the
-frame pointer on the alpha. The use of the value 8 above to restore the
-frame pointer (register 15) is incorrect. On many systems, the frame
-pointer is consistently offset to a specific point on the stack. On the
-alpha, however, the frame pointer is pushed last. First the return
-address is stored, then any other registers are saved (e.g., @code{s0}),
-and finally the frame pointer is put in place. So @code{fp} could have
-an offset of 8, but if the calling function saved any registers at all,
-they add to the offset.
-
-The only places the frame size is noted are with the @samp{.frame}
-directive, for use by the debugger and the OSF exception handling model
-(useless to us), and in the initial computation of the new value for
-@code{sp}, the stack pointer. For example, the function may start with:
-
-@example
-lda $30,-32($30)
-.frame $15,32,$26,0
-@end example
-
-@noindent
-The 32 above is exactly the value we need. With this, we can be sure
-that the frame pointer is stored 8 bytes less---in this case, at 24(sp)).
-The drawback is that there is no way that I (Brendan) have found to let
-us discover the size of a previous frame @emph{inside} the definition
-of @code{__unwind_function}.
-
-So to accomplish exception handling support on the alpha, we need two
-things: first, a way to figure out where the frame pointer was stored,
-and second, a functional @code{__builtin_return_address} implementation
-for except.c to be able to use it.
-
-Or just support DWARF 2 unwind info.
-
-@subsection New Backend Exception Support
-
-This subsection discusses various aspects of the design of the
-data-driven model being implemented for the exception handling backend.
-
-The goal is to generate enough data during the compilation of user code,
-such that we can dynamically unwind through functions at run time with a
-single routine (@code{__throw}) that lives in libgcc.a, built by the
-compiler, and dispatch into associated exception handlers.
-
-This information is generated by the DWARF 2 debugging backend, and
-includes all of the information __throw needs to unwind an arbitrary
-frame. It specifies where all of the saved registers and the return
-address can be found at any point in the function.
-
-Major disadvantages when enabling exceptions are:
-
-@itemize @bullet
-@item
-Code that uses caller saved registers, can't, when flow can be
-transferred into that code from an exception handler. In high performance
-code this should not usually be true, so the effects should be minimal.
-
-@end itemize
-
-@subsection Backend Exception Support
-
-The backend must be extended to fully support exceptions. Right now
-there are a few hooks into the alpha exception handling backend that
-resides in the C++ frontend from that backend that allows exception
-handling to work in g++. An exception region is a segment of generated
-code that has a handler associated with it. The exception regions are
-denoted in the generated code as address ranges denoted by a starting PC
-value and an ending PC value of the region. Some of the limitations
-with this scheme are:
-
-@itemize @bullet
-@item
-The backend replicates insns for such things as loop unrolling and
-function inlining. Right now, there are no hooks into the frontend's
-exception handling backend to handle the replication of insns. When
-replication happens, a new exception region descriptor needs to be
-generated for the new region.
-
-@item
-The backend expects to be able to rearrange code, for things like jump
-optimization. Any rearranging of the code needs have exception region
-descriptors updated appropriately.
-
-@item
-The backend can eliminate dead code. Any associated exception region
-descriptor that refers to fully contained code that has been eliminated
-should also be removed, although not doing this is harmless in terms of
-semantics.
-
-@end itemize
-
-The above is not meant to be exhaustive, but does include all things I
-have thought of so far. I am sure other limitations exist.
-
-Below are some notes on the migration of the exception handling code
-backend from the C++ frontend to the backend.
-
-NOTEs are to be used to denote the start of an exception region, and the
-end of the region. I presume that the interface used to generate these
-notes in the backend would be two functions, start_exception_region and
-end_exception_region (or something like that). The frontends are
-required to call them in pairs. When marking the end of a region, an
-argument can be passed to indicate the handler for the marked region.
-This can be passed in many ways, currently a tree is used. Another
-possibility would be insns for the handler, or a label that denotes a
-handler. I have a feeling insns might be the best way to pass it.
-Semantics are, if an exception is thrown inside the region, control is
-transferred unconditionally to the handler. If control passes through
-the handler, then the backend is to rethrow the exception, in the
-context of the end of the original region. The handler is protected by
-the conventional mechanisms; it is the frontend's responsibility to
-protect the handler, if special semantics are required.
-
-This is a very low level view, and it would be nice is the backend
-supported a somewhat higher level view in addition to this view. This
-higher level could include source line number, name of the source file,
-name of the language that threw the exception and possibly the name of
-the exception. Kenner may want to rope you into doing more than just
-the basics required by C++. You will have to resolve this. He may want
-you to do support for non-local gotos, first scan for exception handler,
-if none is found, allow the debugger to be entered, without any cleanups
-being done. To do this, the backend would have to know the difference
-between a cleanup-rethrower, and a real handler, if would also have to
-have a way to know if a handler `matches' a thrown exception, and this
-is frontend specific.
-
-The stack unwinder is one of the hardest parts to do. It is highly
-machine dependent. The form that kenner seems to like was a couple of
-macros, that would do the machine dependent grunt work. One preexisting
-function that might be of some use is __builtin_return_address (). One
-macro he seemed to want was __builtin_return_address, and the other
-would do the hard work of fixing up the registers, adjusting the stack
-pointer, frame pointer, arg pointer and so on.
-
-
-@node Free Store, Mangling, Exception Handling, Top
-@section Free Store
-
-@code{operator new []} adds a magic cookie to the beginning of arrays
-for which the number of elements will be needed by @code{operator delete
-[]}. These are arrays of objects with destructors and arrays of objects
-that define @code{operator delete []} with the optional size_t argument.
-This cookie can be examined from a program as follows:
-
-@example
-typedef unsigned long size_t;
-extern "C" int printf (const char *, ...);
-
-size_t nelts (void *p)
-@{
- struct cookie @{
- size_t nelts __attribute__ ((aligned (sizeof (double))));
- @};
-
- cookie *cp = (cookie *)p;
- --cp;
-
- return cp->nelts;
-@}
-
-struct A @{
- ~A() @{ @}
-@};
-
-main()
-@{
- A *ap = new A[3];
- printf ("%ld\n", nelts (ap));
-@}
-@end example
-
-@section Linkage
-The linkage code in g++ is horribly twisted in order to meet two design goals:
-
-1) Avoid unnecessary emission of inlines and vtables.
-
-2) Support pedantic assemblers like the one in AIX.
-
-To meet the first goal, we defer emission of inlines and vtables until
-the end of the translation unit, where we can decide whether or not they
-are needed, and how to emit them if they are.
-
-@node Mangling, Concept Index, Free Store, Top
-@section Function name mangling for C++ and Java
-
-Both C++ and Java provide overloaded functions and methods,
-which are methods with the same types but different parameter lists.
-Selecting the correct version is done at compile time.
-Though the overloaded functions have the same name in the source code,
-they need to be translated into different assembler-level names,
-since typical assemblers and linkers cannot handle overloading.
-This process of encoding the parameter types with the method name
-into a unique name is called @dfn{name mangling}. The inverse
-process is called @dfn{demangling}.
-
-It is convenient that C++ and Java use compatible mangling schemes,
-since the makes life easier for tools such as gdb, and it eases
-integration between C++ and Java.
-
-Note there is also a standard "Jave Native Interface" (JNI) which
-implements a different calling convention, and uses a different
-mangling scheme. The JNI is a rather abstract ABI so Java can call methods
-written in C or C++;
-we are concerned here about a lower-level interface primarily
-intended for methods written in Java, but that can also be used for C++
-(and less easily C).
-
-Note that on systems that follow BSD tradition, a C identifier @code{var}
-would get "mangled" into the assembler name @samp{_var}. On such
-systems, all other mangled names are also prefixed by a @samp{_}
-which is not shown in the following examples.
-
-@subsection Method name mangling
-
-C++ mangles a method by emitting the function name, followed by @code{__},
-followed by encodings of any method qualifiers (such as @code{const}),
-followed by the mangling of the method's class,
-followed by the mangling of the parameters, in order.
-
-For example @code{Foo::bar(int, long) const} is mangled
-as @samp{bar__C3Fooil}.
-
-For a constructor, the method name is left out.
-That is @code{Foo::Foo(int, long) const} is mangled
-as @samp{__C3Fooil}.
-
-GNU Java does the same.
-
-@subsection Primitive types
-
-The C++ types @code{int}, @code{long}, @code{short}, @code{char},
-and @code{long long} are mangled as @samp{i}, @samp{l},
-@samp{s}, @samp{c}, and @samp{x}, respectively.
-The corresponding unsigned types have @samp{U} prefixed
-to the mangling. The type @code{signed char} is mangled @samp{Sc}.
-
-The C++ and Java floating-point types @code{float} and @code{double}
-are mangled as @samp{f} and @samp{d} respectively.
-
-The C++ @code{bool} type and the Java @code{boolean} type are
-mangled as @samp{b}.
-
-The C++ @code{wchar_t} and the Java @code{char} types are
-mangled as @samp{w}.
-
-The Java integral types @code{byte}, @code{short}, @code{int}
-and @code{long} are mangled as @samp{c}, @samp{s}, @samp{i},
-and @samp{x}, respectively.
-
-C++ code that has included @code{javatypes.h} will mangle
-the typedefs @code{jbyte}, @code{jshort}, @code{jint}
-and @code{jlong} as respectively @samp{c}, @samp{s}, @samp{i},
-and @samp{x}. (This has not been implemented yet.)
-
-@subsection Mangling of simple names
-
-A simple class, package, template, or namespace name is
-encoded as the number of characters in the name, followed by
-the actual characters. Thus the class @code{Foo}
-is encoded as @samp{3Foo}.
-
-If any of the characters in the name are not alphanumeric
-(i.e not one of the standard ASCII letters, digits, or '_'),
-or the initial character is a digit, then the name is
-mangled as a sequence of encoded Unicode letters.
-A Unicode encoding starts with a @samp{U} to indicate
-that Unicode escapes are used, followed by the number of
-bytes used by the Unicode encoding, followed by the bytes
-representing the encoding. ASSCI letters and
-non-initial digits are encoded without change. However, all
-other characters (including underscore and initial digits) are
-translated into a sequence starting with an underscore,
-followed by the big-endian 4-hex-digit lower-case encoding of the character.
-
-If a method name contains Unicode-escaped characters, the
-entire mangled method name is followed by a @samp{U}.
-
-For example, the method @code{X\u0319::M\u002B(int)} is encoded as
-@samp{M_002b__U6X_0319iU}.
-
-
-@subsection Pointer and reference types
-
-A C++ pointer type is mangled as @samp{P} followed by the
-mangling of the type pointed to.
-
-A C++ reference type as mangled as @samp{R} followed by the
-mangling of the type referenced.
-
-A Java object reference type is equivalent
-to a C++ pointer parameter, so we mangle such an parameter type
-as @samp{P} followed by the mangling of the class name.
-
-@subsection Squangled type compression
-
-Squangling (enabled with the @samp{-fsquangle} option), utilizes the
-@samp{B} code to indicate reuse of a previously seen type within an
-indentifier. Types are recognized in a left to right manner and given
-increasing values, which are appended to the code in the standard
-manner. Ie, multiple digit numbers are delimited by @samp{_}
-characters. A type is considered to be any non primitive type,
-regardless of whether its a parameter, template parameter, or entire
-template. Certain codes are considered modifiers of a type, and are not
-included as part of the type. These are the @samp{C}, @samp{V},
-@samp{P}, @samp{A}, @samp{R}, @samp{U} and @samp{u} codes, denoting
-constant, volatile, pointer, array, reference, unsigned, and restrict.
-These codes may precede a @samp{B} type in order to make the required
-modifications to the type.
-
-For example:
-@example
-template <class T> class class1 @{ @};
-
-template <class T> class class2 @{ @};
-
-class class3 @{ @};
-
-int f(class2<class1<class3> > a ,int b, const class1<class3>&c, class3 *d) @{ @}
-
- B0 -> class2<class1<class3>
- B1 -> class1<class3>
- B2 -> class3
-@end example
-Produces the mangled name @samp{f__FGt6class21Zt6class11Z6class3iRCB1PB2}.
-The int parameter is a basic type, and does not receive a B encoding...
-
-@subsection Qualified names
-
-Both C++ and Java allow a class to be lexically nested inside another
-class. C++ also supports namespaces.
-Java also supports packages.
-
-These are all mangled the same way: First the letter @samp{Q}
-indicates that we are emitting a qualified name.
-That is followed by the number of parts in the qualified name.
-If that number is 9 or less, it is emitted with no delimiters.
-Otherwise, an underscore is written before and after the count.
-Then follows each part of the qualified name, as described above.
-
-For example @code{Foo::\u0319::Bar} is encoded as
-@samp{Q33FooU5_03193Bar}.
-
-Squangling utilizes the the letter @samp{K} to indicate a
-remembered portion of a qualified name. As qualified names are processed
-for an identifier, the names are numbered and remembered in a
-manner similar to the @samp{B} type compression code.
-Names are recognized left to right, and given increasing values, which are
-appended to the code in the standard manner. ie, multiple digit numbers
-are delimited by @samp{_} characters.
-
-For example
-@example
-class Andrew
-@{
- class WasHere
- @{
- class AndHereToo
- @{
- @};
- @};
-@};
-
-f(Andrew&r1, Andrew::WasHere& r2, Andrew::WasHere::AndHereToo& r3) @{ @}
-
- K0 -> Andrew
- K1 -> Andrew::WasHere
- K2 -> Andrew::WasHere::AndHereToo
-@end example
-Function @samp{f()} would be mangled as :
-@samp{f__FR6AndrewRQ2K07WasHereRQ2K110AndHereToo}
-
-There are some occasions when either a @samp{B} or @samp{K} code could
-be chosen, preference is always given to the @samp{B} code. Ie, the example
-in the section on @samp{B} mangling could have used a @samp{K} code
-instead of @samp{B2}.
-
-@subsection Templates
-
-A class template instantiation is encoded as the letter @samp{t},
-followed by the encoding of the template name, followed
-the number of template parameters, followed by encoding of the template
-parameters. If a template parameter is a type, it is written
-as a @samp{Z} followed by the encoding of the type. If it is a
-template, it is encoded as @samp{z} followed by the parameter
-of the template template parameter and the template name.
-
-A function template specialization (either an instantiation or an
-explicit specialization) is encoded by an @samp{H} followed by the
-encoding of the template parameters, as described above, followed by an
-@samp{_}, the encoding of the argument types to the template function
-(not the specialization), another @samp{_}, and the return type. (Like
-the argument types, the return type is the return type of the function
-template, not the specialization.) Template parameters in the argument
-and return types are encoded by an @samp{X} for type parameters,
-@samp{zX} for template parameters,
-or a @samp{Y} for constant parameters, an index indicating their position
-in the template parameter list declaration, and their template depth.
-
-@subsection Arrays
-
-C++ array types are mangled by emitting @samp{A}, followed by
-the length of the array, followed by an @samp{_}, followed by
-the mangling of the element type. Of course, normally
-array parameter types decay into a pointer types, so you
-don't see this.
-
-Java arrays are objects. A Java type @code{T[]} is mangled
-as if it were the C++ type @code{JArray<T>}.
-For example @code{java.lang.String[]} is encoded as
-@samp{Pt6JArray1ZPQ34java4lang6String}.
-
-@subsection Static fields
-
-Both C++ and Java classes can have static fields.
-These are allocated statically, and are shared among all instances.
-
-The mangling starts with a prefix (@samp{_} in most systems), which is
-followed by the mangling
-of the class name, followed by the "joiner" and finally the field name.
-The joiner (see @code{JOINER} in @code{cp-tree.h}) is a special
-separator character. For historical reasons (and idiosyncracies
-of assembler syntax) it can @samp{$} or @samp{.} (or even
-@samp{_} on a few systems). If the joiner is @samp{_} then the prefix
-is @samp{__static_} instead of just @samp{_}.
-
-For example @code{Foo::Bar::var} (or @code{Foo.Bar.var} in Java syntax)
-would be encoded as @samp{_Q23Foo3Bar$var} or @samp{_Q23Foo3Bar.var}
-(or rarely @samp{__static_Q23Foo3Bar_var}).
-
-If the name of a static variable needs Unicode escapes,
-the Unicode indicator @samp{U} comes before the "joiner".
-This @code{\u1234Foo::var\u3445} becomes @code{_U8_1234FooU.var_3445}.
-
-@subsection Table of demangling code characters
-
-The following special characters are used in mangling:
-
-@table @samp
-@item A
-Indicates a C++ array type.
-
-@item b
-Encodes the C++ @code{bool} type,
-and the Java @code{boolean} type.
-
-@item B
-Used for squangling. Similar in concept to the 'T' non-squangled code.
-
-@item c
-Encodes the C++ @code{char} type, and the Java @code{byte} type.
-
-@item C
-A modifier to indicate a @code{const} type.
-Also used to indicate a @code{const} member function
-(in which cases it precedes the encoding of the method's class).
-
-@item d
-Encodes the C++ and Java @code{double} types.
-
-@item e
-Indicates extra unknown arguments @code{...}.
-
-@item E
-Indicates the opening parenthesis of an expression.
-
-@item f
-Encodes the C++ and Java @code{float} types.
-
-@item F
-Used to indicate a function type.
-
-@item H
-Used to indicate a template function.
-
-@item i
-Encodes the C++ and Java @code{int} types.
-
-@item I
-Encodes typedef names of the form @code{int@var{n}_t}, where @var{n} is a
-positive decimal number. The @samp{I} is followed by either two
-hexidecimal digits, which encode the value of @var{n}, or by an
-arbitrary number of hexidecimal digits between underscores. For
-example, @samp{I40} encodes the type @code{int64_t}, and @samp{I_200_}
-encodes the type @code{int512_t}.
-
-@item J
-Indicates a complex type.
-
-@item K
-Used by squangling to compress qualified names.
-
-@item l
-Encodes the C++ @code{long} type.
-
-@item n
-Immediate repeated type. Followed by the repeat count.
-
-@item N
-Repeated type. Followed by the repeat count of the repeated type,
-followed by the type index of the repeated type. Due to a bug in
-g++ 2.7.2, this is only generated if index is 0. Superceded by
-@samp{n} when squangling.
-
-@item O
-Pointer-to-member type.
-
-@item o
-vector type.
-
-@item P
-Indicates a pointer type. Followed by the type pointed to.
-
-@item Q
-Used to mangle qualified names, which arise from nested classes.
-Also used for namespaces.
-In Java used to mangle package-qualified names, and inner classes.
-
-@item r
-Encodes the GNU C++ @code{long double} type.
-
-@item R
-Indicates a reference type. Followed by the referenced type.
-
-@item s
-Encodes the C++ and java @code{short} types.
-
-@item S
-A modifier that indicates that the following integer type is signed.
-Only used with @code{char}.
-
-Also used as a modifier to indicate a static member function.
-
-@item t
-Indicates a template instantiation.
-
-@item T
-A back reference to a previously seen type.
-
-@item U
-A modifier that indicates that the following integer type is unsigned.
-Also used to indicate that the following class or namespace name
-is encoded using Unicode-mangling.
-
-@item u
-The @code{restrict} type qualifier.
-
-@item v
-Encodes the C++ and Java @code{void} types.
-
-@item V
-A modifier for a @code{volatile} type or method.
-
-@item w
-Encodes the C++ @code{wchar_t} type, and the Java @code{char} types.
-
-@item W
-Indicates the closing parenthesis of an expression.
-
-@item x
-Encodes the GNU C++ @code{long long} type, and the Java @code{long} type.
-
-@item X
-Encodes a template type parameter, when part of a function type.
-
-@item Y
-Encodes a template constant parameter, when part of a function type.
-
-@item z
-Used for template template parameters.
-
-@item Z
-Used for template type parameters.
-
-@end table
-
-The letters @samp{G}, @samp{M}, @samp{O}, and @samp{p}
-also seem to be used for obscure purposes ...
-
-@node Concept Index, , Mangling, Top
-
-@section Concept Index
-
-@printindex cp
-
-@bye