From: Richard Kenner Date: Mon, 28 Aug 1995 10:15:04 +0000 (-0400) Subject: entered into RCS X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=da7525f204ecda934ee282bc26c5e4ee9e2332d0;p=gcc.git entered into RCS From-SVN: r10287 --- diff --git a/gcc/README.DWARF b/gcc/README.DWARF new file mode 100644 index 00000000000..ac4719d0493 --- /dev/null +++ b/gcc/README.DWARF @@ -0,0 +1,574 @@ +Notes on the GNU Implementation of DWARF Debugging Information +-------------------------------------------------------------- +Last Updated: Sun Jul 17 08:17:42 PDT 1994 by rfg@segfault.us.com +------------------------------------------------------------ + +This file describes special and unique aspects of the GNU implementation +of the DWARF debugging information language, as provided in the GNU version +2.x compiler(s). + +For general information about the DWARF debugging information language, +you should obtain the DWARF version 1 specification document (and perhaps +also the DWARF version 2 draft specification document) developed by the +UNIX International Programming Languages Special Interest Group. A copy +of the the DWARF version 1 specification (in PostScript form) may be +obtained either from me or from the main Data General +FTP server. (See below.) The file you are looking at now only describes +known deviations from the DWARF version 1 specification, together with +those things which are allowed by the DWARF version 1 specification but +which are known to cause interoperability problems (e.g. with SVR4 SDB). + +To obtain a copy of the DWARF Version 1 and/or DWARF Version 2 specification +from Data General's FTP server, use the following procedure: + +--------------------------------------------------------------------------- + ftp to machine: "dg-rtp.dg.com" (128.222.1.2). + + Log in as "ftp". + cd to "plsig" + get any of the following file you are interested in: + + dwarf.1.0.3.ps + dwarf.2.0.0.index.ps + dwarf.2.0.0.ps +--------------------------------------------------------------------------- + +The generation of DWARF debugging information by the GNU version 2.x C +compiler has now been tested rather extensively for m88k, i386, i860, and +Sparc targets. The DWARF output of the GNU C compiler appears to inter- +operate well with the standard SVR4 SDB debugger on these kinds of target +systems (but of course, there are no guarantees). + +DWARF generation for the GNU g++ compiler is still not operable. This is +due primarily to the many remaining cases where the g++ front end does not +conform to the conventions used in the GNU C front end for representing +various kinds of declarations in the TREE data structure. It is not clear +at this time how these problems will be addressed. + +Future plans for the dwarfout.c module of the GNU compiler(s) includes the +addition of full support for GNU FORTRAN. (This should, in theory, be a +lot simpler to add than adding support for g++... but we'll see.) + +Many features of the DWARF version 2 specification have been adapted to +(and used in) the GNU implementation of DWARF (version 1). In most of +these cases, a DWARF version 2 approach is used in place of (or in addition +to) DWARF version 1 stuff simply because it is apparent that DWARF version +1 is not sufficiently expressive to provide the kinds of information which +may be necessary to support really robust debugging. In all of these cases +however, the use of DWARF version 2 features should not interfere in any +way with the interoperability (of GNU compilers) with generally available +"classic" (pre version 1) DWARF consumer tools (e.g. SVR4 SDB). + +The DWARF generation enhancement for the GNU compiler(s) was initially +donated to the Free Software Foundation by Network Computing Devices. +(Thanks NCD!) Additional development and maintenance of dwarfout.c has +been largely supported (i.e. funded) by Intel Corporation. (Thanks Intel!) + +If you have questions or comments about the DWARF generation feature, please +send mail to me . I will be happy to investigate any bugs +reported and I may even provide fixes (but of course, I can make no promises). + +The DWARF debugging information produced by GCC may deviate in a few minor +(but perhaps significant) respects from the DWARF debugging information +currently produced by other C compilers. A serious attempt has been made +however to conform to the published specifications, to existing practice, +and to generally accepted norms in the GNU implementation of DWARF. + + ** IMPORTANT NOTE ** ** IMPORTANT NOTE ** ** IMPORTANT NOTE ** + +Under normal circumstances, the DWARF information generated by the GNU +compilers (in an assembly language file) is essentially impossible for +a human being to read. This fact can make it very difficult to debug +certain DWARF-related problems. In order to overcome this difficulty, +a feature has been added to dwarfout.c (enabled by the -fverbose-asm +option) which causes additional comments to be placed into the assembly +language output file, out to the right-hand side of most bits of DWARF +material. The comments indicate (far more clearly that the obscure +DWARF hex codes do) what is actually being encoded in DWARF. Thus, the +-fverbose-asm option can be highly useful for those who must study the +DWARF output from the GNU compilers in detail. + +--------- + +(Footnote: Within this file, the term `Debugging Information Entry' will +be abbreviated as `DIE'.) + + +Release Notes (aka known bugs) +------------------------------- + +In one very obscure case involving dynamically sized arrays, the DWARF +"location information" for such an array may make it appear that the +array has been totally optimized out of existence, when in fact it +*must* actually exist. (This only happens when you are using *both* -g +*and* -O.) This is due to aggressive dead store elimination in the +compiler, and to the fact that the DECL_RTL expressions associated with +variables are not always updated to correctly reflect the effects of +GCC's aggressive dead store elimination. + +------------------------------- + +When attempting to set a breakpoint at the "start" of a function compiled +with -g1, the debugger currently has no way of knowing exactly where the +end of the prologue code for the function is. Thus, for most targets, +all the debugger can do is to set the breakpoint at the AT_low_pc address +for the function. But if you stop there and then try to look at one or +more of the formal parameter values, they may not have been "homed" yet, +so you may get inaccurate answers (or perhaps even addressing errors). + +Some people may consider this simply a non-feature, but I consider it a +bug, and I hope to provide some some GNU-specific attributes (on function +DIEs) which will specify the address of the end of the prologue and the +address of the beginning of the epilogue in a future release. + +------------------------------- + +It is believed at this time that old bugs relating to the AT_bit_offset +values for bit-fields have been fixed. + +There may still be some very obscure bugs relating to the DWARF description +of type `long long' bit-fields for target machines (e.g. 80x86 machines) +where the alignment of type `long long' data objects is different from +(and less than) the size of a type `long long' data object. + +Please report any problems with the DWARF description of bit-fields as you +would any other GCC bug. (Procedures for bug reporting are given in the +GNU C compiler manual.) + +-------------------------------- + +At this time, GCC does not know how to handle the GNU C "nested functions" +extension. (See the GCC manual for more info on this extension to ANSI C.) + +-------------------------------- + +The GNU compilers now represent inline functions (and inlined instances +thereof) in exactly the manner described by the current DWARF version 2 +(draft) specification. The version 1 specification for handling inline +functions (and inlined instances) was known to be brain-damaged (by the +PLSIG) when the version 1 spec was finalized, but it was simply too late +in the cycle to get it removed before the version 1 spec was formally +released to the public (by UI). + +-------------------------------- + +At this time, GCC does not generate the kind of really precise information +about the exact declared types of entities with signed integral types which +is required by the current DWARF draft specification. + +Specifically, the current DWARF draft specification seems to require that +the type of an non-unsigned integral bit-field member of a struct or union +type be represented as either a "signed" type or as a "plain" type, +depending upon the the exact set of keywords that were used in the +type specification for the given bit-field member. It was felt (by the +UI/PLSIG) that this distinction between "plain" and "signed" integral types +could have some significance (in the case of bit-fields) because ANSI C +does not constrain the signedness of a plain bit-field, whereas it does +constrain the signedness of an explicitly "signed" bit-field. For this +reason, the current DWARF specification calls for compilers to produce +type information (for *all* integral typed entities... not just bit-fields) +which explicitly indicates the signedness of the relevant type to be +"signed" or "plain" or "unsigned". + +Unfortunately, the GNU DWARF implementation is currently incapable of making +such distinctions. + +-------------------------------- + + +Known Interoperability Problems +------------------------------- + +Although the GNU implementation of DWARF conforms (for the most part) with +the current UI/PLSIG DWARF version 1 specification (with many compatible +version 2 features added in as "vendor specific extensions" just for good +measure) there are a few known cases where GCC's DWARF output can cause +some confusion for "classic" (pre version 1) DWARF consumers such as the +System V Release 4 SDB debugger. These cases are described in this section. + +-------------------------------- + +The DWARF version 1 specification includes the fundamental type codes +FT_ext_prec_float, FT_complex, FT_dbl_prec_complex, and FT_ext_prec_complex. +Since GNU C is only a C compiler (and since C doesn't provide any "complex" +data types) the only one of these fundamental type codes which GCC ever +generates is FT_ext_prec_float. This fundamental type code is generated +by GCC for the `long double' data type. Unfortunately, due to an apparent +bug in the SVR4 SDB debugger, SDB can become very confused wherever any +attempt is made to print a variable, parameter, or field whose type was +given in terms of FT_ext_prec_float. + +(Actually, SVR4 SDB fails to understand *any* of the four fundamental type +codes mentioned here. This will fact will cause additional problems when +there is a GNU FORTRAN front-end.) + +-------------------------------- + +In general, it appears that SVR4 SDB is not able to effectively ignore +fundamental type codes in the "implementation defined" range. This can +cause problems when a program being debugged uses the `long long' data +type (or the signed or unsigned varieties thereof) because these types +are not defined by ANSI C, and thus, GCC must use its own private fundamental +type codes (from the implementation-defined range) to represent these types. + +-------------------------------- + + +General GNU DWARF extensions +---------------------------- + +In the current DWARF version 1 specification, no mechanism is specified by +which accurate information about executable code from include files can be +properly (and fully) described. (The DWARF version 2 specification *does* +specify such a mechanism, but it is about 10 times more complicated than +it needs to be so I'm not terribly anxious to try to implement it right +away.) + +In the GNU implementation of DWARF version 1, a fully downward-compatible +extension has been implemented which permits the GNU compilers to specify +which executable lines come from which files. This extension places +additional information (about source file names) in GNU-specific sections +(which should be totally ignored by all non-GNU DWARF consumers) so that +this extended information can be provided (to GNU DWARF consumers) in a way +which is totally transparent (and invisible) to non-GNU DWARF consumers +(e.g. the SVR4 SDB debugger). The additional information is placed *only* +in specialized GNU-specific sections, where it should never even be seen +by non-GNU DWARF consumers. + +To understand this GNU DWARF extension, imagine that the sequence of entries +in the .lines section is broken up into several subsections. Each contiguous +sequence of .line entries which relates to a sequence of lines (or statements) +from one particular file (either a `base' file or an `include' file) could +be called a `line entries chunk' (LEC). + +For each LEC there is one entry in the .debug_srcinfo section. + +Each normal entry in the .debug_srcinfo section consists of two 4-byte +words of data as follows: + + (1) The starting address (relative to the entire .line section) + of the first .line entry in the relevant LEC. + + (2) The starting address (relative to the entire .debug_sfnames + section) of a NUL terminated string representing the + relevant filename. (This filename name be either a + relative or an absolute filename, depending upon how the + given source file was located during compilation.) + +Obviously, each .debug_srcinfo entry allows you to find the relevant filename, +and it also points you to the first .line entry that was generated as a result +of having compiled a given source line from the given source file. + +Each subsequent .line entry should also be assumed to have been produced +as a result of compiling yet more lines from the same file. The end of +any given LEC is easily found by looking at the first 4-byte pointer in +the *next* .debug_srcinfo entry. That next .debug_srcinfo entry points +to a new and different LEC, so the preceding LEC (implicitly) must have +ended with the last .line section entry which occurs at the 2 1/2 words +just before the address given in the first pointer of the new .debug_srcinfo +entry. + +The following picture may help to clarify this feature. Let's assume that +`LE' stands for `.line entry'. Also, assume that `* 'stands for a pointer. + + + .line section .debug_srcinfo section .debug_sfnames section + ---------------------------------------------------------------- + + LE <---------------------- * + LE * -----------------> "foobar.c" <--- + LE | + LE | + LE <---------------------- * | + LE * -----------------> "foobar.h" <| | + LE | | + LE | | + LE <---------------------- * | | + LE * -----------------> "inner.h" | | + LE | | + LE <---------------------- * | | + LE * ------------------------------- | + LE | + LE | + LE | + LE | + LE <---------------------- * | + LE * ----------------------------------- + LE + LE + LE + +In effect, each entry in the .debug_srcinfo section points to *both* a +filename (in the .debug_sfnames section) and to the start of a block of +consecutive LEs (in the .line section). + +Note that just like in the .line section, there are specialized first and +last entries in the .debug_srcinfo section for each object file. These +special first and last entries for the .debug_srcinfo section are very +different from the normal .debug_srcinfo section entries. They provide +additional information which may be helpful to a debugger when it is +interpreting the data in the .debug_srcinfo, .debug_sfnames, and .line +sections. + +The first entry in the .debug_srcinfo section for each compilation unit +consists of five 4-byte words of data. The contents of these five words +should be interpreted (by debuggers) as follows: + + (1) The starting address (relative to the entire .line section) + of the .line section for this compilation unit. + + (2) The starting address (relative to the entire .debug_sfnames + section) of the .debug_sfnames section for this compilation + unit. + + (3) The starting address (in the execution virtual address space) + of the .text section for this compilation unit. + + (4) The ending address plus one (in the execution virtual address + space) of the .text section for this compilation unit. + + (5) The date/time (in seconds since midnight 1/1/70) at which the + compilation of this compilation unit occurred. This value + should be interpreted as an unsigned quantity because gcc + might be configured to generate a default value of 0xffffffff + in this field (in cases where it is desired to have object + files created at different times from identical source files + be byte-for-byte identical). By default, these timestamps + are *not* generated by dwarfout.c (so that object files + compiled at different times will be byte-for-byte identical). + If you wish to enable this "timestamp" feature however, you + can simply place a #define for the symbol `DWARF_TIMESTAMPS' + in your target configuration file and then rebuild the GNU + compiler(s). + +Note that the first string placed into the .debug_sfnames section for each +compilation unit is the name of the directory in which compilation occurred. +This string ends with a `/' (to help indicate that it is the pathname of a +directory). Thus, the second word of each specialized initial .debug_srcinfo +entry for each compilation unit may be used as a pointer to the (string) +name of the compilation directory, and that string may in turn be used to +"absolutize" any relative pathnames which may appear later on in the +.debug_sfnames section entries for the same compilation unit. + +The fifth and last word of each specialized starting entry for a compilation +unit in the .debug_srcinfo section may (depending upon your configuration) +indicate the date/time of compilation, and this may be used (by a debugger) +to determine if any of the source files which contributed code to this +compilation unit are newer than the object code for the compilation unit +itself. If so, the debugger may wish to print an "out-of-date" warning +about the compilation unit. + +The .debug_srcinfo section associated with each compilation will also have +a specialized terminating entry. This terminating .debug_srcinfo section +entry will consist of the following two 4-byte words of data: + + (1) The offset, measured from the start of the .line section to + the beginning of the terminating entry for the .line section. + + (2) A word containing the value 0xffffffff. + +-------------------------------- + +In the current DWARF version 1 specification, no mechanism is specified by +which information about macro definitions and un-definitions may be provided +to the DWARF consumer. + +The DWARF version 2 (draft) specification does specify such a mechanism. +That specification was based on the GNU ("vendor specific extension") +which provided some support for macro definitions and un-definitions, +but the "official" DWARF version 2 (draft) specification mechanism for +handling macros and the GNU implementation have diverged somewhat. I +plan to update the GNU implementation to conform to the "official" +DWARF version 2 (draft) specification as soon as I get time to do that. + +Note that in the GNU implementation, additional information about macro +definitions and un-definitions is *only* provided when the -g3 level of +debug-info production is selected. (The default level is -g2 and the +plain old -g option is considered to be identical to -g2.) + +GCC records information about macro definitions and undefinitions primarily +in a section called the .debug_macinfo section. Normal entries in the +.debug_macinfo section consist of the following three parts: + + (1) A special "type" byte. + + (2) A 3-byte line-number/filename-offset field. + + (3) A NUL terminated string. + +The interpretation of the second and third parts is dependent upon the +value of the leading (type) byte. + +The type byte may have one of four values depending upon the type of the +.debug_macinfo entry which follows. The 1-byte MACINFO type codes presently +used, and their meanings are as follows: + + MACINFO_start A base file or an include file starts here. + MACINFO_resume The current base or include file ends here. + MACINFO_define A #define directive occurs here. + MACINFO_undef A #undef directive occur here. + +(Note that the MACINFO_... codes mentioned here are simply symbolic names +for constants which are defined in the GNU dwarf.h file.) + +For MACINFO_define and MACINFO_undef entries, the second (3-byte) field +contains the number of the source line (relative to the start of the current +base source file or the current include files) when the #define or #undef +directive appears. For a MACINFO_define entry, the following string field +contains the name of the macro which is defined, followed by its definition. +Note that the definition is always separated from the name of the macro +by at least one whitespace character. For a MACINFO_undef entry, the +string which follows the 3-byte line number field contains just the name +of the macro which is being undef'ed. + +For a MACINFO_start entry, the 3-byte field following the type byte contains +the offset, relative to the start of the .debug_sfnames section for the +current compilation unit, of a string which names the new source file which +is beginning its inclusion at this point. Following that 3-byte field, +each MACINFO_start entry always contains a zero length NUL terminated +string. + +For a MACINFO_resume entry, the 3-byte field following the type byte contains +the line number WITHIN THE INCLUDING FILE at which the inclusion of the +current file (whose inclusion ends here) was initiated. Following that +3-byte field, each MACINFO_resume entry always contains a zero length NUL +terminated string. + +Each set of .debug_macinfo entries for each compilation unit is terminated +by a special .debug_macinfo entry consisting of a 4-byte zero value followed +by a single NUL byte. + +-------------------------------- + +In the current DWARF draft specification, no provision is made for providing +a separate level of (limited) debugging information necessary to support +tracebacks (only) through fully-debugged code (e.g. code in system libraries). + +A proposal to define such a level was submitted (by me) to the UI/PLSIG. +This proposal was rejected by the UI/PLSIG for inclusion into the DWARF +version 1 specification for two reasons. First, it was felt (by the PLSIG) +that the issues involved in supporting a "traceback only" subset of DWARF +were not well understood. Second, and perhaps more importantly, the PLSIG +is already having enough trouble agreeing on what it means to be "conforming" +to the DWARF specification, and it was felt that trying to specify multiple +different *levels* of conformance would only complicate our discussions of +this already divisive issue. Nonetheless, the GNU implementation of DWARF +provides an abbreviated "traceback only" level of debug-info production for +use with fully-debugged "system library" code. This level should only be +used for fully debugged system library code, and even then, it should only +be used where there is a very strong need to conserve disk space. This +abbreviated level of debug-info production can be used by specifying the +-g1 option on the compilation command line. + +-------------------------------- + +As mentioned above, the GNU implementation of DWARF currently uses the DWARF +version 2 (draft) approach for inline functions (and inlined instances +thereof). This is used in preference to the version 1 approach because +(quite simply) the version 1 approach is highly brain-damaged and probably +unworkable. + +-------------------------------- + + +GNU DWARF Representation of GNU C Extensions to ANSI C +------------------------------------------------------ + +The file dwarfout.c has been designed and implemented so as to provide +some reasonable DWARF representation for each and every declarative +construct which is accepted by the GNU C compiler. Since the GNU C +compiler accepts a superset of ANSI C, this means that there are some +cases in which the DWARF information produced by GCC must take some +liberties in improvising DWARF representations for declarations which +are only valid in (extended) GNU C. + +In particular, GNU C provides at least three significant extensions to +ANSI C when it comes to declarations. These are (1) inline functions, +and (2) dynamic arrays, and (3) incomplete enum types. (See the GCC +manual for more information on these GNU extensions to ANSI C.) When +used, these GNU C extensions are represented (in the generated DWARF +output of GCC) in the most natural and intuitively obvious ways. + +In the case of inline functions, the DWARF representation is exactly as +called for in the DWARF version 2 (draft) specification for an identical +function written in C++; i.e. we "reuse" the representation of inline +functions which has been defined for C++ to support this GNU C extension. + +In the case of dynamic arrays, we use the most obvious representational +mechanism available; i.e. an array type in which the upper bound of +some dimension (usually the first and only dimension) is a variable +rather than a constant. (See the DWARF version 1 specification for more +details.) + +In the case of incomplete enum types, such types are represented simply +as TAG_enumeration_type DIEs which DO NOT contain either AT_byte_size +attributes or AT_element_list attributes. + +-------------------------------- + + +Future Directions +----------------- + +The codes, formats, and other paraphernalia necessary to provide proper +support for symbolic debugging for the C++ language are still being worked +on by the UI/PLSIG. The vast majority of the additions to DWARF which will +be needed to completely support C++ have already been hashed out and agreed +upon, but a few small issues (e.g. anonymous unions, access declarations) +are still being discussed. Also, we in the PLSIG are still discussing +whether or not we need to do anything special for C++ templates. (At this +time it is not yet clear whether we even need to do anything special for +these.) + +Unfortunately, as mentioned above, there are quite a few problems in the +g++ front end itself, and these are currently responsible for severely +restricting the progress which can be made on adding DWARF support +specifically for the g++ front-end. Furthermore, Richard Stallman has +expressed the view that C++ friendships might not be important enough to +describe (in DWARF). This view directly conflicts with both the DWARF +version 1 and version 2 (draft) specifications, so until this small +misunderstanding is cleared up, DWARF support for g++ is unlikely. + +With regard to FORTRAN, the UI/PLSIG has defined what is believed to be a +complete and sufficient set of codes and rules for adequately representing +all of FORTRAN 77, and most of Fortran 90 in DWARF. While some support for +this has been implemented in dwarfout.c, further implementation and testing +will have to await the arrival of the GNU Fortran front-end (which is +currently in early alpha test as of this writing). + +GNU DWARF support for other languages (i.e. Pascal and Modula) is a moot +issue until there are GNU front-ends for these other languages. + +GNU DWARF support for DWARF version 2 will probably not be attempted until +such time as the version 2 specification is finalized. (More work needs +to be done on the version 2 specification to make the new "abbreviations" +feature of version 2 more easily implementable. Until then, it will be +a royal pain the ass to implement version 2 "abbreviations".) For the +time being, version 2 features will be added (in a version 1 compatible +manner) when and where these features seem necessary or extremely desirable. + +As currently defined, DWARF only describes a (binary) language which can +be used to communicate symbolic debugging information from a compiler +through an assembler and a linker, to a debugger. There is no clear +specification of what processing should be (or must be) done by the +assembler and/or the linker. Fortunately, the role of the assembler +is easily inferred (by anyone knowledgeable about assemblers) just by +looking at examples of assembly-level DWARF code. Sadly though, the +allowable (or required) processing steps performed by a linker are +harder to infer and (perhaps) even harder to agree upon. There are +several forms of very useful `post-processing' steps which intelligent +linkers *could* (in theory) perform on object files containing DWARF, +but any and all such link-time transformations are currently both disallowed +and unspecified. + +In particular, possible link-time transformations of DWARF code which could +provide significant benefits include (but are not limited to): + + Commonization of duplicate DIEs obtained from multiple input + (object) files. + + Cross-compilation type checking based upon DWARF type information + for objects and functions. + + Other possible `compacting' transformations designed to save disk + space and to reduce linker & debugger I/O activity.