From 3804fe2e1b4c92ffb7350cdef691ddab6fe9c840 Mon Sep 17 00:00:00 2001 From: Jason Merrill Date: Fri, 18 May 2001 18:39:38 -0400 Subject: [PATCH] * README.DWARF: Move into dwarfout.c. From-SVN: r42290 --- gcc/ChangeLog | 6 +- gcc/README.DWARF | 574 ----------------------------------------------- gcc/dwarfout.c | 542 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 547 insertions(+), 575 deletions(-) delete mode 100644 gcc/README.DWARF diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 6ff92046539..cc19c90c9ba 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,7 @@ +2001-05-18 Jason Merrill + + * README.DWARF: Move into dwarfout.c. + 2001-05-18 Dale Johannesen * config/rs6000/rs6000.c (secondary_reload_class): Fix Darwin @@ -1609,7 +1613,7 @@ Wed May 2 13:09:36 2001 Richard Kenner 2001-04-29 Toomas Rosin - * Makefile.in(stmp-fixinc): quote shell assignment values + * Makefile.in (stmp-fixinc): quote shell assignment values 2001-04-29 Kaveh R. Ghazi diff --git a/gcc/README.DWARF b/gcc/README.DWARF deleted file mode 100644 index 97459508b3c..00000000000 --- a/gcc/README.DWARF +++ /dev/null @@ -1,574 +0,0 @@ -Notes on the GNU Implementation of DWARF Debugging Information --------------------------------------------------------------- -Last Updated: Sun Jul 17 08:17:42 PDT 1994 by rfg@segfault.us.com ------------------------------------------------------------- - -This file describes special and unique aspects of the GNU implementation -of the DWARF debugging information language, as provided in the GNU version -2.x compiler(s). - -For general information about the DWARF debugging information language, -you should obtain the DWARF version 1 specification document (and perhaps -also the DWARF version 2 draft specification document) developed by the -UNIX International Programming Languages Special Interest Group. A copy -of the DWARF version 1 specification (in PostScript form) may be -obtained either from me or from the main Data General -FTP server. (See below.) The file you are looking at now only describes -known deviations from the DWARF version 1 specification, together with -those things which are allowed by the DWARF version 1 specification but -which are known to cause interoperability problems (e.g. with SVR4 SDB). - -To obtain a copy of the DWARF Version 1 and/or DWARF Version 2 specification -from Data General's FTP server, use the following procedure: - ---------------------------------------------------------------------------- - ftp to machine: "dg-rtp.dg.com" (128.222.1.2). - - Log in as "ftp". - cd to "plsig" - get any of the following file you are interested in: - - dwarf.1.0.3.ps - dwarf.2.0.0.index.ps - dwarf.2.0.0.ps ---------------------------------------------------------------------------- - -The generation of DWARF debugging information by the GNU version 2.x C -compiler has now been tested rather extensively for m88k, i386, i860, and -Sparc targets. The DWARF output of the GNU C compiler appears to inter- -operate well with the standard SVR4 SDB debugger on these kinds of target -systems (but of course, there are no guarantees). - -DWARF generation for the GNU g++ compiler is still not operable. This is -due primarily to the many remaining cases where the g++ front end does not -conform to the conventions used in the GNU C front end for representing -various kinds of declarations in the TREE data structure. It is not clear -at this time how these problems will be addressed. - -Future plans for the dwarfout.c module of the GNU compiler(s) includes the -addition of full support for GNU FORTRAN. (This should, in theory, be a -lot simpler to add than adding support for g++... but we'll see.) - -Many features of the DWARF version 2 specification have been adapted to -(and used in) the GNU implementation of DWARF (version 1). In most of -these cases, a DWARF version 2 approach is used in place of (or in addition -to) DWARF version 1 stuff simply because it is apparent that DWARF version -1 is not sufficiently expressive to provide the kinds of information which -may be necessary to support really robust debugging. In all of these cases -however, the use of DWARF version 2 features should not interfere in any -way with the interoperability (of GNU compilers) with generally available -"classic" (pre version 1) DWARF consumer tools (e.g. SVR4 SDB). - -The DWARF generation enhancement for the GNU compiler(s) was initially -donated to the Free Software Foundation by Network Computing Devices. -(Thanks NCD!) Additional development and maintenance of dwarfout.c has -been largely supported (i.e. funded) by Intel Corporation. (Thanks Intel!) - -If you have questions or comments about the DWARF generation feature, please -send mail to me . I will be happy to investigate any bugs -reported and I may even provide fixes (but of course, I can make no promises). - -The DWARF debugging information produced by GCC may deviate in a few minor -(but perhaps significant) respects from the DWARF debugging information -currently produced by other C compilers. A serious attempt has been made -however to conform to the published specifications, to existing practice, -and to generally accepted norms in the GNU implementation of DWARF. - - ** IMPORTANT NOTE ** ** IMPORTANT NOTE ** ** IMPORTANT NOTE ** - -Under normal circumstances, the DWARF information generated by the GNU -compilers (in an assembly language file) is essentially impossible for -a human being to read. This fact can make it very difficult to debug -certain DWARF-related problems. In order to overcome this difficulty, -a feature has been added to dwarfout.c (enabled by the -fverbose-asm -option) which causes additional comments to be placed into the assembly -language output file, out to the right-hand side of most bits of DWARF -material. The comments indicate (far more clearly that the obscure -DWARF hex codes do) what is actually being encoded in DWARF. Thus, the --fverbose-asm option can be highly useful for those who must study the -DWARF output from the GNU compilers in detail. - ---------- - -(Footnote: Within this file, the term `Debugging Information Entry' will -be abbreviated as `DIE'.) - - -Release Notes (aka known bugs) -------------------------------- - -In one very obscure case involving dynamically sized arrays, the DWARF -"location information" for such an array may make it appear that the -array has been totally optimized out of existence, when in fact it -*must* actually exist. (This only happens when you are using *both* -g -*and* -O.) This is due to aggressive dead store elimination in the -compiler, and to the fact that the DECL_RTL expressions associated with -variables are not always updated to correctly reflect the effects of -GCC's aggressive dead store elimination. - -------------------------------- - -When attempting to set a breakpoint at the "start" of a function compiled -with -g1, the debugger currently has no way of knowing exactly where the -end of the prologue code for the function is. Thus, for most targets, -all the debugger can do is to set the breakpoint at the AT_low_pc address -for the function. But if you stop there and then try to look at one or -more of the formal parameter values, they may not have been "homed" yet, -so you may get inaccurate answers (or perhaps even addressing errors). - -Some people may consider this simply a non-feature, but I consider it a -bug, and I hope to provide some GNU-specific attributes (on function -DIEs) which will specify the address of the end of the prologue and the -address of the beginning of the epilogue in a future release. - -------------------------------- - -It is believed at this time that old bugs relating to the AT_bit_offset -values for bit-fields have been fixed. - -There may still be some very obscure bugs relating to the DWARF description -of type `long long' bit-fields for target machines (e.g. 80x86 machines) -where the alignment of type `long long' data objects is different from -(and less than) the size of a type `long long' data object. - -Please report any problems with the DWARF description of bit-fields as you -would any other GCC bug. (Procedures for bug reporting are given in the -GNU C compiler manual.) - --------------------------------- - -At this time, GCC does not know how to handle the GNU C "nested functions" -extension. (See the GCC manual for more info on this extension to ANSI C.) - --------------------------------- - -The GNU compilers now represent inline functions (and inlined instances -thereof) in exactly the manner described by the current DWARF version 2 -(draft) specification. The version 1 specification for handling inline -functions (and inlined instances) was known to be brain-damaged (by the -PLSIG) when the version 1 spec was finalized, but it was simply too late -in the cycle to get it removed before the version 1 spec was formally -released to the public (by UI). - --------------------------------- - -At this time, GCC does not generate the kind of really precise information -about the exact declared types of entities with signed integral types which -is required by the current DWARF draft specification. - -Specifically, the current DWARF draft specification seems to require that -the type of an non-unsigned integral bit-field member of a struct or union -type be represented as either a "signed" type or as a "plain" type, -depending upon the exact set of keywords that were used in the -type specification for the given bit-field member. It was felt (by the -UI/PLSIG) that this distinction between "plain" and "signed" integral types -could have some significance (in the case of bit-fields) because ANSI C -does not constrain the signedness of a plain bit-field, whereas it does -constrain the signedness of an explicitly "signed" bit-field. For this -reason, the current DWARF specification calls for compilers to produce -type information (for *all* integral typed entities... not just bit-fields) -which explicitly indicates the signedness of the relevant type to be -"signed" or "plain" or "unsigned". - -Unfortunately, the GNU DWARF implementation is currently incapable of making -such distinctions. - --------------------------------- - - -Known Interoperability Problems -------------------------------- - -Although the GNU implementation of DWARF conforms (for the most part) with -the current UI/PLSIG DWARF version 1 specification (with many compatible -version 2 features added in as "vendor specific extensions" just for good -measure) there are a few known cases where GCC's DWARF output can cause -some confusion for "classic" (pre version 1) DWARF consumers such as the -System V Release 4 SDB debugger. These cases are described in this section. - --------------------------------- - -The DWARF version 1 specification includes the fundamental type codes -FT_ext_prec_float, FT_complex, FT_dbl_prec_complex, and FT_ext_prec_complex. -Since GNU C is only a C compiler (and since C doesn't provide any "complex" -data types) the only one of these fundamental type codes which GCC ever -generates is FT_ext_prec_float. This fundamental type code is generated -by GCC for the `long double' data type. Unfortunately, due to an apparent -bug in the SVR4 SDB debugger, SDB can become very confused wherever any -attempt is made to print a variable, parameter, or field whose type was -given in terms of FT_ext_prec_float. - -(Actually, SVR4 SDB fails to understand *any* of the four fundamental type -codes mentioned here. This will fact will cause additional problems when -there is a GNU FORTRAN front-end.) - --------------------------------- - -In general, it appears that SVR4 SDB is not able to effectively ignore -fundamental type codes in the "implementation defined" range. This can -cause problems when a program being debugged uses the `long long' data -type (or the signed or unsigned varieties thereof) because these types -are not defined by ANSI C, and thus, GCC must use its own private fundamental -type codes (from the implementation-defined range) to represent these types. - --------------------------------- - - -General GNU DWARF extensions ----------------------------- - -In the current DWARF version 1 specification, no mechanism is specified by -which accurate information about executable code from include files can be -properly (and fully) described. (The DWARF version 2 specification *does* -specify such a mechanism, but it is about 10 times more complicated than -it needs to be so I'm not terribly anxious to try to implement it right -away.) - -In the GNU implementation of DWARF version 1, a fully downward-compatible -extension has been implemented which permits the GNU compilers to specify -which executable lines come from which files. This extension places -additional information (about source file names) in GNU-specific sections -(which should be totally ignored by all non-GNU DWARF consumers) so that -this extended information can be provided (to GNU DWARF consumers) in a way -which is totally transparent (and invisible) to non-GNU DWARF consumers -(e.g. the SVR4 SDB debugger). The additional information is placed *only* -in specialized GNU-specific sections, where it should never even be seen -by non-GNU DWARF consumers. - -To understand this GNU DWARF extension, imagine that the sequence of entries -in the .lines section is broken up into several subsections. Each contiguous -sequence of .line entries which relates to a sequence of lines (or statements) -from one particular file (either a `base' file or an `include' file) could -be called a `line entries chunk' (LEC). - -For each LEC there is one entry in the .debug_srcinfo section. - -Each normal entry in the .debug_srcinfo section consists of two 4-byte -words of data as follows: - - (1) The starting address (relative to the entire .line section) - of the first .line entry in the relevant LEC. - - (2) The starting address (relative to the entire .debug_sfnames - section) of a NUL terminated string representing the - relevant filename. (This filename name be either a - relative or an absolute filename, depending upon how the - given source file was located during compilation.) - -Obviously, each .debug_srcinfo entry allows you to find the relevant filename, -and it also points you to the first .line entry that was generated as a result -of having compiled a given source line from the given source file. - -Each subsequent .line entry should also be assumed to have been produced -as a result of compiling yet more lines from the same file. The end of -any given LEC is easily found by looking at the first 4-byte pointer in -the *next* .debug_srcinfo entry. That next .debug_srcinfo entry points -to a new and different LEC, so the preceding LEC (implicitly) must have -ended with the last .line section entry which occurs at the 2 1/2 words -just before the address given in the first pointer of the new .debug_srcinfo -entry. - -The following picture may help to clarify this feature. Let's assume that -`LE' stands for `.line entry'. Also, assume that `* 'stands for a pointer. - - - .line section .debug_srcinfo section .debug_sfnames section - ---------------------------------------------------------------- - - LE <---------------------- * - LE * -----------------> "foobar.c" <--- - LE | - LE | - LE <---------------------- * | - LE * -----------------> "foobar.h" <| | - LE | | - LE | | - LE <---------------------- * | | - LE * -----------------> "inner.h" | | - LE | | - LE <---------------------- * | | - LE * ------------------------------- | - LE | - LE | - LE | - LE | - LE <---------------------- * | - LE * ----------------------------------- - LE - LE - LE - -In effect, each entry in the .debug_srcinfo section points to *both* a -filename (in the .debug_sfnames section) and to the start of a block of -consecutive LEs (in the .line section). - -Note that just like in the .line section, there are specialized first and -last entries in the .debug_srcinfo section for each object file. These -special first and last entries for the .debug_srcinfo section are very -different from the normal .debug_srcinfo section entries. They provide -additional information which may be helpful to a debugger when it is -interpreting the data in the .debug_srcinfo, .debug_sfnames, and .line -sections. - -The first entry in the .debug_srcinfo section for each compilation unit -consists of five 4-byte words of data. The contents of these five words -should be interpreted (by debuggers) as follows: - - (1) The starting address (relative to the entire .line section) - of the .line section for this compilation unit. - - (2) The starting address (relative to the entire .debug_sfnames - section) of the .debug_sfnames section for this compilation - unit. - - (3) The starting address (in the execution virtual address space) - of the .text section for this compilation unit. - - (4) The ending address plus one (in the execution virtual address - space) of the .text section for this compilation unit. - - (5) The date/time (in seconds since midnight 1/1/70) at which the - compilation of this compilation unit occurred. This value - should be interpreted as an unsigned quantity because gcc - might be configured to generate a default value of 0xffffffff - in this field (in cases where it is desired to have object - files created at different times from identical source files - be byte-for-byte identical). By default, these timestamps - are *not* generated by dwarfout.c (so that object files - compiled at different times will be byte-for-byte identical). - If you wish to enable this "timestamp" feature however, you - can simply place a #define for the symbol `DWARF_TIMESTAMPS' - in your target configuration file and then rebuild the GNU - compiler(s). - -Note that the first string placed into the .debug_sfnames section for each -compilation unit is the name of the directory in which compilation occurred. -This string ends with a `/' (to help indicate that it is the pathname of a -directory). Thus, the second word of each specialized initial .debug_srcinfo -entry for each compilation unit may be used as a pointer to the (string) -name of the compilation directory, and that string may in turn be used to -"absolutize" any relative pathnames which may appear later on in the -.debug_sfnames section entries for the same compilation unit. - -The fifth and last word of each specialized starting entry for a compilation -unit in the .debug_srcinfo section may (depending upon your configuration) -indicate the date/time of compilation, and this may be used (by a debugger) -to determine if any of the source files which contributed code to this -compilation unit are newer than the object code for the compilation unit -itself. If so, the debugger may wish to print an "out-of-date" warning -about the compilation unit. - -The .debug_srcinfo section associated with each compilation will also have -a specialized terminating entry. This terminating .debug_srcinfo section -entry will consist of the following two 4-byte words of data: - - (1) The offset, measured from the start of the .line section to - the beginning of the terminating entry for the .line section. - - (2) A word containing the value 0xffffffff. - --------------------------------- - -In the current DWARF version 1 specification, no mechanism is specified by -which information about macro definitions and un-definitions may be provided -to the DWARF consumer. - -The DWARF version 2 (draft) specification does specify such a mechanism. -That specification was based on the GNU ("vendor specific extension") -which provided some support for macro definitions and un-definitions, -but the "official" DWARF version 2 (draft) specification mechanism for -handling macros and the GNU implementation have diverged somewhat. I -plan to update the GNU implementation to conform to the "official" -DWARF version 2 (draft) specification as soon as I get time to do that. - -Note that in the GNU implementation, additional information about macro -definitions and un-definitions is *only* provided when the -g3 level of -debug-info production is selected. (The default level is -g2 and the -plain old -g option is considered to be identical to -g2.) - -GCC records information about macro definitions and undefinitions primarily -in a section called the .debug_macinfo section. Normal entries in the -.debug_macinfo section consist of the following three parts: - - (1) A special "type" byte. - - (2) A 3-byte line-number/filename-offset field. - - (3) A NUL terminated string. - -The interpretation of the second and third parts is dependent upon the -value of the leading (type) byte. - -The type byte may have one of four values depending upon the type of the -.debug_macinfo entry which follows. The 1-byte MACINFO type codes presently -used, and their meanings are as follows: - - MACINFO_start A base file or an include file starts here. - MACINFO_resume The current base or include file ends here. - MACINFO_define A #define directive occurs here. - MACINFO_undef A #undef directive occur here. - -(Note that the MACINFO_... codes mentioned here are simply symbolic names -for constants which are defined in the GNU dwarf.h file.) - -For MACINFO_define and MACINFO_undef entries, the second (3-byte) field -contains the number of the source line (relative to the start of the current -base source file or the current include files) when the #define or #undef -directive appears. For a MACINFO_define entry, the following string field -contains the name of the macro which is defined, followed by its definition. -Note that the definition is always separated from the name of the macro -by at least one whitespace character. For a MACINFO_undef entry, the -string which follows the 3-byte line number field contains just the name -of the macro which is being undef'ed. - -For a MACINFO_start entry, the 3-byte field following the type byte contains -the offset, relative to the start of the .debug_sfnames section for the -current compilation unit, of a string which names the new source file which -is beginning its inclusion at this point. Following that 3-byte field, -each MACINFO_start entry always contains a zero length NUL terminated -string. - -For a MACINFO_resume entry, the 3-byte field following the type byte contains -the line number WITHIN THE INCLUDING FILE at which the inclusion of the -current file (whose inclusion ends here) was initiated. Following that -3-byte field, each MACINFO_resume entry always contains a zero length NUL -terminated string. - -Each set of .debug_macinfo entries for each compilation unit is terminated -by a special .debug_macinfo entry consisting of a 4-byte zero value followed -by a single NUL byte. - --------------------------------- - -In the current DWARF draft specification, no provision is made for providing -a separate level of (limited) debugging information necessary to support -tracebacks (only) through fully-debugged code (e.g. code in system libraries). - -A proposal to define such a level was submitted (by me) to the UI/PLSIG. -This proposal was rejected by the UI/PLSIG for inclusion into the DWARF -version 1 specification for two reasons. First, it was felt (by the PLSIG) -that the issues involved in supporting a "traceback only" subset of DWARF -were not well understood. Second, and perhaps more importantly, the PLSIG -is already having enough trouble agreeing on what it means to be "conforming" -to the DWARF specification, and it was felt that trying to specify multiple -different *levels* of conformance would only complicate our discussions of -this already divisive issue. Nonetheless, the GNU implementation of DWARF -provides an abbreviated "traceback only" level of debug-info production for -use with fully-debugged "system library" code. This level should only be -used for fully debugged system library code, and even then, it should only -be used where there is a very strong need to conserve disk space. This -abbreviated level of debug-info production can be used by specifying the --g1 option on the compilation command line. - --------------------------------- - -As mentioned above, the GNU implementation of DWARF currently uses the DWARF -version 2 (draft) approach for inline functions (and inlined instances -thereof). This is used in preference to the version 1 approach because -(quite simply) the version 1 approach is highly brain-damaged and probably -unworkable. - --------------------------------- - - -GNU DWARF Representation of GNU C Extensions to ANSI C ------------------------------------------------------- - -The file dwarfout.c has been designed and implemented so as to provide -some reasonable DWARF representation for each and every declarative -construct which is accepted by the GNU C compiler. Since the GNU C -compiler accepts a superset of ANSI C, this means that there are some -cases in which the DWARF information produced by GCC must take some -liberties in improvising DWARF representations for declarations which -are only valid in (extended) GNU C. - -In particular, GNU C provides at least three significant extensions to -ANSI C when it comes to declarations. These are (1) inline functions, -and (2) dynamic arrays, and (3) incomplete enum types. (See the GCC -manual for more information on these GNU extensions to ANSI C.) When -used, these GNU C extensions are represented (in the generated DWARF -output of GCC) in the most natural and intuitively obvious ways. - -In the case of inline functions, the DWARF representation is exactly as -called for in the DWARF version 2 (draft) specification for an identical -function written in C++; i.e. we "reuse" the representation of inline -functions which has been defined for C++ to support this GNU C extension. - -In the case of dynamic arrays, we use the most obvious representational -mechanism available; i.e. an array type in which the upper bound of -some dimension (usually the first and only dimension) is a variable -rather than a constant. (See the DWARF version 1 specification for more -details.) - -In the case of incomplete enum types, such types are represented simply -as TAG_enumeration_type DIEs which DO NOT contain either AT_byte_size -attributes or AT_element_list attributes. - --------------------------------- - - -Future Directions ------------------ - -The codes, formats, and other paraphernalia necessary to provide proper -support for symbolic debugging for the C++ language are still being worked -on by the UI/PLSIG. The vast majority of the additions to DWARF which will -be needed to completely support C++ have already been hashed out and agreed -upon, but a few small issues (e.g. anonymous unions, access declarations) -are still being discussed. Also, we in the PLSIG are still discussing -whether or not we need to do anything special for C++ templates. (At this -time it is not yet clear whether we even need to do anything special for -these.) - -Unfortunately, as mentioned above, there are quite a few problems in the -g++ front end itself, and these are currently responsible for severely -restricting the progress which can be made on adding DWARF support -specifically for the g++ front-end. Furthermore, Richard Stallman has -expressed the view that C++ friendships might not be important enough to -describe (in DWARF). This view directly conflicts with both the DWARF -version 1 and version 2 (draft) specifications, so until this small -misunderstanding is cleared up, DWARF support for g++ is unlikely. - -With regard to FORTRAN, the UI/PLSIG has defined what is believed to be a -complete and sufficient set of codes and rules for adequately representing -all of FORTRAN 77, and most of Fortran 90 in DWARF. While some support for -this has been implemented in dwarfout.c, further implementation and testing -will have to await the arrival of the GNU Fortran front-end (which is -currently in early alpha test as of this writing). - -GNU DWARF support for other languages (i.e. Pascal and Modula) is a moot -issue until there are GNU front-ends for these other languages. - -GNU DWARF support for DWARF version 2 will probably not be attempted until -such time as the version 2 specification is finalized. (More work needs -to be done on the version 2 specification to make the new "abbreviations" -feature of version 2 more easily implementable. Until then, it will be -a royal pain the ass to implement version 2 "abbreviations".) For the -time being, version 2 features will be added (in a version 1 compatible -manner) when and where these features seem necessary or extremely desirable. - -As currently defined, DWARF only describes a (binary) language which can -be used to communicate symbolic debugging information from a compiler -through an assembler and a linker, to a debugger. There is no clear -specification of what processing should be (or must be) done by the -assembler and/or the linker. Fortunately, the role of the assembler -is easily inferred (by anyone knowledgeable about assemblers) just by -looking at examples of assembly-level DWARF code. Sadly though, the -allowable (or required) processing steps performed by a linker are -harder to infer and (perhaps) even harder to agree upon. There are -several forms of very useful `post-processing' steps which intelligent -linkers *could* (in theory) perform on object files containing DWARF, -but any and all such link-time transformations are currently both disallowed -and unspecified. - -In particular, possible link-time transformations of DWARF code which could -provide significant benefits include (but are not limited to): - - Commonization of duplicate DIEs obtained from multiple input - (object) files. - - Cross-compilation type checking based upon DWARF type information - for objects and functions. - - Other possible `compacting' transformations designed to save disk - space and to reduce linker & debugger I/O activity. diff --git a/gcc/dwarfout.c b/gcc/dwarfout.c index 7db0def2d69..f319187660f 100644 --- a/gcc/dwarfout.c +++ b/gcc/dwarfout.c @@ -20,6 +20,548 @@ along with GNU CC; see the file COPYING. If not, write to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ +/* + + Notes on the GNU Implementation of DWARF Debugging Information + -------------------------------------------------------------- + Last Major Update: Sun Jul 17 08:17:42 PDT 1994 by rfg@segfault.us.com + ------------------------------------------------------------ + + This file describes special and unique aspects of the GNU implementation of + the DWARF Version 1 debugging information language, as provided in the GNU + version 2.x compiler(s). + + For general information about the DWARF debugging information language, + you should obtain the DWARF version 1.1 specification document (and perhaps + also the DWARF version 2 draft specification document) developed by the + (now defunct) UNIX International Programming Languages Special Interest Group. + + To obtain a copy of the DWARF Version 1 and/or DWARF Version 2 + specification, visit the web page for the DWARF Version 2 committee, at + + http://www.eagercon.com/dwarf/dwarf2std.htm + + The generation of DWARF debugging information by the GNU version 2.x C + compiler has now been tested rather extensively for m88k, i386, i860, and + Sparc targets. The DWARF output of the GNU C compiler appears to inter- + operate well with the standard SVR4 SDB debugger on these kinds of target + systems (but of course, there are no guarantees). + + DWARF 1 generation for the GNU g++ compiler is implemented, but limited. + C++ users should definitely use DWARF 2 instead. + + Future plans for the dwarfout.c module of the GNU compiler(s) includes the + addition of full support for GNU FORTRAN. (This should, in theory, be a + lot simpler to add than adding support for g++... but we'll see.) + + Many features of the DWARF version 2 specification have been adapted to + (and used in) the GNU implementation of DWARF (version 1). In most of + these cases, a DWARF version 2 approach is used in place of (or in addition + to) DWARF version 1 stuff simply because it is apparent that DWARF version + 1 is not sufficiently expressive to provide the kinds of information which + may be necessary to support really robust debugging. In all of these cases + however, the use of DWARF version 2 features should not interfere in any + way with the interoperability (of GNU compilers) with generally available + "classic" (pre version 1) DWARF consumer tools (e.g. SVR4 SDB). + + The DWARF generation enhancement for the GNU compiler(s) was initially + donated to the Free Software Foundation by Network Computing Devices. + (Thanks NCD!) Additional development and maintenance of dwarfout.c has + been largely supported (i.e. funded) by Intel Corporation. (Thanks Intel!) + + If you have questions or comments about the DWARF generation feature, please + send mail to me . I will be happy to investigate any bugs + reported and I may even provide fixes (but of course, I can make no promises). + + The DWARF debugging information produced by GCC may deviate in a few minor + (but perhaps significant) respects from the DWARF debugging information + currently produced by other C compilers. A serious attempt has been made + however to conform to the published specifications, to existing practice, + and to generally accepted norms in the GNU implementation of DWARF. + + ** IMPORTANT NOTE ** ** IMPORTANT NOTE ** ** IMPORTANT NOTE ** + + Under normal circumstances, the DWARF information generated by the GNU + compilers (in an assembly language file) is essentially impossible for + a human being to read. This fact can make it very difficult to debug + certain DWARF-related problems. In order to overcome this difficulty, + a feature has been added to dwarfout.c (enabled by the -dA + option) which causes additional comments to be placed into the assembly + language output file, out to the right-hand side of most bits of DWARF + material. The comments indicate (far more clearly that the obscure + DWARF hex codes do) what is actually being encoded in DWARF. Thus, the + -dA option can be highly useful for those who must study the + DWARF output from the GNU compilers in detail. + + --------- + + (Footnote: Within this file, the term `Debugging Information Entry' will + be abbreviated as `DIE'.) + + + Release Notes (aka known bugs) + ------------------------------- + + In one very obscure case involving dynamically sized arrays, the DWARF + "location information" for such an array may make it appear that the + array has been totally optimized out of existence, when in fact it + *must* actually exist. (This only happens when you are using *both* -g + *and* -O.) This is due to aggressive dead store elimination in the + compiler, and to the fact that the DECL_RTL expressions associated with + variables are not always updated to correctly reflect the effects of + GCC's aggressive dead store elimination. + + ------------------------------- + + When attempting to set a breakpoint at the "start" of a function compiled + with -g1, the debugger currently has no way of knowing exactly where the + end of the prologue code for the function is. Thus, for most targets, + all the debugger can do is to set the breakpoint at the AT_low_pc address + for the function. But if you stop there and then try to look at one or + more of the formal parameter values, they may not have been "homed" yet, + so you may get inaccurate answers (or perhaps even addressing errors). + + Some people may consider this simply a non-feature, but I consider it a + bug, and I hope to provide some GNU-specific attributes (on function + DIEs) which will specify the address of the end of the prologue and the + address of the beginning of the epilogue in a future release. + + ------------------------------- + + It is believed at this time that old bugs relating to the AT_bit_offset + values for bit-fields have been fixed. + + There may still be some very obscure bugs relating to the DWARF description + of type `long long' bit-fields for target machines (e.g. 80x86 machines) + where the alignment of type `long long' data objects is different from + (and less than) the size of a type `long long' data object. + + Please report any problems with the DWARF description of bit-fields as you + would any other GCC bug. (Procedures for bug reporting are given in the + GNU C compiler manual.) + + -------------------------------- + + At this time, GCC does not know how to handle the GNU C "nested functions" + extension. (See the GCC manual for more info on this extension to ANSI C.) + + -------------------------------- + + The GNU compilers now represent inline functions (and inlined instances + thereof) in exactly the manner described by the current DWARF version 2 + (draft) specification. The version 1 specification for handling inline + functions (and inlined instances) was known to be brain-damaged (by the + PLSIG) when the version 1 spec was finalized, but it was simply too late + in the cycle to get it removed before the version 1 spec was formally + released to the public (by UI). + + -------------------------------- + + At this time, GCC does not generate the kind of really precise information + about the exact declared types of entities with signed integral types which + is required by the current DWARF draft specification. + + Specifically, the current DWARF draft specification seems to require that + the type of an non-unsigned integral bit-field member of a struct or union + type be represented as either a "signed" type or as a "plain" type, + depending upon the exact set of keywords that were used in the + type specification for the given bit-field member. It was felt (by the + UI/PLSIG) that this distinction between "plain" and "signed" integral types + could have some significance (in the case of bit-fields) because ANSI C + does not constrain the signedness of a plain bit-field, whereas it does + constrain the signedness of an explicitly "signed" bit-field. For this + reason, the current DWARF specification calls for compilers to produce + type information (for *all* integral typed entities... not just bit-fields) + which explicitly indicates the signedness of the relevant type to be + "signed" or "plain" or "unsigned". + + Unfortunately, the GNU DWARF implementation is currently incapable of making + such distinctions. + + -------------------------------- + + + Known Interoperability Problems + ------------------------------- + + Although the GNU implementation of DWARF conforms (for the most part) with + the current UI/PLSIG DWARF version 1 specification (with many compatible + version 2 features added in as "vendor specific extensions" just for good + measure) there are a few known cases where GCC's DWARF output can cause + some confusion for "classic" (pre version 1) DWARF consumers such as the + System V Release 4 SDB debugger. These cases are described in this section. + + -------------------------------- + + The DWARF version 1 specification includes the fundamental type codes + FT_ext_prec_float, FT_complex, FT_dbl_prec_complex, and FT_ext_prec_complex. + Since GNU C is only a C compiler (and since C doesn't provide any "complex" + data types) the only one of these fundamental type codes which GCC ever + generates is FT_ext_prec_float. This fundamental type code is generated + by GCC for the `long double' data type. Unfortunately, due to an apparent + bug in the SVR4 SDB debugger, SDB can become very confused wherever any + attempt is made to print a variable, parameter, or field whose type was + given in terms of FT_ext_prec_float. + + (Actually, SVR4 SDB fails to understand *any* of the four fundamental type + codes mentioned here. This will fact will cause additional problems when + there is a GNU FORTRAN front-end.) + + -------------------------------- + + In general, it appears that SVR4 SDB is not able to effectively ignore + fundamental type codes in the "implementation defined" range. This can + cause problems when a program being debugged uses the `long long' data + type (or the signed or unsigned varieties thereof) because these types + are not defined by ANSI C, and thus, GCC must use its own private fundamental + type codes (from the implementation-defined range) to represent these types. + + -------------------------------- + + + General GNU DWARF extensions + ---------------------------- + + In the current DWARF version 1 specification, no mechanism is specified by + which accurate information about executable code from include files can be + properly (and fully) described. (The DWARF version 2 specification *does* + specify such a mechanism, but it is about 10 times more complicated than + it needs to be so I'm not terribly anxious to try to implement it right + away.) + + In the GNU implementation of DWARF version 1, a fully downward-compatible + extension has been implemented which permits the GNU compilers to specify + which executable lines come from which files. This extension places + additional information (about source file names) in GNU-specific sections + (which should be totally ignored by all non-GNU DWARF consumers) so that + this extended information can be provided (to GNU DWARF consumers) in a way + which is totally transparent (and invisible) to non-GNU DWARF consumers + (e.g. the SVR4 SDB debugger). The additional information is placed *only* + in specialized GNU-specific sections, where it should never even be seen + by non-GNU DWARF consumers. + + To understand this GNU DWARF extension, imagine that the sequence of entries + in the .lines section is broken up into several subsections. Each contiguous + sequence of .line entries which relates to a sequence of lines (or statements) + from one particular file (either a `base' file or an `include' file) could + be called a `line entries chunk' (LEC). + + For each LEC there is one entry in the .debug_srcinfo section. + + Each normal entry in the .debug_srcinfo section consists of two 4-byte + words of data as follows: + + (1) The starting address (relative to the entire .line section) + of the first .line entry in the relevant LEC. + + (2) The starting address (relative to the entire .debug_sfnames + section) of a NUL terminated string representing the + relevant filename. (This filename name be either a + relative or an absolute filename, depending upon how the + given source file was located during compilation.) + + Obviously, each .debug_srcinfo entry allows you to find the relevant filename, + and it also points you to the first .line entry that was generated as a result + of having compiled a given source line from the given source file. + + Each subsequent .line entry should also be assumed to have been produced + as a result of compiling yet more lines from the same file. The end of + any given LEC is easily found by looking at the first 4-byte pointer in + the *next* .debug_srcinfo entry. That next .debug_srcinfo entry points + to a new and different LEC, so the preceding LEC (implicitly) must have + ended with the last .line section entry which occurs at the 2 1/2 words + just before the address given in the first pointer of the new .debug_srcinfo + entry. + + The following picture may help to clarify this feature. Let's assume that + `LE' stands for `.line entry'. Also, assume that `* 'stands for a pointer. + + + .line section .debug_srcinfo section .debug_sfnames section + ---------------------------------------------------------------- + + LE <---------------------- * + LE * -----------------> "foobar.c" <--- + LE | + LE | + LE <---------------------- * | + LE * -----------------> "foobar.h" <| | + LE | | + LE | | + LE <---------------------- * | | + LE * -----------------> "inner.h" | | + LE | | + LE <---------------------- * | | + LE * ------------------------------- | + LE | + LE | + LE | + LE | + LE <---------------------- * | + LE * ----------------------------------- + LE + LE + LE + + In effect, each entry in the .debug_srcinfo section points to *both* a + filename (in the .debug_sfnames section) and to the start of a block of + consecutive LEs (in the .line section). + + Note that just like in the .line section, there are specialized first and + last entries in the .debug_srcinfo section for each object file. These + special first and last entries for the .debug_srcinfo section are very + different from the normal .debug_srcinfo section entries. They provide + additional information which may be helpful to a debugger when it is + interpreting the data in the .debug_srcinfo, .debug_sfnames, and .line + sections. + + The first entry in the .debug_srcinfo section for each compilation unit + consists of five 4-byte words of data. The contents of these five words + should be interpreted (by debuggers) as follows: + + (1) The starting address (relative to the entire .line section) + of the .line section for this compilation unit. + + (2) The starting address (relative to the entire .debug_sfnames + section) of the .debug_sfnames section for this compilation + unit. + + (3) The starting address (in the execution virtual address space) + of the .text section for this compilation unit. + + (4) The ending address plus one (in the execution virtual address + space) of the .text section for this compilation unit. + + (5) The date/time (in seconds since midnight 1/1/70) at which the + compilation of this compilation unit occurred. This value + should be interpreted as an unsigned quantity because gcc + might be configured to generate a default value of 0xffffffff + in this field (in cases where it is desired to have object + files created at different times from identical source files + be byte-for-byte identical). By default, these timestamps + are *not* generated by dwarfout.c (so that object files + compiled at different times will be byte-for-byte identical). + If you wish to enable this "timestamp" feature however, you + can simply place a #define for the symbol `DWARF_TIMESTAMPS' + in your target configuration file and then rebuild the GNU + compiler(s). + + Note that the first string placed into the .debug_sfnames section for each + compilation unit is the name of the directory in which compilation occurred. + This string ends with a `/' (to help indicate that it is the pathname of a + directory). Thus, the second word of each specialized initial .debug_srcinfo + entry for each compilation unit may be used as a pointer to the (string) + name of the compilation directory, and that string may in turn be used to + "absolutize" any relative pathnames which may appear later on in the + .debug_sfnames section entries for the same compilation unit. + + The fifth and last word of each specialized starting entry for a compilation + unit in the .debug_srcinfo section may (depending upon your configuration) + indicate the date/time of compilation, and this may be used (by a debugger) + to determine if any of the source files which contributed code to this + compilation unit are newer than the object code for the compilation unit + itself. If so, the debugger may wish to print an "out-of-date" warning + about the compilation unit. + + The .debug_srcinfo section associated with each compilation will also have + a specialized terminating entry. This terminating .debug_srcinfo section + entry will consist of the following two 4-byte words of data: + + (1) The offset, measured from the start of the .line section to + the beginning of the terminating entry for the .line section. + + (2) A word containing the value 0xffffffff. + + -------------------------------- + + In the current DWARF version 1 specification, no mechanism is specified by + which information about macro definitions and un-definitions may be provided + to the DWARF consumer. + + The DWARF version 2 (draft) specification does specify such a mechanism. + That specification was based on the GNU ("vendor specific extension") + which provided some support for macro definitions and un-definitions, + but the "official" DWARF version 2 (draft) specification mechanism for + handling macros and the GNU implementation have diverged somewhat. I + plan to update the GNU implementation to conform to the "official" + DWARF version 2 (draft) specification as soon as I get time to do that. + + Note that in the GNU implementation, additional information about macro + definitions and un-definitions is *only* provided when the -g3 level of + debug-info production is selected. (The default level is -g2 and the + plain old -g option is considered to be identical to -g2.) + + GCC records information about macro definitions and undefinitions primarily + in a section called the .debug_macinfo section. Normal entries in the + .debug_macinfo section consist of the following three parts: + + (1) A special "type" byte. + + (2) A 3-byte line-number/filename-offset field. + + (3) A NUL terminated string. + + The interpretation of the second and third parts is dependent upon the + value of the leading (type) byte. + + The type byte may have one of four values depending upon the type of the + .debug_macinfo entry which follows. The 1-byte MACINFO type codes presently + used, and their meanings are as follows: + + MACINFO_start A base file or an include file starts here. + MACINFO_resume The current base or include file ends here. + MACINFO_define A #define directive occurs here. + MACINFO_undef A #undef directive occur here. + + (Note that the MACINFO_... codes mentioned here are simply symbolic names + for constants which are defined in the GNU dwarf.h file.) + + For MACINFO_define and MACINFO_undef entries, the second (3-byte) field + contains the number of the source line (relative to the start of the current + base source file or the current include files) when the #define or #undef + directive appears. For a MACINFO_define entry, the following string field + contains the name of the macro which is defined, followed by its definition. + Note that the definition is always separated from the name of the macro + by at least one whitespace character. For a MACINFO_undef entry, the + string which follows the 3-byte line number field contains just the name + of the macro which is being undef'ed. + + For a MACINFO_start entry, the 3-byte field following the type byte contains + the offset, relative to the start of the .debug_sfnames section for the + current compilation unit, of a string which names the new source file which + is beginning its inclusion at this point. Following that 3-byte field, + each MACINFO_start entry always contains a zero length NUL terminated + string. + + For a MACINFO_resume entry, the 3-byte field following the type byte contains + the line number WITHIN THE INCLUDING FILE at which the inclusion of the + current file (whose inclusion ends here) was initiated. Following that + 3-byte field, each MACINFO_resume entry always contains a zero length NUL + terminated string. + + Each set of .debug_macinfo entries for each compilation unit is terminated + by a special .debug_macinfo entry consisting of a 4-byte zero value followed + by a single NUL byte. + + -------------------------------- + + In the current DWARF draft specification, no provision is made for providing + a separate level of (limited) debugging information necessary to support + tracebacks (only) through fully-debugged code (e.g. code in system libraries). + + A proposal to define such a level was submitted (by me) to the UI/PLSIG. + This proposal was rejected by the UI/PLSIG for inclusion into the DWARF + version 1 specification for two reasons. First, it was felt (by the PLSIG) + that the issues involved in supporting a "traceback only" subset of DWARF + were not well understood. Second, and perhaps more importantly, the PLSIG + is already having enough trouble agreeing on what it means to be "conforming" + to the DWARF specification, and it was felt that trying to specify multiple + different *levels* of conformance would only complicate our discussions of + this already divisive issue. Nonetheless, the GNU implementation of DWARF + provides an abbreviated "traceback only" level of debug-info production for + use with fully-debugged "system library" code. This level should only be + used for fully debugged system library code, and even then, it should only + be used where there is a very strong need to conserve disk space. This + abbreviated level of debug-info production can be used by specifying the + -g1 option on the compilation command line. + + -------------------------------- + + As mentioned above, the GNU implementation of DWARF currently uses the DWARF + version 2 (draft) approach for inline functions (and inlined instances + thereof). This is used in preference to the version 1 approach because + (quite simply) the version 1 approach is highly brain-damaged and probably + unworkable. + + -------------------------------- + + + GNU DWARF Representation of GNU C Extensions to ANSI C + ------------------------------------------------------ + + The file dwarfout.c has been designed and implemented so as to provide + some reasonable DWARF representation for each and every declarative + construct which is accepted by the GNU C compiler. Since the GNU C + compiler accepts a superset of ANSI C, this means that there are some + cases in which the DWARF information produced by GCC must take some + liberties in improvising DWARF representations for declarations which + are only valid in (extended) GNU C. + + In particular, GNU C provides at least three significant extensions to + ANSI C when it comes to declarations. These are (1) inline functions, + and (2) dynamic arrays, and (3) incomplete enum types. (See the GCC + manual for more information on these GNU extensions to ANSI C.) When + used, these GNU C extensions are represented (in the generated DWARF + output of GCC) in the most natural and intuitively obvious ways. + + In the case of inline functions, the DWARF representation is exactly as + called for in the DWARF version 2 (draft) specification for an identical + function written in C++; i.e. we "reuse" the representation of inline + functions which has been defined for C++ to support this GNU C extension. + + In the case of dynamic arrays, we use the most obvious representational + mechanism available; i.e. an array type in which the upper bound of + some dimension (usually the first and only dimension) is a variable + rather than a constant. (See the DWARF version 1 specification for more + details.) + + In the case of incomplete enum types, such types are represented simply + as TAG_enumeration_type DIEs which DO NOT contain either AT_byte_size + attributes or AT_element_list attributes. + + -------------------------------- + + + Future Directions + ----------------- + + The codes, formats, and other paraphernalia necessary to provide proper + support for symbolic debugging for the C++ language are still being worked + on by the UI/PLSIG. The vast majority of the additions to DWARF which will + be needed to completely support C++ have already been hashed out and agreed + upon, but a few small issues (e.g. anonymous unions, access declarations) + are still being discussed. Also, we in the PLSIG are still discussing + whether or not we need to do anything special for C++ templates. (At this + time it is not yet clear whether we even need to do anything special for + these.) + + With regard to FORTRAN, the UI/PLSIG has defined what is believed to be a + complete and sufficient set of codes and rules for adequately representing + all of FORTRAN 77, and most of Fortran 90 in DWARF. While some support for + this has been implemented in dwarfout.c, further implementation and testing + is needed. + + GNU DWARF support for other languages (i.e. Pascal and Modula) is a moot + issue until there are GNU front-ends for these other languages. + + As currently defined, DWARF only describes a (binary) language which can + be used to communicate symbolic debugging information from a compiler + through an assembler and a linker, to a debugger. There is no clear + specification of what processing should be (or must be) done by the + assembler and/or the linker. Fortunately, the role of the assembler + is easily inferred (by anyone knowledgeable about assemblers) just by + looking at examples of assembly-level DWARF code. Sadly though, the + allowable (or required) processing steps performed by a linker are + harder to infer and (perhaps) even harder to agree upon. There are + several forms of very useful `post-processing' steps which intelligent + linkers *could* (in theory) perform on object files containing DWARF, + but any and all such link-time transformations are currently both disallowed + and unspecified. + + In particular, possible link-time transformations of DWARF code which could + provide significant benefits include (but are not limited to): + + Commonization of duplicate DIEs obtained from multiple input + (object) files. + + Cross-compilation type checking based upon DWARF type information + for objects and functions. + + Other possible `compacting' transformations designed to save disk + space and to reduce linker & debugger I/O activity. + +*/ + #include "config.h" #ifdef DWARF_DEBUGGING_INFO -- 2.30.2