add relocation code and ELF docs
[binutils-gdb.git] / bfd / doc / bfdint.texi
1 \input texinfo
2 @setfilename bfdint.info
3
4 @settitle BFD Internals
5 @iftex
6 @title{BFD Internals}
7 @author{Ian Lance Taylor}
8 @author{Cygnus Solutions}
9 @end iftex
10
11 @node Top
12 @top BFD Internals
13 @raisesections
14 @cindex bfd internals
15
16 This document describes some BFD internal information which may be
17 helpful when working on BFD. It is very incomplete.
18
19 This document is not updated regularly, and may be out of date. It was
20 last modified on $Date$.
21
22 The initial version of this document was written by Ian Lance Taylor
23 @email{ian@@cygnus.com}.
24
25 @menu
26 * BFD glossary:: BFD glossary
27 * BFD guidelines:: BFD programming guidelines
28 * BFD generated files:: BFD generated files
29 * BFD multiple compilations:: Files compiled multiple times in BFD
30 * BFD relocation handling:: BFD relocation handling
31 * BFD ELF support:: BFD ELF support
32 * Index:: Index
33 @end menu
34
35 @node BFD glossary
36 @section BFD glossary
37 @cindex glossary for bfd
38 @cindex bfd glossary
39
40 This is a short glossary of some BFD terms.
41
42 @table @asis
43 @item a.out
44 The a.out object file format. The original Unix object file format.
45 Still used on SunOS, though not Solaris. Supports only three sections.
46
47 @item archive
48 A collection of object files produced and manipulated by the @samp{ar}
49 program.
50
51 @item BFD
52 The BFD library itself. Also, each object file, archive, or exectable
53 opened by the BFD library has the type @samp{bfd *}, and is sometimes
54 referred to as a bfd.
55
56 @item COFF
57 The Common Object File Format. Used on Unix SVR3. Used by some
58 embedded targets, although ELF is normally better.
59
60 @item DLL
61 A shared library on Windows.
62
63 @item dynamic linker
64 When a program linked against a shared library is run, the dynamic
65 linker will locate the appropriate shared library and arrange to somehow
66 include it in the running image.
67
68 @item dynamic object
69 Another name for an ELF shared library.
70
71 @item ECOFF
72 The Extended Common Object File Format. Used on Alpha Digital Unix
73 (formerly OSF/1), as well as Ultrix and Irix 4. A variant of COFF.
74
75 @item ELF
76 The Executable and Linking Format. The object file format used on most
77 modern Unix systems, including GNU/Linux, Solaris, Irix, and SVR4. Also
78 used on many embedded systems.
79
80 @item executable
81 A program, with instructions and symbols, and perhaps dynamic linking
82 information. Normally produced by a linker.
83
84 @item NLM
85 NetWare Loadable Module. Used to describe the format of an object which
86 be loaded into NetWare, which is some kind of PC based network server
87 program.
88
89 @item object file
90 A binary file including machine instructions, symbols, and relocation
91 information. Normally produced by an assembler.
92
93 @item object file format
94 The format of an object file. Typically object files and executables
95 for a particular system are in the same format, although executables
96 will not contain any relocation information.
97
98 @item PE
99 The Portable Executable format. This is the object file format used for
100 Windows (specifically, Win32) object files. It is based closely on
101 COFF, but has a few significant differences.
102
103 @item PEI
104 The Portable Executable Image format. This is the object file format
105 used for Windows (specifically, Win32) executables. It is very similar
106 to PE, but includes some additional header information.
107
108 @item relocations
109 Information used by the linker to adjust section contents. Also called
110 relocs.
111
112 @item section
113 Object files and executable are composed of sections. Sections have
114 optional data and optional relocation information.
115
116 @item shared library
117 A library of functions which may be used by many executables without
118 actually being linked into each executable. There are several different
119 implementations of shared libraries, each having slightly different
120 features.
121
122 @item symbol
123 Each object file and executable may have a list of symbols, often
124 referred to as the symbol table. A symbol is basically a name and an
125 address. There may also be some additional information like the type of
126 symbol, although the type of a symbol is normally something simple like
127 function or object, and should be confused with the more complex C
128 notion of type. Typically every global function and variable in a C
129 program will have an associated symbol.
130
131 @item Win32
132 The current Windows API, implemented by Windows 95 and later and Windows
133 NT 3.51 and later, but not by Windows 3.1.
134
135 @item XCOFF
136 The eXtended Common Object File Format. Used on AIX. A variant of
137 COFF, with a completely different symbol table implementation.
138 @end table
139
140 @node BFD guidelines
141 @section BFD programming guidelines
142 @cindex bfd programming guidelines
143 @cindex programming guidelines for bfd
144 @cindex guidelines, bfd programming
145
146 There is a lot of poorly written and confusing code in BFD. New BFD
147 code should be written to a higher standard. Merely because some BFD
148 code is written in a particular manner does not mean that you should
149 emulate it.
150
151 Here are some general BFD programming guidelines:
152
153 @itemize @bullet
154 @item
155 Follow the GNU coding standards.
156
157 @item
158 Avoid global variables. We ideally want BFD to be fully reentrant, so
159 that it can be used in multiple threads. All uses of global or static
160 variables interfere with that. Initialized constant variables are OK,
161 and they should be explicitly marked with const. Instead of global
162 variables, use data attached to a BFD or to a linker hash table.
163
164 @item
165 All externally visible functions should have names which start with
166 @samp{bfd_}. All such functions should be declared in some header file,
167 typically @file{bfd.h}. See, for example, the various declarations near
168 the end of @file{bfd-in.h}, which mostly declare functions required by
169 specific linker emulations.
170
171 @item
172 All functions which need to be visible from one file to another within
173 BFD, but should not be visible outside of BFD, should start with
174 @samp{_bfd_}. Although external names beginning with @samp{_} are
175 prohibited by the ANSI standard, in practice this usage will always
176 work, and it is required by the GNU coding standards.
177
178 @item
179 Always remember that people can compile using --enable-targets to build
180 several, or all, targets at once. It must be possible to link together
181 the files for all targets.
182
183 @item
184 BFD code should compile with few or no warnings using @samp{gcc -Wall}.
185 Some warnings are OK, like the absence of certain function declarations
186 which may or may not be declared in system header files. Warnings about
187 ambiguous expressions and the like should always be fixed.
188 @end itemize
189
190 @node BFD generated files
191 @section BFD generated files
192 @cindex generated files in bfd
193 @cindex bfd generated files
194
195 BFD contains several automatically generated files. This section
196 describes them. Some files are created at configure time, when you
197 configure BFD. Some files are created at make time, when you build
198 time. Some files are automatically rebuilt at make time, but only if
199 you configure with the @samp{--enable-maintainer-mode} option. Some
200 files live in the object directory---the directory from which you run
201 configure---and some live in the source directory. All files that live
202 in the source directory are checked into the CVS repository.
203
204 @table @file
205 @item bfd.h
206 @cindex @file{bfd.h}
207 @cindex @file{bfd-in3.h}
208 Lives in the object directory. Created at make time from
209 @file{bfd-in2.h} via @file{bfd-in3.h}. @file{bfd-in3.h} is created at
210 configure time from @file{bfd-in2.h}. There are automatic dependencies
211 to rebuild @file{bfd-in3.h} and hence @file{bfd.h} if @file{bfd-in2.h}
212 changes, so you can normally ignore @file{bfd-in3.h}, and just think
213 about @file{bfd-in2.h} and @file{bfd.h}.
214
215 @file{bfd.h} is built by replacing a few strings in @file{bfd-in2.h}.
216 To see them, search for @samp{@@} in @file{bfd-in2.h}. They mainly
217 control whether BFD is built for a 32 bit target or a 64 bit target.
218
219 @item bfd-in2.h
220 @cindex @file{bfd-in2.h}
221 Lives in the source directory. Created from @file{bfd-in.h} and several
222 other BFD source files. If you configure with the
223 @samp{--enable-maintainer-mode} option, @file{bfd-in2.h} is rebuilt
224 automatically when a source file changes.
225
226 @item elf32-target.h
227 @itemx elf64-target.h
228 @cindex @file{elf32-target.h}
229 @cindex @file{elf64-target.h}
230 Live in the object directory. Created from @file{elfxx-target.h}.
231 These files are versions of @file{elfxx-target.h} customized for either
232 a 32 bit ELF target or a 64 bit ELF target.
233
234 @item libbfd.h
235 @cindex @file{libbfd.h}
236 Lives in the source directory. Created from @file{libbfd-in.h} and
237 several other BFD source files. If you configure with the
238 @samp{--enable-maintainer-mode} option, @file{libbfd.h} is rebuilt
239 automatically when a source file changes.
240
241 @item libcoff.h
242 @cindex @file{libcoff.h}
243 Lives in the source directory. Created from @file{libcoff-in.h} and
244 @file{coffcode.h}. If you configure with the
245 @samp{--enable-maintainer-mode} option, @file{libcoff.h} is rebuilt
246 automatically when a source file changes.
247
248 @item targmatch.h
249 @cindex @file{targmatch.h}
250 Lives in the object directory. Created at make time from
251 @file{config.bfd}. This file is used to map configuration triplets into
252 BFD target vector variable names at run time.
253 @end table
254
255 @node BFD multiple compilations
256 @section Files compiled multiple times in BFD
257 Several files in BFD are compiled multiple times. By this I mean that
258 there are header files which contain function definitions. These header
259 filesare included by other files, and thus the functions are compiled
260 once per file which includes them.
261
262 Preprocessor macros are used to control the compilation, so that each
263 time the files are compiled the resulting functions are slightly
264 different. Naturally, if they weren't different, there would be no
265 reason to compile them multiple times.
266
267 This is a not a particularly good programming technique, and future BFD
268 work should avoid it.
269
270 @itemize @bullet
271 @item
272 Since this technique is rarely used, even experienced C programmers find
273 it confusing.
274
275 @item
276 It is difficult to debug programs which use BFD, since there is no way
277 to describe which version of a particular function you are looking at.
278
279 @item
280 Programs which use BFD wind up incorporating two or more slightly
281 different versions of the same function, which wastes space in the
282 executable.
283
284 @item
285 This technique is never required nor is it especially efficient. It is
286 always possible to use statically initialized structures holding
287 function pointers and magic constants instead.
288 @end itemize
289
290 The following is a list of the files which are compiled multiple times.
291
292 @table @file
293 @item aout-target.h
294 @cindex @file{aout-target.h}
295 Describes a few functions and the target vector for a.out targets. This
296 is used by individual a.out targets with different definitions of
297 @samp{N_TXTADDR} and similar a.out macros.
298
299 @item aoutf1.h
300 @cindex @file{aoutf1.h}
301 Implements standard SunOS a.out files. In principle it supports 64 bit
302 a.out targets based on the preprocessor macro @samp{ARCH_SIZE}, but
303 since all known a.out targets are 32 bits, this code may or may not
304 work. This file is only included by a few other files, and it is
305 difficult to justify its existence.
306
307 @item aoutx.h
308 @cindex @file{aoutx.h}
309 Implements basic a.out support routines. This file can be compiled for
310 either 32 or 64 bit support. Since all known a.out targets are 32 bits,
311 the 64 bit support may or may not work. I believe the original
312 intention was that this file would only be included by @samp{aout32.c}
313 and @samp{aout64.c}, and that other a.out targets would simply refer to
314 the functions it defined. Unfortunately, some other a.out targets
315 started including it directly, leading to a somewhat confused state of
316 affairs.
317
318 @item coffcode.h
319 @cindex @file{coffcode.h}
320 Implements basic COFF support routines. This file is included by every
321 COFF target. It implements code which handles COFF magic numbers as
322 well as various hook functions called by the generic COFF functions in
323 @file{coffgen.c}. This file is controlled by a number of different
324 macros, and more are added regularly.
325
326 @item coffswap.h
327 @cindex @file{coffswap.h}
328 Implements COFF swapping routines. This file is included by
329 @file{coffcode.h}, and thus by every COFF target. It implements the
330 routines which swap COFF structures between internal and external
331 format. The main control for this file is the external structure
332 definitions in the files in the @file{include/coff} directory. A COFF
333 target file will include one of those files before including
334 @file{coffcode.h} and thus @file{coffswap.h}. There are a few other
335 macros which affect @file{coffswap.h} as well, mostly describing whether
336 certain fields are present in the external structures.
337
338 @item ecoffswap.h
339 @cindex @file{ecoffswap.h}
340 Implements ECOFF swapping routines. This is like @file{coffswap.h}, but
341 for ECOFF. It is included by the ECOFF target files (of which there are
342 only two). The control is the preprocessor macro @samp{ECOFF_32} or
343 @samp{ECOFF_64}.
344
345 @item elfcode.h
346 @cindex @file{elfcode.h}
347 Implements ELF functions that use external structure definitions. This
348 file is included by two other files: @file{elf32.c} and @file{elf64.c}.
349 It is controlled by the @samp{ARCH_SIZE} macro which is defined to be
350 @samp{32} or @samp{64} before including it. The @samp{NAME} macro is
351 used internally to give the functions different names for the two target
352 sizes.
353
354 @item elfcore.h
355 @cindex @file{elfcore.h}
356 Like @file{elfcode.h}, but for functions that are specific to ELF core
357 files. This is included only by @file{elfcode.h}.
358
359 @item elflink.h
360 @cindex @file{elflink.h}
361 Like @file{elfcode.h}, but for functions used by the ELF linker. This
362 is included only by @file{elfcode.h}.
363
364 @item elfxx-target.h
365 @cindex @file{elfxx-target.h}
366 This file is the source for the generated files @file{elf32-target.h}
367 and @file{elf64-target.h}, one of which is included by every ELF target.
368 It defines the ELF target vector.
369
370 @item freebsd.h
371 @cindex @file{freebsd.h}
372 Presumably intended to be included by all FreeBSD targets, but in fact
373 there is only one such target, @samp{i386-freebsd}. This defines a
374 function used to set the right magic number for FreeBSD, as well as
375 various macros, and includes @file{aout-target.h}.
376
377 @item netbsd.h
378 @cindex @file{netbsd.h}
379 Like @file{freebsd.h}, except that there are several files which include
380 it.
381
382 @item nlm-target.h
383 @cindex @file{nlm-target.h}
384 Defines the target vector for a standard NLM target.
385
386 @item nlmcode.h
387 @cindex @file{nlmcode.h}
388 Like @file{elfcode.h}, but for NLM targets. This is only included by
389 @file{nlm32.c} and @file{nlm64.c}, both of which define the macro
390 @samp{ARCH_SIZE} to an appropriate value. There are no 64 bit NLM
391 targets anyhow, so this is sort of useless.
392
393 @item nlmswap.h
394 @cindex @file{nlmswap.h}
395 Like @file{coffswap.h}, but for NLM targets. This is included by each
396 NLM target, but I think it winds up compiling to the exact same code for
397 every target, and as such is fairly useless.
398
399 @item peicode.h
400 @cindex @file{peicode.h}
401 Provides swapping routines and other hooks for PE targets.
402 @file{coffcode.h} will include this rather than @file{coffswap.h} for a
403 PE target. This defines PE specific versions of the COFF swapping
404 routines, and also defines some macros which control @file{coffcode.h}
405 itself.
406 @end table
407
408 @node BFD relocation handling
409 @section BFD relocation handling
410 @cindex bfd relocation handling
411 @cindex relocations in bfd
412
413 The handling of relocations is one of the more confusing aspects of BFD.
414 Relocation handling has been implemented in various different ways, all
415 somewhat incompatible, none perfect.
416
417 @menu
418 * BFD relocation concepts:: BFD relocation concepts
419 * BFD relocation functions:: BFD relocation functions
420 * BFD relocation codes:: BFD relocation codes
421 * BFD relocation future:: BFD relocation future
422 @end menu
423
424 @node BFD relocation concepts
425 @subsection BFD relocation concepts
426
427 A relocation is an action which the linker must take when linking. It
428 describes a change to the contents of a section. The change is normally
429 based on the final value of one or more symbols. Relocations are
430 created by the assembler when it creates an object file.
431
432 Most relocations are simple. A typical simple relocation is to set 32
433 bits at a given offset in a section to the value of a symbol. This type
434 of relocation would be generated for code like @code{int *p = &i;} where
435 @samp{p} and @samp{i} are global variables. A relocation for the symbol
436 @samp{i} would be generated such that the linker would initialize the
437 area of memory which holds the value of @samp{p} to the value of the
438 symbol @samp{i}.
439
440 Slightly more complex relocations may include an addend, which is a
441 constant to add to the symbol value before using it. In some cases a
442 relocation will require adding the symbol value to the existing contents
443 of the section in the object file. In others the relocation will simply
444 replace the contents of the section with the symbol value. Some
445 relocations are PC relative, so that the value to be stored in the
446 section is the difference between the value of a symbol and the final
447 address of the section contents.
448
449 In general, relocations can be arbitrarily complex. For
450 example,relocations used in dynamic linking systems often require the
451 linker to allocate space in a different section and use the offset
452 within that section as the value to store. In the IEEE object file
453 format, relocations may involve arbitrary expressions.
454
455 When doing a relocateable link, the linker may or may not have to do
456 anything with a relocation, depending upon the definition of the
457 relocation. Simple relocations generally do not require any special
458 action.
459
460 @node BFD relocation functions
461 @subsection BFD relocation functions
462
463 In BFD, each section has an array of @samp{arelent} structures. Each
464 structure has a pointer to a symbol, an address within the section, an
465 addend, and a pointer to a @samp{reloc_howto_struct} structure. The
466 howto structure has a bunch of fields describing the reloc, including a
467 type field. The type field is specific to the object file format
468 backend; none of the generic code in BFD examines it.
469
470 Originally, the function @samp{bfd_perform_relocation} was supposed to
471 handle all relocations. In theory, many relocations would be simple
472 enough to be described by the fields in the howto structure. For those
473 that weren't, the howto structure included a @samp{special_function}
474 field to use as an escape.
475
476 While this seems plausible, a look at @samp{bfd_perform_relocation}
477 shows that it failed. The function has odd special cases. Some of the
478 fields in the howto structure, such as @samp{pcrel_offset}, were not
479 adequately documented.
480
481 The linker uses @samp{bfd_perform_relocation} to do all relocations when
482 the input and output file have different formats (e.g., when generating
483 S-records). The generic linker code, which is used by all targets which
484 do not define their own special purpose linker, uses
485 @samp{bfd_get_relocated_section_contents}, which for most targets turns
486 into a call to @samp{bfd_generic_get_relocated_section_contents}, which
487 calls @samp{bfd_perform_relocation}. So @samp{bfd_perform_relocation}
488 is still widely used, which makes it difficult to change, since it is
489 difficult to test all possible cases.
490
491 The assembler used @samp{bfd_perform_relocation} for a while. This
492 turned out to be the wrong thing to do, since
493 @samp{bfd_perform_relocation} was written to handle relocations on an
494 existing object file, while the assembler needed to create relocations
495 in a new object file. The assembler was changed to use the new function
496 @samp{bfd_install_relocation} instead, and @samp{bfd_install_relocation}
497 was created as a copy of @samp{bfd_perform_relocation}.
498
499 Unfortunately, the work did not progress any farther, so
500 @samp{bfd_install_relocation} remains a simple copy of
501 @samp{bfd_perform_relocation}, with all the odd special cases and
502 confusing code. This again is difficult to change, because again any
503 change can affect any assembler target, and so is difficult to test.
504
505 The new linker, when using the same object file format for all input
506 files and the output file, does not convert relocations into
507 @samp{arelent} structures, so it can not use
508 @samp{bfd_perform_relocation} at all. Instead, users of the new linker
509 are expected to write a @samp{relocate_section} function which will
510 handle relocations in a target specific fashion.
511
512 There are two helper functions for target specific relocation:
513 @samp{_bfd_final_link_relocate} and @samp{_bfd_relocate_contents}.
514 These functions use a howto structure, but they @emph{do not} use the
515 @samp{special_function} field. Since the functions are normally called
516 from target specific code, the @samp{special_function} field adds
517 little; any relocations which require special handling can be handled
518 without calling those functions.
519
520 So, if you want to add a new target, or add a new relocation to an
521 existing target, you need to do the following:
522 @itemize @bullet
523 @item
524 Make sure you clearly understand what the contents of the section should
525 look like after assembly, after a relocateable link, and after a final
526 link. Make sure you clearly understand the operations the linker must
527 perform during a relocateable link and during a final link.
528
529 @item
530 Write a howto structure for the relocation. The howto structure is
531 flexible enough to represent any relocation which should be handled by
532 setting a contiguous bitfield in the destination to the value of a
533 symbol, possibly with an addend, possibly adding the symbol value to the
534 value already present in the destination.
535
536 @item
537 Change the assembler to generate your relocation. The assembler will
538 call @samp{bfd_install_relocation}, so your howto structure has to be
539 able to handle that. You may need to set the @samp{special_function}
540 field to handle assembly correctly. Be careful to ensure that any code
541 you write to handle the assembler will also work correctly when doing a
542 relocateable link. For example, see @samp{bfd_elf_generic_reloc}.
543
544 @item
545 Test the assembler. Consider the cases of relocation against an
546 undefined symbol, a common symbol, a symbol defined in the object file
547 in the same section, and a symbol defined in the object file in a
548 different section. These cases may not all be applicable for your
549 reloc.
550
551 @item
552 If your target uses the new linker, which is recommended, add any
553 required handling to the target specific relocation function. In simple
554 cases this will just involve a call to @samp{_bfd_final_link_relocate}
555 or @samp{_bfd_relocate_contents}, depending upon the definition of the
556 relocation and whether the link is relocateable or not.
557
558 @item
559 Test the linker. Test the case of a final link. If the relocation can
560 overflow, use a linker script to force an overflow and make sure the
561 error is reported correctly. Test a relocateable link, whether the
562 symbol is defined or undefined in the relocateable output. For both the
563 final and relocateable link, test the case when the symbol is a common
564 symbol, when the symbol looked like a common symbol but became a defined
565 symbol, when the symbol is defined in a different object file, and when
566 the symbol is defined in the same object file.
567
568 @item
569 In order for linking to another object file format, such as S-records,
570 to work correctly, @samp{bfd_perform_relocation} has to do the right
571 thing for the relocation. You may need to set the
572 @samp{special_function} field to handle this correctly. Test this by
573 doing a link in which the output object file format is S-records.
574
575 @item
576 Using the linker to generate relocateable output in a different object
577 file format is impossible in the general case, so you generally don't
578 have to worry about that. Linking input files of different object file
579 formats together is quite unusual, but if you're really dedicated you
580 may want to consider testing this case, both when the output object file
581 format is the same as your format, and when it is different.
582 @end itemize
583
584 @node BFD relocation codes
585 @subsection BFD relocation codes
586
587 BFD has another way of describing relocations besides the howto
588 structures described above: the enum @samp{bfd_reloc_code_real_type}.
589
590 Every known relocation type can be described as a value in this
591 enumeration. The enumeration contains many target specific relocations,
592 but where two or more targets have the same relocation, a single code is
593 used. For example, the single value @samp{BFD_RELOC_32} is used for all
594 simple 32 bit relocation types.
595
596 The main purpose of this relocation code is to give the assembler some
597 mechanism to create @samp{arelent} structures. In order for the
598 assembler to create an @samp{arelent} structure, it has to be able to
599 obtain a howto structure. The function @samp{bfd_reloc_type_lookup},
600 which simply calls the target vector entry point
601 @samp{reloc_type_lookup}, takes a relocation code and returns a howto
602 structure.
603
604 The function @samp{bfd_get_reloc_code_name} returns the name of a
605 relocation code. This is mainly used in error messages.
606
607 Using both howto structures and relocation codes can be somewhat
608 confusing. There are many processor specific relocation codes.
609 However, the relocation is only fully defined by the howto structure.
610 The same relocation code will map to different howto structures in
611 different object file formats. For example, the addend handling may be
612 different.
613
614 Most of the relocation codes are not really general. The assembler can
615 not use them without already understanding what sorts of relocations can
616 be used for a particular target. It might be possible to replace the
617 relocation codes with something simpler.
618
619 @node BFD relocation future
620 @subsection BFD relocation future
621
622 Clearly the current BFD relocation support is in bad shape. A
623 wholescale rewrite would be very difficult, because it would require
624 thorough testing of every BFD target. So some sort of incremental
625 change is required.
626
627 My vague thoughts on this would involve defining a new, clearly defined,
628 howto structure. Some mechanism would be used to determine which type
629 of howto structure was being used by a particular format.
630
631 The new howto structure would clearly define the relocation behaviour in
632 the case of an assembly, a relocateable link, and a final link. At
633 least one special function would be defined as an escape, and it might
634 make sense to define more.
635
636 One or more generic functions similar to @samp{bfd_perform_relocation}
637 would be written to handle the new howto structure.
638
639 This should make it possible to write a generic version of the relocate
640 section functions used by the new linker. The target specific code
641 would provide some mechanism (a function pointer or an initial
642 conversion) to convert target specific relocations into howto
643 structures.
644
645 Ideally it would be possible to use this generic relocate section
646 function for the generic linker as well. That is, it would replace the
647 @samp{bfd_generic_get_relocated_section_contents} function which is
648 currently normally used.
649
650 For the special case of ELF dynamic linking, more consideration needs to
651 be given to writing ELF specific but ELF target generic code to handle
652 special relocation types such as GOT and PLT.
653
654 @node BFD ELF support
655 @section BFD ELF support
656 @cindex elf support in bfd
657 @cindex bfd elf support
658
659 The ELF object file format is defined in two parts: a generic ABI and a
660 processor specific supplement. The ELF support in BFD is split in a
661 similar fashion. The processor specific support is largely kept within
662 a single file. The generic support is provided by several other file.
663 The processor specific support provides a set of function pointers and
664 constants used by the generic support.
665
666 @menu
667 * BFD ELF generic support:: BFD ELF generic support
668 * BFD ELF processor specific support:: BFD ELF processor specific support
669 * BFD ELF future:: BFD ELF future
670 @end menu
671
672 @node BFD ELF generic support
673 @subsection BFD ELF generic support
674
675 In general, functions which do not read external data from the ELF file
676 are found in @file{elf.c}. They operate on the internal forms of the
677 ELF structures, which are defined in @file{include/elf/internal.h}. The
678 internal structures are defined in terms of @samp{bfd_vma}, and so may
679 be used for both 32 bit and 64 bit ELF targets.
680
681 The file @file{elfcode.h} contains functions which operate on the
682 external data. @file{elfcode.h} is compiled twice, once via
683 @file{elf32.c} with @samp{ARCH_SIZE} defined as @samp{32}, and once via
684 @file{elf64.c} with @samp{ARCH_SIZE} defined as @samp{64}.
685 @file{elfcode.h} includes functions to swap the ELF structures in and
686 out of external form, as well as a few more complex functions.
687
688 Linker support is found in @file{elflink.c} and @file{elflink.h}. The
689 latter file is compiled twice, for both 32 and 64 bit support. The
690 linker support is only used if the processor specific file defines
691 @samp{elf_backend_relocate_section}, which is required to relocate the
692 section contents. If that macro is not defined, the generic linker code
693 is used, and relocations are handled via @samp{bfd_perform_relocation}.
694
695 The core file support is in @file{elfcore.h}, which is compiled twice,
696 for both 32 and 64 bit support. The more interesting cases of core file
697 support only work on a native system which has the @file{sys/procfs.h}
698 header file. Without that file, the core file support does little more
699 than read the ELF program segments as BFD sections.
700
701 The BFD internal header file @file{elf-bfd.h} is used for communication
702 among these files and the processor specific files.
703
704 The default entries for the BFD ELF target vector are found mainly in
705 @file{elf.c}. Some functions are found in @file{elfcode.h}.
706
707 The processor specific files may override particular entries in the
708 target vector, but most do not, with one exception: the
709 @samp{bfd_reloc_type_lookup} entry point is always processor specific.
710
711 @node BFD ELF processor specific support
712 @subsection BFD ELF processor specific support
713
714 By convention, the processor specific support for a particular processor
715 will be found in @file{elf@var{nn}-@var{cpu}.c}, where @var{nn} is
716 either 32 or 64, and @var{cpu} is the name of the processor.
717
718 @menu
719 * BFD ELF processor required:: Required processor specific support
720 * BFD ELF processor linker:: Processor specific linker support
721 * BFD ELF processor other:: Other processor specific support options
722 @end menu
723
724 @node BFD ELF processor required
725 @subsubsection Required processor specific support
726
727 When writing a @file{elf@var{nn}-@var{cpu}.c} file, you must do the
728 following:
729 @itemize @bullet
730 @item
731 Define either @samp{TARGET_BIG_SYM} or @samp{TARGET_LITTLE_SYM}, or
732 both, to a unique C name to use for the target vector. This name should
733 appear in the list of target vectors in @file{targets.c}, and will also
734 have to appear in @file{config.bfd} and @file{configure.in}. Define
735 @samp{TARGET_BIG_SYM} for a big-endian processor,
736 @samp{TARGET_LITTLE_SYM} for a little-endian processor, and define both
737 for a bi-endian processor.
738 @item
739 Define either @samp{TARGET_BIG_NAME} or @samp{TARGET_LITTLE_NAME}, or
740 both, to a string used as the name of the target vector. This is the
741 name which a user of the BFD tool would use to specify the object file
742 format. It would normally appear in a linker emulation parameters
743 file.
744 @item
745 Define @samp{ELF_ARCH} to the BFD architecture (an element of the
746 @samp{bfd_architecture} enum, typically @samp{bfd_arch_@var{cpu}}).
747 @item
748 Define @samp{ELF_MACHINE_CODE} to the magic number which should appear
749 in the @samp{e_machine} field of the ELF header. As of this writing,
750 these magic numbers are assigned by SCO; if you want to get a magic
751 number for a particular processor, try sending a note to
752 @email{registry@@sco.com}. In the BFD sources, the magic numbers are
753 found in @file{include/elf/common.h}; they have names beginning with
754 @samp{EM_}.
755 @item
756 Define @samp{ELF_MAXPAGESIZE} to the maximum size of a virtual page in
757 memory. This can normally be found at the start of chapter 5 in the
758 processor specific supplement. For a processor which will only be used
759 in an embedded system, or which has no memory management hardware, this
760 can simply be @samp{1}.
761 @item
762 If the format should use @samp{Rel} rather than @samp{Rela} relocations,
763 define @samp{USE_REL}. This is normally defined in chapter 4 of the
764 processor specific supplement. In the absence of a supplement, it's
765 usually easier to work with @samp{Rela} relocations, although they will
766 require more space in object files (but not in executables, except when
767 using dynamic linking). It is possible, though somewhat awkward, to
768 support both @samp{Rel} and @samp{Rela} relocations for a single target;
769 @file{elf64-mips.c} does it by overriding the relocation reading and
770 writing routines.
771 @item
772 Define howto structures for all the relocation types.
773 @item
774 Define a @samp{bfd_reloc_type_lookup} routine. This must be named
775 @samp{bfd_elf@var{nn}_bfd_reloc_type_lookup}, and may be either a
776 function or a macro. It must translate a BFD relocation code into a
777 howto structure. This is normally a table lookup or a simple switch.
778 @item
779 If using @samp{Rel} relocations, define @samp{elf_info_to_howto_rel}.
780 If using @samp{Rela} relocations, define @samp{elf_info_to_howto}.
781 Either way, this is a macro defined as the name of a function which
782 takes an @samp{arelent} and a @samp{Rel} or @samp{Rela} structure, and
783 sets the @samp{howto} field of the @samp{arelent} based on the
784 @samp{Rel} or @samp{Rela} structure. This is normally uses
785 @samp{ELF@var{nn}_R_TYPE} to get the ELF relocation type and uses it as
786 an index into a table of howto structures.
787 @end itemize
788
789 You must also add the magic number for this processor to the
790 @samp{prep_headers} function in @file{elf.c}.
791
792 @node BFD ELF processor linker
793 @subsubsection Processor specific linker support
794
795 The linker will be much more efficient if you define a relocate section
796 function. This will permit BFD to use the ELF specific linker support.
797
798 If you do not define a relocate section function, BFD must use the
799 generic linker support, which requires converting all symbols and
800 relocations into BFD @samp{asymbol} and @samp{arelent} structures. In
801 this case, relocations will be handled by calling
802 @samp{bfd_perform_relocation}, which will use the howto structures you
803 have defined. @xref{BFD relocation handling}.
804
805 In order to support linking into a different object file format, such as
806 S-records, @samp{bfd_perform_relocation} must work correctly with your
807 howto structures, so you can't skip that step. However, if you define
808 the relocate section function, then in the normal case of linking into
809 an ELF file the linker will not need to convert symbols and relocations,
810 and will be much more efficient.
811
812 To use a relocation section function, define the macro
813 @samp{elf_backend_relocate_section} as the name of a function which will
814 take the contents of a section, as well as relocation, symbol, and other
815 information, and modify the section contents according to the relocation
816 information. In simple cases, this is little more than a loop over the
817 relocations which computes the value of each relocation and calls
818 @samp{_bfd_final_link_relocate}. The function must check for a
819 relocateable link, and in that case normally needs to do nothing other
820 than adjust the addend for relocations against a section symbol.
821
822 The complex cases generally have to do with dynamic linker support. GOT
823 and PLT relocations must be handled specially, and the linker normally
824 arranges to set up the GOT and PLT sections while handling relocations.
825 When generating a shared library, random relocations must normally be
826 copied into the shared library, or converted to RELATIVE relocations
827 when possible.
828
829 @node BFD ELF processor other
830 @subsubsection Other processor specific support options
831
832 There are many other macros which may be defined in
833 @file{elf@var{nn}-@var{cpu}.c}. These macros may be found in
834 @file{elfxx-target.h}.
835
836 Macros may be used to override some of the generic ELF target vector
837 functions.
838
839 Several processor specific hook functions which may be defined as
840 macros. These functions are found as function pointers in the
841 @samp{elf_backend_data} structure defined in @file{elf-bfd.h}. In
842 general, a hook function is set by defining a macro
843 @samp{elf_backend_@var{name}}.
844
845 There are a few processor specific constants which may also be defined.
846 These are again found in the @samp{elf_backend_data} structure.
847
848 I will not define the various functions and constants here; see the
849 comments in @file{elf-bfd.h}.
850
851 Normally any odd characteristic of a particular ELF processor is handled
852 via a hook function. For example, the special @samp{SHN_MIPS_SCOMMON}
853 section number found in MIPS ELF is handled via the hooks
854 @samp{section_from_bfd_section}, @samp{symbol_processing},
855 @samp{add_symbol_hook}, and @samp{output_symbol_hook}.
856
857 Dynamic linking support, which involves processor specific relocations
858 requiring special handling, is also implemented via hook functions.
859
860 @node BFD ELF future
861 @subsection BFD ELF future
862
863 The current dynamic linking support has too much code duplication.
864 While each processor has particular differences, much of the dynamic
865 linking support is quite similar for each processor. The GOT and PLT
866 are handled in fairly similar ways, the details of -Bsymbolic linking
867 are generally similar, etc. This code should be reworked to use more
868 generic functions, eliminating the duplication.
869
870 Similarly, the relocation handling has too much duplication. Many of
871 the @samp{reloc_type_lookup} and @samp{info_to_howto} functions are
872 quite similar. The relocate section functions are also often quite
873 similar, both in the standard linker handling and the dynamic linker
874 handling. Many of the COFF processor specific backends share a single
875 relocate section function (@samp{_bfd_coff_generic_relocate_section}),
876 and it should be possible to do something like this for the ELF targets
877 as well.
878
879 The appearance of the processor specific magic number in
880 @samp{prep_headers} in @file{elf.c} is somewhat bogus. It should be
881 possible to add support for a new processor without changing the generic
882 support.
883
884 The processor function hooks and constants are ad hoc and need better
885 documentation.
886
887 @node Index
888 @unnumberedsec Index
889 @printindex cp
890
891 @contents
892 @bye