Updated, amplified, and reorganized linker manual. Option-flag
[binutils-gdb.git] / ld / ld.texinfo
1 \input texinfo
2 @setfilename gld.info
3 @c $Id$
4 @syncodeindex ky cp
5 @ifinfo
6 This file documents the GNU linker GLD.
7
8 Copyright (C) 1991 Free Software Foundation, Inc.
9
10 Permission is granted to make and distribute verbatim copies of
11 this manual provided the copyright notice and this permission notice
12 are preserved on all copies.
13
14 @ignore
15 Permission is granted to process this file through Tex and print the
16 results, provided the printed document carries copying permission
17 notice identical to this one except for the removal of this paragraph
18 (this paragraph not being relevant to the printed manual).
19
20 @end ignore
21 Permission is granted to copy and distribute modified versions of this
22 manual under the conditions for verbatim copying, provided also that the
23 section entitled ``GNU General Public License'' is included exactly as
24 in the original, and provided that the entire resulting derived work is
25 distributed under the terms of a permission notice identical to this
26 one.
27
28 Permission is granted to copy and distribute translations of this manual
29 into another language, under the above conditions for modified versions,
30 except that the section entitled ``GNU General Public License'' may be
31 included in a translation approved by the author instead of in the
32 original English.
33 @end ifinfo
34 @setchapternewpage odd
35 @settitle GLD, the GNU linker
36 @titlepage
37 @title{gld}
38 @subtitle{The GNU linker}
39 @sp 1
40 @subtitle Second Edition---@code{gld} version 2.0
41 @subtitle April 1991
42 @author {Steve Chamberlain, Roland Pesch}
43 @author {Cygnus Support}
44 @page
45
46 @tex
47 \def\$#1${{#1}} % Kluge: collect RCS revision info without $...$
48 \xdef\manvers{\$Revision$} % For use in headers, footers too
49 {\parskip=0pt
50 \hfill Cygnus Support\par
51 \hfill {\it GLD, the GNU linker}, \manvers\par
52 \hfill \TeX{}info \texinfoversion\par
53 \hfill steve\@cygnus.com, pesch\@cygnus.com\par
54 }
55 \global\parindent=0pt % Steve likes it this way.
56 @end tex
57
58 @vskip 0pt plus 1filll
59 Copyright @copyright{} 1991 Free Software Foundation, Inc.
60
61 Permission is granted to make and distribute verbatim copies of
62 this manual provided the copyright notice and this permission notice
63 are preserved on all copies.
64
65 Permission is granted to copy and distribute modified versions of this
66 manual under the conditions for verbatim copying, provided also that
67 the entire resulting derived work is distributed under the terms of a
68 permission notice identical to this one.
69
70 Permission is granted to copy and distribute translations of this manual
71 into another language, under the above conditions for modified versions.
72 @end titlepage
73 @c FIXME: Talk about importance of *order* of args, cmds to linker!
74
75 @node Top,,,
76 @ifinfo
77 This file documents the GNU linker gld.
78 @end ifinfo
79
80 @node Overview,,,
81 @chapter Overview
82
83 @code{gld} combines a number of object and archive files, relocates
84 their data and ties up symbol references. Often the last step in
85 building a new compiled program to run is a call to @code{gld}.
86
87 @code{gld} accepts Linker Command Language files written in
88 a superset of AT@&T's Link Editor Command Language syntax,
89 to provide explicit and total control over the linking process.
90
91 This version of @code{gld} uses the general purpose @code{bfd} libraries
92 to operate on object files. This allows @code{gld} to read, combine, and
93 write object files in many different formats---for example, COFF or
94 @code{a.out}. Different formats may be linked together to produce any
95 available kind of object file. @xref{BFD} for a list of formats
96 supported on various architectures.
97
98 When linking formats with equivalent representations of debugging
99 information (typically variations on one format), @code{gld} maintains
100 all debugging information.
101
102 @node Invocation,,,
103 @chapter Command line options
104
105 @c FIXME: -D, -N, -z, -f from older GNU linker, but not currently in new;
106 @c FIXME...steve is currently thinking about whether to add them. Maybe
107 @c FIXME...remove from document.
108 @example
109 gld [-o @var{output} ] @var{objfiles}@dots{}
110 [ -A@var{architecture} ] [ -b @var{output-format} ] [ -Bstatic ]
111 [ -c @var{commandfile} ] [ -D @var{datasize} ]
112 [ -d | -dc | -dp ] [ -defsym @var{symbol} = @var{expression} ]
113 [ -e @var{entry} ] [ -f @var{fill} ] [ -F ] [ -F @var{format} ]
114 [ -format @var{output-format} ] [ -g ] [ -i ]
115 [ -l@var{ar} ] [ -L@var{searchdir} ] [ -M | -m ]
116 [ -N | -n | -z ] [ -noinhibit-exec ] [ -R @var{filename} ]
117 [ -r | -Ur ] [ -S ] [ -s ]
118 [ SCRIPT @dots{} ENDSCRIPT ] [ SCRIPT @dots{} @@ ]
119 [ -T @var{commandfile} ]
120 [ -Ttext @var{textorg} ] [ -Tdata @var{dataorg} ] [ -Tbss @var{bssorg} ]
121 [ -t ] [ -u @var{sym}] [-v] [ -X ] [ -x ]
122 @end example
123
124 This plethora of command-line options may seem intimidating, but in
125 actual practice few of them are used in any particular context.
126 For instance, a frequent use of @code{gld} is to link standard Unix
127 object files on a standard, supported Unix system. On such a system, to
128 link a file @code{hello.o}:
129 @example
130 $ gld -o output /lib/crt0.o hello.o -lc
131 @end example
132 This tells @code{gld} to produce a file called @code{output} as the
133 result of linking the file @code{/lib/crt0.o} with @code{hello.o} and
134 the library @code{libc.a} which will come from the standard search
135 directories.
136
137 The command-line options to @code{gld} may be specified in any order, and
138 may be repeated at will. For the most part, repeating an option with a
139 different argument will either have no further effect, or override prior
140 occurrences (those further to the left on the command line) of an
141 option.
142
143 The exceptions---which may meaningfully be used more than once---
144 are @code{-L}, @code{-l}, and @code{-u}.
145 @c FIXME: probably some new opts can be repeated meaningfully too.
146
147 The list of object files to be linked together, shown as @var{objfiles},
148 may follow, precede, or be mixed in with command-line options; save that
149 an @var{objfiles} argument may not be placed between an option flag and
150 its argument.
151
152 Option arguments must follow the option letter without intervening
153 whitespace, or be given as separate arguments immediately following the
154 option that requires them.
155
156 @table @code
157 @item @var{objfiles}@dots{}
158 The object files @var{objfiles} to be linked; at least one must be specified.
159
160 @item -A@var{architecture}
161 In the current release of @code{gld}, this option is useful only for the
162 Intel 960 family of architectures. In that context, the
163 @var{architecture} argument is one of the two-letter names identifying
164 members of the 960 family; the option specifies the desired output
165 target, and warns of any incompatible instructions in the input files.
166 It also selects archive libraries supporting the particular
167 architecture; its effect in this regard is similar to that of @code{-l}, save
168 that @code{-A}@var{architecture} triggers a two-level search; first for a
169 library with exactly the name you specify as @var{architecture}, and if
170 that fails, for a library named with the @code{-l} convention---i.e.,
171 @samp{lib@var{architecture}.a}.
172
173 Future releases of @code{gld} may support similar functionality for
174 other architecture families.
175
176 @item -b @var{output-format}
177 Specify the desired output-file binary format. You don't usually need
178 to specify this. @code{gld} can determine
179 the format of @emph{input} files by inspection, and---in the most frequent
180 case, when all input files have the same format, @code{gld} selects the
181 same format for output files by default.
182
183 You can use this option if you need to link a variety of object formats
184 together, or if you wish to force a different output format even though
185 you have homogeneous input files.
186
187 @var{output-format} is a text string, the name of a particular format
188 supported by the BFD libraries. @xref{BFD}.
189
190 @code{-format @var{output-format}} has the same effect.
191
192 @item -Bstatic
193 This flag is accepted for command-line compatibility with the SunOS linker,
194 but has no effect on @code{gld}.
195
196 @item -c @var{commandfile}
197 Directs @code{gld} to read link commands from the file
198 @var{commandfile}. These commands will override @code{gld}'s
199 default link format in its entirety; @var{commandfile} must specify
200 everything necessary to specify the target format. @xref{Commands}.
201
202 You may also include a script of link commands directly in the command
203 line by using the @code{SCRIPT} @dots{} @code{ENDSCRIPT} keywords.
204
205 @c FIXME: -D in older GNU linker, not necessarily in new
206 @item -D @var{datasize}
207 Use this option to specify a target size for the @code{data} segment of
208 your linked program. The option is only obeyed if @var{datasize} is
209 larger than the natural size of the program's @code{data} segment.
210
211 @var{datasize} must be an integer specified in hexadecimal.
212
213 @code{ld} will simply increase the size of the @code{data} segment,
214 padding the created gap with zeros (or a fill pattern specified with
215 @samp{-f}, or using the command language), and reduce the size of the
216 @code{bss} segment by the same amount.
217 @c FIXME: double-check this w/Steve. Open questions: order? Does it
218 @c FIXME...matter whether -f before or after -D? What about -c relative
219 @c FIXME...position? fill cmd in default script? Apparently
220 @c FIXME...can have multiple fill patterns; which used here?
221
222
223 @item -d
224 @itemx -dc
225 @itemx -dp
226 These three options are equivalent; multiple forms are supported for
227 compatibility with other linkers. Any of them options will force
228 @code{ld} to assign space to common symbols even if a relocatable output
229 file is specified (@code{-r}).
230
231 @item -defsym @var{symbol} = @var{expression}
232 Create a global symbol, in the output file, set to the absolute address
233 given by @var{expression}. A limited form of arithmetic is supported
234 for the @var{expression} in this context: you may give a hexadecimal
235 constant, or use @code{+} and @code{-} to add or subtract hexacedimal
236 constants. If you need more elaborate expressions, consider using the
237 linker command language from a script.
238
239 @item -e @var{entry}
240 Use @var{entry} as the explicit symbol for beginning execution of your
241 program, rather than the default entry point. @xref{Entry Point}, for a
242 discussion of defaults and other ways of specifying the
243 entry point.
244
245 @c FIXME: -f in older GNU linker, not necessarily in new
246 @item -f @var{fill}
247 Sets the default fill pattern for ``holes'' in the output file to
248 the lowest two bytes of the expression specified.
249
250 @item -F
251 @itemx -F{format}
252 Some older linkers required the specification of object-file format,
253 even when all input files were homogeneous, and used this option for
254 that purpose. @code{gld} doesn't usually require this information---it
255 automatically recognizes input-file object format---but it accepts the
256 option flag for compatibility with old scripts.
257
258 @item -format @var{output-format}
259 Synonym for @code{-b} @var{output-format}.
260
261 @item -g
262 Accepted, but ignored; provided for compatibility with other tools.
263
264 @item -i
265 Produce an incremental link (same as option @code{-r}).
266
267 @item -l@var{ar}
268 Add an archive file @var{ar} to the list of files to link. This
269 option may be used any number of times. @code{ld} will search its
270 path-list for occurrences of @code{lib@var{ar}.a} for every @var{ar}
271 specified.
272
273 @c FIXME: -l also has a side effect of using the "c++ demangler" if we happen
274 @c FIXME...to specify -llibg++. Document? pesch@@cygnus.com, 24jan91
275
276 @item -L@var{searchdir}
277 This command adds path @var{searchdir} to the list of paths that
278 @code{gld} will search for archive libraries. You may use this option
279 any number of times.
280
281 @c Should we make any attempt to list the standard paths searched
282 @c without listing? When hacking on a new system I often want to know
283 @c this, but this may not be the place... it's not constant across
284 @c systems, of course, which is what makes it interesting.
285 @c pesch@@cygnus.com, 24jan91.
286
287 @item -M
288 @itemx -m
289 Print (to the standard output file) a link map---diagnostic information
290 about where symbols are mapped by @code{ld}, and information on global
291 common storage allocation.
292
293 @c FIXME: -N in older GNU linker, not necessarily in new
294 @item -N
295 specifies readable and writable @code{text} and @code{data} sections. If
296 the output format supports Unix style magic numbers, the output is
297 marked as @code{OMAGIC}.
298
299 @item -n
300 sets the text segment to be read only, and @code{NMAGIC} is written
301 if possible.
302
303 @item -noinhibit-exec
304 Normally, the linker will not produce an output file if it encounters
305 errors during the link process. With this flag, you can specify that
306 you wish the output file retained for even after non-fatal errors.
307
308 @item -o @var{output}
309 @var{output} is a name for the program produced by @code{ld}; if this
310 option is not specified, the name @samp{a.out} is used by default.
311
312 @item -R @var{filename}
313 Read symbol names and their addresses from @var{filename}, but do not
314 relocate it or include it in the output. This allows your output file
315 to refer symbolically to absolute locations of memory defined in other
316 programs.
317 @c FIXME: -R accurate? Motivation? Kernel memory, shared mem?
318
319 @item -r
320 @cindex partial link
321 Generates relocatable output---i.e., generate an output file that can in
322 turn serve as input to @code{gld}. This is often called @dfn{partial
323 linking}. As a side effect, this option also sets the output file's
324 magic number to @code{OMAGIC}; see @samp{-N}. If this option is not
325 specified, an absolute file is produced. When linking C++ programs,
326 this option @emph{will not} resolve references to constructors;
327 @samp{-Ur} is an alternative.
328
329 @item -S
330 Omits debugger symbol information (but not all symbols) from the output file.
331
332 @item -s
333 Omits all symbol information from the output file.
334
335 @item SCRIPT @dots @@
336 @itemx SCRIPT @dots ENDSCRIPT
337 You can, if you wish, include a script of linker commands directly in
338 the command line instead of referring to it via an input file. When the
339 keyword @code{SCRIPT} occurs on the command line, the linker switches to
340 interpreting the command language until the end of the list of commands
341 is reached---flagged with either an at sign @samp{@@} or with the
342 keyword @code{ENDSCRIPT}. Other command-line options will not be
343 recognized while parsing the script. @xref{Commands} for a description
344 of the command language.
345
346 @item -Tbss @var{bssorg}
347 @itemx -Tdata @var{dataorg}
348 @itemx -Ttext @var{textorg}
349 Use @var{textorg} as the starting address for---respectively---the
350 @code{bss}, @code{data}, or the @code{text} segment of the output file.
351 @var{textorg} must be a hexadecimal integer.
352
353 @item -T @var{commandfile}
354 @itemx -T@var{commandfile}
355 Equivalent to @code{-c @var{commandfile}}; supported for compatibility with
356 other tools.
357
358 @item -t
359 Prints names of input files as @code{ld} processes them.
360
361 @item -u @var{sym}
362 Forces @var{sym} to be entered in the output file as an undefined symbol.
363 This may, for example, trigger linking of additional modules from
364 standard libraries. @code{-u} may be repeated with different option
365 arguments to enter additional undefined symbols. This option is equivalent
366 to the @code{EXTERN} linker command.
367
368 @item -Ur
369 @cindex constructors
370 For anything other than C++ programs, this option is equivalent to
371 @samp{-r}: it generates relocatable output---i.e., an output file that can in
372 turn serve as input to @code{gld}. When linking C++ programs, @samp{-Ur}
373 @emph{will} resolve references to constructors, unlike @samp{-r}.
374
375 @item -v
376 @cindex version
377 @cindex verbose
378 ``Verbose'' switch: display informative messages, including the version
379 numbers for @code{gld} and BFD, information on files opened, and BFD
380 subroutine calls.
381
382 @item -X
383 If @code{-s} or @code{-S} is also specified, delete only local symbols
384 beginning with @samp{L}.
385
386 @item -x
387 If @code{-s} or @code{-S} is also specified, delete all local symbols,
388 not just those beginning with @samp{L}.
389
390 @c FIXME: -z in older GNU linker, not necessarily in new
391 @item -z
392 Specifies a read-only, demand pageable, and shared @code{text} segment.
393 If the output format supports Unix-style magic numbers, @code{-z} also
394 marks the output as @code{ZMAGIC}, the default.
395
396 @c FIXME: why is following here?. Is it useful to say '-z -r' for
397 @c FIXME...instance, or is this just a ref to other ways of setting
398 @c FIXME...magic no?
399 Specifying a relocatable output file (@code{-r}) will also set the magic
400 number to @code{OMAGIC}.
401
402 See description of @samp{-N}.
403
404 @end table
405
406 @node Commands,,,
407 @chapter Command Language
408 @c FIXME: is this a good place to talk about LDEMULATION env var?
409 @c FIXME...Apparently some commands "subtly different" depending on
410 @c FIXME...whether this set to eg "link960", "gld960", "gld". What is
411 @c FIXME...full set of possibilities, what is default? Config-dep?
412
413
414 The command language allows explicit control over the link process,
415 allowing complete specification of the mapping between the linker's
416 input files and its output. This includes:
417 @itemize @bullet
418 @item input files
419 @item file formats
420 @item output file format
421 @item addresses of sections
422 @item placement of common blocks
423 @end itemize
424
425 A command file may be supplied to the linker, either explicitly through
426 the @code{-c} option, or implicitly as an ordinary file. If the linker
427 opens a file which it cannot recognize as a supported object or archive
428 format, it tries to interpret the file as a command file.
429
430 @node Scripts,,,
431 @section Linker Scripts
432 The @code{gld} command language is a collection of statements; some are
433 simple keywords setting a particular flag, some are used to select and
434 group input files or name output files; and two particular statement
435 types have a fundamental and pervasive impact on the linking process.
436
437 The most fundamental command of the @code{gld} command language is the
438 @code{SECTIONS} command (@pxref{SECTIONS}). Every meaningful command
439 script must have a @code{SECTIONS} command: it specifies a
440 ``picture'' of the output file's layout, in varying degrees of detail.
441 No other command is required in all cases.
442
443 The @code{MEMORY} command complements @code{SECTIONS} by describing the
444 available memory in the target architecture; if it is not present,
445 sufficient memory is assumed to be available in a contiguous block for
446 all output. @xref{MEMORY}.
447
448 @node Expressions,,,
449 @section Expressions
450 Many useful commands involve arithmetic expressions. The syntax for
451 expressions in the command language is identical to that of C
452 expressions, with the following features:
453 @itemize @bullet
454 @item All expressions evaluated as integers and
455 are of ``long'' or ``unsigned long'' type.
456 @item All constants are integers.
457 @item All of the C arithmetic operators are provided.
458 @item Global variables may be referenced, defined and created.
459 @item Built in functions may be called.
460 @end itemize
461
462 @node Integers,,,
463 @subsection Integers
464 An octal integer is @samp{0} followed by zero or more of the octal
465 digits (@samp{01234567}).
466 @example
467 @end example
468
469 A decimal integer starts with a non-zero digit followed by zero or
470 more digits (@samp{0123456789}).
471 @example
472 _as_octal = 0157255;
473 @end example
474
475 A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or
476 more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}.
477 @example
478 _as_hex = 0xdead;
479 @end example
480
481 Decimal integers have the usual values. To denote a negative integer, use
482 the prefix operator @samp{-}; @pxref{Operators}.
483 @example
484 _as_decimal = 57005;
485 _as_neg = -57005;
486 @end example
487
488 Additionally the suffixes @code{K} and @code{M} may be used to scale a
489 constant by
490 @tex
491 ${\rm 1024}$ or ${\rm 1024}^2$
492 @end tex
493 @ifinfo
494 1024 or 1024*1024
495 @end ifinfo
496 respectively. For example, the following all refer to the same quantity:@refill
497
498 @example
499 _4k_1 = 4K;
500 _4k_2 = 4096;
501 _4k_3 = 0x1000;
502 @end example
503
504 @node Symbols,,,
505 @subsection Symbol Names
506 Unless quoted, symbol names start with a letter, underscore, point or
507 minus sign and may include any letters, underscores, digits, points,
508 and minus signs. Unquoted symbol names must not conflict with any
509 keywords. You can specify a symbol which contains odd characters or has
510 the same name as a keyword, by surrounding the symbol name in double quotes:
511 @example
512 "SECTION" = 9;
513 "with a space" = "also with a space" + 10;
514 @end example
515
516 @subsection The Location Counter
517 The special linker variable @dfn{dot} @samp{.} always contains the
518 current output location counter. Since the @code{.} always refers to
519 a location in an output section, it must always appear in an
520 expression within a @code{SECTIONS} command. The @code{.} symbol
521 may appear anywhere that an ordinary symbol is allowed in an
522 expression, but its assignments have a side effect. Assigning a value
523 to the @code{.} symbol will cause the location counter to be moved.
524 This may be used to create holes in the output section. The location
525 counter may never be moved backwards.
526 @example
527 SECTIONS
528 @{
529 output :
530 @{
531 file1(.text)
532 . = . + 1000;
533 file2(.text)
534 . += 1000;
535 file3(.text)
536 . -= 32;
537 file4(.text)
538 @} = 0x1234;
539 @}
540 @end example
541 In the previous example, @code{file1} is located at the beginning of
542 the output section, then there is a 1000 byte gap, filled with 0x1234.
543 Then @code{file2} appears, also with a 1000 byte gap following before
544 @code{file3} is loaded. Then the first 32 bytes of @code{file4} are
545 placed over the last 32 bytes of @code{file3}.
546
547 @node Operators,,,
548 @subsection Operators
549 The linker recognizes the standard C set of arithmetic operators, with
550 the standard bindings and precedence levels:
551 @c FIXME: distinguish somehow between prefix, infix in operator table!
552 @c FIXME: is it fair to include assignments below? Don't they
553 @c FIXME...require trailing ; when no other exprs do?
554 @ifinfo
555 @example
556 precedence associativity Operators
557 (highest)
558 1 left ! - ~
559 2 left * / %
560 3 left + -
561 4 left >> <<
562 5 left == != > < <= >=
563 6 left &
564 7 left |
565 8 left &&
566 9 left ||
567 10 right ? :
568 11 right &= += -= *= /=
569 (lowest)
570 @end example
571 @end ifinfo
572 @c FIXME: simplify, debug TeX form of this table!
573 @tex
574
575 \vbox{\offinterlineskip
576 \hrule
577 \halign
578 {\vrule#&\hfil#\hfil&\vrule#&\hfil#\hfil&\vrule#&\hfil#\hfil&\vrule#\cr
579 height2pt&&&&&\cr
580 &Level&& associativity &&Operators&\cr
581 height2pt&&&&&\cr
582 \noalign{\hrule}
583 height2pt&&&&&\cr
584 &highest&&&&&\cr
585 &1&&left&&$ ! - ~$&\cr
586 height2pt&&&&&\cr
587 &2&&left&&* / \%&\cr
588 height2pt&&&&&\cr
589 &3&&left&&+ -&\cr
590 height2pt&&&&&\cr
591 &4&&left&&$>> <<$&\cr
592 height2pt&&&&&\cr
593 &5&&left&&$== != > < <= >=$&\cr
594 height2pt&&&&&\cr
595 &6&&left&&\&&\cr
596 height2pt&&&&&\cr
597 &7&&left&&|&\cr
598 height2pt&&&&&\cr
599 &8&&left&&{\&\&}&\cr
600 height2pt&&&&&\cr
601 &9&&left&&||&\cr
602 height2pt&&&&&\cr
603 &10&&right&&? :&\cr
604 height2pt&&&&&\cr
605 &11&&right&&$${\&= += -= *= /=}&\cr
606 &lowest&&&&&\cr
607 height2pt&&&&&\cr}
608 \hrule}
609 @end tex
610
611 @node Evaluation,,,
612 @subsection Evaluation
613
614 The linker uses ``lazy evaluation'' for expressions; it only calculates
615 an expression when absolutely necessary. The linker needs the value of
616 the start address, and the lengths of memory regions, in order to do any
617 linking at all; these values are computed as soon as possible when the
618 linker reads in the command file. However, other values (such as symbol
619 values) are not known or needed until after storage allocation. Such
620 values are evaluated later, when other information (such as the sizes of
621 output sections) is available for use in the symbol assignment
622 expression.
623
624 @node Assignment,,,
625 @subsection Assignment: Defining Symbols
626
627 You may create global symbols, and assign values (addresses) to global
628 symbols, using any of the C assignment operators:
629
630 @table @code
631 @item @var{symbol} = @var{expression} ;
632 @itemx @var{symbol} += @var{expression} ;
633 @itemx @var{symbol} -= @var{expression} ;
634 @itemx @var{symbol} *= @var{expression} ;
635 @itemx @var{symbol} /= @var{expression} ;
636 @end table
637
638 Two things distinguish assignment from other operators in @code{gld}
639 expressions.
640 @itemize @bullet
641 @item Assignment may only be used at the root of an expression;
642 @samp{a=b+3;} is allowed, but @samp{a+b=3;} is an error.
643 @item A trailing semicolon is required at the end of an assignment
644 statement.
645 @end itemize
646
647 Assignment statements may appear:
648 @itemize @bullet
649 @item as commands in their own right in a @code{gld} script; or
650 @item as independent statements within a @code{SECTIONS} command; or
651 @item as part of the contents of a section definition in a
652 @code{SECTIONS} command.
653 @end itemize
654
655 The first two cases are equivalent in effect---both define a symbol with
656 an absolute address; the last case defines a symbol whose address is
657 relative to a particular section (@pxref{SECTIONS}).
658
659 When a linker expression is evaluated and assigned to a variable it is given
660 either an absolute or a relocatable type. An absolute expression type
661 is one in which the symbol contains the value that it will have in the
662 output file, a relocateable expression type is one in which the value
663 is expressed as a fixed offset from the base of a section.
664
665 The type of the expression is controlled by its position in the script
666 file. A symbol assigned within a @code{SECTION} specification is
667 created relative to the base of the section, a symbol assigned in any
668 other place is created as an absolute symbol. Since a symbol created
669 within a @code{SECTION} specification is relative to the base of the
670 section it will remain relocatable if relocatable output is requested.
671 A symbol may be created with an absolute value even when assigned to
672 within a @code{SECTION} specification by using the absolute assignment
673 function @code{ABSOLUTE} For example, to create an absolute symbol
674 whose address is the last byte of the output section @code{.data}:
675 @example
676 .data :
677 @{
678 *(.data)
679 _edata = ABSOLUTE(.) ;
680 @}
681 @end example
682
683 The linker tries to put off the evaluation of an assignment until
684 all the terms in the source expression are known (@pxref{Evaluation}).
685 For instance the sizes of sections cannot be known until after
686 allocation, so assignments dependent upon these are not performed until
687 after allocation. Some expressions, such as those depending upon the
688 location counter @dfn{dot}, @samp{.} must be evaluated during
689 allocation. If the result of an expression is required, but the value is
690 not available, then an error results. For example, attempting to use a
691 script like the following
692 @example
693 SECTIONS @{
694 text 9+this_isnt_constant:
695 @{ @dots{}
696 @}
697 @}
698 @end example
699 will get the error message ``@code{Non constant expression for initial
700 address}''.
701
702 @node Builtins,,,
703 @subsection Built in Functions
704 The command language provides built in functions for use in
705 expressions in link scripts.
706 @itemize @bullet
707 @item @code{ALIGN(@var{exp})}
708 returns the result of the current location counter (@code{.}) aligned to
709 the next @var{exp} boundary. @var{exp} must be an expression whose
710 value is a power of two. This is equivalent to @samp{(. + @var{exp} -1)
711 & ~(@var{exp}-1)}. As an example, to align the output @code{.data}
712 section to the next 0x2000 byte boundary after the preceding section and
713 to set a variable within the section to the next 0x8000 boundary after
714 the input sections:
715 @example
716 .data ALIGN(0x2000) :@{
717 *(.data)
718 variable = ALIGN(0x8000);
719 @}
720 @end example
721
722 @item @code{ADDR(@var{section name})}
723 returns the absolute address of the named section. Your script must
724 previously have defined the location of that section. In the following
725 example the @code{symbol_1} and @code{symbol_2} are assigned identical
726 values:
727 @example
728 .output1:
729 @{
730 start_of_output_1 $= .;
731 ...
732 @}
733 .output:
734 @{
735 symbol_1 = ADDR(.output1);
736 symbol_2 = start_of_output_1;
737 @}
738 @end example
739
740 @item @code{SIZEOF(@var{section name})}
741 returns the size in bytes of the named section, if the section has
742 been allocated. In the following example the @code{symbol_1} and
743 @code{symbol_2} are assigned identical values:
744 @example
745 .output @{
746 .start = . ;
747 ...
748 .end = .;
749 @}
750 symbol_1 = .end - .start;
751 symbol_2 = SIZEOF(.output);
752 @end example
753
754 @item @code{DEFINED(@var{symbol name})}
755 Returns 1 if the symbol is in the linker global symbol table and is
756 defined, otherwise it returns 0. For example, this command-file fragment
757 shows how to set a global symbol @code{begin} to the first location in
758 the @code{.text} section---but only if no symbol called @code{begin}
759 existed:
760 @example
761 .text: @{
762 begin = DEFINED(begin) ? begin : . ;
763 ...
764 @}
765 @end example
766 @end itemize
767
768 @node MEMORY,,,
769 @section MEMORY Command
770 The linker's default configuration permits allocation of all memory.
771 You can override this by using the @code{MEMORY} command. The
772 @code{MEMORY} command describes the location and size of blocks of
773 memory in the target. By using it carefully, you can describe which
774 memory regions may be used by the linker, and which memory regions it
775 must avoid. The linker does not shuffle sections to fit into the
776 available regions, but does move the requested sections into the correct
777 regions and issue errors when the regions become too full.
778
779 Command files may contain at most one use of the @code{MEMORY}
780 command; however, you can define as many blocks of memory within it as
781 you wish. The syntax is:
782
783 @example
784 MEMORY
785 @{
786 @var{name} (@var{attr}): ORIGIN = @var{origin}, LENGTH = @var{len}
787 .
788 .
789 .
790 @}
791 @end example
792 @table @code
793 @item @var{name}
794 is a name used internally by the linker to refer to the region. Any
795 symbol name may be used. The region names are stored in a separate
796 name space, and will not conflict with symbols, filenames or section
797 names. Use distinct names to specify multiple regions.
798 @item (@var{attr})
799 is an optional list of attributes, parsed for compatibility with the
800 AT@&T linker but ignored by the both the AT@&T and the GNU linker.
801 Valid attribute lists must be made up of the characters ``@code{RWXL}''.
802 If you omit the attribute list, you may omit the parentheses around it
803 as well.
804 @item @var{origin}
805 is the start address of the region in physical memory. It is expressed as
806 an expression, which must evaluate to a constant before
807 memory allocation is performed. The keyword @code{ORIGIN} may be
808 abbreviated to @code{org} or @code{o}.
809 @item @var{len}
810 is the size in bytes of the region (an expression).
811 The keyword @code{LENGTH} may be abbreviated to @code{len} or @code{l}
812 @end table
813
814 For example, to specify that memory has two regions available for
815 allocation; one starting at 0 for 256k, and the other starting at
816 0x40000000 for four megabytes:
817
818 @example
819 MEMORY
820 @{
821 rom : ORIGIN= 0, LENGTH = 256K
822 ram : org= 0x40000000, l = 4M
823 @}
824 @end example
825
826 Once you have defined a region of memory named @var{mem}, you can direct
827 specific output sections there by using a command ending in @samp{>@var{mem}}
828 within the @code{SECTIONS} command. If the combined output
829 sections directed to a region are too big for the region, the linker will
830 issue an error message.
831
832 @node SECTIONS,,,
833 @section SECTIONS Command
834 The @code{SECTIONS} command controls exactly where input sections are
835 placed into output sections, their order and to which output sections
836 they are allocated.
837
838 You may use at most one @code{SECTIONS} command in a commands file,
839 but you can have as many statements within it as you wish. Statements
840 within the @code{SECTIONS} command can do one of three things:
841 @itemize @bullet
842 @item
843 define the entry point;
844 @item
845 assign a value to a symbol;
846 @item
847 describe the placement of a named output section, and what input
848 sections make it up.
849 @end itemize
850
851 The first two possibilities---defining the entry point, and defining
852 symbols---can also be done outside the @samp{SECTIONS} command:
853 @pxref{Entry Point}, @pxref{Assignment}. They are permitted here as
854 well for your convenience in reading the script, so that symbols or the
855 entry point can be defined at meaningful points in your output-file
856 layout.
857
858 When no @code{SECTIONS} command is specified, the default action
859 of the linker is to place each input section into an identically named
860 output section in the order that the sections are first encountered in
861 the input files; if all input sections are present in the first file,
862 for example, the order of sections in the output file will match the
863 order in the first input file.
864
865 @node Section Definition,,,
866 @subsection Section Definitions
867 The most frequently used statement in the @code{SECTIONS} command is
868 the @dfn{section definition}, which you can use to specify the
869 properties of an output section: its location, alignment, contents,
870 fill pattern, and target memory region can all be specified. Most of
871 these specifications are optional; the simplest form of a section
872 definition is
873 @example
874 SECTIONS
875 @{
876 .
877 .
878 .
879 @var{secname} : @{
880 @var{contents}
881 @}
882 .
883 .
884 .
885 @}
886 @end example
887 @noindent
888 @var{secname} is the name of the output section, and @var{contents} a
889 specification of what goes there---for example a list of input files or
890 sections of input files. As you might assume, the whitespace shown is
891 optional; you do need the colon @samp{:} and the braces @samp{@{@}},
892 however.
893
894 @var{secname} must meet the constraints of your output format. In
895 formats which only support a limited number of sections, such as
896 @code{a.out}, the name must be one of the names supported by the format
897 (in the case of @code{a.out}, @code{.text}, @code{.data} or @code{.bss}). If
898 the output format supports any number of sections, but with numbers and
899 not names (in the case of IEEE), the name should be supplied as a quoted
900 numeric string. A section name may consist of any sequence characters,
901 but any name which does not conform to the standard @code{gld} symbol
902 name syntax must be quoted.
903
904 @node Section Contents,,,
905 @subsection Section Contents
906 In a section definition, you can specify the contents of an output section by
907 listing particular object files; by listing particular input-file
908 sections; or a combination of the two. You can also place arbitrary
909 data in the section, and define symbols relative to the beginning of the
910 section.
911
912 The @var{contents} of a section definition may include any of the
913 following kinds of statement. You can include as many of these as you
914 like in a single section definition, separated from one another by
915 whitespace.
916
917 @table @code
918 @item @var{filename}( @var{section} )
919 @itemx @var{filename}( @var{section}, @var{section}, @dots{} )
920 @itemx @var{filename}( @var{section} @var{section} @dots{} )
921 You can name one or more sections from your input files, for
922 insertion in the current output section. If you wish to specify a list
923 of input-file sections inside the parentheses, you may separate the
924 section names by either commas or whitespace.
925
926 @item @var{filename}
927 You may simply name a particular input file to be placed in the current
928 output section; @emph{all} sections from that file are placed in
929 the current section definition. Since multiple statements may be
930 present in the contents of a section definition, you can specify a list
931 of particular files by name:
932 @example
933 .data: @{ afile.o bfile.o cfile.o @}
934 @end example
935
936 If the file name has already been mentioned in another section
937 definition, with an explicit section name list, then only those sections
938 which have not yet been allocated are used.
939
940 @item * (@var{section})
941 @itemx * (@var{section}, @var{section}, @dots{})
942 @itemx * (@var{section} @var{section} @dots{})
943 Instead of explicitly naming particular input files in a link control
944 script, you can refer to @emph{all} files from the @code{gld} command
945 line: use @samp{*} instead of a particular filename before the
946 parenthesized input-file section list.
947
948 For example, to copy sections @code{1} through @code{4} from a Oasys file
949 into the @code{.text} section of an @code{a.out} file, and sections @code{13}
950 and @code{14} into the @code{.data} section:
951 @example
952 SECTION @{
953 .text :@{
954 *("1" "2" "3" "4")
955 @}
956
957 .data :@{
958 *("13" "14")
959 @}
960 @}
961 @end example
962
963 If you have already explicitly included some files by name, @samp{*}
964 refers to all @emph{remaining} files---those whose places in the output
965 file have not yet been defined.
966
967 @item [ @var{section} ]
968 @itemx [ @var{section}, @var{section}, @dots{} ]
969 @itemx [ @var{section} @var{section} @dots{} ]
970 This is an alternate notation to specify named sections from all
971 unallocated input files; its effect is exactly the same as that of
972 @samp{* (@var{section}@dots{})}
973
974 @item @var{filename}@code{( COMMON )}
975 @itemx [ COMMON ]
976 Specify where in your output file to place uninitialized data
977 with this notation. @code{[COMMON]} by itself refers to all
978 uninitialized data from all input files (so far as it is not yet
979 allocated); @var{filename}@code{(COMMON)} refers to uninitialized data
980 from a particular file. Both are special cases of the general
981 mechanisms for specifying where to place input-file sections:
982 @code{gld} permits you to refer to uninitialized data as if it
983 were in an input-file section named @code{COMMON}, regardless of the
984 input file's format.
985 @end table
986
987 For example, the following command script arranges its output file into
988 three consecutive sections, named @code{.text}, @code{.data}, and
989 @code{.bss}, taking the input for each from the correspondingly named
990 sections of all the input files:
991 @example
992 SECTIONS
993 {
994 .text: { *(.text) }
995 .data: { *(.data) }
996 .bss: { *(.bss) [COMMON] }
997 }
998 @end example
999
1000 The following example reads all of the sections from file @code{all.o}
1001 and places them at the start of output section @code{outputa} which
1002 starts at location @code{0x10000}. All of section @code{.input1} from
1003 file @code{foo.o} follows immediately, in the same output section. All
1004 of section @code{.input2} from @code{foo.o} goes into output section
1005 @code{outputb}, followed by section @code{.input1} from @code{foo1.o}.
1006 All of the remaining @code{.input1} and @code{.input2} sections from any
1007 files are written to output section @code{outputc}.
1008
1009 @example
1010 SECTIONS
1011 @{
1012 outputa 0x10000 :
1013 @{
1014 all.o
1015 foo.o (.input1)
1016 @}
1017 outputb :
1018 @{
1019 foo.o (.input2)
1020 foo1.o (.input1)
1021 @}
1022 outputc :
1023 @{
1024 *(.input1)
1025 *(.input2)
1026 @}
1027 @}
1028 @end example
1029
1030 There are still more kinds of statements permitted in the contents of
1031 output section definitions! The foregoing statements permitted you to
1032 arrange, in your output file, data originating from your input files.
1033 You can also place data directly in an output section from the link
1034 command script. Most of these additional statements involve
1035 expressions; @pxref{Expressions}. Although these statements are shown
1036 separately here for ease of presentation, no such segregation is needed
1037 within a section definition in the @code{SECTIONS} command; you can
1038 intermix them freely with any of the statements we've just described.
1039
1040 @table @code
1041 @item CREATE_OBJECT_SYMBOLS
1042 instructs the linker to create a symbol for each input file and place it
1043 into the current section, set with the address of the first byte of
1044 data written from the input file. For instance, with @code{a.out}
1045 files it is conventional to have a symbol for each input file. You can
1046 accomplish this by defining the output @code{.text} section as follows:
1047 @example
1048 SECTIONS @{
1049 .text 0x2020 :
1050 @{
1051 CREATE_OBJECT_SYMBOLS
1052 *(.text)
1053 _etext = ALIGN(0x2000);
1054 @}
1055 .
1056 .
1057 .
1058 @}
1059 @end example
1060
1061 If @code{objsym} is a file containing this script, and @code{a.o},
1062 @code{b.o}, @code{c.o}, and @code{d.o} are four input files with
1063 contents like the following---
1064 @example
1065 /* a.c */
1066
1067 afunction() { }
1068 int adata=1;
1069 int abss;
1070 @end example
1071
1072 @noindent
1073 @samp{gld -M sample a.o b.o c.o d.o} would create a map like this,
1074 containing symbols matching the object file names:
1075 @example
1076 00000000 A __DYNAMIC
1077 00004020 B _abss
1078 00004000 D _adata
1079 00002020 T _afunction
1080 00004024 B _bbss
1081 00004008 D _bdata
1082 00002038 T _bfunction
1083 00004028 B _cbss
1084 00004010 D _cdata
1085 00002050 T _cfunction
1086 0000402c B _dbss
1087 00004018 D _ddata
1088 00002068 T _dfunction
1089 00004020 D _edata
1090 00004030 B _end
1091 00004000 T _etext
1092 00002020 t a.o
1093 00002038 t b.o
1094 00002050 t c.o
1095 00002068 t d.o
1096 @end example
1097
1098 @item FORCE_COMMON_ALLOCATION
1099 @c FIXME! I don't know what this does.
1100
1101 @item @var{symbol} = @var{expression} ;
1102 @itemx @var{symbol} @var{f}= @var{expression} ;
1103 @var{symbol} is any symbol name (@pxref{Symbols}). When you assign a
1104 value to a symbol within a particular section definition, the value is
1105 relative to the beginning of the section (@pxref{Assignment}). If you write
1106 @example
1107 SECTIONS
1108 {
1109 abs = 14 ;
1110 .
1111 .
1112 .
1113 .data: { @dots{} rel = 14 ; @dots{} }
1114 abs2 = 14 + ADDR(.data);
1115 .
1116 .
1117 .
1118 }
1119 @end example
1120 @c FIXME! Try above example!
1121 @noindent
1122 @code{abs} and @var{rel} do not have the same value; @code{rel} has the
1123 same value as @code{abs2}.
1124
1125 ``@var{f}='' here refers to any of the operators @code{&= += -= *=
1126 /=} which combine arithmetic and assignment.
1127
1128 @item BYTE(@var{expression})
1129 @itemx SHORT(@var{expression})
1130 @itemx LONG(@var{expression})
1131 By including one of these three statements in a section definition, you
1132 can explicitly place one, two, or four bytes (respectively) at the
1133 current address of that section. Multiple-byte quantities are
1134 represented in whatever byte order is appropriate for the output file
1135 format (@pxref{BFD}).
1136
1137 @item FILL(@var{expression})
1138 Specifies the ``fill pattern'' for the current section. Any otherwise
1139 unspecified regions of memory within the section (for example, regions
1140 you skip over by assigning a new value to the location counter @samp{.})
1141 are filled with the two least significant bytes from the
1142 @var{expression} argument. A @code{FILL} statement covers memory
1143 locations @emph{after} the point it occurs in the section definition; by
1144 including more than one @code{FILL} statement, you can have different
1145 fill patterns in different parts of an output section.
1146 @end table
1147
1148 @node Section Options,,,
1149 @subsection Optional Section Attributes
1150 Here is the full syntax of a section definition, including all the
1151 optional portions:
1152
1153 @example
1154 SECTIONS
1155 @{
1156 .
1157 .
1158 .
1159 @var{secname} @var{start} BLOCK(@var{align}) : @var{contents} =@var{fill} >@var{region}
1160 .
1161 .
1162 .
1163 @}
1164 @end example
1165
1166 @var{secname} and @var{contents} are required. @xref{Section
1167 Definition}, and @pxref{Section Contents} for the details of
1168 @var{contents}. @var{start}, @code{BLOCK(@var{align)}},
1169 @code{=@var{fill}}, and @code{>@var{region}} are all optional.
1170
1171 @table @code
1172 @item @var{start}
1173 You can force the output section to be loaded at a specified address by
1174 specifying @var{start} immediately following the section name.
1175 @var{start} can be represented as any expression. The following
1176 example generates section @var{output} at location
1177 @code{0x40000000}:
1178 @example
1179 SECTIONS @{
1180 .
1181 .
1182 .
1183 output 0x40000000: @{
1184 @dots{}
1185 @}
1186 .
1187 .
1188 .
1189 @}
1190 @end example
1191
1192 @item BLOCK(@var{align})
1193 @c FIXME! Fill in BLOCK(align) description
1194
1195 @item =@var{fill}
1196 You may use any expression to specify @var{fill}. Including
1197 @code{=@var{fill}} in a section definition specifies the initial fill
1198 value for that section. Any unallocated holes in the current output
1199 section when written to the output file will be filled with the two
1200 least significant bytes of the value, repeated as necessary. You can
1201 also change the fill value with a @code{FILL} statement in the
1202 @var{contents} of a section definition.
1203
1204 @item >@var{region}
1205 @c FIXME! Fill in >region description
1206
1207 @end table
1208
1209 @node Entry Point,,,
1210 @section The Entry Point
1211 The linker command language includes a command specifically for
1212 defining the first executable instruction in an output file (its
1213 @dfn{entry point}). Its argument is a symbol name:
1214 @example
1215 ENTRY(@var{symbol})
1216 @end example
1217
1218 Like symbol assignments, the @code{ENTRY} command may be placed either
1219 as an independent command in the command file, or among the section
1220 definitions within the @code{SECTIONS} command---whatever makes the most
1221 sense for your layout.
1222
1223 @code{ENTRY} is only one of several ways of choosing the entry point.
1224 You may indicate it in any of the following ways (shown in descending
1225 order of priority: methods higher in the list override methods lower down).
1226 @itemize @bullet
1227 @item
1228 the @code{-e} @var{entry} command-line option;
1229 @item
1230 the @code{ENTRY} @var{symbol} command in a linker control script;
1231 @item
1232 the value of the symbol @code{start}, if present;
1233 @item
1234 the value of the symbol @code{_main}, if present;
1235 @item
1236 the address of the first byte of the @code{.text} section, if present;
1237 @item
1238 The address @code{0}.
1239 @end itemize
1240
1241 For example, you can also generate an entry point with an assignment statement:
1242 if no symbol @code{start} is defined within your input files, you can
1243 simply assign it an appropriate value---
1244 @example
1245 start = 0x2020;
1246 @end example
1247
1248 @noindent
1249 The example shows an absolute address, but you can use any expression.
1250 For example, if your input object files use some other symbol-name
1251 convention for the entry point, you can just assign the value of
1252 whatever symbol contains the start address to @code{start}:
1253 @example
1254 start = other_symbol;
1255 @end example
1256
1257 @node Other Commands,,,
1258 @section Other Commands
1259 The command language includes a number of other commands that you can
1260 use for specialized purposes. They are similar in purpose to
1261 command-line options.
1262
1263 @table @code
1264 @item FLOAT
1265 @itemx NOFLOAT
1266 Declare to the linker whether or not floating point support is
1267 available. The default assumption is @code{NOFLOAT}.
1268 @c FIXME: So what? What does it do once it knows FLOAT or NOFLOAT?
1269
1270 @item HLL ( @var{file}, @var{file}, @dots{} )
1271 @itemx HLL ( @var{file} @var{file} @dots{} )
1272
1273 @item INPUT ( @var{file}, @var{file}, @dots{} )
1274 @itemx INPUT ( @var{file} @var{file} @dots{} )
1275
1276 @item MAP ( @var{name} )
1277
1278 @item OUTPUT ( @var{filename} )
1279
1280 @item SEARCH_DIR ( @var{pathname} )
1281
1282 @item STARTUP ( @var{name} )
1283
1284 @item SYSLIB ( @var{file}, @var{file}, @dots{} )
1285 @itemx SYSLIB ( @var{file} @var{file} @dots{} )
1286
1287 @item TARGET ( @var{format} )
1288
1289 @end table
1290
1291 @node BFD,,,
1292 @chapter BFD
1293
1294 The linker accesses object and archive files using the @code{bfd}
1295 libraries. These libraries allow the linker to use the same routines
1296 to operate on object files whatever the object file format.
1297
1298 A different object file format can be supported simply by creating a
1299 new @code{bfd} back end and adding it to the library.
1300
1301 Formats currently supported:
1302 @itemize @bullet
1303 @item
1304 Sun3 68k @code{a.out}
1305 @item
1306 IEEE-695 68k Object Module Format
1307 @item
1308 Oasys 68k Binary Relocatable Object File Format
1309 @item
1310 Sun4 sparc @code{a.out}
1311 @item
1312 88k bcs coff
1313 @item
1314 i960 coff little endian
1315 @item
1316 i960 coff big endian
1317 @item
1318 i960 @code{b.out} little endian
1319 @item
1320 i960 @code{b.out} big endian
1321 @end itemize
1322
1323 As with most implementations, @code{bfd} is a compromise between
1324 several conflicting requirements. The major factor influencing
1325 @code{bfd} design was efficiency, any time used converting between
1326 formats is time which would not have been spent had @code{bfd} not
1327 been involved. This is partly offset by abstraction payback; since
1328 @code{bfd} simplifies applications and back ends, more time and care
1329 may be spent optimizing algorithms for a greater speed.
1330
1331 One minor artifact of the @code{bfd} solution which the
1332 user should be aware of is the potential for information loss.
1333 There are two places where useful information can be lost using the
1334 @code{bfd} mechanism; during conversion and during output. @xref{BFD
1335 information loss}.
1336
1337 @node BFD outline,,,
1338 @section How it works: an outline of BFD
1339 When an object file is opened, @code{bfd} subroutines automatically
1340 determine the format of the input object file, and build a descriptor in
1341 memory with pointers to routines that will be used to access elements of
1342 the object file's data structures.
1343
1344 As different information from the the object files is required
1345 @code{bfd} reads from different sections of the file and processes them.
1346 For example a very common operation for the linker is processing symbol
1347 tables. Each @code{bfd} back end provides a routine for converting
1348 between the object file's representation of symbols and an internal
1349 canonical format. When the linker asks for the symbol table of an object
1350 file, it calls through the memory pointer to the relevant @code{bfd}
1351 back end routine which reads and converts the table into a canonical
1352 form. The linker then operates upon the common form. When the link is
1353 finished and the linker writes the symbol table of the output file,
1354 another @code{bfd} back end routine is called which takes the newly
1355 created symbol table and converts it into the chosen output format.
1356
1357 @node BFD information loss,,,
1358 @section Information Loss
1359 @emph{Information can be lost during output.} The output formats
1360 supported by @code{bfd} do not provide identical facilities, and
1361 information which may be described in one form has nowhere to go in
1362 another format. One example of this is alignment information in
1363 @code{b.out}. There is nowhere in an @code{a.out} format file to store
1364 alignment information on the contained data, so when a file is linked
1365 from @code{b.out} and an @code{a.out} image is produced, alignment
1366 information will not propagate to the output file. (The linker will
1367 still use the alignment information internally, so the link is performed
1368 correctly).
1369
1370 Another example is COFF section names. COFF files may contain an
1371 unlimited number of sections, each one with a textual section name. If
1372 the target of the link is a format which does not have many sections (eg
1373 @code{a.out}) or has sections without names (eg the Oasys format) the
1374 link cannot be done simply. You can circumvent this problem by
1375 describing the desired input-to-output section mapping with the command
1376 language.
1377
1378 @emph{Information lost during canonicalization.} The @code{bfd} internal
1379 canonical form of the external formats is not exhaustive; there are
1380 structures in input formats for which there is no direct representation
1381 internally. This means that the @code{bfd} back ends cannot maintain
1382 all possible data richness through the transformation between external to
1383 internal and back to external formats.
1384
1385 This limitation is only a problem when using the linker to read one
1386 format and write another. Each @code{bfd} back end is responsible for
1387 maintaining as much data as possible, and the internal @code{bfd}
1388 canonical form has structures which are opaque to the @code{bfd} core,
1389 and exported only to the back ends. When a file is read in one format,
1390 the canonical form is generated for @code{bfd} and the linker. At the
1391 same time, the back end saves away any information which may otherwise
1392 be lost. If the data is then written back to the same back end, the back
1393 end routine will be able to use the canonical form provided by the
1394 @code{bfd} core as well as the information it prepared earlier. Since
1395 there is a great deal of commonality between back ends, this mechanism
1396 is very useful. There is no information lost for this reason when
1397 linking big endian COFF to little endian COFF, or from @code{a.out} to
1398 @code{b.out}. When a mixture of formats is linked, the information is
1399 only lost from the files whose format differs from the destination.
1400
1401 @node Mechanism,,,
1402 @section Mechanism
1403 The greatest potential for loss of information is when there is least
1404 overlap between the information provided by the source format, that
1405 stored by the canonical format and the information needed by the
1406 destination format. A brief description of the canonical form may help
1407 you appreciate what kinds of data you can count on preserving across
1408 conversions.
1409
1410 @table @emph
1411 @item files
1412 Information on target machine architecture, particular implementation
1413 and format type are stored on a per-file basis. Other information
1414 includes a demand pageable bit and a write protected bit. Note that
1415 information like Unix magic numbers is not stored here---only the magic
1416 numbers' meaning, so a @code{ZMAGIC} file would have both the demand pageable
1417 bit and the write protected text bit set.
1418
1419 The byte order of the target is stored on a per-file basis, so that
1420 both big- and little-endian object files may be linked with one another.
1421
1422 @item sections
1423 Each section in the input file contains the name of the section, the
1424 original address in the object file, various flags, size and alignment
1425 information and pointers into other @code{bfd} data structures.
1426
1427 @item symbols
1428 Each symbol contains a pointer to the object file which originally
1429 defined it, its name, value and various flag bits. When a symbol table
1430 is read in, all symbols are relocated to make them relative to the base
1431 of the section where they were defined, so that each symbol points to
1432 its containing section. Each symbol also has a varying amount of hidden
1433 data to contain private data for the BFD back end. Since the symbol
1434 points to the original file, the private data format for that symbol is
1435 accessible. @code{gld} can operate on a collection of symbols of wildly
1436 different formats without problems.
1437
1438 Normal global and simple local symbols are maintained on output, so an
1439 output file (no matter its format) will retain symbols pointing to
1440 functions and to global, static, and common variables. Some symbol
1441 information is not worth retaining; in @code{a.out} type information is
1442 stored in the symbol table as long symbol names. This information would
1443 be useless to most COFF debuggers and may be thrown away with
1444 appropriate command line switches. (The GNU debugger @code{gdb} does
1445 support @code{a.out} style debugging information in COFF).
1446
1447 There is one word of type information within the symbol, so if the
1448 format supports symbol type information within symbols - (eg COFF,
1449 IEEE, Oasys) and the type is simple enough to fit within one word
1450 (nearly everything but aggregates) the information will be preserved.
1451
1452 @item relocation level
1453 @c FIXME: I don't understand "relocation record" from this so I can't
1454 @c FIXME...improve the explanation to make it clear...
1455 Each canonical relocation record contains a pointer to the symbol to
1456 relocate to, the offset of the data to relocate, the section the data
1457 is in and a pointer to a relocation type descriptor. Relocation is
1458 performed effectively by message passing through the relocation type
1459 descriptor and symbol pointer. It allows relocations to be performed
1460 on output data using a relocation method only available in one of the
1461 input formats. For instance, Oasys provides a byte relocation format.
1462 A relocation record requesting this relocation type would point
1463 indirectly to a routine to perform this, so the relocation may be
1464 performed on a byte being written to a COFF file, even though 68k COFF
1465 has no such relocation type.
1466
1467 @item line numbers
1468 Line numbers have to be relocated along with the symbol information.
1469 Each symbol with an associated list of line number records points to the
1470 first record of the list. The head of a line number list consists of a
1471 pointer to the symbol, which allows divination of the address of the
1472 function whose line number is being described. The rest of the list is
1473 made up of pairs: offsets into the section and line numbers. Any format
1474 which can simply derive this information can pass it successfully
1475 between formats (COFF, IEEE and Oasys).
1476 @end table
1477
1478 @contents
1479 @bye
1480
1481