Introduced m4 parametrization rather than comments and ignore/end ignore
[binutils-gdb.git] / gas / doc / as.texinfo
1 \input texinfo
2 @c @tex
3 @c \special{twoside}
4 @c @end tex
5 _if__(_ALL_ARCH__)
6 @setfilename as.info
7 _fi__(_ALL_ARCH__)
8 _if__(_M680X0__ && !_ALL_ARCH__)
9 @setfilename as-m680x0.info
10 _fi__(_M680X0__ && !_ALL_ARCH__)
11 _if__(_AMD29K__ && !_ALL_ARCH__)
12 @setfilename as-29k.info
13 _fi__(_AMD29K__ && !_ALL_ARCH__)
14 @c
15 @c NOTE: this manual is marked up for preprocessing with a collection
16 @c of m4 macros called "pretex.m4". If you see <_if__> and <_fi__>
17 @c scattered around the source, you have the full source before
18 @c preprocessing; if you don't, you have the source configured for some
19 @c particular architecture (and you can of course get the full source,
20 @c with all configurations, from wherever you got this). The full
21 @c source needs to be run through m4 before either tex- or info-
22 @c formatting: for example,
23 @c m4 pretex.m4 none.m4 m680x0.m4 as.texinfo >as-680x0.texinfo
24 @c will produce (assuming your path finds either GNU or SysV m4;
25 @c Berkeley won't do) a file suitable for formatting.
26 @c See the text in "pretex.m4" for a fuller explanation (and the macro
27 @c definitions).
28 @c
29 @synindex ky cp
30 @ifinfo
31 This file documents the GNU Assembler "as".
32
33 Copyright (C) 1991 Free Software Foundation, Inc.
34
35 Permission is granted to make and distribute verbatim copies of
36 this manual provided the copyright notice and this permission notice
37 are preserved on all copies.
38
39 @ignore
40 Permission is granted to process this file through Tex and print the
41 results, provided the printed document carries copying permission
42 notice identical to this one except for the removal of this paragraph
43 (this paragraph not being relevant to the printed manual).
44
45 @end ignore
46 Permission is granted to copy and distribute modified versions of this
47 manual under the conditions for verbatim copying, provided also that the
48 section entitled ``GNU General Public License'' is included exactly as
49 in the original, and provided that the entire resulting derived work is
50 distributed under the terms of a permission notice identical to this
51 one.
52
53 Permission is granted to copy and distribute translations of this manual
54 into another language, under the above conditions for modified versions,
55 except that the section entitled ``GNU General Public License'' may be
56 included in a translation approved by the author instead of in the
57 original English.
58 @end ifinfo
59 @tex
60 @finalout
61 @end tex
62 @smallbook
63 @setchapternewpage odd
64 _if__(_M680X0__)
65 @settitle Using GNU as (680x0)
66 _fi__(_M680X0__)
67 _if__(_AMD29K__)
68 @settitle Using GNU as (AMD 29K)
69 _fi__(_AMD29K__)
70 @titlepage
71 @title{Using GNU as}
72 @subtitle{The GNU Assembler}
73 _if__(_M680X0__)
74 @subtitle{for Motorola 680x0}
75 _fi__(_M680X0__)
76 _if__(_AMD29K__)
77 @subtitle{for the AMD 29K family}
78 _fi__(_AMD29K__)
79 @sp 1
80 @subtitle February 1991
81 @sp 13
82 The Free Software Foundation Inc. thanks The Nice Computer
83 Company of Australia for loaning Dean Elsner to write the
84 first (Vax) version of @code{as} for Project GNU.
85 The proprietors, management and staff of TNCCA thank FSF for
86 distracting the boss while they got some work
87 done.
88 @sp 3
89 @author{Dean Elsner, Jay Fenlason & friends}
90 @author{revised by Roland Pesch for Cygnus Support}
91 @c pesch@cygnus.com
92 @page
93 @tex
94 \def\$#1${{#1}} % Kluge: collect RCS revision info without $...$
95 \xdef\manvers{\$Revision$} % For use in headers, footers too
96 {\parskip=0pt
97 \hfill Cygnus Support\par
98 \hfill \manvers\par
99 \hfill \TeX{}info \texinfoversion\par
100 }
101 %"boxit" macro for figures:
102 %Modified from Knuth's ``boxit'' macro from TeXbook (answer to exercise 21.3)
103 \gdef\boxit#1#2{\vbox{\hrule\hbox{\vrule\kern3pt
104 \vbox{\parindent=0pt\parskip=0pt\hsize=#1\kern3pt\strut\hfil
105 #2\hfil\strut\kern3pt}\kern3pt\vrule}\hrule}}%box with visible outline
106 \gdef\ibox#1#2{\hbox to #1{#2\hfil}\kern8pt}% invisible box
107 @end tex
108
109 @vskip 0pt plus 1filll
110 Copyright @copyright{} 1991 Free Software Foundation, Inc.
111
112 Permission is granted to make and distribute verbatim copies of
113 this manual provided the copyright notice and this permission notice
114 are preserved on all copies.
115
116 Permission is granted to copy and distribute modified versions of this
117 manual under the conditions for verbatim copying, provided also that the
118 section entitled ``GNU General Public License'' is included exactly as
119 in the original, and provided that the entire resulting derived work is
120 distributed under the terms of a permission notice identical to this
121 one.
122
123 Permission is granted to copy and distribute translations of this manual
124 into another language, under the above conditions for modified versions,
125 except that the section entitled ``GNU General Public License'' may be
126 included in a translation approved by the author instead of in the
127 original English.
128 @end titlepage
129 @page
130
131 @node Top, Overview, (dir), (dir)
132
133 @menu
134 * Overview:: Overview
135 * Syntax:: Syntax
136 * Segments:: Segments and Relocation
137 * Symbols:: Symbols
138 * Expressions:: Expressions
139 * Pseudo Ops:: Assembler Directives
140 * Maintenance:: Maintaining the Assembler
141 * Retargeting:: Teaching the Assembler about a New Machine
142 * License:: GNU GENERAL PUBLIC LICENSE
143
144 --- The Detailed Node Listing ---
145
146 Overview
147
148 * Invoking:: Invoking @code{as}
149 * Manual:: Structure of this Manual
150 * GNU Assembler:: as, the GNU Assembler
151 * Command Line:: Command Line
152 * Input Files:: Input Files
153 * Object:: Output (Object) File
154 * Errors:: Error and Warning Messages
155 * Options:: Options
156
157 Input Files
158
159 * Filenames:: Input Filenames and Line-numbers
160
161 Syntax
162
163 * Pre-processing:: Pre-processing
164 * Whitespace:: Whitespace
165 * Comments:: Comments
166 * Symbol Intro:: Symbols
167 * Statements:: Statements
168 * Constants:: Constants
169
170 Constants
171
172 * Characters:: Character Constants
173 * Numbers:: Number Constants
174
175 Character Constants
176
177 * Strings:: Strings
178 * Chars:: Characters
179
180 Segments and Relocation
181
182 * Segs Background:: Background
183 * ld Segments:: ld Segments
184 * as Segments:: as Internal Segments
185 * Sub-Segments:: Sub-Segments
186 * bss:: bss Segment
187
188 Segments and Relocation
189
190 * ld Segments:: ld Segments
191 * as Segments:: as Internal Segments
192 * Sub-Segments:: Sub-Segments
193 * bss:: bss Segment
194
195 Symbols
196
197 * Labels:: Labels
198 * Setting Symbols:: Giving Symbols Other Values
199 * Symbol Names:: Symbol Names
200 * Dot:: The Special Dot Symbol
201 * Symbol Attributes:: Symbol Attributes
202
203 Symbol Names
204
205 * Local Symbols:: Local Symbol Names
206
207 Symbol Attributes
208
209 * Symbol Value:: Value
210 * Symbol Type:: Type
211 * Symbol Desc:: Descriptor
212 * Symbol Other:: Other
213
214 Expressions
215
216 * Empty Exprs:: Empty Expressions
217 * Integer Exprs:: Integer Expressions
218
219 Integer Expressions
220
221 * Arguments:: Arguments
222 * Operators:: Operators
223 * Prefix Ops:: Prefix Operators
224 * Infix Ops:: Infix Operators
225
226 Assembler Directives
227
228 * Abort:: The Abort directive causes as to abort
229 * Align:: Pad the location counter to a power of 2
230 * App-File:: Set the logical file name
231 * Ascii:: Fill memory with bytes of ASCII characters
232 * Asciz:: Fill memory with bytes of ASCII characters followed
233 by a null.
234 * Byte:: Fill memory with 8-bit integers
235 * Comm:: Reserve public space in the BSS segment
236 * Data:: Change to the data segment
237 * Desc:: Set the n_desc of a symbol
238 * Double:: Fill memory with double-precision floating-point numbers
239 * Else:: @code{.else}
240 * End:: @code{.end}
241 * Endif:: @code{.endif}
242 * Equ:: @code{.equ @var{symbol}, @var{expression}}
243 * Extern:: @code{.extern}
244 * Fill:: Fill memory with repeated values
245 * Float:: Fill memory with single-precision floating-point numbers
246 * Global:: Make a symbol visible to the linker
247 * Ident:: @code{.ident}
248 * If:: @code{.if @var{absolute expression}}
249 * Include:: @code{.include "@var{file}"}
250 * Int:: Fill memory with 32-bit integers
251 * Lcomm:: Reserve private space in the BSS segment
252 * Line:: Set the logical line number
253 * Ln:: @code{.ln @var{line-number}}
254 * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
255 * Long:: Fill memory with 32-bit integers
256 * Lsym:: Create a local symbol
257 * Octa:: Fill memory with 128-bit integers
258 * Org:: Change the location counter
259 * Quad:: Fill memory with 64-bit integers
260 * Set:: Set the value of a symbol
261 * Short:: Fill memory with 16-bit integers
262 * Single:: @code{.single @var{flonums}}
263 * Stab:: Store debugging information
264 * Text:: Change to the text segment
265 * Word:: Fill memory with 32-bit integers
266 * Deprecated:: Deprecated Directives
267 * Machine Options:: Options
268 * Machine Syntax:: Syntax
269 * Floating Point:: Floating Point
270 * Machine Directives:: Machine Directives
271 * Opcodes:: Opcodes
272
273 Machine Directives
274
275 * block:: @code{.block @var{size} , @var{fill}}
276 * cputype:: @code{.cputype}
277 * file:: @code{.file}
278 * hword:: @code{.hword @var{expressions}}
279 * line:: @code{.line}
280 * reg:: @code{.reg @var{symbol}, @var{expression}}
281 * sect:: @code{.sect}
282 * use:: @code{.use @var{segment name}}
283 @end menu
284
285 @node Overview, Syntax, Top, Top
286 @chapter Overview
287
288 This manual is a user guide to the GNU assembler @code{as}.
289 _if__(_M680X0__)
290 This version of the manual describes @code{as} configured to generate
291 code for Motorola 680x0 architectures.
292 _fi__(_M680X0__)
293 _if__(_AMD29K__)
294 This version of the manual describes @code{as} configured to generate
295 code for Advanced Micro Devices' 29K architectures.
296 _fi__(_AMD29K__)
297
298 @menu
299 * Invoking:: Invoking @code{as}
300 * Manual:: Structure of this Manual
301 * GNU Assembler:: as, the GNU Assembler
302 * Command Line:: Command Line
303 * Input Files:: Input Files
304 * Object:: Output (Object) File
305 * Errors:: Error and Warning Messages
306 * Options:: Options
307 @end menu
308
309 @node Invoking, Manual, Overview, Overview
310 @section Invoking @code{as}
311
312 Here is a brief summary of how to invoke GNU @code{as}. For details,
313 @pxref{Options}.
314
315 @c We don't use @deffn and friends for the following because they seem
316 @c to be limited to one line for the header.
317 @example
318 as [ -D ] [ -f ] [ -I @var{path} ] [ -k ] [ -L ] [ -o @var{objfile} ] [ -R ] [ -v ] [ -w ]
319 _if__(_M680X0__)
320 [ -l ] [ -mc68000 | -mc68010 | -mc68020 ]
321 _fi__(_M680X0__)
322 _if__(_AMD29K__)
323 @c am29k has no machine-dependent assembler options
324 _fi__(_AMD29K__)
325 [ -- | @var{files} @dots{} ]
326 @end example
327
328 @table @code
329
330 @item -D
331 This option is accepted only for script compatibility with calls to
332 other assemblers; it has no effect on GNU @code{as}.
333
334 @item -f
335 ``fast''---skip preprocessing (assume source is compiler output)
336
337 @item -I @var{path}
338 Add @var{path} to the search list for @code{.include} directives
339
340 @item -k
341 _if__(_AMD29K__)
342 This option is accepted but has no effect on the 29K family.
343 _fi__(_AMD29K__)
344 _if__(!_AMD29K__)
345 Issue warnings when difference tables altered for long displacements
346 _fi__(!_AMD29K__)
347
348 @item -L
349 Keep (in symbol table) local symbols, starting with @samp{L}
350
351 @item -o @var{objfile}
352 Name the object-file output from @code{as}
353
354 @item -R
355 Fold data segment into text segment
356
357 @item -W
358 Suppress warning messages
359
360 _if__(_M680X0__)
361 @item -l
362 Shorten references to undefined symbols, to one word instead of two
363
364 @item -mc68000 | -mc68010 | -mc68020
365 Specify what processor in the 68000 family is the target (default 68020)
366 _fi__(_M680X0__)
367
368 @item -- | @var{files} @dots{}
369 Source files to assemble, or standard input
370 @end table
371
372 @node Manual, GNU Assembler, Invoking, Overview
373 @section Structure of this Manual
374 This document is intended to describe what you need to know to use GNU
375 @code{as}. We cover the syntax expected in source files, including
376 notation for symbols, constants, and expressions; the directives that
377 @code{as} understands; and of course how to invoke @code{as}.
378
379 _if__(_M680X0__ && !_ALL_ARCH__)
380 We also cover special features in the 68000 configuration of @code{as},
381 including pseudo-operations.
382 _fi__(_M680X0__ && !_ALL_ARCH__)
383 _if__(_AMD29K__ && !_ALL_ARCH__)
384 We also cover special features in the AMD 29K configuration of @code{as},
385 including assembler directives.
386 _fi__(_AMD29K__ && !_ALL_ARCH__)
387
388 _if__(_ALL_ARCH__)
389 This document also describes some of the machine-dependent features of
390 various flavors of the assembler.
391 _fi__(_ALL_ARCH__)
392 _if__(_INTERNALS__)
393 This document also describes how the assembler works internally, and
394 provides some information that may be useful to people attempting to
395 port the assembler to another machine.
396 _fi__(_INTERNALS__)
397
398 On the other hand, this manual is @emph{not} intended as an introduction
399 to programming in assembly language---let alone programming in general!
400 In a similar vein, we make no attempt to introduce the machine
401 architecture; we do @emph{not} describe the instruction set, standard
402 mnemonics, registers or addressing modes that are standard to a
403 particular architecture. You may want to consult the manufacturer's
404 machine architecture manual for this information.
405
406
407 @c I think this is premature---pesch@cygnus.com, 17jan1991
408 @ignore
409 Throughout this document, we assume that you are running @dfn{GNU},
410 the portable operating system from the @dfn{Free Software
411 Foundation, Inc.}. This restricts our attention to certain kinds of
412 computer (in particular, the kinds of computers that GNU can run on);
413 once this assumption is granted examples and definitions need less
414 qualification.
415
416 @code{as} is part of a team of programs that turn a high-level
417 human-readable series of instructions into a low-level
418 computer-readable series of instructions. Different versions of
419 @code{as} are used for different kinds of computer.
420 @end ignore
421
422 @c There used to be a section "Terminology" here, which defined
423 @c "contents", "byte", "word", and "long". Defining "word" to any
424 @c particular size is confusing when the .word directive may generate 16
425 @c bits on one machine and 32 bits on another; in general, for the user
426 @c version of this manual, none of these terms seem essential to define.
427 @c They were used very little even in the former draft of the manual;
428 @c this draft makes an effort to avoid them (except in names of
429 @c directives).
430
431 @node GNU Assembler, Command Line, Manual, Overview
432 @section as, the GNU Assembler
433 @code{as} is primarily intended to assemble the output of the GNU C
434 compiler @code{gcc} for use by the linker @code{ld}. Nevertheless,
435 we've tried to make @code{as} assemble correctly everything that the native
436 assembler would.
437 _if__(_VAX__)
438 Any exceptions are documented explicitly (@pxref{Machine Dependent}).
439 _fi__(_VAX__)
440 This doesn't mean @code{as} always uses the same syntax as another
441 assembler for the same architecture; for example, we know of several
442 incompatible versions of 680x0 assembly language syntax.
443
444 GNU @code{as} is really a family of assemblers. If you use (or have
445 used) GNU @code{as} on another architecture, you should find a fairly
446 similar environment. Each version has much in common with the others,
447 including object file formats, most assembler directives (often called
448 @dfn{pseudo-ops)} and assembler syntax.
449
450 Unlike older assemblers, @code{as} is designed to assemble a source
451 program in one pass of the source file. This has a subtle impact on the
452 @kbd{.org} directive (@pxref{Org}).
453
454 @node Command Line, Input Files, GNU Assembler, Overview
455 @section Command Line
456
457 After the program name @code{as}, the command line may contain
458 options and file names. Options may be in any order, and may be
459 before, after, or between file names. The order of file names is
460 significant.
461
462 @file{--} (two hyphens) by itself names the standard input file
463 explicitly, as one of the files for @code{as} to assemble.
464
465 Except for @samp{--} any command line argument that begins with a
466 hyphen (@samp{-}) is an option. Each option changes the behavior of
467 @code{as}. No option changes the way another option works. An
468 option is a @samp{-} followed by one or more letters; the case of
469 the letter is important. All options are optional.
470
471 Some options expect exactly one file name to follow them. The file
472 name may either immediately follow the option's letter (compatible
473 with older assemblers) or it may be the next command argument (GNU
474 standard). These two command lines are equivalent:
475
476 @example
477 as -o my-object-file.o mumble
478 as -omy-object-file.o mumble
479 @end example
480
481 @node Input Files, Object, Command Line, Overview
482 @section Input Files
483
484 We use the phrase @dfn{source program}, abbreviated @dfn{source}, to
485 describe the program input to one run of @code{as}. The program may
486 be in one or more files; how the source is partitioned into files
487 doesn't change the meaning of the source.
488
489 @c I added "con" prefix to "catenation" just to prove I can overcome my
490 @c APL training... pesch@cygnus.com
491 The source program is a concatenation of the text in all the files, in the
492 order specified.
493
494 Each time you run @code{as} it assembles exactly one source
495 program. The source program is made up of one or more files.
496 (The standard input is also a file.)
497
498 You give @code{as} a command line that has zero or more input file
499 names. The input files are read (from left file name to right). A
500 command line argument (in any position) that has no special meaning
501 is taken to be an input file name.
502
503 If @code{as} is given no file names it attempts to read one input file
504 from @code{as}'s standard input, which is normally your terminal. You
505 may have to type @key{ctl-D} to tell @code{as} there is no more program
506 to assemble.
507
508 Use @samp{--} if you need to explicitly name the standard input file
509 in your command line.
510
511 If the source is empty, @code{as} will produce a small, empty object
512 file.
513
514 @menu
515 * Filenames:: Input Filenames and Line-numbers
516 @end menu
517
518 @node Filenames, , Input Files, Input Files
519 @subsection Input Filenames and Line-numbers
520 There are two ways of locating a line in the input file (or files) and both
521 are used in reporting error messages. One way refers to a line
522 number in a physical file; the other refers to a line number in a
523 ``logical'' file.
524
525 @dfn{Physical files} are those files named in the command line given
526 to @code{as}.
527
528 @dfn{Logical files} are simply names declared explicitly by assembler
529 directives; they bear no relation to physical files. Logical file names
530 help error messages reflect the original source file, when @code{as}
531 source is itself synthesized from other files. @xref{App-File}.
532
533 @node Object, Errors, Input Files, Overview
534 @section Output (Object) File
535 Every time you run @code{as} it produces an output file, which is
536 your assembly language program translated into numbers. This file
537 is the object file, named @code{a.out} unless you tell @code{as} to
538 give it another name by using the @code{-o} option. Conventionally,
539 object file names end with @file{.o}. The default name of
540 @file{a.out} is used for historical reasons: older assemblers were
541 capable of assembling self-contained programs directly into a
542 runnable program.
543 @c This may still work, but hasn't been tested.
544
545 The object file is meant for input to the linker @code{ld}. It contains
546 assembled program code, information to help @code{ld} integrate
547 the assembled program into a runnable file, and (optionally) symbolic
548 information for the debugger.
549
550 @comment link above to some info file(s) like the description of a.out.
551 @comment don't forget to describe GNU info as well as Unix lossage.
552
553 @node Errors, Options, Object, Overview
554 @section Error and Warning Messages
555
556 @code{as} may write warnings and error messages to the standard error
557 file (usually your terminal). This should not happen when @code{as} is
558 run automatically by a compiler. Warnings report an assumption made so
559 that @code{as} could keep assembling a flawed program; errors report a
560 grave problem that stops the assembly.
561
562 Warning messages have the format
563 @example
564 file_name:@b{NNN}:Warning Message Text
565 @end example
566 @noindent(where @b{NNN} is a line number). If a logical file name has
567 been given (@pxref{App-File}) it is used for the filename, otherwise the
568 name of the current input file is used. If a logical line number was
569 given
570 _if__(!_AMD29K__)
571 (@pxref{Line})
572 _fi__(!_AMD29K__)
573 _if__(_AMD29K__)
574 (@pxref{Ln})
575 _fi__(_AMD29K__)
576 then it is used to calculate the number printed,
577 otherwise the actual line in the current source file is printed. The
578 message text is intended to be self explanatory (in the grand Unix
579 tradition). @refill
580
581 Error messages have the format
582 @example
583 file_name:@b{NNN}:FATAL:Error Message Text
584 @end example
585 The file name and line number are derived as for warning
586 messages. The actual message text may be rather less explanatory
587 because many of them aren't supposed to happen.
588
589 @group
590 @node Options, , Errors, Overview
591 @section Options
592 @subsection @code{-D}
593 This option has no effect whatsoever, but it is accepted to make it more
594 likely that scripts written for other assemblers will also work with
595 GNU @code{as}.
596 @end group
597
598 @subsection Work Faster: @code{-f}
599 @samp{-f} should only be used when assembling programs written by a
600 (trusted) compiler. @samp{-f} stops the assembler from pre-processing
601 the input file(s) before assembling them.
602 @quotation
603 @emph{Warning:} if the files actually need to be pre-processed (if they
604 contain comments, for example), @code{as} will not work correctly if
605 @samp{-f} is used.
606 @end quotation
607
608 @subsection Add to @code{.include} search path: @code{-I} @var{path}
609 Use this option to add a @var{path} to the list of directories GNU
610 @code{as} will search for files specified in @code{.include} directives
611 (@pxref{Include}). You may use @code{-I} as many times as necessary to
612 include a variety of paths. The current working directory is always
613 searched first; after that, @code{as} searches any @samp{-I} directories
614 in the same order as they were specified (left to right) on the command
615 line.
616
617 @subsection Warn if difference tables altered: @code{-k}
618 _if__(_AMD29K__)
619 On the AMD 29K family, this option is allowed, but has no effect. It is
620 permitted for compatibility with GNU @code{as} on other platforms,
621 where it can be used to warn when @code{as} alters the machine code
622 generated for @samp{.word} directives in difference tables. The AMD 29K
623 family does not have the addressing limitations that sometimes lead to this
624 alteration on other platforms.
625 _fi__(_AMD29K__)
626
627 _if__(!_AMD29K__)
628 @code{as} sometimes alters the code emitted for directives of the form
629 @samp{.word @var{sym1}-@var{sym2}}; @pxref{Word}.
630 You can use the @samp{-k} option if you want a warning issued when this
631 is done.
632 _fi__(!_AMD29K__)
633
634 @subsection Include Local Labels: @code{-L}
635 Labels beginning with @samp{L} (upper case only) are called @dfn{local
636 labels}. @xref{Symbol Names}. Normally you don't see such labels when
637 debugging, because they are intended for the use of programs (like
638 compilers) that compose assembler programs, not for your notice.
639 Normally both @code{as} and @code{ld} discard such labels, so you don't
640 normally debug with them.
641
642 This option tells @code{as} to retain those @samp{L@dots{}} symbols
643 in the object file. Usually if you do this you also tell the linker
644 @code{ld} to preserve symbols whose names begin with @samp{L}.
645
646 @subsection Name the Object File: @code{-o}
647 There is always one object file output when you run @code{as}. By
648 default it has the name @file{a.out}. You use this option (which
649 takes exactly one filename) to give the object file a different name.
650
651 Whatever the object file is called, @code{as} will overwrite any
652 existing file of the same name.
653
654 @subsection Data Segment into Text Segment: @code{-R}
655 @code{-R} tells @code{as} to write the object file as if all
656 data-segment data lives in the text segment. This is only done at
657 the very last moment: your binary data are the same, but data
658 segment parts are relocated differently. The data segment part of
659 your object file is zero bytes long because all it bytes are
660 appended to the text segment. (@xref{Segments}.)
661
662 When you specify @code{-R} it would be possible to generate shorter
663 address displacements (because we don't have to cross between text and
664 data segment). We don't do this simply for compatibility with older
665 versions of @code{as}. In future, @code{-R} may work this way.
666
667 @subsection Suppress Warnings: @code{-W}
668 @code{as} should never give a warning or error message when
669 assembling compiler output. But programs written by people often
670 cause @code{as} to give a warning that a particular assumption was
671 made. All such warnings are directed to the standard error file.
672 If you use this option, no warnings are issued. This option only
673 affects the warning messages: it does not change any particular of how
674 @code{as} assembles your file. Errors, which stop the assembly, are
675 still reported.
676
677 @node Syntax, Segments, Overview, Top
678 @chapter Syntax
679 This chapter describes the machine-independent syntax allowed in a
680 source file. @code{as} syntax is similar to what many other assemblers
681 use; it is inspired in BSD 4.2
682 _if__(!_VAX__)
683 assembler. @refill
684 _fi__(!_VAX__)
685 _if__(_VAX__)
686 assembler, except that @code{as} does not assemble Vax bit-fields.
687 _fi__(_VAX__)
688
689 @menu
690 * Pre-processing:: Pre-processing
691 * Whitespace:: Whitespace
692 * Comments:: Comments
693 * Symbol Intro:: Symbols
694 * Statements:: Statements
695 * Constants:: Constants
696 @end menu
697
698 @node Pre-processing, Whitespace, Syntax, Syntax
699 @section Pre-processing
700
701 The pre-processor:
702 @itemize @bullet
703 @item
704 adjusts and removes extra whitespace. It leaves one space or tab before
705 the keywords on a line, and turns any other whitespace on the line into
706 a single space.
707
708 @item
709 removes all comments, replacing them with a single space, or an
710 appropriate number of newlines.
711
712 @item
713 converts character constants into the appropriate numeric values.
714 @end itemize
715
716 Excess whitespace, comments, and character constants
717 cannot be used in the portions of the input text that are not
718 pre-processed.
719
720 If the first line of an input file is @code{#NO_APP} or the @samp{-f}
721 option is given, the input file will not be pre-processed. Within such
722 an input file, parts of the file can be pre-processed by putting a line
723 that says @code{#APP} before the text that should be pre-processed, and
724 putting a line that says @code{#NO_APP} after them. This feature is
725 mainly intend to support @code{asm} statements in compilers whose output
726 normally does not need to be pre-processed.
727
728 @node Whitespace, Comments, Pre-processing, Syntax
729 @section Whitespace
730 @dfn{Whitespace} is one or more blanks or tabs, in any order.
731 Whitespace is used to separate symbols, and to make programs neater
732 for people to read. Unless within character constants
733 (@pxref{Characters}), any whitespace means the same as exactly one
734 space.
735
736 @node Comments, Symbol Intro, Whitespace, Syntax
737 @section Comments
738 There are two ways of rendering comments to @code{as}. In both
739 cases the comment is equivalent to one space.
740
741 Anything from @samp{/*} through the next @samp{*/} is a comment.
742 This means you may not nest these comments.
743
744 @example
745 /*
746 The only way to include a newline ('\n') in a comment
747 is to use this sort of comment.
748 */
749
750 /* This sort of comment does not nest. */
751 @end example
752
753 Anything from the @dfn{line comment} character to the next newline
754 is considered a comment and is ignored. The line comment character is
755 _if__(_VAX__)
756 @samp{#} on the Vax;
757 _fi__(_VAX__)
758 _if__(_M680X0__)
759 @samp{|} on the 680x0;
760 _fi__(_M680X0__)
761 _if__(_AMD29K__)
762 @samp{;} for the AMD 29K family;
763 _fi__(_AMD29K__)
764 @pxref{Machine Dependent}. @refill
765
766 _if__(_ALL_ARCH__)
767 On some machines there are two different line comment characters. One
768 will only begin a comment if it is the first non-whitespace character on
769 a line, while the other will always begin a comment.
770 _fi__(_ALL_ARCH__)
771
772 To be compatible with past assemblers a special interpretation is
773 given to lines that begin with @samp{#}. Following the @samp{#} an
774 absolute expression (@pxref{Expressions}) is expected: this will be
775 the logical line number of the @b{next} line. Then a string
776 (@xref{Strings}.) is allowed: if present it is a new logical file
777 name. The rest of the line, if any, should be whitespace.
778
779 If the first non-whitespace characters on the line are not numeric,
780 the line is ignored. (Just like a comment.)
781 @example
782 # This is an ordinary comment.
783 # 42-6 "new_file_name" # New logical file name
784 # This is logical line # 36.
785 @end example
786 This feature is deprecated, and may disappear from future versions
787 of @code{as}.
788
789 @node Symbol Intro, Statements, Comments, Syntax
790 @section Symbols
791 A @dfn{symbol} is one or more characters chosen from the set of all
792 letters (both upper and lower case), digits and the three characters
793 @samp{_.$}. No symbol may begin with a digit. Case is significant.
794 There is no length limit: all characters are significant. Symbols are
795 delimited by characters not in that set, or by the beginning of a file
796 (since the source program must end with a newline, the end of a file is
797 not a possible symbol delimiter). @xref{Symbols}.
798
799 @node Statements, Constants, Symbol Intro, Syntax
800 @section Statements
801 A @dfn{statement} ends at a newline character (@samp{\n})
802 _if__(!_AMD29K__)
803 or at a semicolon (@samp{;}). The newline or semicolon
804 _fi__(!_AMD29K__)
805 _if__(_AMD29K__)
806 or an ``at'' sign (@samp{@@}). The newline or at sign
807 _fi__(_AMD29K__)
808 is considered part
809 of the preceding statement. Newlines
810 _if__(!_AMD29K__)
811 and semicolons
812 _fi__(!_AMD29K__)
813 _if__(_AMD29K__)
814 and at signs
815 _fi__(_AMD29K__)
816 within
817 character constants are an exception: they don't end statements.
818 It is an error to end any statement with end-of-file: the last
819 character of any input file should be a newline.@refill
820
821 You may write a statement on more than one line if you put a
822 backslash (@kbd{\}) immediately in front of any newlines within the
823 statement. When @code{as} reads a backslashed newline both
824 characters are ignored. You can even put backslashed newlines in
825 the middle of symbol names without changing the meaning of your
826 source program.
827
828 An empty statement is allowed, and may include whitespace. It is ignored.
829
830 @c "key symbol" is not used elsewhere in the document; seems pedantic to
831 @c @defn{} it in that case, as was done previously... pesch@cygnus.com,
832 @c 13feb91.
833 A statement begins with zero or more labels, optionally followed by a
834 key symbol which determines what kind of statement it is. The key
835 symbol determines the syntax of the rest of the statement. If the
836 symbol begins with a dot @samp{.} then the statement is an assembler
837 directive: typically valid for any computer. If the symbol begins with
838 a letter the statement is an assembly language @dfn{instruction}: it
839 will assemble into a machine language instruction. Different versions
840 of @code{as} for different computers will recognize different
841 instructions. In fact, the same symbol may represent a different
842 instruction in a different computer's assembly language.
843
844 A label is a symbol immediately followed by a colon (@code{:}).
845 Whitespace before a label or after a colon is permitted, but you may not
846 have whitespace between a label's symbol and its colon. @xref{Labels}.
847
848 @example
849 label: .directive followed by something
850 another$label: # This is an empty statement.
851 instruction operand_1, operand_2, @dots{}
852 @end example
853
854 @node Constants, , Statements, Syntax
855 @section Constants
856 A constant is a number, written so that its value is known by
857 inspection, without knowing any context. Like this:
858 @smallexample
859 .byte 74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value.
860 .ascii "Ring the bell\7" # A string constant.
861 .octa 0x123456789abcdef0123456789ABCDEF0 # A bignum.
862 .float 0f-314159265358979323846264338327\
863 95028841971.693993751E-40 # - pi, a flonum.
864 @end smallexample
865
866 @menu
867 * Characters:: Character Constants
868 * Numbers:: Number Constants
869 @end menu
870
871 @node Characters, Numbers, Constants, Constants
872 @subsection Character Constants
873 There are two kinds of character constants. A @dfn{character} stands
874 for one character in one byte and its value may be used in
875 numeric expressions. String constants (properly called string
876 @emph{literals}) are potentially many bytes and their values may not be
877 used in arithmetic expressions.
878
879 @menu
880 * Strings:: Strings
881 * Chars:: Characters
882 @end menu
883
884 @node Strings, Chars, Characters, Characters
885 @subsubsection Strings
886 A @dfn{string} is written between double-quotes. It may contain
887 double-quotes or null characters. The way to get special characters
888 into a string is to @dfn{escape} these characters: precede them with
889 a backslash @samp{\} character. For example @samp{\\} represents
890 one backslash: the first @code{\} is an escape which tells
891 @code{as} to interpret the second character literally as a backslash
892 (which prevents @code{as} from recognizing the second @code{\} as an
893 escape character). The complete list of escapes follows.
894
895 @table @kbd
896 @c @item \a
897 @c Mnemonic for ACKnowledge; for ASCII this is octal code 007.
898 @item \b
899 Mnemonic for backspace; for ASCII this is octal code 010.
900 @c @item \e
901 @c Mnemonic for EOText; for ASCII this is octal code 004.
902 @item \f
903 Mnemonic for FormFeed; for ASCII this is octal code 014.
904 @item \n
905 Mnemonic for newline; for ASCII this is octal code 012.
906 @c @item \p
907 @c Mnemonic for prefix; for ASCII this is octal code 033, usually known as @code{escape}.
908 @item \r
909 Mnemonic for carriage-Return; for ASCII this is octal code 015.
910 @c @item \s
911 @c Mnemonic for space; for ASCII this is octal code 040. Included for compliance with
912 @c other assemblers.
913 @item \t
914 Mnemonic for horizontal Tab; for ASCII this is octal code 011.
915 @c @item \v
916 @c Mnemonic for Vertical tab; for ASCII this is octal code 013.
917 @c @item \x @var{digit} @var{digit} @var{digit}
918 @c A hexadecimal character code. The numeric code is 3 hexadecimal digits.
919 @item \ @var{digit} @var{digit} @var{digit}
920 An octal character code. The numeric code is 3 octal digits.
921 For compatibility with other Unix systems, 8 and 9 are accepted as digits:
922 for example, @code{\008} has the value 010, and @code{\009} the value 011.
923 @item \\
924 Represents one @samp{\} character.
925 @c @item \'
926 @c Represents one @samp{'} (accent acute) character.
927 @c This is needed in single character literals
928 @c (@xref{Characters}.) to represent
929 @c a @samp{'}.
930 @item \"
931 Represents one @samp{"} character. Needed in strings to represent
932 this character, because an unescaped @samp{"} would end the string.
933 @item \ @var{anything-else}
934 Any other character when escaped by @kbd{\} will give a warning, but
935 assemble as if the @samp{\} was not present. The idea is that if
936 you used an escape sequence you clearly didn't want the literal
937 interpretation of the following character. However @code{as} has no
938 other interpretation, so @code{as} knows it is giving you the wrong
939 code and warns you of the fact.
940 @end table
941
942 Which characters are escapable, and what those escapes represent,
943 varies widely among assemblers. The current set is what we think
944 BSD 4.2 @code{as} recognizes, and is a subset of what most C
945 compilers recognize. If you are in doubt, don't use an escape
946 sequence.
947
948 @node Chars, , Strings, Characters
949 @subsubsection Characters
950 A single character may be written as a single quote immediately
951 followed by that character. The same escapes apply to characters as
952 to strings. So if you want to write the character backslash, you
953 must write @kbd{'\\} where the first @code{\} escapes the second
954 @code{\}. As you can see, the quote is an acute accent, not a
955 grave accent. A newline
956 _if__(!_AMD29K__)
957 (or semicolon @samp{;})
958 _fi__(!_AMD29K__)
959 _if__(_AMD29K__)
960 (or at sign @samp{@@})
961 _fi__(_AMD29K__)
962 immediately
963 following an acute accent is taken as a literal character and does
964 not count as the end of a statement. The value of a character
965 constant in a numeric expression is the machine's byte-wide code for
966 that character. @code{as} assumes your character code is ASCII: @kbd{'A}
967 means 65, @kbd{'B} means 66, and so on. @refill
968
969 @node Numbers, , Characters, Constants
970 @subsection Number Constants
971 @code{as} distinguishes three kinds of numbers according to how they
972 are stored in the target machine. @emph{Integers} are numbers that
973 would fit into an @code{int} in the C language. @emph{Bignums} are
974 integers, but they are stored in a more than 32 bits. @emph{Flonums}
975 are floating point numbers, described below.
976
977 @subsubsection Integers
978 A binary integer is @samp{0b} or @samp{0B} followed by zero or more of
979 the binary digits @samp{01}.
980
981 An octal integer is @samp{0} followed by zero or more of the octal
982 digits (@samp{01234567}).
983
984 A decimal integer starts with a non-zero digit followed by zero or
985 more digits (@samp{0123456789}).
986
987 A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or
988 more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}.
989
990 Integers have the usual values. To denote a negative integer, use
991 the prefix operator @samp{-} discussed under expressions
992 (@pxref{Prefix Ops}).
993
994 @subsubsection Bignums
995 A @dfn{bignum} has the same syntax and semantics as an integer
996 except that the number (or its negative) takes more than 32 bits to
997 represent in binary. The distinction is made because in some places
998 integers are permitted while bignums are not.
999
1000 @subsubsection Flonums
1001 A @dfn{flonum} represents a floating point number. The translation is
1002 complex: a decimal floating point number from the text is converted by
1003 @code{as} to a generic binary floating point number of more than
1004 sufficient precision. This generic floating point number is converted
1005 to a particular computer's floating point format (or formats) by a
1006 portion of @code{as} specialized to that computer.
1007
1008 A flonum is written by writing (in order)
1009 @itemize @bullet
1010 @item
1011 The digit @samp{0}.
1012 @item
1013 _if__(_AMD29K__)
1014 One of the letters @samp{DFPRSX} (in upper or lower case), to tell
1015 @code{as} the rest of the number is a flonum.
1016 _fi__(_AMD29K__)
1017 _if__(!_AMD29K__)
1018 A letter, to tell @code{as} the rest of the number is a flonum. @kbd{e}
1019 is recommended. Case is not important. (Any otherwise illegal letter
1020 will work here, but that might be changed. Vax BSD 4.2 assembler seems
1021 to allow any of @samp{defghDEFGH}.)
1022 _fi__(!_AMD29K__)
1023 @item
1024 An optional sign: either @samp{+} or @samp{-}.
1025 @item
1026 An optional @dfn{integer part}: zero or more decimal digits.
1027 @item
1028 An optional @dfn{fraction part}: @samp{.} followed by zero
1029 or more decimal digits.
1030 @item
1031 An optional exponent, consisting of:
1032 @itemize @bullet
1033 @item
1034 _if__(_AMD29K__)
1035 An @samp{E} or @samp{e}.
1036 _if__(!_AMD29K__)
1037 A letter; the exact significance varies according to
1038 the computer that executes the program. @code{as}
1039 accepts any letter for now. Case is not important.
1040 _fi__(!_AMD29K__)
1041 @item
1042 Optional sign: either @samp{+} or @samp{-}.
1043 @item
1044 One or more decimal digits.
1045 @end itemize
1046 @end itemize
1047
1048 At least one of @var{integer part} or @var{fraction part} must be
1049 present. The floating point number has the usual base-10 value.
1050
1051 @code{as} does all processing using integers. Flonums are computed
1052 independently of any floating point hardware in the computer running
1053 @code{as}.
1054
1055 @node Segments, Symbols, Syntax, Top
1056 @chapter Segments and Relocation
1057 @menu
1058 * Segs Background:: Background
1059 * ld Segments:: ld Segments
1060 * as Segments:: as Internal Segments
1061 * Sub-Segments:: Sub-Segments
1062 * bss:: bss Segment
1063 @end menu
1064
1065 @node Segs Background, ld Segments, Segments, Segments
1066 @section Background
1067 Roughly, a segment is a range of addresses, with no gaps; all data
1068 ``in'' those addresses is treated the same for some particular purpose.
1069 For example there may be a ``read only'' segment.
1070
1071 The linker @code{ld} reads many object files (partial programs) and
1072 combines their contents to form a runnable program. When @code{as}
1073 emits an object file, the partial program is assumed to start at address
1074 0. @code{ld} will assign the final addresses the partial program
1075 occupies, so that different partial programs don't overlap. This is
1076 actually an over-simplification, but it will suffice to explain how
1077 @code{as} uses segments.
1078
1079 @code{ld} moves blocks of bytes of your program to their run-time
1080 addresses. These blocks slide to their run-time addresses as rigid
1081 units; their length does not change and neither does the order of bytes
1082 within them. Such a rigid unit is called a @emph{segment}. Assigning
1083 run-time addresses to segments is called @dfn{relocation}. It includes
1084 the task of adjusting mentions of object-file addresses so they refer to
1085 the proper run-time addresses.
1086
1087 An object file written by @code{as} has three segments, any of which may
1088 be empty. These are named @dfn{text}, @dfn{data} and @dfn{bss}
1089 segments. Within the object file, the text segment starts at
1090 address @code{0}, the data segment follows, and the bss segment
1091 follows the data segment.
1092
1093 To let @code{ld} know which data will change when the segments are
1094 relocated, and how to change that data, @code{as} also writes to the
1095 object file details of the relocation needed. To perform relocation
1096 @code{ld} must know, each time an address in the object
1097 file is mentioned:
1098 @itemize @bullet
1099 @item
1100 Where in the object file is the beginning of this reference to
1101 an address?
1102 @item
1103 How long (in bytes) is this reference?
1104 @item
1105 Which segment does the address refer to? What is the numeric value of
1106 @display
1107 (@var{address}) @minus{} (@var{start-address of segment})?
1108 @end display
1109 @item
1110 Is the reference to an address ``Program-Counter relative''?
1111 @end itemize
1112
1113 In fact, every address @code{as} ever uses is expressed as
1114 @code{(@var{segment}) + (@var{offset into segment})}. Further, every
1115 expression @code{as} computes is of this segmented nature.
1116 @dfn{Absolute expression} means an expression with segment ``absolute''
1117 (@pxref{ld Segments}). A @dfn{pass1 expression} means an expression
1118 with segment ``pass1'' (@pxref{as Segments}). In this manual we use the
1119 notation @{@var{segname} @var{N}@} to mean ``offset @var{N} into segment
1120 @var{segname}''.
1121
1122 Apart from text, data and bss segments you need to know about the
1123 @dfn{absolute} segment. When @code{ld} mixes partial programs,
1124 addresses in the absolute segment remain unchanged. That is, address
1125 @code{@{absolute 0@}} is ``relocated'' to run-time address 0 by @code{ld}.
1126 Although two partial programs' data segments will not overlap addresses
1127 after linking, @emph{by definition} their absolute segments will overlap.
1128 Address @code{@{absolute@ 239@}} in one partial program will always be the same
1129 address when the program is running as address @code{@{absolute@ 239@}} in any
1130 other partial program.
1131
1132 The idea of segments is extended to the @dfn{undefined} segment. Any
1133 address whose segment is unknown at assembly time is by definition
1134 rendered @{undefined @var{U}@}---where @var{U} will be filled in later.
1135 Since numbers are always defined, the only way to generate an undefined
1136 address is to mention an undefined symbol. A reference to a named
1137 common block would be such a symbol: its value is unknown at assembly
1138 time so it has segment @emph{undefined}.
1139
1140 By analogy the word @emph{segment} is used to describe groups of segments in
1141 the linked program. @code{ld} puts all partial programs' text
1142 segments in contiguous addresses in the linked program. It is
1143 customary to refer to the @emph{text segment} of a program, meaning all
1144 the addresses of all partial program's text segments. Likewise for
1145 data and bss segments.
1146
1147 Some segments are manipulated by @code{ld}; others are invented for
1148 use of @code{as} and have no meaning except during assembly.
1149
1150 @menu
1151 * ld Segments:: ld Segments
1152 * as Segments:: as Internal Segments
1153 * Sub-Segments:: Sub-Segments
1154 * bss:: bss Segment
1155 @end menu
1156
1157 @node ld Segments, as Segments, Segs Background, Segments
1158 @section ld Segments
1159 @code{ld} deals with just five kinds of segments, summarized below.
1160
1161 @table @strong
1162
1163 @item text segment
1164 @itemx data segment
1165 These segments hold your program. @code{as} and @code{ld} treat them as
1166 separate but equal segments. Anything you can say of one segment is
1167 true of the other. When the program is running, however, it is
1168 customary for the text segment to be unalterable. The
1169 text segment is often shared among processes: it will contain
1170 instructions, constants and the like. The data segment of a running
1171 program is usually alterable: for example, C variables would be stored
1172 in the data segment.
1173
1174 @item bss segment
1175 This segment contains zeroed bytes when your program begins running. It
1176 is used to hold unitialized variables or common storage. The length of
1177 each partial program's bss segment is important, but because it starts
1178 out containing zeroed bytes there is no need to store explicit zero
1179 bytes in the object file. The bss segment was invented to eliminate
1180 those explicit zeros from object files.
1181
1182 @item absolute segment
1183 Address 0 of this segment is always ``relocated'' to runtime address 0.
1184 This is useful if you want to refer to an address that @code{ld} must
1185 not change when relocating. In this sense we speak of absolute
1186 addresses being ``unrelocatable'': they don't change during relocation.
1187
1188 @item @code{undefined} segment
1189 This ``segment'' is a catch-all for address references to objects not in
1190 the preceding segments.
1191 @c FIXME: ref to some other doc on obj-file formats could go here.
1192
1193 @end table
1194
1195 An idealized example of the 3 relocatable segments follows. Memory
1196 addresses are on the horizontal axis.
1197
1198 @ifinfo
1199 @example
1200 +-----+----+--+
1201 partial program # 1: |ttttt|dddd|00|
1202 +-----+----+--+
1203
1204 text data bss
1205 seg. seg. seg.
1206
1207 +---+---+---+
1208 partial program # 2: |TTT|DDD|000|
1209 +---+---+---+
1210
1211 +--+---+-----+--+----+---+-----+~~
1212 linked program: | |TTT|ttttt| |dddd|DDD|00000|
1213 +--+---+-----+--+----+---+-----+~~
1214
1215 addresses: 0 @dots{}
1216 @end example
1217 @end ifinfo
1218 @tex
1219 \halign{\hfil\rm #\quad&#\cr
1220 \cr
1221 &\ibox{2.5cm}{\tt text}\ibox{2cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1222 Partial program \#1:
1223 &\boxit{2.5cm}{\tt ttttt}\boxit{2cm}{\tt dddd}\boxit{1cm}{\tt 00}\cr
1224 \cr
1225 &\ibox{1cm}{\tt text}\ibox{1.5cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1226 Partial program \#2:
1227 &\boxit{1cm}{\tt TTT}\boxit{1.5cm}{\tt DDDD}\boxit{1cm}{\tt 000}\cr
1228 \cr
1229 &\ibox{.5cm}{}\ibox{1cm}{\tt text}\ibox{2.5cm}{}\ibox{.75cm}{}\ibox{2cm}{\tt data}\ibox{1.5cm}{}\ibox{2cm}{\tt bss}\cr
1230 linked program:
1231 &\boxit{.5cm}{}\boxit{1cm}{\tt TTT}\boxit{2.5cm}{\tt
1232 ttttt}\boxit{.75cm}{}\boxit{2cm}{\tt dddd}\boxit{1.5cm}{\tt
1233 DDDD}\boxit{2cm}{00000}\ \dots\cr
1234 addresses:
1235 &\dots\cr
1236 }
1237 @end tex
1238
1239 @node as Segments, Sub-Segments, ld Segments, Segments
1240 @section as Internal Segments
1241 These segments are invented for the internal use of @code{as}. They
1242 have no meaning at run-time. You don't need to know about these
1243 segments except that they might be mentioned in @code{as}' warning
1244 messages. These segments are invented to permit the value of every
1245 expression in your assembly language program to be a segmented
1246 address.
1247
1248 @table @b
1249 @item absent segment
1250 An expression was expected and none was
1251 found.
1252
1253 @item goof segment
1254 An internal assembler logic error has been
1255 found. This means there is a bug in the assembler.
1256
1257 @item grand segment
1258 A @dfn{grand number} is a bignum or a flonum, but not an integer. If a
1259 number can't be written as a C @code{int} constant, it is a grand
1260 number. @code{as} has to remember that a flonum or a bignum does not
1261 fit into 32 bits, and cannot be an argument (@pxref{Arguments}) in an
1262 expression: this is done by making a flonum or bignum be in segment
1263 grand. This is purely for internal @code{as} convenience; grand
1264 segment behaves similarly to absolute segment.
1265
1266 @item pass1 segment
1267 The expression was impossible to evaluate in the first pass. The
1268 assembler will attempt a second pass (second reading of the source) to
1269 evaluate the expression. Your expression mentioned an undefined symbol
1270 in a way that defies the one-pass (segment + offset in segment) assembly
1271 process. No compiler need emit such an expression.
1272
1273 @quotation
1274 @emph{Warning:} the second pass is currently not implemented. @code{as}
1275 will abort with an error message if one is required.
1276 @end quotation
1277
1278 @item difference segment
1279 As an assist to the C compiler, expressions of the forms
1280 @display
1281 (@var{undefined symbol}) @minus{} (@var{expression}
1282 (@var{something} @minus{} (@var{undefined symbol})
1283 (@var{undefined symbol}) @minus{} (@var{undefined symbol})
1284 @end display
1285 are permitted, and belong to the difference segment. @code{as}
1286 re-evaluates such expressions after the source file has been read and
1287 the symbol table built. If by that time there are no undefined symbols
1288 in the expression then the expression assumes a new segment. The
1289 intention is to permit statements like
1290 @samp{.word label - base_of_table}
1291 to be assembled in one pass where both @code{label} and
1292 @code{base_of_table} are undefined. This is useful for compiling C and
1293 Algol switch statements, Pascal case statements, FORTRAN computed goto
1294 statements and the like.
1295 @end table
1296
1297 @node Sub-Segments, bss, as Segments, Segments
1298 @section Sub-Segments
1299 Assembled bytes fall into two segments: text and data.
1300 Because you may have groups of text or data that you want to end up near
1301 to each other in the object file, @code{as} allows you to use
1302 @dfn{subsegments}. Within each segment, there can be numbered
1303 subsegments with values from 0 to 8192. Objects assembled into the same
1304 subsegment will be grouped with other objects in the same subsegment
1305 when they are all put into the object file. For example, a compiler
1306 might want to store constants in the text segment, but might not want to
1307 have them interspersed with the program being assembled. In this case,
1308 the compiler could issue a @code{text 0} before each section of code
1309 being output, and a @code{text 1} before each group of constants being
1310 output.
1311
1312 Subsegments are optional. If you don't use subsegments, everything
1313 will be stored in subsegment number zero.
1314
1315 _if__(!_AMD29K__)
1316 Each subsegment is zero-padded up to a multiple of four bytes.
1317 (Subsegments may be padded a different amount on different flavors
1318 of @code{as}.)
1319 _fi__(!_AMD29K__)
1320 _if__(_AMD29K__)
1321 On the AMD 29K family, no particular padding is added to segment sizes;
1322 GNU as forces no alignment on this platform.
1323 _fi__(_AMD29K__)
1324 Subsegments appear in your object file in numeric order, lowest numbered
1325 to highest. (All this to be compatible with other people's assemblers.)
1326 The object file contains no representation of subsegments; @code{ld} and
1327 other programs that manipulate object files will see no trace of them.
1328 They just see all your text subsegments as a text segment, and all your
1329 data subsegments as a data segment.
1330
1331 To specify which subsegment you want subsequent statements assembled
1332 into, use a @samp{.text @var{expression}} or a @samp{.data
1333 @var{expression}} statement. @var{Expression} should be an absolute
1334 expression. (@xref{Expressions}.) If you just say @samp{.text}
1335 then @samp{.text 0} is assumed. Likewise @samp{.data} means
1336 @samp{.data 0}. Assembly begins in @code{text 0}.
1337 For instance:
1338 @example
1339 .text 0 # The default subsegment is text 0 anyway.
1340 .ascii "This lives in the first text subsegment. *"
1341 .text 1
1342 .ascii "But this lives in the second text subsegment."
1343 .data 0
1344 .ascii "This lives in the data segment,"
1345 .ascii "in the first data subsegment."
1346 .text 0
1347 .ascii "This lives in the first text segment,"
1348 .ascii "immediately following the asterisk (*)."
1349 @end example
1350
1351 Each segment has a @dfn{location counter} incremented by one for every
1352 byte assembled into that segment. Because subsegments are merely a
1353 convenience restricted to @code{as} there is no concept of a subsegment
1354 location counter. There is no way to directly manipulate a location
1355 counter---but the @code{.align} directive will change it, and any label
1356 definition will capture its current value. The location counter of the
1357 segment that statements are being assembled into is said to be the
1358 @dfn{active} location counter.
1359
1360 @node bss, , Sub-Segments, Segments
1361 @section bss Segment
1362 The bss segment is used for local common variable storage.
1363 You may allocate address space in the bss segment, but you may
1364 not dictate data to load into it before your program executes. When
1365 your program starts running, all the contents of the bss
1366 segment are zeroed bytes.
1367
1368 Addresses in the bss segment are allocated with special directives;
1369 you may not assemble anything directly into the bss segment. Hence
1370 there are no bss subsegments. @xref{Comm}; @pxref{Lcomm}.
1371
1372 @node Symbols, Expressions, Segments, Top
1373 @chapter Symbols
1374 Symbols are a central concept: the programmer uses symbols to name
1375 things, the linker uses symbols to link, and the debugger uses symbols
1376 to debug.
1377
1378 @quotation
1379 @emph{Warning:} @code{as} does not place symbols in the object file in
1380 the same order they were declared. This may break some debuggers.
1381 @end quotation
1382
1383 @menu
1384 * Labels:: Labels
1385 * Setting Symbols:: Giving Symbols Other Values
1386 * Symbol Names:: Symbol Names
1387 * Dot:: The Special Dot Symbol
1388 * Symbol Attributes:: Symbol Attributes
1389 @end menu
1390
1391 @node Labels, Setting Symbols, Symbols, Symbols
1392 @section Labels
1393 A @dfn{label} is written as a symbol immediately followed by a colon
1394 @samp{:}. The symbol then represents the current value of the
1395 active location counter, and is, for example, a suitable instruction
1396 operand. You are warned if you use the same symbol to represent two
1397 different locations: the first definition overrides any other
1398 definitions.
1399
1400 @node Setting Symbols, Symbol Names, Labels, Symbols
1401 @section Giving Symbols Other Values
1402 A symbol can be given an arbitrary value by writing a symbol, followed
1403 by an equals sign @samp{=}, followed by an expression
1404 (@pxref{Expressions}). This is equivalent to using the @code{.set}
1405 directive. @xref{Set}.
1406
1407 @node Symbol Names, Dot, Setting Symbols, Symbols
1408 @section Symbol Names
1409 Symbol names begin with a letter or with one of @samp{$._}. That
1410 character may be followed by any string of digits, letters,
1411 underscores and dollar signs. Case of letters is significant:
1412 @code{foo} is a different symbol name than @code{Foo}.
1413
1414 _if__(_AMD29K__)
1415 For the AMD 29K family, @samp{?} is also allowed in the
1416 body of a symbol name, though not at its beginning.
1417 _fi__(_AMD29K__)
1418
1419 Each symbol has exactly one name. Each name in an assembly language
1420 program refers to exactly one symbol. You may use that symbol name any
1421 number of times in a program.
1422
1423 @menu
1424 * Local Symbols:: Local Symbol Names
1425 @end menu
1426
1427 @node Local Symbols, , Symbol Names, Symbol Names
1428 @subsection Local Symbol Names
1429
1430 Local symbols help compilers and programmers use names temporarily.
1431 There are ten local symbol names, which are re-used throughout the
1432 program. You may refer to them using the names @samp{0} @samp{1}
1433 @dots{} @samp{9}. To define a local symbol, write a label of the form
1434 @samp{@b{N}:} (where @b{N} represents any digit). To refer to the most
1435 recent previous definition of that symbol write @samp{@b{N}b}, using the
1436 same digit as when you defined the label. To refer to the next
1437 definition of a local label, write @samp{@b{N}f}---where @b{N} gives you
1438 a choice of 10 forward references. The @samp{b} stands for
1439 ``backwards'' and the @samp{f} stands for ``forwards''.
1440
1441 Local symbols are not emitted by the current GNU C compiler.
1442
1443 There is no restriction on how you can use these labels, but
1444 remember that at any point in the assembly you can refer to at most
1445 10 prior local labels and to at most 10 forward local labels.
1446
1447 Local symbol names are only a notation device. They are immediately
1448 transformed into more conventional symbol names before the assembler
1449 uses them. The symbol names stored in the symbol table, appearing in
1450 error messages and optionally emitted to the object file have these
1451 parts:
1452
1453 @table @code
1454 @item L
1455 All local labels begin with @samp{L}. Normally both @code{as} and
1456 @code{ld} forget symbols that start with @samp{L}. These labels are
1457 used for symbols you are never intended to see. If you give the
1458 @samp{-L} option then @code{as} will retain these symbols in the
1459 object file. If you also instruct @code{ld} to retain these symbols,
1460 you may use them in debugging.
1461
1462 @item @var{digit}
1463 If the label is written @samp{0:} then the digit is @samp{0}.
1464 If the label is written @samp{1:} then the digit is @samp{1}.
1465 And so on up through @samp{9:}.
1466
1467 @item @ctrl{A}
1468 This unusual character is included so you don't accidentally invent
1469 a symbol of the same name. The character has ASCII value
1470 @samp{\001}.
1471
1472 @item @emph{ordinal number}
1473 This is a serial number to keep the labels distinct. The first
1474 @samp{0:} gets the number @samp{1}; The 15th @samp{0:} gets the
1475 number @samp{15}; @emph{etc.}. Likewise for the other labels @samp{1:}
1476 through @samp{9:}.
1477 @end table
1478
1479 For instance, the first @code{1:} is named @code{L1@ctrl{A}1}, the 44th
1480 @code{3:} is named @code{L3@ctrl{A}44}.
1481
1482 @node Dot, Symbol Attributes, Symbol Names, Symbols
1483 @section The Special Dot Symbol
1484
1485 The special symbol @samp{.} refers to the current address that
1486 @code{as} is assembling into. Thus, the expression @samp{melvin:
1487 .long .} will cause @code{melvin} to contain its own address.
1488 Assigning a value to @code{.} is treated the same as a @code{.org}
1489 directive. Thus, the expression @samp{.=.+4} is the same as saying
1490 _if__(!_AMD29K__)
1491 @samp{.space 4}.
1492 _fi__(!_AMD29K__)
1493 _if__(_AMD29K__)
1494 @samp{.block 4}.
1495 _fi__(_AMD29K__)
1496
1497 @node Symbol Attributes, , Dot, Symbols
1498 @section Symbol Attributes
1499 Every symbol has these attributes: Value, Type, Descriptor, and ``Other''.
1500 _if__(_INTERNALS__)
1501 The detailed definitions are in _0__<a.out.h>_1__.
1502 _fi__(_INTERNALS__)
1503
1504 If you use a symbol without defining it, @code{as} assumes zero for
1505 all these attributes, and probably won't warn you. This makes the
1506 symbol an externally defined symbol, which is generally what you
1507 would want.
1508
1509 @menu
1510 * Symbol Value:: Value
1511 * Symbol Type:: Type
1512 * Symbol Desc:: Descriptor
1513 * Symbol Other:: Other
1514 @end menu
1515
1516 @node Symbol Value, Symbol Type, Symbol Attributes, Symbol Attributes
1517 @subsection Value
1518 The value of a symbol is (usually) 32 bits, the size of one GNU C
1519 @code{int}. For a symbol which labels a location in the
1520 text, data, bss or absolute segments the
1521 value is the number of addresses from the start of that segment to
1522 the label. Naturally for text, data and bss
1523 segments the value of a symbol changes as @code{ld} changes segment
1524 base addresses during linking. absolute symbols' values do
1525 not change during linking: that is why they are called absolute.
1526
1527 The value of an undefined symbol is treated in a special way. If it is
1528 0 then the symbol is not defined in this assembler source program, and
1529 @code{ld} will try to determine its value from other programs it is
1530 linked with. You make this kind of symbol simply by mentioning a symbol
1531 name without defining it. A non-zero value represents a @code{.comm}
1532 common declaration. The value is how much common storage to reserve, in
1533 bytes (addresses). The symbol refers to the first address of the
1534 allocated storage.
1535
1536 @node Symbol Type, Symbol Desc, Symbol Value, Symbol Attributes
1537 @subsection Type
1538 The type attribute of a symbol is 8 bits encoded in a devious way.
1539 We kept this coding standard for compatibility with older operating
1540 systems.
1541
1542 @ifinfo
1543 @example
1544
1545 7 6 5 4 3 2 1 0 bit numbers
1546 +-----+-----+-----+-----+-----+-----+-----+-----+
1547 | | | |
1548 | N_STAB bits | N_TYPE bits |N_EXT|
1549 | | | bit |
1550 +-----+-----+-----+-----+-----+-----+-----+-----+
1551
1552 Type byte
1553 @end example
1554 @end ifinfo
1555 @tex
1556 \vskip 1pc
1557 \halign{#\quad&#\cr
1558 \ibox{3cm}{7}\ibox{4cm}{4}\ibox{1.1cm}{0}&bit numbers\cr
1559 \boxit{3cm}{{\tt N\_STAB} bits}\boxit{4cm}{{\tt N\_TYPE}
1560 bits}\boxit{1.1cm}{\tt N\_EXT}\cr
1561 \hfill {\bf Type} byte\hfill\cr
1562 }
1563 @end tex
1564
1565 @subsubsection @code{N_EXT} bit
1566 This bit is set if @code{ld} might need to use the symbol's type bits
1567 and value. If this bit is off, then @code{ld} can ignore the
1568 symbol while linking. It is set in two cases. If the symbol is
1569 undefined, then @code{ld} is expected to find the symbol's value
1570 elsewhere in another program module. Otherwise the symbol has the
1571 value given, but this symbol name and value are revealed to any other
1572 programs linked in the same executable program. This second use of
1573 the @code{N_EXT} bit is most often made by a @code{.globl} statement.
1574
1575 @subsubsection @code{N_TYPE} bits
1576 These establish the symbol's ``type'', which is mainly a relocation
1577 concept. Common values are detailed in the manual describing the
1578 executable file format.
1579
1580 @subsubsection @code{N_STAB} bits
1581 Common values for these bits are described in the manual on the
1582 executable file format.
1583
1584 @node Symbol Desc, Symbol Other, Symbol Type, Symbol Attributes
1585 @subsection Descriptor
1586 This is an arbitrary 16-bit value. You may establish a symbol's
1587 descriptor value by using a @code{.desc} statement (@pxref{Desc}).
1588 A descriptor value means nothing to @code{as}.
1589
1590 @node Symbol Other, , Symbol Desc, Symbol Attributes
1591 @subsection Other
1592 This is an arbitrary 8-bit value. It means nothing to @code{as}.
1593
1594 @node Expressions, Pseudo Ops, Symbols, Top
1595 @chapter Expressions
1596 An @dfn{expression} specifies an address or numeric value.
1597 Whitespace may precede and/or follow an expression.
1598
1599 @menu
1600 * Empty Exprs:: Empty Expressions
1601 * Integer Exprs:: Integer Expressions
1602 @end menu
1603
1604 @node Empty Exprs, Integer Exprs, Expressions, Expressions
1605 @section Empty Expressions
1606 An empty expression has no value: it is just whitespace or null.
1607 Wherever an absolute expression is required, you may omit the
1608 expression and @code{as} will assume a value of (absolute) 0. This
1609 is compatible with other assemblers.
1610
1611 @node Integer Exprs, , Empty Exprs, Expressions
1612 @section Integer Expressions
1613 An @dfn{integer expression} is one or more @emph{arguments} delimited
1614 by @emph{operators}.
1615
1616 @menu
1617 * Arguments:: Arguments
1618 * Operators:: Operators
1619 * Prefix Ops:: Prefix Operators
1620 * Infix Ops:: Infix Operators
1621 @end menu
1622
1623 @node Arguments, Operators, Integer Exprs, Integer Exprs
1624 @subsection Arguments
1625
1626 @dfn{Arguments} are symbols, numbers or subexpressions. In other
1627 contexts arguments are sometimes called ``arithmetic operands''. In
1628 this manual, to avoid confusing them with the ``instruction operands'' of
1629 the machine language, we use the term ``argument'' to refer to parts of
1630 expressions only, reserving the word ``operand'' to refer only to machine
1631 instruction operands.
1632
1633 Symbols are evaluated to yield @{@var{segment} @var{NNN}@} where
1634 @var{segment} is one of text, data, bss, absolute,
1635 or @code{undefined}. @var{NNN} is a signed, 2's complement 32 bit
1636 integer.
1637
1638 Numbers are usually integers.
1639
1640 A number can be a flonum or bignum. In this case, you are warned
1641 that only the low order 32 bits are used, and @code{as} pretends
1642 these 32 bits are an integer. You may write integer-manipulating
1643 instructions that act on exotic constants, compatible with other
1644 assemblers.
1645
1646 Subexpressions are a left parenthesis @samp{(} followed by an integer
1647 expression, followed by a right parenthesis @samp{)}; or a prefix
1648 operator followed by an argument.
1649
1650 @node Operators, Prefix Ops, Arguments, Integer Exprs
1651 @subsection Operators
1652 @dfn{Operators} are arithmetic functions, like @code{+} or @code{%}. Prefix
1653 operators are followed by an argument. Infix operators appear
1654 between their arguments. Operators may be preceded and/or followed by
1655 whitespace.
1656
1657 @node Prefix Ops, Infix Ops, Operators, Integer Exprs
1658 @subsection Prefix Operators
1659 @code{as} has the following @dfn{prefix operators}. They each take
1660 one argument, which must be absolute.
1661 @table @code
1662 @item -
1663 @dfn{Negation}. Two's complement negation.
1664 @item ~
1665 @dfn{Complementation}. Bitwise not.
1666 @end table
1667
1668 @node Infix Ops, , Prefix Ops, Integer Exprs
1669 @subsection Infix Operators
1670
1671 @dfn{Infix operators} take two arguments, one on either side. Operators
1672 have precedence, but operations with equal precedence are performed left
1673 to right. Apart from @code{+} or @code{-}, both arguments must be
1674 absolute, and the result is absolute.
1675
1676 @enumerate
1677
1678 @item
1679 Highest Precedence
1680 @table @code
1681 @item *
1682 @dfn{Multiplication}.
1683 @item /
1684 @dfn{Division}. Truncation is the same as the C operator @samp{/}
1685 @item %
1686 @dfn{Remainder}.
1687 @item _0__<_1__
1688 @itemx _0__<<_1__
1689 @dfn{Shift Left}. Same as the C operator @samp{_0__<<_1__}
1690 @item _0__>_1__
1691 @itemx _0__>>_1__
1692 @dfn{Shift Right}. Same as the C operator @samp{_0__>>_1__}
1693 @end table
1694
1695 @item
1696 Intermediate precedence
1697 @table @code
1698 @item |
1699 @dfn{Bitwise Inclusive Or}.
1700 @item &
1701 @dfn{Bitwise And}.
1702 @item ^
1703 @dfn{Bitwise Exclusive Or}.
1704 @item !
1705 @dfn{Bitwise Or Not}.
1706 @end table
1707
1708 @item
1709 Lowest Precedence
1710 @table @code
1711 @item +
1712 @dfn{Addition}. If either argument is absolute, the result
1713 has the segment of the other argument.
1714 If either argument is pass1 or undefined, the result is pass1.
1715 Otherwise @code{+} is illegal.
1716 @item -
1717 @dfn{Subtraction}. If the right argument is absolute, the
1718 result has the segment of the left argument.
1719 If either argument is pass1 the result is pass1.
1720 If either argument is undefined the result is difference segment.
1721 If both arguments are in the same segment, the result is absolute---provided
1722 that segment is one of text, data or bss.
1723 Otherwise subtraction is illegal.
1724 @end table
1725 @end enumerate
1726
1727 The sense of the rule for addition is that it's only meaningful to add
1728 the @emph{offsets} in an address; you can only have a defined segment in
1729 one of the two arguments.
1730
1731 Similarly, you can't subtract quantities from two different segments.
1732
1733 @node Pseudo Ops, Machine Dependent, Expressions, Top
1734 @chapter Assembler Directives
1735 @menu
1736 * Abort:: The Abort directive causes as to abort
1737 * Align:: Pad the location counter to a power of 2
1738 * App-File:: Set the logical file name
1739 * Ascii:: Fill memory with bytes of ASCII characters
1740 * Asciz:: Fill memory with bytes of ASCII characters followed
1741 by a null.
1742 * Byte:: Fill memory with 8-bit integers
1743 * Comm:: Reserve public space in the BSS segment
1744 * Data:: Change to the data segment
1745 * Desc:: Set the n_desc of a symbol
1746 * Double:: Fill memory with double-precision floating-point numbers
1747 * Else:: @code{.else}
1748 * End:: @code{.end}
1749 * Endif:: @code{.endif}
1750 * Equ:: @code{.equ @var{symbol}, @var{expression}}
1751 * Extern:: @code{.extern}
1752 * Fill:: Fill memory with repeated values
1753 * Float:: Fill memory with single-precision floating-point numbers
1754 * Global:: Make a symbol visible to the linker
1755 * Ident:: @code{.ident}
1756 * If:: @code{.if @var{absolute expression}}
1757 * Include:: @code{.include "@var{file}"}
1758 * Int:: Fill memory with 32-bit integers
1759 * Lcomm:: Reserve private space in the BSS segment
1760 * Line:: Set the logical line number
1761 * Ln:: @code{.ln @var{line-number}}
1762 * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
1763 * Long:: Fill memory with 32-bit integers
1764 * Lsym:: Create a local symbol
1765 * Octa:: Fill memory with 128-bit integers
1766 * Org:: Change the location counter
1767 * Quad:: Fill memory with 64-bit integers
1768 * Set:: Set the value of a symbol
1769 * Short:: Fill memory with 16-bit integers
1770 * Single:: @code{.single @var{flonums}}
1771 * Stab:: Store debugging information
1772 * Text:: Change to the text segment
1773 * Word:: Fill memory with 32-bit integers
1774 * Deprecated:: Deprecated Directives
1775 * Machine Options:: Options
1776 * Machine Syntax:: Syntax
1777 * Floating Point:: Floating Point
1778 * Machine Directives:: Machine Directives
1779 * Opcodes:: Opcodes
1780 @end menu
1781
1782 All assembler directives have names that begin with a period (@samp{.}).
1783 The rest of the name is letters: their case does not matter.
1784
1785 This chapter discusses directives present in all versions of GNU
1786 @code{as}; @pxref{Machine Dependent} for additional directives.
1787
1788 @node Abort, Align, Pseudo Ops, Pseudo Ops
1789 @section @code{.abort}
1790 This directive stops the assembly immediately. It is for
1791 compatibility with other assemblers. The original idea was that the
1792 assembler program would be piped into the assembler. If the sender
1793 of a program quit, it could use this directive tells @code{as} to
1794 quit also. One day @code{.abort} will not be supported.
1795
1796 @node Align, App-File, Abort, Pseudo Ops
1797 @section @code{.align @var{abs-expression} , @var{abs-expression}}
1798 Pad the location counter (in the current subsegment) to a particular
1799 storage boundary. The first expression (which must be absolute) is the
1800 number of low-order zero bits the location counter will have after
1801 advancement. For example @samp{.align 3} will advance the location
1802 counter until it a multiple of 8. If the location counter is already a
1803 multiple of 8, no change is needed.
1804
1805 The second expression (also absolute) gives the value to be stored in
1806 the padding bytes. It (and the comma) may be omitted. If it is
1807 omitted, the padding bytes are zero.
1808
1809 @node App-File, Ascii, Align, Pseudo Ops
1810 @section @code{.app-file @var{string}}
1811 @code{.app-file} tells @code{as} that we are about to start a new
1812 logical file. @var{String} is the new file name. In general, the
1813 filename is recognized whether or not it is surrounded by quotes @samp{"};
1814 but if you wish to specify an empty file name is permitted,
1815 you must give the quotes--@code{""}. This statement may go away in
1816 future: it is only recognized to be compatible with old @code{as}
1817 programs.
1818
1819 @node Ascii, Asciz, App-File, Pseudo Ops
1820 @section @code{.ascii "@var{string}"}@dots{}
1821 @code{.ascii} expects zero or more string literals (@pxref{Strings})
1822 separated by commas. It assembles each string (with no automatic
1823 trailing zero byte) into consecutive addresses.
1824
1825 @node Asciz, Byte, Ascii, Pseudo Ops
1826 @section @code{.asciz "@var{string}"}@dots{}
1827 @code{.asciz} is just like @code{.ascii}, but each string is followed by
1828 a zero byte. The ``z'' in @samp{.asciz} stands for ``zero''.
1829
1830 @node Byte, Comm, Asciz, Pseudo Ops
1831 @section @code{.byte @var{expressions}}
1832
1833 @code{.byte} expects zero or more expressions, separated by commas.
1834 Each expression is assembled into the next byte.
1835
1836 @node Comm, Data, Byte, Pseudo Ops
1837 @section @code{.comm @var{symbol} , @var{length} }
1838 @code{.comm} declares a named common area in the bss segment. Normally
1839 @code{ld} reserves memory addresses for it during linking, so no partial
1840 program defines the location of the symbol. Use @code{.comm} to tell
1841 @code{ld} that it must be at least @var{length} bytes long. @code{ld}
1842 will allocate space for each @code{.comm} symbol that is at least as
1843 long as the longest @code{.comm} request in any of the partial programs
1844 linked. @var{length} is an absolute expression.
1845
1846 @node Data, Desc, Comm, Pseudo Ops
1847 @section @code{.data @var{subsegment}}
1848 @code{.data} tells @code{as} to assemble the following statements onto the
1849 end of the data subsegment numbered @var{subsegment} (which is an
1850 absolute expression). If @var{subsegment} is omitted, it defaults
1851 to zero.
1852
1853 @node Desc, Double, Data, Pseudo Ops
1854 @section @code{.desc @var{symbol}, @var{abs-expression}}
1855 This directive sets the descriptor of the symbol (@pxref{Symbol Attributes})
1856 to the low 16 bits of an absolute expression.
1857
1858 @node Double, Else, Desc, Pseudo Ops
1859 @section @code{.double @var{flonums}}
1860 @code{.double} expects zero or more flonums, separated by commas. It assembles
1861 floating point numbers.
1862 _if__(_ALL_ARCH__)
1863 The exact kind of floating point numbers emitted depends on how
1864 @code{as} is configured. @xref{Machine Dependent}.
1865 _fi__(_ALL_ARCH__)
1866 _if__(_AMD29K__)
1867 On the AMD 29K family the floating point format used is IEEE.
1868 _fi__(_AMD29K__)
1869
1870 @node Else, End, Double, Pseudo Ops
1871 @section @code{.else}
1872 @code{.else} is part of the @code{as} support for conditional assembly;
1873 @pxref{If}. It marks the beginning of a section of code to be assembled
1874 if the condition for the preceding @code{.if} was false.
1875
1876 @ignore
1877 @node End, Endif, Else, Pseudo Ops
1878 @section @code{.end}
1879 This doesn't do anything---but isn't an s_ignore, so I suspect it's
1880 meant to do something eventually (which is why it isn't documented here
1881 as "for compatibility with blah").
1882 @end ignore
1883
1884 @node Endif, Equ, End, Pseudo Ops
1885 @section @code{.endif}
1886 @code{.endif} is part of the @code{as} support for conditional assembly;
1887 it marks the end of a block of code that is only assembled
1888 conditionally. @xref{If}.
1889
1890 @node Equ, Extern, Endif, Pseudo Ops
1891 @section @code{.equ @var{symbol}, @var{expression}}
1892
1893 This directive sets the value of @var{symbol} to @var{expression}.
1894 It is synonymous with @samp{.set}; @pxref{Set}.
1895
1896 @node Extern, Fill, Equ, Pseudo Ops
1897 @section @code{.extern}
1898 @code{.extern} is accepted in the source program---for compatibility
1899 with other assemblers---but it is ignored. GNU @code{as} treats
1900 all undefined symbols as external.
1901
1902 @node Fill, Float, Extern, Pseudo Ops
1903 @section @code{.fill @var{repeat} , @var{size} , @var{value}}
1904 @var{result}, @var{size} and @var{value} are absolute expressions.
1905 This emits @var{repeat} copies of @var{size} bytes. @var{Repeat}
1906 may be zero or more. @var{Size} may be zero or more, but if it is
1907 more than 8, then it is deemed to have the value 8, compatible with
1908 other people's assemblers. The contents of each @var{repeat} bytes
1909 is taken from an 8-byte number. The highest order 4 bytes are
1910 zero. The lowest order 4 bytes are @var{value} rendered in the
1911 byte-order of an integer on the computer @code{as} is assembling for.
1912 Each @var{size} bytes in a repetition is taken from the lowest order
1913 @var{size} bytes of this number. Again, this bizarre behavior is
1914 compatible with other people's assemblers.
1915
1916 @var{Size} and @var{value} are optional.
1917 If the second comma and @var{value} are absent, @var{value} is
1918 assumed zero. If the first comma and following tokens are absent,
1919 @var{size} is assumed to be 1.
1920
1921 @node Float, Global, Fill, Pseudo Ops
1922 @section @code{.float @var{flonums}}
1923 This directive assembles zero or more flonums, separated by commas. It
1924 has the same effect as @code{.single}.
1925 _if__(_ALL_ARCH__)
1926 The exact kind of floating point numbers emitted depends on how
1927 @code{as} is configured.
1928 @xref{Machine Dependent}.
1929 _fi__(_ALL_ARCH__)
1930 _if__(_AMD29K__)
1931 The floating point format used for the AMD 29K family is IEEE.
1932 _fi__(_AMD29K__)
1933
1934 @node Global, Ident, Float, Pseudo Ops
1935 @section @code{.global @var{symbol}}, @code{.globl @var{symbol}}
1936 @code{.global} makes the symbol visible to @code{ld}. If you define
1937 @var{symbol} in your partial program, its value is made available to
1938 other partial programs that are linked with it. Otherwise,
1939 @var{symbol} will take its attributes from a symbol of the same name
1940 from another partial program it is linked with.
1941
1942 This is done by setting the @code{N_EXT} bit of that symbol's type byte
1943 to 1. @xref{Symbol Attributes}.
1944
1945 Both spellings (@samp{.globl} and @samp{.global}) are accepted, for
1946 compatibility with other assemblers.
1947
1948 @node Ident, If, Global, Pseudo Ops
1949 @section @code{.ident}
1950 This directive is used by some assemblers to place tags in object files.
1951 GNU @code{as} simply accepts the directive for source-file
1952 compatibility with such assemblers, but does not actually emit anything
1953 for it.
1954
1955 @node If, Include, Ident, Pseudo Ops
1956 @section @code{.if @var{absolute expression}}
1957 @code{.if} marks the beginning of a section of code which is only
1958 considered part of the source program being assembled if the argument
1959 (which must be an @var{absolute expression}) is non-zero. The end of
1960 the conditional section of code must be marked by @code{.endif}
1961 (@pxref{Endif}); optionally, you may include code for the
1962 alternative condition, flagged by @code{.else} (@pxref{Else}.
1963
1964 The following variants of @code{.if} are also supported:
1965 @table @code
1966 @item ifdef @var{symbol}
1967 Assembles the following section of code if the specified @var{symbol}
1968 has been defined.
1969
1970 @ignore
1971 @item ifeqs
1972 BOGONS??
1973 @end ignore
1974
1975 @item ifndef @var{symbol}
1976 @itemx ifnotdef @var{symbol}
1977 Assembles the following section of code if the specified @var{symbol}
1978 has not been defined. Both spelling variants are equivalent.
1979
1980 @ignore
1981 @item ifnes
1982 NO bogons, I presume?
1983 @end ignore
1984 @end table
1985
1986 @node Include, Int, If, Pseudo Ops
1987 @section @code{.include "@var{file}"}
1988 This directive provides a way to include supporting files at specified
1989 points in your source program. The code from @var{file} is assembled as
1990 if it followed the point of the @code{.include}; when the end of the
1991 included file is reached, assembly of the original file continues. You
1992 can control the search paths used with the @samp{-I} command-line option
1993 (@pxref{Options}). Quotation marks are required around @var{file}.
1994
1995 @node Int, Lcomm, Include, Pseudo Ops
1996 @section @code{.int @var{expressions}}
1997 Expect zero or more @var{expressions}, of any segment, separated by
1998 commas. For each expression, emit a 32-bit number that will, at run
1999 time, be the value of that expression. The byte order of the
2000 expression depends on what kind of computer will run the program.
2001
2002 @node Lcomm, Line, Int, Pseudo Ops
2003 @section @code{.lcomm @var{symbol} , @var{length}}
2004 Reserve @var{length} (an absolute expression) bytes for a local
2005 common denoted by @var{symbol}. The segment and value of @var{symbol} are
2006 those of the new local common. The addresses are allocated in the
2007 bss segment, so at run-time the bytes will start off zeroed.
2008 @var{Symbol} is not declared global (@pxref{Global}), so is normally
2009 not visible to @code{ld}.
2010
2011 _if__(!_AMD29K__)
2012 @node Line, Ln, Lcomm, Pseudo Ops
2013 @section @code{.line @var{line-number}}, @code{.ln @var{line-number}}
2014 @code{.line}, and its alternate spelling @code{.ln}, tell
2015 _fi__(!_AMD29K__)
2016 _if__(_AMD29K__)
2017 @node Ln, List, Line, Pseudo Ops
2018 @section @code{.ln @var{line-number}}
2019 Tell
2020 _fi__(_AMD29K__)
2021 @code{as} to change the logical line number. @var{line-number} must be
2022 an absolute expression. The next line will have that logical line
2023 number. So any other statements on the current line (after a statement
2024 separator character
2025 _if__(_AMD29K__)
2026 @samp{@@})
2027 _fi__(_AMD29K__)
2028 _if__(!_AMD29K__)
2029 @code{;})
2030 _fi__(!_AMD29K__)
2031 will be reported as on logical line number
2032 @var{logical line number} @minus{} 1.
2033 One day this directive will be unsupported: it is used only
2034 for compatibility with existing assembler programs. @refill
2035
2036 @node List, Long, Ln, Pseudo Ops
2037 @section @code{.list} and related directives
2038 GNU @code{as} ignores the directives @code{.list}, @code{.nolist},
2039 @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}; however,
2040 they're accepted for compatibility with assemblers that use them.
2041
2042 @node Long, Lsym, List, Pseudo Ops
2043 @section @code{.long @var{expressions}}
2044 @code{.long} is the same as @samp{.int}, @pxref{Int}.
2045
2046 @node Lsym, Octa, Long, Pseudo Ops
2047 @section @code{.lsym @var{symbol}, @var{expression}}
2048 @code{.lsym} creates a new symbol named @var{symbol}, but does not put it in
2049 the hash table, ensuring it cannot be referenced by name during the
2050 rest of the assembly. This sets the attributes of the symbol to be
2051 the same as the expression value:
2052 @example
2053 @var{other} = @var{descriptor} = 0
2054 @var{type} = @r{(segment of @var{expression})}
2055 N_EXT = 0
2056 @var{value} = @var{expression}
2057 @end example
2058
2059 @node Octa, Org, Lsym, Pseudo Ops
2060 @section @code{.octa @var{bignums}}
2061 This directive expects zero or more bignums, separated by commas. For each
2062 bignum, it emits a 16-byte integer.
2063
2064 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2065 hence @emph{quad}-word for 8 bytes.
2066
2067 @node Org, Quad, Octa, Pseudo Ops
2068 @section @code{.org @var{new-lc} , @var{fill}}
2069
2070 @code{.org} will advance the location counter of the current segment to
2071 @var{new-lc}. @var{new-lc} is either an absolute expression or an
2072 expression with the same segment as the current subsegment. That is,
2073 you can't use @code{.org} to cross segments: if @var{new-lc} has the
2074 wrong segment, the @code{.org} directive is ignored. To be compatible
2075 with former assemblers, if the segment of @var{new-lc} is absolute,
2076 @code{as} will issue a warning, then pretend the segment of @var{new-lc}
2077 is the same as the current subsegment.
2078
2079 @code{.org} may only increase the location counter, or leave it
2080 unchanged; you cannot use @code{.org} to move the location counter
2081 backwards.
2082
2083 @c double negative used below "not undefined" because this is a specific
2084 @c reference to "undefined" (as SEG_UNKNOWN is called in this manual)
2085 @c segment. pesch@cygnus.com 18feb91
2086 Because @code{as} tries to assemble programs in one pass @var{new-lc}
2087 may not be undefined. If you really detest this restriction we eagerly await
2088 a chance to share your improved assembler.
2089
2090 Beware that the origin is relative to the start of the segment, not
2091 to the start of the subsegment. This is compatible with other
2092 people's assemblers.
2093
2094 When the location counter (of the current subsegment) is advanced, the
2095 intervening bytes are filled with @var{fill} which should be an
2096 absolute expression. If the comma and @var{fill} are omitted,
2097 @var{fill} defaults to zero.
2098
2099 @node Quad, Set, Org, Pseudo Ops
2100 @section @code{.quad @var{bignums}}
2101 @code{.quad} expects zero or more bignums, separated by commas. For
2102 each bignum, it emits an 8-byte integer. If the bignum won't fit in a 8
2103 bytes, it prints a warning message; and just takes the lowest order 8
2104 bytes of the bignum.
2105
2106 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2107 hence @emph{quad}-word for 8 bytes.
2108
2109 @node Set, Short, Quad, Pseudo Ops
2110 @section @code{.set @var{symbol}, @var{expression}}
2111
2112 This directive sets the value of @var{symbol} to @var{expression}. This
2113 will change @var{symbol}'s value and type to conform to
2114 @var{expression}. If @code{N_EXT} is set, it remains set.
2115 (@xref{Symbol Attributes}.)
2116
2117 You may @code{.set} a symbol many times in the same assembly.
2118 If the expression's segment is unknowable during pass 1, a second
2119 pass over the source program will be forced. The second pass is
2120 currently not implemented. @code{as} will abort with an error
2121 message if one is required.
2122
2123 If you @code{.set} a global symbol, the value stored in the object
2124 file is the last value stored into it.
2125
2126 @node Short, Single, Set, Pseudo Ops
2127 @section @code{.short @var{expressions}}
2128 _if__(! (_SPARC__ || _AMD29K__) )
2129 @code{.short} is the same as @samp{.word}. @xref{Word}.
2130 _fi__(! (_SPARC__ || _AMD29K__) )
2131 _if__(_SPARC__ || _AMD29K__)
2132 This expects zero or more @var{expressions}, and emits
2133 a 16 bit number for each.
2134 _fi__(_SPARC__ || _AMD29K__)
2135
2136 @node Single, Space, Short, Pseudo Ops
2137 @section @code{.single @var{flonums}}
2138 This directive assembles zero or more flonums, separated by commas. It
2139 has the same effect as @code{.float}.
2140 _if__(_ALL_ARCH__)
2141 The exact kind of floating point numbers emitted depends on how
2142 @code{as} is configured. @xref{Machine Dependent}.
2143 _fi__(_ALL_ARCH__)
2144 _if__(_AMD29K__)
2145 The floating point format used for the AMD 29K family is IEEE.
2146 _fi__(_AMD29K__)
2147
2148
2149 @node Space, Space, Single, Pseudo Ops
2150 _if__(!_AMD29K__)
2151 @section @code{.space @var{size} , @var{fill}}
2152 This directive emits @var{size} bytes, each of value @var{fill}. Both
2153 @var{size} and @var{fill} are absolute expressions. If the comma
2154 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2155 _fi__(!_AMD29K__)
2156
2157 _if__(_AMD29K__)
2158 @section @code{.space}
2159 This directive is ignored; it is accepted for compatibility with other
2160 AMD 29K assemblers.
2161
2162 @quotation
2163 @emph{Warning:} In other versions of GNU @code{as}, the directive
2164 @code{.space} has the effect of @code{.block} @xref{Machine Directives}.
2165 @end quotation
2166 _fi__(_AMD29K__)
2167
2168 @node Stab, Text, Space, Pseudo Ops
2169 @section @code{.stabd, .stabn, .stabs}
2170 There are three directives that begin @samp{.stab}.
2171 All emit symbols (@pxref{Symbols}), for use by symbolic debuggers.
2172 The symbols are not entered in @code{as}' hash table: they
2173 cannot be referenced elsewhere in the source file.
2174 Up to five fields are required:
2175 @table @var
2176 @item string
2177 This is the symbol's name. It may contain any character except @samp{\000},
2178 so is more general than ordinary symbol names. Some debuggers used to
2179 code arbitrarily complex structures into symbol names using this field.
2180 @item type
2181 An absolute expression. The symbol's type is set to the low 8
2182 bits of this expression.
2183 Any bit pattern is permitted, but @code{ld} and debuggers will choke on
2184 silly bit patterns.
2185 @item other
2186 An absolute expression.
2187 The symbol's ``other'' attribute is set to the low 8 bits of this expression.
2188 @item desc
2189 An absolute expression.
2190 The symbol's descriptor is set to the low 16 bits of this expression.
2191 @item value
2192 An absolute expression which becomes the symbol's value.
2193 @end table
2194
2195 If a warning is detected while reading a @code{.stabd}, @code{.stabn},
2196 or @code{.stabs} statement, the symbol has probably already been created
2197 and you will get a half-formed symbol in your object file. This is
2198 compatible with earlier assemblers!
2199
2200 @table @code
2201 @item .stabd @var{type} , @var{other} , @var{desc}
2202
2203 The ``name'' of the symbol generated is not even an empty string.
2204 It is a null pointer, for compatibility. Older assemblers used a
2205 null pointer so they didn't waste space in object files with empty
2206 strings.
2207
2208 The symbol's value is set to the location counter,
2209 relocatably. When your program is linked, the value of this symbol
2210 will be where the location counter was when the @code{.stabd} was
2211 assembled.
2212
2213 @item .stabn @var{type} , @var{other} , @var{desc} , @var{value}
2214
2215 The name of the symbol is set to the empty string @code{""}.
2216
2217 @item .stabs @var{string} , @var{type} , @var{other} , @var{desc} , @var{value}
2218
2219 All five fields are specified.
2220 @end table
2221
2222 @node Text, Word, Stab, Pseudo Ops
2223 @section @code{.text @var{subsegment}}
2224 Tells @code{as} to assemble the following statements onto the end of
2225 the text subsegment numbered @var{subsegment}, which is an absolute
2226 expression. If @var{subsegment} is omitted, subsegment number zero
2227 is used.
2228
2229 @node Word, Deprecated, Text, Pseudo Ops
2230 @section @code{.word @var{expressions}}
2231 This directive expects zero or more @var{expressions}, of any segment,
2232 separated by commas.
2233 _if__(_SPARC__ || _AMD29K__)
2234 For each expression, @code{as} emits a 32-bit number.
2235 _fi__(_SPARC__ || _AMD29K__)
2236 _if__(! (_SPARC__ || _AMD29K__) )
2237 For each expression, @code{as} emits a 16-bit number.
2238 _fi__(! (_SPARC__ || _AMD29K__) )
2239
2240 _if__(_ALL_ARCH__)
2241 The byte order of the expression depends on what kind of computer will
2242 run the program.
2243 _fi__(_ALL_ARCH__)
2244
2245 @c on the 29k the "special treatment to support compilers" doesn't
2246 @c happen---32-bit addressability, period; no long/short jumps.
2247 _if__(!_AMD29K__)
2248 @subsection Special Treatment to support Compilers
2249
2250 In order to assemble compiler output into something that will work,
2251 @code{as} will occasionlly do strange things to @samp{.word} directives.
2252 Directives of the form @samp{.word sym1-sym2} are often emitted by
2253 compilers as part of jump tables. Therefore, when @code{as} assembles a
2254 directive of the form @samp{.word sym1-sym2}, and the difference between
2255 @code{sym1} and @code{sym2} does not fit in 16 bits, @code{as} will
2256 create a @dfn{secondary jump table}, immediately before the next label.
2257 This @var{secondary jump table} will be preceded by a short-jump to the
2258 first byte after the secondary table. This short-jump prevents the flow
2259 of control from accidentally falling into the new table. Inside the
2260 table will be a long-jump to @code{sym2}. The original @samp{.word}
2261 will contain @code{sym1} minus the address of the long-jump to
2262 @code{sym2}.
2263
2264 If there were several occurrences of @samp{.word sym1-sym2} before the
2265 secondary jump table, all of them will be adjusted. If there was a
2266 @samp{.word sym3-sym4}, that also did not fit in sixteen bits, a
2267 long-jump to @code{sym4} will be included in the secondary jump table,
2268 and the @code{.word} directives will be adjusted to contain @code{sym3}
2269 minus the address of the long-jump to @code{sym4}; and so on, for as many
2270 entries in the original jump table as necessary.
2271
2272 _if__(_INTERNALS__)
2273 @emph{This feature may be disabled by compiling @code{as} with the
2274 @samp{-DWORKING_DOT_WORD} option.} This feature is likely to confuse
2275 assembly language programmers.
2276 _fi__(_INTERNALS__)
2277 _fi__(!_AMD29K__)
2278
2279 @node Deprecated, Machine Dependent, Word, Pseudo Ops
2280 @section Deprecated Directives
2281 One day these directives won't work.
2282 They are included for compatibility with older assemblers.
2283 @table @t
2284 @item .abort
2285 @item .app-file
2286 @item .line
2287 @end table
2288
2289 @node Machine Dependent, Machine Dependent, Pseudo Ops, Top
2290 _if__(_ALL_ARCH__)
2291 @chapter Machine Dependent Features
2292 _fi__(_ALL_ARCH__)
2293
2294 _if__(_VAX__ && !_ALL_ARCH__)
2295 @chapter Machine Dependent Features: VAX
2296 _fi__(_VAX__ && !_ALL_ARCH__)
2297 _if__(_ALL_ARCH__)
2298 @section Vax
2299 _fi__(_ALL_ARCH__)
2300 _if__(_VAX__)
2301 @subsection Options
2302
2303 The Vax version of @code{as} accepts any of the following options,
2304 gives a warning message that the option was ignored and proceeds.
2305 These options are for compatibility with scripts designed for other
2306 people's assemblers.
2307
2308 @table @asis
2309 @item @kbd{-D} (Debug)
2310 @itemx @kbd{-S} (Symbol Table)
2311 @itemx @kbd{-T} (Token Trace)
2312 These are obsolete options used to debug old assemblers.
2313
2314 @item @kbd{-d} (Displacement size for JUMPs)
2315 This option expects a number following the @kbd{-d}. Like options
2316 that expect filenames, the number may immediately follow the
2317 @kbd{-d} (old standard) or constitute the whole of the command line
2318 argument that follows @kbd{-d} (GNU standard).
2319
2320 @item @kbd{-V} (Virtualize Interpass Temporary File)
2321 Some other assemblers use a temporary file. This option
2322 commanded them to keep the information in active memory rather
2323 than in a disk file. @code{as} always does this, so this
2324 option is redundant.
2325
2326 @item @kbd{-J} (JUMPify Longer Branches)
2327 Many 32-bit computers permit a variety of branch instructions
2328 to do the same job. Some of these instructions are short (and
2329 fast) but have a limited range; others are long (and slow) but
2330 can branch anywhere in virtual memory. Often there are 3
2331 flavors of branch: short, medium and long. Some other
2332 assemblers would emit short and medium branches, unless told by
2333 this option to emit short and long branches.
2334
2335 @item @kbd{-t} (Temporary File Directory)
2336 Some other assemblers may use a temporary file, and this option
2337 takes a filename being the directory to site the temporary
2338 file. @code{as} does not use a temporary disk file, so this
2339 option makes no difference. @kbd{-t} needs exactly one
2340 filename.
2341 @end table
2342
2343 The Vax version of the assembler accepts two options when
2344 compiled for VMS. They are @kbd{-h}, and @kbd{-+}. The
2345 @kbd{-h} option prevents @code{as} from modifying the
2346 symbol-table entries for symbols that contain lowercase
2347 characters (I think). The @kbd{-+} option causes @code{as} to
2348 print warning messages if the FILENAME part of the object file,
2349 or any symbol name is larger than 31 characters. The @kbd{-+}
2350 option also insertes some code following the @samp{_main}
2351 symbol so that the object file will be compatible with Vax-11
2352 "C".
2353
2354 @subsection Floating Point
2355 Conversion of flonums to floating point is correct, and
2356 compatible with previous assemblers. Rounding is
2357 towards zero if the remainder is exactly half the least significant bit.
2358
2359 @code{D}, @code{F}, @code{G} and @code{H} floating point formats
2360 are understood.
2361
2362 Immediate floating literals (@emph{e.g.} @samp{S`$6.9})
2363 are rendered correctly. Again, rounding is towards zero in the
2364 boundary case.
2365
2366 The @code{.float} directive produces @code{f} format numbers.
2367 The @code{.double} directive produces @code{d} format numbers.
2368
2369 @subsection Machine Directives
2370 The Vax version of the assembler supports four directives for
2371 generating Vax floating point constants. They are described in the
2372 table below.
2373
2374 @table @code
2375 @item .dfloat
2376 This expects zero or more flonums, separated by commas, and
2377 assembles Vax @code{d} format 64-bit floating point constants.
2378
2379 @item .ffloat
2380 This expects zero or more flonums, separated by commas, and
2381 assembles Vax @code{f} format 32-bit floating point constants.
2382
2383 @item .gfloat
2384 This expects zero or more flonums, separated by commas, and
2385 assembles Vax @code{g} format 64-bit floating point constants.
2386
2387 @item .hfloat
2388 This expects zero or more flonums, separated by commas, and
2389 assembles Vax @code{h} format 128-bit floating point constants.
2390
2391 @end table
2392
2393 @subsection Opcodes
2394 All DEC mnemonics are supported. Beware that @code{case@dots{}}
2395 instructions have exactly 3 operands. The dispatch table that
2396 follows the @code{case@dots{}} instruction should be made with
2397 @code{.word} statements. This is compatible with all unix
2398 assemblers we know of.
2399
2400 @subsection Branch Improvement
2401 Certain pseudo opcodes are permitted. They are for branch
2402 instructions. They expand to the shortest branch instruction that
2403 will reach the target. Generally these mnemonics are made by
2404 substituting @samp{j} for @samp{b} at the start of a DEC mnemonic.
2405 This feature is included both for compatibility and to help
2406 compilers. If you don't need this feature, don't use these
2407 opcodes. Here are the mnemonics, and the code they can expand into.
2408
2409 @table @code
2410 @item jbsb
2411 @samp{Jsb} is already an instruction mnemonic, so we chose @samp{jbsb}.
2412 @table @asis
2413 @item (byte displacement)
2414 @kbd{bsbb @dots{}}
2415 @item (word displacement)
2416 @kbd{bsbw @dots{}}
2417 @item (long displacement)
2418 @kbd{jsb @dots{}}
2419 @end table
2420 @item jbr
2421 @itemx jr
2422 Unconditional branch.
2423 @table @asis
2424 @item (byte displacement)
2425 @kbd{brb @dots{}}
2426 @item (word displacement)
2427 @kbd{brw @dots{}}
2428 @item (long displacement)
2429 @kbd{jmp @dots{}}
2430 @end table
2431 @item j@var{COND}
2432 @var{COND} may be any one of the conditional branches
2433 @code{neq nequ eql eqlu gtr geq lss gtru lequ vc vs gequ cc lssu cs}.
2434 @var{COND} may also be one of the bit tests
2435 @code{bs bc bss bcs bsc bcc bssi bcci lbs lbc}.
2436 @var{NOTCOND} is the opposite condition to @var{COND}.
2437 @table @asis
2438 @item (byte displacement)
2439 @kbd{b@var{COND} @dots{}}
2440 @item (word displacement)
2441 @kbd{b@var{UNCOND} foo ; brw @dots{} ; foo:}
2442 @item (long displacement)
2443 @kbd{b@var{UNCOND} foo ; jmp @dots{} ; foo:}
2444 @end table
2445 @item jacb@var{X}
2446 @var{X} may be one of @code{b d f g h l w}.
2447 @table @asis
2448 @item (word displacement)
2449 @kbd{@var{OPCODE} @dots{}}
2450 @item (long displacement)
2451 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @dots{} ; bar:}
2452 @end table
2453 @item jaob@var{YYY}
2454 @var{YYY} may be one of @code{lss leq}.
2455 @item jsob@var{ZZZ}
2456 @var{ZZZ} may be one of @code{geq gtr}.
2457 @table @asis
2458 @item (byte displacement)
2459 @kbd{@var{OPCODE} @dots{}}
2460 @item (word displacement)
2461 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2462 @item (long displacement)
2463 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar: }
2464 @end table
2465 @item aobleq
2466 @itemx aoblss
2467 @itemx sobgeq
2468 @itemx sobgtr
2469 @table @asis
2470 @item (byte displacement)
2471 @kbd{@var{OPCODE} @dots{}}
2472 @item (word displacement)
2473 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2474 @item (long displacement)
2475 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar:}
2476 @end table
2477 @end table
2478
2479 @subsection operands
2480 The immediate character is @samp{$} for Unix compatibility, not
2481 @samp{#} as DEC writes it.
2482
2483 The indirect character is @samp{*} for Unix compatibility, not
2484 @samp{@@} as DEC writes it.
2485
2486 The displacement sizing character is @samp{`} (an accent grave) for
2487 Unix compatibility, not @samp{^} as DEC writes it. The letter
2488 preceding @samp{`} may have either case. @samp{G} is not
2489 understood, but all other letters (@code{b i l s w}) are understood.
2490
2491 Register names understood are @code{r0 r1 r2 @dots{} r15 ap fp sp
2492 pc}. Any case of letters will do.
2493
2494 For instance
2495 @example
2496 tstb *w`$4(r5)
2497 @end example
2498
2499 Any expression is permitted in an operand. Operands are comma
2500 separated.
2501
2502 @c There is some bug to do with recognizing expressions
2503 @c in operands, but I forget what it is. It is
2504 @c a syntax clash because () is used as an address mode
2505 @c and to encapsulate sub-expressions.
2506 @subsection Not Supported
2507 Vax bit fields can not be assembled with @code{as}. Someone
2508 can add the required code if they really need it.
2509 _fi__(_VAX__)
2510
2511 _if__(_AMD29K__ && !_ALL_ARCH__)
2512 @chapter Machine Dependent Features: AMD 29K
2513 _fi__(_AMD29K__ && !_ALL_ARCH__)
2514 _if__(_AMD29K__)
2515 @node Machine Options, Machine Syntax, Machine Dependent, Machine Dependent
2516 @section Options
2517 GNU @code{as} has no additional command-line options for the AMD
2518 29K family.
2519
2520 @node Machine Syntax, Floating Point, Machine Options, Machine Dependent
2521 @section Syntax
2522 @subsection Special Characters
2523 @samp{;} is the line comment character.
2524
2525 @samp{@@} can be used instead of a newline to separate statements.
2526
2527 The character @samp{?} is permitted in identifiers (but may not begin
2528 an identifier).
2529
2530 @subsection Register Names
2531 General-purpose registers are represented by predefined symbols of the
2532 form @samp{GR@var{nnn}} (for global registers) or @samp{LR@var{nnn}}
2533 (for local registers), where @var{nnn} represents a number between
2534 @code{0} and @code{127}, written with no leading zeros. The leading
2535 letters may be in either upper or lower case; for example, @samp{gr13}
2536 and @samp{LR7} are both valid register names.
2537
2538 You may also refer to general-purpose registers by specifying the
2539 register number as the result of an expression (prefixed with @samp{%%}
2540 to flag the expression as a register number):
2541 @example
2542 %%@var{expression}
2543 @end example
2544 @noindent---where @var{expression} must be an absolute expression
2545 evaluating to a number between @code{0} and @code{255}. The range
2546 [0, 127] refers to global registers, and the range [128, 255] to local
2547 registers.
2548
2549 In addition, GNU @code{as} understands the following protected
2550 special-purpose register names for the AMD 29K family:
2551
2552 @example
2553 vab chd pc0
2554 ops chc pc1
2555 cps rbp pc2
2556 cfg tmc mmu
2557 cha tmr lru
2558 @end example
2559
2560 These unprotected special-purpose register names are also recognized:
2561 @example
2562 ipc alu fpe
2563 ipa bp inte
2564 ipb fc fps
2565 q cr exop
2566 @end example
2567
2568 @node Floating Point, Machine Directives, Machine Syntax, Machine Dependent
2569 @section Floating Point
2570 The AMD 29K family uses IEEE floating-point numbers.
2571
2572 @node Machine Directives, Opcodes, Floating Point, Machine Dependent
2573 @section Machine Directives
2574
2575 @menu
2576 * block:: @code{.block @var{size} , @var{fill}}
2577 * cputype:: @code{.cputype}
2578 * file:: @code{.file}
2579 * hword:: @code{.hword @var{expressions}}
2580 * line:: @code{.line}
2581 * reg:: @code{.reg @var{symbol}, @var{expression}}
2582 * sect:: @code{.sect}
2583 * use:: @code{.use @var{segment name}}
2584 @end menu
2585
2586 @node block, cputype, Machine Directives, Machine Directives
2587 @subsection @code{.block @var{size} , @var{fill}}
2588 This directive emits @var{size} bytes, each of value @var{fill}. Both
2589 @var{size} and @var{fill} are absolute expressions. If the comma
2590 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2591
2592 In other versions of GNU @code{as}, this directive is called
2593 @samp{.space}.
2594
2595 @node cputype, file, block, Machine Directives
2596 @subsection @code{.cputype}
2597 This directive is ignored; it is accepted for compatibility with other
2598 AMD 29K assemblers.
2599
2600 @node file, hword, cputype, Machine Directives
2601 @subsection @code{.file}
2602 This directive is ignored; it is accepted for compatibility with other
2603 AMD 29K assemblers.
2604
2605 @quotation
2606 @emph{Warning:} in other versions of GNU @code{as}, @code{.file} is
2607 used for the directive called @code{.app-file} in the AMD 29K support.
2608 @end quotation
2609
2610 @node hword, line, file, Machine Directives
2611 @subsection @code{.hword @var{expressions}}
2612 This expects zero or more @var{expressions}, and emits
2613 a 16 bit number for each. (Synonym for @samp{.short}.)
2614
2615 @node line, reg, hword, Machine Directives
2616 @subsection @code{.line}
2617 This directive is ignored; it is accepted for compatibility with other
2618 AMD 29K assemblers.
2619
2620 @node reg, sect, line, Machine Directives
2621 @subsection @code{.reg @var{symbol}, @var{expression}}
2622 @code{.reg} has the same effect as @code{.lsym}; @pxref{Lsym}.
2623
2624 @node sect, use, reg, Machine Directives
2625 @subsection @code{.sect}
2626 This directive is ignored; it is accepted for compatibility with other
2627 AMD 29K assemblers.
2628
2629 @node use, , sect, Machine Directives
2630 @subsection @code{.use @var{segment name}}
2631 Establishes the segment and subsegment for the following code;
2632 @var{segment name} may be one of @code{.text}, @code{.data},
2633 @code{.data1}, or @code{.lit}. With one of the first three @var{segment
2634 name} options, @samp{.use} is equivalent to the machine directive
2635 @var{segment name}; the remaining case, @samp{.use .lit}, is the same as
2636 @samp{.data 200}.
2637
2638
2639 @node Opcodes, Opcodes, Machine Directives, Machine Dependent
2640 @section Opcodes
2641 GNU @code{as} implements all the standard AMD 29K opcodes. No
2642 additional pseudo-instructions are needed on this family.
2643
2644 For information on the 29K machine instruction set, see @cite{Am29000
2645 User's Manual}, Advanced Micro Devices, Inc.
2646
2647
2648 _fi__(_AMD29K__)
2649 _if__(_M680X0__ && !_ALL_ARCH__)
2650 @chapter Machine Dependent Features: Motorola 680x0
2651 _fi__(_M680X0__ && !_ALL_ARCH__)
2652 _if__(_M680X0__)
2653 @section Options
2654 The 680x0 version of @code{as} has two machine dependent options.
2655 One shortens undefined references from 32 to 16 bits, while the
2656 other is used to tell @code{as} what kind of machine it is
2657 assembling for.
2658
2659 You can use the @kbd{-l} option to shorten the size of references to
2660 undefined symbols. If the @kbd{-l} option is not given, references to
2661 undefined symbols will be a full long (32 bits) wide. (Since @code{as}
2662 cannot know where these symbols will end up, @code{as} can only allocate
2663 space for the linker to fill in later. Since @code{as} doesn't know how
2664 far away these symbols will be, it allocates as much space as it can.)
2665 If this option is given, the references will only be one word wide (16
2666 bits). This may be useful if you want the object file to be as small as
2667 possible, and you know that the relevant symbols will be less than 17
2668 bits away.
2669
2670 The 680x0 version of @code{as} is most frequently used to assemble
2671 programs for the Motorola MC68020 microprocessor. Occasionally it is
2672 used to assemble programs for the mostly similar, but slightly different
2673 MC68000 or MC68010 microprocessors. You can give @code{as} the options
2674 @samp{-m68000}, @samp{-mc68000}, @samp{-m68010}, @samp{-mc68010},
2675 @samp{-m68020}, and @samp{-mc68020} to tell it what processor is the
2676 target.
2677
2678 @section Syntax
2679
2680 The 680x0 version of @code{as} uses syntax similar to the Sun assembler.
2681 Size modifiers are appended directly to the end of the opcode without an
2682 intervening period. For example, write @samp{movl} rather than
2683 @samp{move.l}.
2684
2685 _if__(_INTERNALS__)
2686 If @code{as} is compiled with SUN_ASM_SYNTAX defined, it will also allow
2687 Sun-style local labels of the form @samp{1$} through @samp{$9}.
2688 _fi__(_INTERNALS__)
2689
2690 In the following table @dfn{apc} stands for any of the address
2691 registers (@samp{a0} through @samp{a7}), nothing, (@samp{}), the
2692 Program Counter (@samp{pc}), or the zero-address relative to the
2693 program counter (@samp{zpc}).
2694
2695 The following addressing modes are understood:
2696 @table @dfn
2697 @item Immediate
2698 @samp{#@var{digits}}
2699
2700 @item Data Register
2701 @samp{d0} through @samp{d7}
2702
2703 @item Address Register
2704 @samp{a0} through @samp{a7}
2705
2706 @item Address Register Indirect
2707 @samp{a0@@} through @samp{a7@@}
2708
2709 @item Address Register Postincrement
2710 @samp{a0@@+} through @samp{a7@@+}
2711
2712 @item Address Register Predecrement
2713 @samp{a0@@-} through @samp{a7@@-}
2714
2715 @item Indirect Plus Offset
2716 @samp{@var{apc}@@(@var{digits})}
2717
2718 @item Index
2719 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2720 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})}
2721
2722 @item Postindex
2723 @samp{@var{apc}@@(@var{digits})@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2724 or @samp{@var{apc}@@(@var{digits})@@(@var{register}:@var{size}:@var{scale})}
2725
2726 @item Preindex
2727 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2728 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2729
2730 @item Memory Indirect
2731 @samp{@var{apc}@@(@var{digits})@@(@var{digits})}
2732
2733 @item Absolute
2734 @samp{@var{symbol}}, or @samp{@var{digits}}
2735 @ignore
2736 @c pesch@cygnus.com: gnu, rich concur the following needs careful
2737 @c research before documenting.
2738 , or either of the above followed
2739 by @samp{:b}, @samp{:w}, or @samp{:l}.
2740 @end ignore
2741 @end table
2742
2743 @section Floating Point
2744 The floating point code is not too well tested, and may have
2745 subtle bugs in it.
2746
2747 Packed decimal (P) format floating literals are not supported.
2748 Feel free to add the code!
2749
2750 The floating point formats generated by directives are these.
2751 @table @code
2752 @item .float
2753 @code{Single} precision floating point constants.
2754 @item .double
2755 @code{Double} precision floating point constants.
2756 @end table
2757
2758 There is no directive to produce regions of memory holding
2759 extended precision numbers, however they can be used as
2760 immediate operands to floating-point instructions. Adding a
2761 directive to create extended precision numbers would not be
2762 hard, but it has not yet seemed necessary.
2763
2764 @section Machine Directives
2765 In order to be compatible with the Sun assembler the 680x0 assembler
2766 understands the following directives.
2767 @table @code
2768 @item .data1
2769 This directive is identical to a @code{.data 1} directive.
2770 @item .data2
2771 This directive is identical to a @code{.data 2} directive.
2772 @item .even
2773 This directive is identical to a @code{.align 1} directive.
2774 @c Is this true? does it work???
2775 @item .skip
2776 This directive is identical to a @code{.space} directive.
2777 @end table
2778
2779 @section Opcodes
2780 @c pesch@cygnus.com: I don't see any point in the following
2781 @c paragraph. Bugs are bugs; how does saying this
2782 @c help anyone?
2783 @ignore
2784 Danger: Several bugs have been found in the opcode table (and
2785 fixed). More bugs may exist. Be careful when using obscure
2786 instructions.
2787 @end ignore
2788
2789 @subsection Branch Improvement
2790
2791 Certain pseudo opcodes are permitted for branch instructions.
2792 They expand to the shortest branch instruction that will reach the
2793 target. Generally these mnemonics are made by substituting @samp{j} for
2794 @samp{b} at the start of a Motorola mnemonic.
2795
2796 The following table summarizes the pseudo-operations. A @code{*} flags
2797 cases that are more fully described after the table:
2798
2799 @example
2800 Displacement
2801 +---------------------------------------------------------
2802 | 68020 68000/10
2803 Pseudo-Op |BYTE WORD LONG LONG non-PC relative
2804 +---------------------------------------------------------
2805 jbsr |bsrs bsr bsrl jsr jsr
2806 jra |bras bra bral jmp jmp
2807 * jXX |bXXs bXX bXXl bNXs;jmpl bNXs;jmp
2808 * dbXX |dbXX dbXX dbXX; bra; jmpl
2809 * fjXX |fbXXw fbXXw fbXXl fbNXw;jmp
2810
2811 XX: condition
2812 NX: negative of condition XX
2813
2814 @end example
2815 @center{@code{*}---see full description below}
2816
2817 @table @code
2818 @item jbsr
2819 @itemx jra
2820 These are the simplest jump pseudo-operations; they always map to one
2821 particular machine instruction, depending on the displacement to the
2822 branch target.
2823
2824 @item j@var{XX}
2825 Here, @samp{j@var{XX}} stands for an entire family of pseudo-operations,
2826 where @var{XX} is a conditional branch or condition-code test. The full
2827 list of pseudo-ops in this family is:
2828 @example
2829 jhi jls jcc jcs jne jeq jvc
2830 jvs jpl jmi jge jlt jgt jle
2831 @end example
2832
2833 For the cases of non-PC relative displacements and long displacements on
2834 the 68000 or 68010, @code{as} will issue a longer code fragment in terms of
2835 @var{NX}, the opposite condition to @var{XX}:
2836 @example
2837 j@var{XX} foo
2838 @end example
2839 gives
2840 @example
2841 b@var{NX}s oof
2842 jmp foo
2843 oof:
2844 @end example
2845
2846 @item db@var{XX}
2847 The full family of pseudo-operations covered here is
2848 @example
2849 dbhi dbls dbcc dbcs dbne dbeq dbvc
2850 dbvs dbpl dbmi dbge dblt dbgt dble
2851 dbf dbra dbt
2852 @end example
2853
2854 Other than for word and byte displacements, when the source reads
2855 @samp{db@var{XX} foo}, @code{as} will emit
2856 @example
2857 db@var{XX} oo1
2858 bra oo2
2859 oo1:jmpl foo
2860 oo2:
2861 @end example
2862
2863 @item fj@var{XX}
2864 This family includes
2865 @example
2866 fjne fjeq fjge fjlt fjgt fjle fjf
2867 fjt fjgl fjgle fjnge fjngl fjngle fjngt
2868 fjnle fjnlt fjoge fjogl fjogt fjole fjolt
2869 fjor fjseq fjsf fjsne fjst fjueq fjuge
2870 fjugt fjule fjult fjun
2871 @end example
2872
2873 For branch targets that are not PC relative, @code{as} emits
2874 @example
2875 fb@var{NX} oof
2876 jmp foo
2877 oof:
2878 @end example
2879 when it encounters @samp{fj@var{XX} foo}.
2880
2881 @end table
2882
2883 @subsection Special Characters
2884 The immediate character is @samp{#} for Sun compatibility. The
2885 line-comment character is @samp{|}. If a @samp{#} appears at the
2886 beginning of a line, it is treated as a comment unless it looks like
2887 @samp{# line file}, in which case it is treated normally.
2888 _fi__(_M680X0__)
2889
2890 @c pesch@cygnus.com: conditionalize, rather than ignore, when filled in.
2891 @ignore
2892 @section 32x32
2893 @section Options
2894 The 32x32 version of @code{as} accepts a @kbd{-m32032} option to
2895 specify thiat it is compiling for a 32032 processor, or a
2896 @kbd{-m32532} to specify that it is compiling for a 32532 option.
2897 The default (if neither is specified) is chosen when the assembler
2898 is compiled.
2899
2900 @subsection Syntax
2901 I don't know anything about the 32x32 syntax assembled by
2902 @code{as}. Someone who undersands the processor (I've never seen
2903 one) and the possible syntaxes should write this section.
2904
2905 @subsection Floating Point
2906 The 32x32 uses IEEE floating point numbers, but @code{as} will only
2907 create single or double precision values. I don't know if the 32x32
2908 understands extended precision numbers.
2909
2910 @subsection Machine Directives
2911 The 32x32 has no machine dependent directives.
2912 @end ignore
2913
2914 @c pesch@cygnus.com: stop ignoring this when "syntax" section filled in
2915 @ignore
2916 _if__(_SPARC__ && !_ALL_ARCH__)
2917 @chapter Machine Dependent Features: SPARC
2918 _fi__(_SPARC__ && !_ALL_ARCH__)
2919 @section Sparc
2920 @subsection Options
2921 The sparc has no machine dependent options.
2922
2923 @subsection syntax
2924 I don't know anything about Sparc syntax. Someone who does
2925 will have to write this section.
2926
2927 @subsection Floating Point
2928 The Sparc uses ieee floating-point numbers.
2929
2930 @subsection Machine Directives
2931 The Sparc version of @code{as} supports the following additional
2932 machine directives:
2933
2934 @table @code
2935 @item .common
2936 This must be followed by a symbol name, a positive number, and
2937 @code{"bss"}. This behaves somewhat like @code{.comm}, but the
2938 syntax is different.
2939
2940 @item .global
2941 This is functionally identical to @code{.globl}.
2942
2943 @item .half
2944 This is functionally identical to @code{.short}.
2945
2946 @item .proc
2947 This directive is ignored. Any text following it on the same
2948 line is also ignored.
2949
2950 @item .reserve
2951 This must be followed by a symbol name, a positive number, and
2952 @code{"bss"}. This behaves somewhat like @code{.lcomm}, but the
2953 syntax is different.
2954
2955 @item .seg
2956 This must be followed by @code{"text"}, @code{"data"}, or
2957 @code{"data1"}. It behaves like @code{.text}, @code{.data}, or
2958 @code{.data 1}.
2959
2960 @item .skip
2961 This is functionally identical to the .space directive.
2962
2963 @item .word
2964 On the Sparc, the .word directive produces 32 bit values,
2965 instead of the 16 bit values it produces on every other machine.
2966
2967 @end table
2968 @end ignore
2969
2970 _if__(_I80386__ && !_ALL_ARCH__)
2971 @chapter Machine Dependent Features: SPARC
2972 _fi__(_I80386__ && !_ALL_ARCH__)
2973 _if__(_I80386__)
2974 @section Intel 80386
2975 @subsection Options
2976 The 80386 has no machine dependent options.
2977
2978 @subsection AT&T Syntax versus Intel Syntax
2979 In order to maintain compatibility with the output of @code{GCC},
2980 @code{as} supports AT&T System V/386 assembler syntax. This is quite
2981 different from Intel syntax. We mention these differences because
2982 almost all 80386 documents used only Intel syntax. Notable differences
2983 between the two syntaxes are:
2984 @itemize @bullet
2985 @item
2986 AT&T immediate operands are preceded by @samp{$}; Intel immediate
2987 operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}).
2988 AT&T register operands are preceded by @samp{%}; Intel register operands
2989 are undelimited. AT&T absolute (as opposed to PC relative) jump/call
2990 operands are prefixed by @samp{*}; they are undelimited in Intel syntax.
2991
2992 @item
2993 AT&T and Intel syntax use the opposite order for source and destination
2994 operands. Intel @samp{add eax, 4} is @samp{addl $4, %eax}. The
2995 @samp{source, dest} convention is maintained for compatibility with
2996 previous Unix assemblers.
2997
2998 @item
2999 In AT&T syntax the size of memory operands is determined from the last
3000 character of the opcode name. Opcode suffixes of @samp{b}, @samp{w},
3001 and @samp{l} specify byte (8-bit), word (16-bit), and long (32-bit)
3002 memory references. Intel syntax accomplishes this by prefixes memory
3003 operands (@emph{not} the opcodes themselves) with @samp{byte ptr},
3004 @samp{word ptr}, and @samp{dword ptr}. Thus, Intel @samp{mov al, byte
3005 ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T syntax.
3006
3007 @item
3008 Immediate form long jumps and calls are
3009 @samp{lcall/ljmp $@var{segment}, $@var{offset}} in AT&T syntax; the
3010 Intel syntax is
3011 @samp{call/jmp far @var{segment}:@var{offset}}. Also, the far return
3012 instruction
3013 is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is
3014 @samp{ret far @var{stack-adjust}}.
3015
3016 @item
3017 The AT&T assembler does not provide support for multiple segment
3018 programs. Unix style systems expect all programs to be single segments.
3019 @end itemize
3020
3021 @subsection Opcode Naming
3022 Opcode names are suffixed with one character modifiers which specify the
3023 size of operands. The letters @samp{b}, @samp{w}, and @samp{l} specify
3024 byte, word, and long operands. If no suffix is specified by an
3025 instruction and it contains no memory operands then @code{as} tries to
3026 fill in the missing suffix based on the destination register operand
3027 (the last one by convention). Thus, @samp{mov %ax, %bx} is equivalent
3028 to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to
3029 @samp{movw $1, %bx}. Note that this is incompatible with the AT&T Unix
3030 assembler which assumes that a missing opcode suffix implies long
3031 operand size. (This incompatibility does not affect compiler output
3032 since compilers always explicitly specify the opcode suffix.)
3033
3034 Almost all opcodes have the same names in AT&T and Intel format. There
3035 are a few exceptions. The sign extend and zero extend instructions need
3036 two sizes to specify them. They need a size to sign/zero extend
3037 @emph{from} and a size to zero extend @emph{to}. This is accomplished
3038 by using two opcode suffixes in AT&T syntax. Base names for sign extend
3039 and zero extend are @samp{movs@dots{}} and @samp{movz@dots{}} in AT&T
3040 syntax (@samp{movsx} and @samp{movzx} in Intel syntax). The opcode
3041 suffixes are tacked on to this base name, the @emph{from} suffix before
3042 the @emph{to} suffix. Thus, @samp{movsbl %al, %edx} is AT&T syntax for
3043 ``move sign extend @emph{from} %al @emph{to} %edx.'' Possible suffixes,
3044 thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word),
3045 and @samp{wl} (from word to long).
3046
3047 The Intel syntax conversion instructions
3048 @itemize @bullet
3049 @item
3050 @samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax},
3051 @item
3052 @samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax},
3053 @item
3054 @samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax},
3055 @item
3056 @samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax},
3057 @end itemize
3058 are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, and @samp{cltd} in
3059 AT&T naming. @code{as} accepts either naming for these instructions.
3060
3061 Far call/jump instructions are @samp{lcall} and @samp{ljmp} in
3062 AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel
3063 convention.
3064
3065 @subsection Register Naming
3066 Register operands are always prefixes with @samp{%}. The 80386 registers
3067 consist of
3068 @itemize @bullet
3069 @item
3070 the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx},
3071 @samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the
3072 frame pointer), and @samp{%esp} (the stack pointer).
3073
3074 @item
3075 the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx},
3076 @samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}.
3077
3078 @item
3079 the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh},
3080 @samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These
3081 are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx},
3082 @samp{%cx}, and @samp{%dx})
3083
3084 @item
3085 the 6 segment registers @samp{%cs} (code segment), @samp{%ds}
3086 (data segment), @samp{%ss} (stack segment), @samp{%es}, @samp{%fs},
3087 and @samp{%gs}.
3088
3089 @item
3090 the 3 processor control registers @samp{%cr0}, @samp{%cr2}, and
3091 @samp{%cr3}.
3092
3093 @item
3094 the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2},
3095 @samp{%db3}, @samp{%db6}, and @samp{%db7}.
3096
3097 @item
3098 the 2 test registers @samp{%tr6} and @samp{%tr7}.
3099
3100 @item
3101 the 8 floating point register stack @samp{%st} or equivalently
3102 @samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)},
3103 @samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}.
3104 @end itemize
3105
3106 @subsection Opcode Prefixes
3107 Opcode prefixes are used to modify the following opcode. They are used
3108 to repeat string instructions, to provide segment overrides, to perform
3109 bus lock operations, and to give operand and address size (16-bit
3110 operands are specified in an instruction by prefixing what would
3111 normally be 32-bit operands with a ``operand size'' opcode prefix).
3112 Opcode prefixes are usually given as single-line instructions with no
3113 operands, and must directly precede the instruction they act upon. For
3114 example, the @samp{scas} (scan string) instruction is repeated with:
3115 @example
3116 repne
3117 scas
3118 @end example
3119
3120 Here is a list of opcode prefixes:
3121 @itemize @bullet
3122 @item
3123 Segment override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es},
3124 @samp{fs}, @samp{gs}. These are automatically added by specifying
3125 using the @var{segment}:@var{memory-operand} form for memory references.
3126
3127 @item
3128 Operand/Address size prefixes @samp{data16} and @samp{addr16}
3129 change 32-bit operands/addresses into 16-bit operands/addresses. Note
3130 that 16-bit addressing modes (i.e. 8086 and 80286 addressing modes)
3131 are not supported (yet).
3132
3133 @item
3134 The bus lock prefix @samp{lock} inhibits interrupts during
3135 execution of the instruction it precedes. (This is only valid with
3136 certain instructions; see a 80386 manual for details).
3137
3138 @item
3139 The wait for coprocessor prefix @samp{wait} waits for the
3140 coprocessor to complete the current instruction. This should never be
3141 needed for the 80386/80387 combination.
3142
3143 @item
3144 The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added
3145 to string instructions to make them repeat @samp{%ecx} times.
3146 @end itemize
3147
3148 @subsection Memory References
3149 An Intel syntax indirect memory reference of the form
3150 @example
3151 @var{segment}:[@var{base} + @var{index}*@var{scale} + @var{disp}]
3152 @end example
3153 is translated into the AT&T syntax
3154 @example
3155 @var{segment}:@var{disp}(@var{base}, @var{index}, @var{scale})
3156 @end example
3157 where @var{base} and @var{index} are the optional 32-bit base and
3158 index registers, @var{disp} is the optional displacement, and
3159 @var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index}
3160 to calculate the address of the operand. If no @var{scale} is
3161 specified, @var{scale} is taken to be 1. @var{segment} specifies the
3162 optional segment register for the memory operand, and may override the
3163 default segment register (see a 80386 manual for segment register
3164 defaults). Note that segment overrides in AT&T syntax @emph{must} have
3165 be preceded by a @samp{%}. If you specify a segment override which
3166 coincides with the default segment register, @code{as} will @emph{not}
3167 output any segment register override prefixes to assemble the given
3168 instruction. Thus, segment overrides can be specified to emphasize which
3169 segment register is used for a given memory operand.
3170
3171 Here are some examples of Intel and AT&T style memory references:
3172 @table @asis
3173
3174 @item AT&T: @samp{-4(%ebp)}, Intel: @samp{[ebp - 4]}
3175 @var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{segment} is
3176 missing, and the default segment is used (@samp{%ss} for addressing with
3177 @samp{%ebp} as the base register). @var{index}, @var{scale} are both missing.
3178
3179 @item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]}
3180 @var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is
3181 @samp{foo}. All other fields are missing. The segment register here
3182 defaults to @samp{%ds}.
3183
3184 @item AT&T: @samp{foo(,1)}; Intel @samp{[foo]}
3185 This uses the value pointed to by @samp{foo} as a memory operand.
3186 Note that @var{base} and @var{index} are both missing, but there is only
3187 @emph{one} @samp{,}. This is a syntactic exception.
3188
3189 @item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo}
3190 This selects the contents of the variable @samp{foo} with segment
3191 register @var{segment} being @samp{%gs}.
3192
3193 @end table
3194
3195 Absolute (as opposed to PC relative) call and jump operands must be
3196 prefixed with @samp{*}. If no @samp{*} is specified, @code{as} will
3197 always choose PC relative addressing for jump/call labels.
3198
3199 Any instruction that has a memory operand @emph{must} specify its size (byte,
3200 word, or long) with an opcode suffix (@samp{b}, @samp{w}, or @samp{l},
3201 respectively).
3202
3203 @subsection Handling of Jump Instructions
3204 Jump instructions are always optimized to use the smallest possible
3205 displacements. This is accomplished by using byte (8-bit) displacement
3206 jumps whenever the target is sufficiently close. If a byte displacement
3207 is insufficient a long (32-bit) displacement is used. We do not support
3208 word (16-bit) displacement jumps (i.e. prefixing the jump instruction
3209 with the @samp{addr16} opcode prefix), since the 80386 insists upon masking
3210 @samp{%eip} to 16 bits after the word displacement is added.
3211
3212 Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz},
3213 @samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in
3214 byte displacements, so that it is possible that use of these
3215 instructions (@code{GCC} does not use them) will cause the assembler to
3216 print an error message (and generate incorrect code). The AT&T 80386
3217 assembler tries to get around this problem by expanding @samp{jcxz foo} to
3218 @example
3219 jcxz cx_zero
3220 jmp cx_nonzero
3221 cx_zero: jmp foo
3222 cx_nonzero:
3223 @end example
3224
3225 @subsection Floating Point
3226 All 80387 floating point types except packed BCD are supported.
3227 (BCD support may be added without much difficulty). These data
3228 types are 16-, 32-, and 64- bit integers, and single (32-bit),
3229 double (64-bit), and extended (80-bit) precision floating point.
3230 Each supported type has an opcode suffix and a constructor
3231 associated with it. Opcode suffixes specify operand's data
3232 types. Constructors build these data types into memory.
3233
3234 @itemize @bullet
3235 @item
3236 Floating point constructors are @samp{.float} or @samp{.single},
3237 @samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats.
3238 These correspond to opcode suffixes @samp{s}, @samp{l}, and @samp{t}.
3239 @samp{t} stands for temporary real, and that the 80387 only supports
3240 this format via the @samp{fldt} (load temporary real to stack top) and
3241 @samp{fstpt} (store temporary real and pop stack) instructions.
3242
3243 @item
3244 Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and
3245 @samp{.quad} for the 16-, 32-, and 64-bit integer formats. The corresponding
3246 opcode suffixes are @samp{s} (single), @samp{l} (long), and @samp{q}
3247 (quad). As with the temporary real format the 64-bit @samp{q} format is
3248 only present in the @samp{fildq} (load quad integer to stack top) and
3249 @samp{fistpq} (store quad integer and pop stack) instructions.
3250 @end itemize
3251
3252 Register to register operations do not require opcode suffixes,
3253 so that @samp{fst %st, %st(1)} is equivalent to @samp{fstl %st, %st(1)}.
3254
3255 Since the 80387 automatically synchronizes with the 80386 @samp{fwait}
3256 instructions are almost never needed (this is not the case for the
3257 80286/80287 and 8086/8087 combinations). Therefore, @code{as} suppresses
3258 the @samp{fwait} instruction whenever it is implicitly selected by one
3259 of the @samp{fn@dots{}} instructions. For example, @samp{fsave} and
3260 @samp{fnsave} are treated identically. In general, all the @samp{fn@dots{}}
3261 instructions are made equivalent to @samp{f@dots{}} instructions. If
3262 @samp{fwait} is desired it must be explicitly coded.
3263
3264 @subsection Notes
3265 There is some trickery concerning the @samp{mul} and @samp{imul}
3266 instructions that deserves mention. The 16-, 32-, and 64-bit expanding
3267 multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5
3268 for @samp{imul}) can be output only in the one operand form. Thus,
3269 @samp{imul %ebx, %eax} does @emph{not} select the expanding multiply;
3270 the expanding multiply would clobber the @samp{%edx} register, and this
3271 would confuse @code{GCC} output. Use @samp{imul %ebx} to get the
3272 64-bit product in @samp{%edx:%eax}.
3273
3274 We have added a two operand form of @samp{imul} when the first operand
3275 is an immediate mode expression and the second operand is a register.
3276 This is just a shorthand, so that, multiplying @samp{%eax} by 69, for
3277 example, can be done with @samp{imul $69, %eax} rather than @samp{imul
3278 $69, %eax, %eax}.
3279 _fi__(_I80386__)
3280
3281
3282 @c pesch@cygnus.com: we ignore the following chapters, since internals are
3283 @c changing rapidly. These may need to be moved to another
3284 @c book anyhow, if we adopt the model of user/modifier
3285 @c books.
3286 @ignore
3287 @node Maintenance, Retargeting, Machine Dependent, Top
3288 @chapter Maintaining the Assembler
3289 [[this chapter is still being built]]
3290
3291 @section Design
3292 We had these goals, in descending priority:
3293 @table @b
3294 @item Accuracy.
3295 For every program composed by a compiler, @code{as} should emit
3296 ``correct'' code. This leaves some latitude in choosing addressing
3297 modes, order of @code{relocation_info} structures in the object
3298 file, @emph{etc}.
3299
3300 @item Speed, for usual case.
3301 By far the most common use of @code{as} will be assembling compiler
3302 emissions.
3303
3304 @item Upward compatibility for existing assembler code.
3305 Well @dots{} we don't support Vax bit fields but everything else
3306 seems to be upward compatible.
3307
3308 @item Readability.
3309 The code should be maintainable with few surprises. (JF: ha!)
3310
3311 @end table
3312
3313 We assumed that disk I/O was slow and expensive while memory was
3314 fast and access to memory was cheap. We expect the in-memory data
3315 structures to be less than 10 times the size of the emitted object
3316 file. (Contrast this with the C compiler where in-memory structures
3317 might be 100 times object file size!)
3318 This suggests:
3319 @itemize @bullet
3320 @item
3321 Try to read the source file from disk only one time. For other
3322 reasons, we keep large chunks of the source file in memory during
3323 assembly so this is not a problem. Also the assembly algorithm
3324 should only scan the source text once if the compiler composed the
3325 text according to a few simple rules.
3326 @item
3327 Emit the object code bytes only once. Don't store values and then
3328 backpatch later.
3329 @item
3330 Build the object file in memory and do direct writes to disk of
3331 large buffers.
3332 @end itemize
3333
3334 RMS suggested a one-pass algorithm which seems to work well. By not
3335 parsing text during a second pass considerable time is saved on
3336 large programs (@emph{e.g.} the sort of C program @code{yacc} would
3337 emit).
3338
3339 It happened that the data structures needed to emit relocation
3340 information to the object file were neatly subsumed into the data
3341 structures that do backpatching of addresses after pass 1.
3342
3343 Many of the functions began life as re-usable modules, loosely
3344 connected. RMS changed this to gain speed. For example, input
3345 parsing routines which used to work on pre-sanitized strings now
3346 must parse raw data. Hence they have to import knowledge of the
3347 assemblers' comment conventions @emph{etc}.
3348
3349 @section Deprecated Feature(?)s
3350 We have stopped supporting some features:
3351 @itemize @bullet
3352 @item
3353 @code{.org} statements must have @b{defined} expressions.
3354 @item
3355 Vax Bit fields (@kbd{:} operator) are entirely unsupported.
3356 @end itemize
3357
3358 It might be a good idea to not support these features in a future release:
3359 @itemize @bullet
3360 @item
3361 @kbd{#} should begin a comment, even in column 1.
3362 @item
3363 Why support the logical line & file concept any more?
3364 @item
3365 Subsegments are a good candidate for flushing.
3366 Depends on which compilers need them I guess.
3367 @end itemize
3368
3369 @section Bugs, Ideas, Further Work
3370 Clearly the major improvement is DON'T USE A TEXT-READING
3371 ASSEMBLER for the back end of a compiler. It is much faster to
3372 interpret binary gobbledygook from a compiler's tables than to
3373 ask the compiler to write out human-readable code just so the
3374 assembler can parse it back to binary.
3375
3376 Assuming you use @code{as} for human written programs: here are
3377 some ideas:
3378 @itemize @bullet
3379 @item
3380 Document (here) @code{APP}.
3381 @item
3382 Take advantage of knowing no spaces except after opcode
3383 to speed up @code{as}. (Modify @code{app.c} to flush useless spaces:
3384 only keep space/tabs at begin of line or between 2
3385 symbols.)
3386 @item
3387 Put pointers in this documentation to @file{a.out} documentation.
3388 @item
3389 Split the assembler into parts so it can gobble direct binary
3390 from @emph{e.g.} @code{cc}. It is silly for@code{cc} to compose text
3391 just so @code{as} can parse it back to binary.
3392 @item
3393 Rewrite hash functions: I want a more modular, faster library.
3394 @item
3395 Clean up LOTS of code.
3396 @item
3397 Include all the non-@file{.c} files in the maintenance chapter.
3398 @item
3399 Document flonums.
3400 @item
3401 Implement flonum short literals.
3402 @item
3403 Change all talk of expression operands to expression quantities,
3404 or perhaps to expression arguments.
3405 @item
3406 Implement pass 2.
3407 @item
3408 Whenever a @code{.text} or @code{.data} statement is seen, we close
3409 of the current frag with an imaginary @code{.fill 0}. This is
3410 because we only have one obstack for frags, and we can't grow new
3411 frags for a new subsegment, then go back to the old subsegment and
3412 append bytes to the old frag. All this nonsense goes away if we
3413 give each subsegment its own obstack. It makes code simpler in
3414 about 10 places, but nobody has bothered to do it because C compiler
3415 output rarely changes subsegments (compared to ending frags with
3416 relaxable addresses, which is common).
3417 @end itemize
3418
3419 @section Sources
3420 @c The following files in the @file{as} directory
3421 @c are symbolic links to other files, of
3422 @c the same name, in a different directory.
3423 @c @itemize @bullet
3424 @c @item
3425 @c @file{atof_generic.c}
3426 @c @item
3427 @c @file{atof_vax.c}
3428 @c @item
3429 @c @file{flonum_const.c}
3430 @c @item
3431 @c @file{flonum_copy.c}
3432 @c @item
3433 @c @file{flonum_get.c}
3434 @c @item
3435 @c @file{flonum_multip.c}
3436 @c @item
3437 @c @file{flonum_normal.c}
3438 @c @item
3439 @c @file{flonum_print.c}
3440 @c @end itemize
3441
3442 Here is a list of the source files in the @file{as} directory.
3443
3444 @table @file
3445 @item app.c
3446 This contains the pre-processing phase, which deletes comments,
3447 handles whitespace, etc. This was recently re-written, since app
3448 used to be a separate program, but RMS wanted it to be inline.
3449
3450 @item append.c
3451 This is a subroutine to append a string to another string returning a
3452 pointer just after the last @code{char} appended. (JF: All these
3453 little routines should probably all be put in one file.)
3454
3455 @item as.c
3456 Here you will find the main program of the assembler @code{as}.
3457
3458 @item expr.c
3459 This is a branch office of @file{read.c}. This understands
3460 expressions, arguments. Inside @code{as}, arguments are called
3461 (expression) @emph{operands}. This is confusing, because we also talk
3462 (elsewhere) about instruction @emph{operands}. Also, expression
3463 operands are called @emph{quantities} explicitly to avoid confusion
3464 with instruction operands. What a mess.
3465
3466 @item frags.c
3467 This implements the @b{frag} concept. Without frags, finding the
3468 right size for branch instructions would be a lot harder.
3469
3470 @item hash.c
3471 This contains the symbol table, opcode table @emph{etc.} hashing
3472 functions.
3473
3474 @item hex_value.c
3475 This is a table of values of digits, for use in atoi() type
3476 functions. Could probably be flushed by using calls to strtol(), or
3477 something similar.
3478
3479 @item input-file.c
3480 This contains Operating system dependent source file reading
3481 routines. Since error messages often say where we are in reading
3482 the source file, they live here too. Since @code{as} is intended to
3483 run under GNU and Unix only, this might be worth flushing. Anyway,
3484 almost all C compilers support stdio.
3485
3486 @item input-scrub.c
3487 This deals with calling the pre-processor (if needed) and feeding the
3488 chunks back to the rest of the assembler the right way.
3489
3490 @item messages.c
3491 This contains operating system independent parts of fatal and
3492 warning message reporting. See @file{append.c} above.
3493
3494 @item output-file.c
3495 This contains operating system dependent functions that write an
3496 object file for @code{as}. See @file{input-file.c} above.
3497
3498 @item read.c
3499 This implements all the directives of @code{as}. This also deals
3500 with passing input lines to the machine dependent part of the
3501 assembler.
3502
3503 @item strstr.c
3504 This is a C library function that isn't in most C libraries yet.
3505 See @file{append.c} above.
3506
3507 @item subsegs.c
3508 This implements subsegments.
3509
3510 @item symbols.c
3511 This implements symbols.
3512
3513 @item write.c
3514 This contains the code to perform relaxation, and to write out
3515 the object file. It is mostly operating system independent, but
3516 different OSes have different object file formats in any case.
3517
3518 @item xmalloc.c
3519 This implements @code{malloc()} or bust. See @file{append.c} above.
3520
3521 @item xrealloc.c
3522 This implements @code{realloc()} or bust. See @file{append.c} above.
3523
3524 @item atof-generic.c
3525 The following files were taken from a machine-independent subroutine
3526 library for manipulating floating point numbers and very large
3527 integers.
3528
3529 @file{atof-generic.c} turns a string into a flonum internal format
3530 floating-point number.
3531
3532 @item flonum-const.c
3533 This contains some potentially useful floating point numbers in
3534 flonum format.
3535
3536 @item flonum-copy.c
3537 This copies a flonum.
3538
3539 @item flonum-multip.c
3540 This multiplies two flonums together.
3541
3542 @item bignum-copy.c
3543 This copies a bignum.
3544
3545 @end table
3546
3547 Here is a table of all the machine-specific files (this includes
3548 both source and header files). Typically, there is a
3549 @var{machine}.c file, a @var{machine}-opcode.h file, and an
3550 atof-@var{machine}.c file. The @var{machine}-opcode.h file should
3551 be identical to the one used by GDB (which uses it for disassembly.)
3552
3553 @table @file
3554
3555 @item atof-ieee.c
3556 This contains code to turn a flonum into a ieee literal constant.
3557 This is used by tye 680x0, 32x32, sparc, and i386 versions of @code{as}.
3558
3559 @item i386-opcode.h
3560 This is the opcode-table for the i386 version of the assembler.
3561
3562 @item i386.c
3563 This contains all the code for the i386 version of the assembler.
3564
3565 @item i386.h
3566 This defines constants and macros used by the i386 version of the assembler.
3567
3568 @item m-generic.h
3569 generic 68020 header file. To be linked to m68k.h on a
3570 non-sun3, non-hpux system.
3571
3572 @item m-sun2.h
3573 68010 header file for Sun2 workstations. Not well tested. To be linked
3574 to m68k.h on a sun2. (See also @samp{-DSUN_ASM_SYNTAX} in the
3575 @file{Makefile}.)
3576
3577 @item m-sun3.h
3578 68020 header file for Sun3 workstations. To be linked to m68k.h before
3579 compiling on a Sun3 system. (See also @samp{-DSUN_ASM_SYNTAX} in the
3580 @file{Makefile}.)
3581
3582 @item m-hpux.h
3583 68020 header file for a HPUX (system 5?) box. Which box, which
3584 version of HPUX, etc? I don't know.
3585
3586 @item m68k.h
3587 A hard- or symbolic- link to one of @file{m-generic.h},
3588 @file{m-hpux.h} or @file{m-sun3.h} depending on which kind of
3589 680x0 you are assembling for. (See also @samp{-DSUN_ASM_SYNTAX} in the
3590 @file{Makefile}.)
3591
3592 @item m68k-opcode.h
3593 Opcode table for 68020. This is now a link to the opcode table
3594 in the @code{GDB} source directory.
3595
3596 @item m68k.c
3597 All the mc680x0 code, in one huge, slow-to-compile file.
3598
3599 @item ns32k.c
3600 This contains the code for the ns32032/ns32532 version of the
3601 assembler.
3602
3603 @item ns32k-opcode.h
3604 This contains the opcode table for the ns32032/ns32532 version
3605 of the assembler.
3606
3607 @item vax-inst.h
3608 Vax specific file for describing Vax operands and other Vax-ish things.
3609
3610 @item vax-opcode.h
3611 Vax opcode table.
3612
3613 @item vax.c
3614 Vax specific parts of @code{as}. Also includes the former files
3615 @file{vax-ins-parse.c}, @file{vax-reg-parse.c} and @file{vip-op.c}.
3616
3617 @item atof-vax.c
3618 Turns a flonum into a Vax constant.
3619
3620 @item vms.c
3621 This file contains the special code needed to put out a VMS
3622 style object file for the Vax.
3623
3624 @end table
3625
3626 Here is a list of the header files in the source directory.
3627 (Warning: This section may not be very accurate. I didn't
3628 write the header files; I just report them.) Also note that I
3629 think many of these header files could be cleaned up or
3630 eliminated.
3631
3632 @table @file
3633
3634 @item a.out.h
3635 This describes the structures used to create the binary header data
3636 inside the object file. Perhaps we should use the one in
3637 @file{/usr/include}?
3638
3639 @item as.h
3640 This defines all the globally useful things, and pulls in _0__<stdio.h>_1__
3641 and _0__<assert.h>_1__.
3642
3643 @item bignum.h
3644 This defines macros useful for dealing with bignums.
3645
3646 @item expr.h
3647 Structure and macros for dealing with expression()
3648
3649 @item flonum.h
3650 This defines the structure for dealing with floating point
3651 numbers. It #includes @file{bignum.h}.
3652
3653 @item frags.h
3654 This contains macro for appending a byte to the current frag.
3655
3656 @item hash.h
3657 Structures and function definitions for the hashing functions.
3658
3659 @item input-file.h
3660 Function headers for the input-file.c functions.
3661
3662 @item md.h
3663 structures and function headers for things defined in the
3664 machine dependent part of the assembler.
3665
3666 @item obstack.h
3667 This is the GNU systemwide include file for manipulating obstacks.
3668 Since nobody is running under real GNU yet, we include this file.
3669
3670 @item read.h
3671 Macros and function headers for reading in source files.
3672
3673 @item struct-symbol.h
3674 Structure definition and macros for dealing with the gas
3675 internal form of a symbol.
3676
3677 @item subsegs.h
3678 structure definition for dealing with the numbered subsegments
3679 of the text and data segments.
3680
3681 @item symbols.h
3682 Macros and function headers for dealing with symbols.
3683
3684 @item write.h
3685 Structure for doing segment fixups.
3686 @end table
3687
3688 @comment ~subsection Test Directory
3689 @comment (Note: The test directory seems to have disappeared somewhere
3690 @comment along the line. If you want it, you'll probably have to find a
3691 @comment REALLY OLD dump tape~dots{})
3692 @comment
3693 @comment The ~file{test/} directory is used for regression testing.
3694 @comment After you modify ~@code{as}, you can get a quick go/nogo
3695 @comment confidence test by running the new ~@code{as} over the source
3696 @comment files in this directory. You use a shell script ~file{test/do}.
3697 @comment
3698 @comment The tests in this suite are evolving. They are not comprehensive.
3699 @comment They have, however, caught hundreds of bugs early in the debugging
3700 @comment cycle of ~@code{as}. Most test statements in this suite were naturally
3701 @comment selected: they were used to demonstrate actual ~@code{as} bugs rather
3702 @comment than being written ~i{a prioi}.
3703 @comment
3704 @comment Another testing suggestion: over 30 bugs have been found simply by
3705 @comment running examples from this manual through ~@code{as}.
3706 @comment Some examples in this manual are selected
3707 @comment to distinguish boundary conditions; they are good for testing ~@code{as}.
3708 @comment
3709 @comment ~subsubsection Regression Testing
3710 @comment Each regression test involves assembling a file and comparing the
3711 @comment actual output of ~@code{as} to ``known good'' output files. Both
3712 @comment the object file and the error/warning message file (stderr) are
3713 @comment inspected. Optionally ~@code{as}' exit status may be checked.
3714 @comment Discrepencies are reported. Each discrepency means either that
3715 @comment you broke some part of ~@code{as} or that the ``known good'' files
3716 @comment are now out of date and should be changed to reflect the new
3717 @comment definition of ``good''.
3718 @comment
3719 @comment Each regression test lives in its own directory, in a tree
3720 @comment rooted in the directory ~file{test/}. Each such directory
3721 @comment has a name ending in ~file{.ret}, where `ret' stands for
3722 @comment REgression Test. The ~file{.ret} ending allows ~code{find
3723 @comment (1)} to find all regression tests in the tree, without
3724 @comment needing to list them explicitly.
3725 @comment
3726 @comment Any ~file{.ret} directory must contain a file called
3727 @comment ~file{input} which is the source file to assemble. During
3728 @comment testing an object file ~file{output} is created, as well as
3729 @comment a file ~file{stdouterr} which contains the output to both
3730 @comment stderr and stderr. If there is a file ~file{output.good} in
3731 @comment the directory, and if ~file{output} contains exactly the
3732 @comment same data as ~file{output.good}, the file ~file{output} is
3733 @comment deleted. Likewise ~file{stdouterr} is removed if it exactly
3734 @comment matches a file ~file{stdouterr.good}. If file
3735 @comment ~file{status.good} is present, containing a decimal number
3736 @comment before a newline, the exit status of ~@code{as} is compared
3737 @comment to this number. If the status numbers are not equal, a file
3738 @comment ~file{status} is written to the directory, containing the
3739 @comment actual status as a decimal number followed by newline.
3740 @comment
3741 @comment Should any of the ~file{*.good} files fail to match their corresponding
3742 @comment actual files, this is noted by a 1-line message on the screen during
3743 @comment the regression test, and you can use ~@code{find (1)} to find any
3744 @comment files named ~file{status}, ~file {output} or ~file{stdouterr}.
3745 @comment
3746 @node Retargeting, License, Maintenance, Top
3747 @chapter Teaching the Assembler about a New Machine
3748
3749 This chapter describes the steps required in order to make the
3750 assembler work with another machine's assembly language. This
3751 chapter is not complete, and only describes the steps in the
3752 broadest terms. You should look at the source for the
3753 currently supported machine in order to discover some of the
3754 details that aren't mentioned here.
3755
3756 You should create a new file called @file{@var{machine}.c}, and
3757 add the appropriate lines to the file @file{Makefile} so that
3758 you can compile your new version of the assembler. This should
3759 be straighforward; simply add lines similar to the ones there
3760 for the four current versions of the assembler.
3761
3762 If you want to be compatible with GDB, (and the current
3763 machine-dependent versions of the assembler), you should create
3764 a file called @file{@var{machine}-opcode.h} which should
3765 contain all the information about the names of the machine
3766 instructions, their opcodes, and what addressing modes they
3767 support. If you do this right, the assembler and GDB can share
3768 this file, and you'll only have to write it once. Note that
3769 while you're writing @code{as}, you may want to use an
3770 independent program (if you have access to one), to make sure
3771 that @code{as} is emitting the correct bytes. Since @code{as}
3772 and @code{GDB} share the opcode table, an incorrect opcode
3773 table entry may make invalid bytes look OK when you disassemble
3774 them with @code{GDB}.
3775
3776 @section Functions You will Have to Write
3777
3778 Your file @file{@var{machine}.c} should contain definitions for
3779 the following functions and variables. It will need to include
3780 some header files in order to use some of the structures
3781 defined in the machine-independent part of the assembler. The
3782 needed header files are mentioned in the descriptions of the
3783 functions that will need them.
3784
3785 @table @code
3786
3787 @item long omagic;
3788 This long integer holds the value to place at the beginning of
3789 the @file{a.out} file. It is usually @samp{OMAGIC}, except on
3790 machines that store additional information in the magic-number.
3791
3792 @item char comment_chars[];
3793 This character array holds the values of the characters that
3794 start a comment anywhere in a line. Comments are stripped off
3795 automatically by the machine independent part of the
3796 assembler. Note that the @samp{/*} will always start a
3797 comment, and that only @samp{*/} will end a comment started by
3798 @samp{*/}.
3799
3800 @item char line_comment_chars[];
3801 This character array holds the values of the chars that start a
3802 comment only if they are the first (non-whitespace) character
3803 on a line. If the character @samp{#} does not appear in this
3804 list, you may get unexpected results. (Various
3805 machine-independent parts of the assembler treat the comments
3806 @samp{#APP} and @samp{#NO_APP} specially, and assume that lines
3807 that start with @samp{#} are comments.)
3808
3809 @item char EXP_CHARS[];
3810 This character array holds the letters that can separate the
3811 mantissa and the exponent of a floating point number. Typical
3812 values are @samp{e} and @samp{E}.
3813
3814 @item char FLT_CHARS[];
3815 This character array holds the letters that--when they appear
3816 immediately after a leading zero--indicate that a number is a
3817 floating-point number. (Sort of how 0x indicates that a
3818 hexadecimal number follows.)
3819
3820 @item pseudo_typeS md_pseudo_table[];
3821 (@var{pseudo_typeS} is defined in @file{md.h})
3822 This array contains a list of the machine_dependent directives
3823 the assembler must support. It contains the name of each
3824 pseudo op (Without the leading @samp{.}), a pointer to a
3825 function to be called when that directive is encountered, and
3826 an integer argument to be passed to that function.
3827
3828 @item void md_begin(void)
3829 This function is called as part of the assembler's
3830 initialization. It should do any initialization required by
3831 any of your other routines.
3832
3833 @item int md_parse_option(char **optionPTR, int *argcPTR, char ***argvPTR)
3834 This routine is called once for each option on the command line
3835 that the machine-independent part of @code{as} does not
3836 understand. This function should return non-zero if the option
3837 pointed to by @var{optionPTR} is a valid option. If it is not
3838 a valid option, this routine should return zero. The variables
3839 @var{argcPTR} and @var{argvPTR} are provided in case the option
3840 requires a filename or something similar as an argument. If
3841 the option is multi-character, @var{optionPTR} should be
3842 advanced past the end of the option, otherwise every letter in
3843 the option will be treated as a separate single-character
3844 option.
3845
3846 @item void md_assemble(char *string)
3847 This routine is called for every machine-dependent
3848 non-directive line in the source file. It does all the real
3849 work involved in reading the opcode, parsing the operands,
3850 etc. @var{string} is a pointer to a null-terminated string,
3851 that comprises the input line, with all excess whitespace and
3852 comments removed.
3853
3854 @item void md_number_to_chars(char *outputPTR,long value,int nbytes)
3855 This routine is called to turn a C long int, short int, or char
3856 into the series of bytes that represents that number on the
3857 target machine. @var{outputPTR} points to an array where the
3858 result should be stored; @var{value} is the value to store; and
3859 @var{nbytes} is the number of bytes in 'value' that should be
3860 stored.
3861
3862 @item void md_number_to_imm(char *outputPTR,long value,int nbytes)
3863 This routine is called to turn a C long int, short int, or char
3864 into the series of bytes that represent an immediate value on
3865 the target machine. It is identical to the function @code{md_number_to_chars},
3866 except on NS32K machines.@refill
3867
3868 @item void md_number_to_disp(char *outputPTR,long value,int nbytes)
3869 This routine is called to turn a C long int, short int, or char
3870 into the series of bytes that represent an displacement value on
3871 the target machine. It is identical to the function @code{md_number_to_chars},
3872 except on NS32K machines.@refill
3873
3874 @item void md_number_to_field(char *outputPTR,long value,int nbytes)
3875 This routine is identical to @code{md_number_to_chars},
3876 except on NS32K machines.
3877
3878 @item void md_ri_to_chars(struct relocation_info *riPTR,ri)
3879 (@code{struct relocation_info} is defined in @file{a.out.h})
3880 This routine emits the relocation info in @var{ri}
3881 in the appropriate bit-pattern for the target machine.
3882 The result should be stored in the location pointed
3883 to by @var{riPTR}. This routine may be a no-op unless you are
3884 attempting to do cross-assembly.
3885
3886 @item char *md_atof(char type,char *outputPTR,int *sizePTR)
3887 This routine turns a series of digits into the appropriate
3888 internal representation for a floating-point number.
3889 @var{type} is a character from @var{FLT_CHARS[]} that describes
3890 what kind of floating point number is wanted; @var{outputPTR}
3891 is a pointer to an array that the result should be stored in;
3892 and @var{sizePTR} is a pointer to an integer where the size (in
3893 bytes) of the result should be stored. This routine should
3894 return an error message, or an empty string (not (char *)0) for
3895 success.
3896
3897 @item int md_short_jump_size;
3898 This variable holds the (maximum) size in bytes of a short (16
3899 bit or so) jump created by @code{md_create_short_jump()}. This
3900 variable is used as part of the broken-word feature, and isn't
3901 needed if the assembler is compiled with
3902 @samp{-DWORKING_DOT_WORD}.
3903
3904 @item int md_long_jump_size;
3905 This variable holds the (maximum) size in bytes of a long (32
3906 bit or so) jump created by @code{md_create_long_jump()}. This
3907 variable is used as part of the broken-word feature, and isn't
3908 needed if the assembler is compiled with
3909 @samp{-DWORKING_DOT_WORD}.
3910
3911 @item void md_create_short_jump(char *resultPTR,long from_addr,
3912 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3913 This function emits a jump from @var{from_addr} to @var{to_addr} in
3914 the array of bytes pointed to by @var{resultPTR}. If this creates a
3915 type of jump that must be relocated, this function should call
3916 @code{fix_new()} with @var{frag} and @var{to_symbol}. The jump
3917 emitted by this function may be smaller than @var{md_short_jump_size},
3918 but it must never create a larger one.
3919 (If it creates a smaller jump, the extra bytes of memory will not be
3920 used.) This function is used as part of the broken-word feature,
3921 and isn't needed if the assembler is compiled with
3922 @samp{-DWORKING_DOT_WORD}.@refill
3923
3924 @item void md_create_long_jump(char *ptr,long from_addr,
3925 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3926 This function is similar to the previous function,
3927 @code{md_create_short_jump()}, except that it creates a long
3928 jump instead of a short one. This function is used as part of
3929 the broken-word feature, and isn't needed if the assembler is
3930 compiled with @samp{-DWORKING_DOT_WORD}.
3931
3932 @item int md_estimate_size_before_relax(fragS *fragPTR,int segment_type)
3933 This function does the initial setting up for relaxation. This
3934 includes forcing references to still-undefined symbols to the
3935 appropriate addressing modes.
3936
3937 @item relax_typeS md_relax_table[];
3938 (relax_typeS is defined in md.h)
3939 This array describes the various machine dependent states a
3940 frag may be in before relaxation. You will need one group of
3941 entries for each type of addressing mode you intend to relax.
3942
3943 @item void md_convert_frag(fragS *fragPTR)
3944 (@var{fragS} is defined in @file{as.h})
3945 This routine does the required cleanup after relaxation.
3946 Relaxation has changed the type of the frag to a type that can
3947 reach its destination. This function should adjust the opcode
3948 of the frag to use the appropriate addressing mode.
3949 @var{fragPTR} points to the frag to clean up.
3950
3951 @item void md_end(void)
3952 This function is called just before the assembler exits. It
3953 need not free up memory unless the operating system doesn't do
3954 it automatically on exit. (In which case you'll also have to
3955 track down all the other places where the assembler allocates
3956 space but never frees it.)
3957
3958 @end table
3959
3960 @section External Variables You will Need to Use
3961
3962 You will need to refer to or change the following external variables
3963 from within the machine-dependent part of the assembler.
3964
3965 @table @code
3966 @item extern char flagseen[];
3967 This array holds non-zero values in locations corresponding to
3968 the options that were on the command line. Thus, if the
3969 assembler was called with @samp{-W}, @var{flagseen['W']} would
3970 be non-zero.
3971
3972 @item extern fragS *frag_now;
3973 This pointer points to the current frag--the frag that bytes
3974 are currently being added to. If nothing else, you will need
3975 to pass it as an argument to various machine-independent
3976 functions. It is maintained automatically by the
3977 frag-manipulating functions; you should never have to change it
3978 yourself.
3979
3980 @item extern LITTLENUM_TYPE generic_bignum[];
3981 (@var{LITTLENUM_TYPE} is defined in @file{bignum.h}.
3982 This is where @dfn{bignums}--numbers larger than 32 bits--are
3983 returned when they are encountered in an expression. You will
3984 need to use this if you need to implement directives (or
3985 anything else) that must deal with these large numbers.
3986 @code{Bignums} are of @code{segT} @code{SEG_BIG} (defined in
3987 @file{as.h}, and have a positive @code{X_add_number}. The
3988 @code{X_add_number} of a @code{bignum} is the number of
3989 @code{LITTLENUMS} in @var{generic_bignum} that the number takes
3990 up.
3991
3992 @item extern FLONUM_TYPE generic_floating_point_number;
3993 (@var{FLONUM_TYPE} is defined in @file{flonum.h}.
3994 The is where @dfn{flonums}--floating-point numbers within
3995 expressions--are returned. @code{Flonums} are of @code{segT}
3996 @code{SEG_BIG}, and have a negative @code{X_add_number}.
3997 @code{Flonums} are returned in a generic format. You will have
3998 to write a routine to turn this generic format into the
3999 appropriate floating-point format for your machine.
4000
4001 @item extern int need_pass_2;
4002 If this variable is non-zero, the assembler has encountered an
4003 expression that cannot be assembled in a single pass. Since
4004 the second pass isn't implemented, this flag means that the
4005 assembler is punting, and is only looking for additional syntax
4006 errors. (Or something like that.)
4007
4008 @item extern segT now_seg;
4009 This variable holds the value of the segment the assembler is
4010 currently assembling into.
4011
4012 @end table
4013
4014 @section External functions will you need
4015
4016 You will find the following external functions useful (or
4017 indispensable) when you're writing the machine-dependent part
4018 of the assembler.
4019
4020 @table @code
4021
4022 @item char *frag_more(int bytes)
4023 This function allocates @var{bytes} more bytes in the current
4024 frag (or starts a new frag, if it can't expand the current frag
4025 any more.) for you to store some object-file bytes in. It
4026 returns a pointer to the bytes, ready for you to store data in.
4027
4028 @item void fix_new(fragS *frag, int where, short size, symbolS *add_symbol, symbolS *sub_symbol, long offset, int pcrel)
4029 This function stores a relocation fixup to be acted on later.
4030 @var{frag} points to the frag the relocation belongs in;
4031 @var{where} is the location within the frag where the relocation begins;
4032 @var{size} is the size of the relocation, and is usually 1 (a single byte),
4033 2 (sixteen bits), or 4 (a longword).
4034 The value @var{add_symbol} @minus{} @var{sub_symbol} + @var{offset}, is added to the byte(s)
4035 at _0__@var{frag->literal[where]}_1__. If @var{pcrel} is non-zero, the address of the
4036 location is subtracted from the result. A relocation entry is also added
4037 to the @file{a.out} file. @var{add_symbol}, @var{sub_symbol}, and/or
4038 @var{offset} may be NULL.@refill
4039
4040 @item char *frag_var(relax_stateT type, int max_chars, int var,
4041 @code{relax_substateT subtype, symbolS *symbol, char *opcode)}
4042 This function creates a machine-dependent frag of type @var{type}
4043 (usually @code{rs_machine_dependent}).
4044 @var{max_chars} is the maximum size in bytes that the frag may grow by;
4045 @var{var} is the current size of the variable end of the frag;
4046 @var{subtype} is the sub-type of the frag. The sub-type is used to index into
4047 @var{md_relax_table[]} during @code{relaxation}.
4048 @var{symbol} is the symbol whose value should be used to when relax-ing this frag.
4049 @var{opcode} points into a byte whose value may have to be modified if the
4050 addressing mode used by this frag changes. It typically points into the
4051 @var{fr_literal[]} of the previous frag, and is used to point to a location
4052 that @code{md_convert_frag()}, may have to change.@refill
4053
4054 @item void frag_wane(fragS *fragPTR)
4055 This function is useful from within @code{md_convert_frag}. It
4056 changes a frag to type rs_fill, and sets the variable-sized
4057 piece of the frag to zero. The frag will never change in size
4058 again.
4059
4060 @item segT expression(expressionS *retval)
4061 (@var{segT} is defined in @file{as.h}; @var{expressionS} is defined in @file{expr.h})
4062 This function parses the string pointed to by the external char
4063 pointer @var{input_line_pointer}, and returns the segment-type
4064 of the expression. It also stores the results in the
4065 @var{expressionS} pointed to by @var{retval}.
4066 @var{input_line_pointer} is advanced to point past the end of
4067 the expression. (@var{input_line_pointer} is used by other
4068 parts of the assembler. If you modify it, be sure to restore
4069 it to its original value.)
4070
4071 @item as_warn(char *message,@dots{})
4072 If warning messages are disabled, this function does nothing.
4073 Otherwise, it prints out the current file name, and the current
4074 line number, then uses @code{fprintf} to print the
4075 @var{message} and any arguments it was passed.
4076
4077 @item as_bad(char *message,@dots{})
4078 This function should be called when @code{as} encounters
4079 conditions that are bad enough that @code{as} should not
4080 produce an object file, but should continue reading input and
4081 printing warning and bad error messages.
4082
4083 @item as_fatal(char *message,@dots{})
4084 This function prints out the current file name and line number,
4085 prints the word @samp{FATAL:}, then uses @code{fprintf} to
4086 print the @var{message} and any arguments it was passed. Then
4087 the assembler exits. This function should only be used for
4088 serious, unrecoverable errors.
4089
4090 @item void float_const(int float_type)
4091 This function reads floating-point constants from the current
4092 input line, and calls @code{md_atof} to assemble them. It is
4093 useful as the function to call for the directives
4094 @samp{.single}, @samp{.double}, @samp{.float}, etc.
4095 @var{float_type} must be a character from @var{FLT_CHARS}.
4096
4097 @item void demand_empty_rest_of_line(void);
4098 This function can be used by machine-dependent directives to
4099 make sure the rest of the input line is empty. It prints a
4100 warning message if there are additional characters on the line.
4101
4102 @item long int get_absolute_expression(void)
4103 This function can be used by machine-dependent directives to
4104 read an absolute number from the current input line. It
4105 returns the result. If it isn't given an absolute expression,
4106 it prints a warning message and returns zero.
4107
4108 @end table
4109
4110
4111 @section The concept of Frags
4112
4113 This assembler works to optimize the size of certain addressing
4114 modes. (e.g. branch instructions) This means the size of many
4115 pieces of object code cannot be determined until after assembly
4116 is finished. (This means that the addresses of symbols cannot be
4117 determined until assembly is finished.) In order to do this,
4118 @code{as} stores the output bytes as @dfn{frags}.
4119
4120 Here is the definition of a frag (from @file{as.h})
4121 @example
4122 struct frag
4123 @{
4124 long int fr_fix;
4125 long int fr_var;
4126 relax_stateT fr_type;
4127 relax_substateT fr_substate;
4128 unsigned long fr_address;
4129 long int fr_offset;
4130 struct symbol *fr_symbol;
4131 char *fr_opcode;
4132 struct frag *fr_next;
4133 char fr_literal[];
4134 @}
4135 @end example
4136
4137 @table @var
4138 @item fr_fix
4139 is the size of the fixed-size piece of the frag.
4140
4141 @item fr_var
4142 is the maximum (?) size of the variable-sized piece of the frag.
4143
4144 @item fr_type
4145 is the type of the frag.
4146 Current types are:
4147 rs_fill
4148 rs_align
4149 rs_org
4150 rs_machine_dependent
4151
4152 @item fr_substate
4153 This stores the type of machine-dependent frag this is. (what
4154 kind of addressing mode is being used, and what size is being
4155 tried/will fit/etc.
4156
4157 @item fr_address
4158 @var{fr_address} is only valid after relaxation is finished.
4159 Before relaxation, the only way to store an address is (pointer
4160 to frag containing the address) plus (offset into the frag).
4161
4162 @item fr_offset
4163 This contains a number, whose meaning depends on the type of
4164 the frag.
4165 for machine_dependent frags, this contains the offset from
4166 fr_symbol that the frag wants to go to. Thus, for branch
4167 instructions it is usually zero. (unless the instruction was
4168 @samp{jba foo+12} or something like that.)
4169
4170 @item fr_symbol
4171 for machine_dependent frags, this points to the symbol the frag
4172 needs to reach.
4173
4174 @item fr_opcode
4175 This points to the location in the frag (or in a previous frag)
4176 of the opcode for the instruction that caused this to be a frag.
4177 @var{fr_opcode} is needed if the actual opcode must be changed
4178 in order to use a different form of the addressing mode.
4179 (For example, if a conditional branch only comes in size tiny,
4180 a large-size branch could be implemented by reversing the sense
4181 of the test, and turning it into a tiny branch over a large jump.
4182 This would require changing the opcode.)
4183
4184 @var{fr_literal} is a variable-size array that contains the
4185 actual object bytes. A frag consists of a fixed size piece of
4186 object data, (which may be zero bytes long), followed by a
4187 piece of object data whose size may not have been determined
4188 yet. Other information includes the type of the frag (which
4189 controls how it is relaxed),
4190
4191 @item fr_next
4192 This is the next frag in the singly-linked list. This is
4193 usually only needed by the machine-independent part of
4194 @code{as}.
4195
4196 @end table
4197 @end ignore
4198
4199 @node License, , Retargeting, Top
4200 @unnumbered GNU GENERAL PUBLIC LICENSE
4201 @center Version 1, February 1989
4202
4203 @display
4204 Copyright @copyright{} 1989 Free Software Foundation, Inc.
4205 675 Mass Ave, Cambridge, MA 02139, USA
4206
4207 Everyone is permitted to copy and distribute verbatim copies
4208 of this license document, but changing it is not allowed.
4209 @end display
4210
4211 @unnumberedsec Preamble
4212
4213 The license agreements of most software companies try to keep users
4214 at the mercy of those companies. By contrast, our General Public
4215 License is intended to guarantee your freedom to share and change free
4216 software---to make sure the software is free for all its users. The
4217 General Public License applies to the Free Software Foundation's
4218 software and to any other program whose authors commit to using it.
4219 You can use it for your programs, too.
4220
4221 When we speak of free software, we are referring to freedom, not
4222 price. Specifically, the General Public License is designed to make
4223 sure that you have the freedom to give away or sell copies of free
4224 software, that you receive source code or can get it if you want it,
4225 that you can change the software or use pieces of it in new free
4226 programs; and that you know you can do these things.
4227
4228 To protect your rights, we need to make restrictions that forbid
4229 anyone to deny you these rights or to ask you to surrender the rights.
4230 These restrictions translate to certain responsibilities for you if you
4231 distribute copies of the software, or if you modify it.
4232
4233 For example, if you distribute copies of a such a program, whether
4234 gratis or for a fee, you must give the recipients all the rights that
4235 you have. You must make sure that they, too, receive or can get the
4236 source code. And you must tell them their rights.
4237
4238 We protect your rights with two steps: (1) copyright the software, and
4239 (2) offer you this license which gives you legal permission to copy,
4240 distribute and/or modify the software.
4241
4242 Also, for each author's protection and ours, we want to make certain
4243 that everyone understands that there is no warranty for this free
4244 software. If the software is modified by someone else and passed on, we
4245 want its recipients to know that what they have is not the original, so
4246 that any problems introduced by others will not reflect on the original
4247 authors' reputations.
4248
4249 The precise terms and conditions for copying, distribution and
4250 modification follow.
4251
4252 @iftex
4253 @unnumberedsec TERMS AND CONDITIONS
4254 @end iftex
4255 @ifinfo
4256 @center TERMS AND CONDITIONS
4257 @end ifinfo
4258
4259 @enumerate
4260 @item
4261 This License Agreement applies to any program or other work which
4262 contains a notice placed by the copyright holder saying it may be
4263 distributed under the terms of this General Public License. The
4264 ``Program'', below, refers to any such program or work, and a ``work based
4265 on the Program'' means either the Program or any work containing the
4266 Program or a portion of it, either verbatim or with modifications. Each
4267 licensee is addressed as ``you''.
4268
4269 @item
4270 You may copy and distribute verbatim copies of the Program's source
4271 code as you receive it, in any medium, provided that you conspicuously and
4272 appropriately publish on each copy an appropriate copyright notice and
4273 disclaimer of warranty; keep intact all the notices that refer to this
4274 General Public License and to the absence of any warranty; and give any
4275 other recipients of the Program a copy of this General Public License
4276 along with the Program. You may charge a fee for the physical act of
4277 transferring a copy.
4278
4279 @item
4280 You may modify your copy or copies of the Program or any portion of
4281 it, and copy and distribute such modifications under the terms of Paragraph
4282 1 above, provided that you also do the following:
4283
4284 @itemize @bullet
4285 @item
4286 cause the modified files to carry prominent notices stating that
4287 you changed the files and the date of any change; and
4288
4289 @item
4290 cause the whole of any work that you distribute or publish, that
4291 in whole or in part contains the Program or any part thereof, either
4292 with or without modifications, to be licensed at no charge to all
4293 third parties under the terms of this General Public License (except
4294 that you may choose to grant warranty protection to some or all
4295 third parties, at your option).
4296
4297 @item
4298 If the modified program normally reads commands interactively when
4299 run, you must cause it, when started running for such interactive use
4300 in the simplest and most usual way, to print or display an
4301 announcement including an appropriate copyright notice and a notice
4302 that there is no warranty (or else, saying that you provide a
4303 warranty) and that users may redistribute the program under these
4304 conditions, and telling the user how to view a copy of this General
4305 Public License.
4306
4307 @item
4308 You may charge a fee for the physical act of transferring a
4309 copy, and you may at your option offer warranty protection in
4310 exchange for a fee.
4311 @end itemize
4312
4313 Mere aggregation of another independent work with the Program (or its
4314 derivative) on a volume of a storage or distribution medium does not bring
4315 the other work under the scope of these terms.
4316
4317 @item
4318 You may copy and distribute the Program (or a portion or derivative of
4319 it, under Paragraph 2) in object code or executable form under the terms of
4320 Paragraphs 1 and 2 above provided that you also do one of the following:
4321
4322 @itemize @bullet
4323 @item
4324 accompany it with the complete corresponding machine-readable
4325 source code, which must be distributed under the terms of
4326 Paragraphs 1 and 2 above; or,
4327
4328 @item
4329 accompany it with a written offer, valid for at least three
4330 years, to give any third party free (except for a nominal charge
4331 for the cost of distribution) a complete machine-readable copy of the
4332 corresponding source code, to be distributed under the terms of
4333 Paragraphs 1 and 2 above; or,
4334
4335 @item
4336 accompany it with the information you received as to where the
4337 corresponding source code may be obtained. (This alternative is
4338 allowed only for noncommercial distribution and only if you
4339 received the program in object code or executable form alone.)
4340 @end itemize
4341
4342 Source code for a work means the preferred form of the work for making
4343 modifications to it. For an executable file, complete source code means
4344 all the source code for all modules it contains; but, as a special
4345 exception, it need not include source code for modules which are standard
4346 libraries that accompany the operating system on which the executable
4347 file runs, or for standard header files or definitions files that
4348 accompany that operating system.
4349
4350 @item
4351 You may not copy, modify, sublicense, distribute or transfer the
4352 Program except as expressly provided under this General Public License.
4353 Any attempt otherwise to copy, modify, sublicense, distribute or transfer
4354 the Program is void, and will automatically terminate your rights to use
4355 the Program under this License. However, parties who have received
4356 copies, or rights to use copies, from you under this General Public
4357 License will not have their licenses terminated so long as such parties
4358 remain in full compliance.
4359
4360 @item
4361 By copying, distributing or modifying the Program (or any work based
4362 on the Program) you indicate your acceptance of this license to do so,
4363 and all its terms and conditions.
4364
4365 @item
4366 Each time you redistribute the Program (or any work based on the
4367 Program), the recipient automatically receives a license from the original
4368 licensor to copy, distribute or modify the Program subject to these
4369 terms and conditions. You may not impose any further restrictions on the
4370 recipients' exercise of the rights granted herein.
4371
4372 @item
4373 The Free Software Foundation may publish revised and/or new versions
4374 of the General Public License from time to time. Such new versions will
4375 be similar in spirit to the present version, but may differ in detail to
4376 address new problems or concerns.
4377
4378 Each version is given a distinguishing version number. If the Program
4379 specifies a version number of the license which applies to it and ``any
4380 later version'', you have the option of following the terms and conditions
4381 either of that version or of any later version published by the Free
4382 Software Foundation. If the Program does not specify a version number of
4383 the license, you may choose any version ever published by the Free Software
4384 Foundation.
4385
4386 @item
4387 If you wish to incorporate parts of the Program into other free
4388 programs whose distribution conditions are different, write to the author
4389 to ask for permission. For software which is copyrighted by the Free
4390 Software Foundation, write to the Free Software Foundation; we sometimes
4391 make exceptions for this. Our decision will be guided by the two goals
4392 of preserving the free status of all derivatives of our free software and
4393 of promoting the sharing and reuse of software generally.
4394
4395 @iftex
4396 @heading NO WARRANTY
4397 @end iftex
4398 @ifinfo
4399 @center NO WARRANTY
4400 @end ifinfo
4401
4402 @item
4403 BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
4404 FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
4405 OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
4406 PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
4407 OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
4408 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
4409 TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
4410 PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
4411 REPAIR OR CORRECTION.
4412
4413 @item
4414 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL
4415 ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
4416 REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
4417 INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
4418 ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT
4419 LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
4420 SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
4421 WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
4422 ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
4423 @end enumerate
4424
4425 @iftex
4426 @heading END OF TERMS AND CONDITIONS
4427 @end iftex
4428 @ifinfo
4429 @center END OF TERMS AND CONDITIONS
4430 @end ifinfo
4431
4432 @page
4433 @unnumberedsec How to Apply These Terms to Your New Programs
4434
4435 If you develop a new program, and you want it to be of the greatest
4436 possible use to humanity, the best way to achieve this is to make it
4437 free software which everyone can redistribute and change under these
4438 terms.
4439
4440 To do so, attach the following notices to the program. It is safest to
4441 attach them to the start of each source file to most effectively convey
4442 the exclusion of warranty; and each file should have at least the
4443 ``copyright'' line and a pointer to where the full notice is found.
4444
4445 @smallexample
4446 @var{one line to give the program's name and a brief idea of what it does.}
4447 Copyright (C) 19@var{yy} @var{name of author}
4448
4449 This program is free software; you can redistribute it and/or modify
4450 it under the terms of the GNU General Public License as published by
4451 the Free Software Foundation; either version 1, or (at your option)
4452 any later version.
4453
4454 This program is distributed in the hope that it will be useful,
4455 but WITHOUT ANY WARRANTY; without even the implied warranty of
4456 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
4457 GNU General Public License for more details.
4458
4459 You should have received a copy of the GNU General Public License
4460 along with this program; if not, write to the Free Software
4461 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
4462 @end smallexample
4463
4464 Also add information on how to contact you by electronic and paper mail.
4465
4466 If the program is interactive, make it output a short notice like this
4467 when it starts in an interactive mode:
4468
4469 @smallexample
4470 Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
4471 Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
4472 This is free software, and you are welcome to redistribute it
4473 under certain conditions; type `show c' for details.
4474 @end smallexample
4475
4476 The hypothetical commands `show w' and `show c' should show the
4477 appropriate parts of the General Public License. Of course, the
4478 commands you use may be called something other than `show w' and `show
4479 c'; they could even be mouse-clicks or menu items---whatever suits your
4480 program.
4481
4482 You should also get your employer (if you work as a programmer) or your
4483 school, if any, to sign a ``copyright disclaimer'' for the program, if
4484 necessary. Here is a sample; alter the names:
4485
4486 @smallexample
4487 Yoyodyne, Inc., hereby disclaims all copyright interest in the
4488 program `Gnomovision' (a program to direct compilers to make passes
4489 at assemblers) written by James Hacker.
4490
4491 @var{signature of Ty Coon}, 1 April 1989
4492 Ty Coon, President of Vice
4493 @end smallexample
4494
4495 That's all there is to it!
4496
4497
4498 @summarycontents
4499 @contents
4500 @bye