gas/doc/as.texinfo

   1 \input texinfo
   2 @c @tex
   3 @c \special{twoside}
   4 @c @end tex
   5 @setfilename as
   6 @synindex ky cp
   7 @ifinfo
   8 This file documents the GNU Assembler "as".
   9
  10 Copyright (C) 1991 Free Software Foundation, Inc.
  11
  12 Permission is granted to make and distribute verbatim copies of
  13 this manual provided the copyright notice and this permission notice
  14 are preserved on all copies.
  15
  16 @ignore
  17 Permission is granted to process this file through Tex and print the
  18 results, provided the printed document carries copying permission
  19 notice identical to this one except for the removal of this paragraph
  20 (this paragraph not being relevant to the printed manual).
  21
  22 @end ignore
  23 Permission is granted to copy and distribute modified versions of this
  24 manual under the conditions for verbatim copying, provided also that the
  25 section entitled ``GNU General Public License'' is included exactly as
  26 in the original, and provided that the entire resulting derived work is
  27 distributed under the terms of a permission notice identical to this
  28 one.
  29
  30 Permission is granted to copy and distribute translations of this manual
  31 into another language, under the above conditions for modified versions,
  32 except that the section entitled ``GNU General Public License'' may be
  33 included in a translation approved by the author instead of in the
  34 original English.
  35 @end ifinfo
  36 @tex
  37 @finalout
  38 @end tex
  39
  40 @setchapternewpage odd
  41 @c if m680x0
  42 @c @settitle Using GNU as (680x0)
  43 @c fi m680x0
  44 @c if am29k
  45 @settitle Using GNU as (AMD 29K)
  46 @c fi am29k
  47 @titlepage
  48 @title{Using GNU as}
  49 @subtitle{The GNU Assembler}
  50 @c if m680x0
  51 @c @subtitle{for Motorola 680x0}
  52 @c fi m680x0
  53 @c if am29k
  54 @subtitle{for the AMD 29K family}
  55 @c fi am29k
  56 @sp 1
  57 @subtitle February 1991
  58 @sp 13
  59 The Free Software Foundation Inc.  thanks The Nice Computer
  60 Company of Australia for loaning Dean Elsner to write the
  61 first (Vax) version of @code{as} for Project GNU.
  62 The proprietors, management and staff of TNCCA thank FSF for
  63 distracting the boss while they got some work
  64 done.
  65 @sp 3
  66 @author{Dean Elsner, Jay Fenlason & friends}
  67 @author{revised by Roland Pesch for Cygnus Support}
  68 @c pesch@cygnus.com
  69 @page
  70 @tex
  71 \def\$#1${{#1}}  % Kluge: collect RCS revision info without $...$
  72 \xdef\manvers{\$Revision$}  % For use in headers, footers too
  73 {\parskip=0pt
  74 \hfill Cygnus Support\par
  75 \hfill \manvers\par
  76 \hfill \TeX{}info \texinfoversion\par
  77 }
  78 %"boxit" macro for figures:
  79 %Modified from Knuth's ``boxit'' macro from TeXbook (answer to exercise 21.3)
  80 \gdef\boxit#1#2{\vbox{\hrule\hbox{\vrule\kern3pt
  81      \vbox{\parindent=0pt\parskip=0pt\hsize=#1\kern3pt\strut\hfil
  82 #2\hfil\strut\kern3pt}\kern3pt\vrule}\hrule}}%box with visible outline
  83 \gdef\ibox#1#2{\hbox to #1{#2\hfil}\kern8pt}% invisible box
  84 @end tex
  85
  86 @vskip 0pt plus 1filll
  87 Copyright @copyright{} 1991 Free Software Foundation, Inc.
  88
  89 Permission is granted to make and distribute verbatim copies of
  90 this manual provided the copyright notice and this permission notice
  91 are preserved on all copies.
  92
  93 Permission is granted to copy and distribute modified versions of this
  94 manual under the conditions for verbatim copying, provided also that the
  95 section entitled ``GNU General Public License'' is included exactly as
  96 in the original, and provided that the entire resulting derived work is
  97 distributed under the terms of a permission notice identical to this
  98 one.
  99
 100 Permission is granted to copy and distribute translations of this manual
 101 into another language, under the above conditions for modified versions,
 102 except that the section entitled ``GNU General Public License'' may be
 103 included in a translation approved by the author instead of in the
 104 original English.
 105 @end titlepage
 106 @page
 107
 108 @node Top, Overview, (dir), (dir)
 109
 110 @menu
 111 * Overview::                    Overview
 112 * Syntax::                      Syntax
 113 * Segments::                    Segments and Relocation
 114 * Symbols::                     Symbols
 115 * Expressions::                 Expressions
 116 * Pseudo Ops::                  Assembler Directives
 117 * Maintenance::                 Maintaining the Assembler
 118 * Retargeting::                 Teaching the Assembler about a New Machine
 119 * License::                     GNU GENERAL PUBLIC LICENSE
 120
 121  --- The Detailed Node Listing ---
 122
 123 Overview
 124
 125 * Invoking::                    Invoking @code{as}
 126 * Manual::                      Structure of this Manual
 127 * GNU Assembler::               as, the GNU Assembler
 128 * Command Line::                Command Line
 129 * Input Files::                 Input Files
 130 * Object::                      Output (Object) File
 131 * Errors::                      Error and Warning Messages
 132 * Options::                     Options
 133
 134 Input Files
 135
 136 * Filenames::                   Input Filenames and Line-numbers
 137
 138 Syntax
 139
 140 * Pre-processing::              Pre-processing
 141 * Whitespace::                  Whitespace
 142 * Comments::                    Comments
 143 * Symbol Intro::                Symbols
 144 * Statements::                  Statements
 145 * Constants::                   Constants
 146
 147 Constants
 148
 149 * Characters::                  Character Constants
 150 * Numbers::                     Number Constants
 151
 152 Character Constants
 153
 154 * Strings::                     Strings
 155 * Chars::                       Characters
 156
 157 Segments and Relocation
 158
 159 * Segs Background::             Background
 160 * ld Segments::                 ld Segments
 161 * as Segments::                 as Internal Segments
 162 * Sub-Segments::                Sub-Segments
 163 * bss::                         bss Segment
 164
 165 Segments and Relocation
 166
 167 * ld Segments::                 ld Segments
 168 * as Segments::                 as Internal Segments
 169 * Sub-Segments::                Sub-Segments
 170 * bss::                         bss Segment
 171
 172 Symbols
 173
 174 * Labels::                      Labels
 175 * Setting Symbols::             Giving Symbols Other Values
 176 * Symbol Names::                Symbol Names
 177 * Dot::                         The Special Dot Symbol
 178 * Symbol Attributes::           Symbol Attributes
 179
 180 Symbol Names
 181
 182 * Local Symbols::               Local Symbol Names
 183
 184 Symbol Attributes
 185
 186 * Symbol Value::                Value
 187 * Symbol Type::                 Type
 188 * Symbol Desc::                 Descriptor
 189 * Symbol Other::                Other
 190
 191 Expressions
 192
 193 * Empty Exprs::                 Empty Expressions
 194 * Integer Exprs::               Integer Expressions
 195
 196 Integer Expressions
 197
 198 * Arguments::                   Arguments
 199 * Operators::                   Operators
 200 * Prefix Ops::                  Prefix Operators
 201 * Infix Ops::                   Infix Operators
 202
 203 Assembler Directives
 204
 205 * Abort::                       The Abort directive causes as to abort
 206 * Align::                       Pad the location counter to a power of 2
 207 * App-File::                    Set the logical file name
 208 * Ascii::                       Fill memory with bytes of ASCII characters
 209 * Asciz::                       Fill memory with bytes of ASCII characters followed
 210                 by a null.
 211 * Byte::                        Fill memory with 8-bit integers
 212 * Comm::                        Reserve public space in the BSS segment
 213 * Data::                        Change to the data segment
 214 * Desc::                        Set the n_desc of a symbol
 215 * Double::                      Fill memory with double-precision floating-point numbers
 216 * Else::                        @code{.else}
 217 * End::                         @code{.end}
 218 * Endif::                       @code{.endif}
 219 * Equ::                         @code{.equ @var{symbol}, @var{expression}}
 220 * Extern::                      @code{.extern}
 221 * Fill::                        Fill memory with repeated values
 222 * Float::                       Fill memory with single-precision floating-point numbers
 223 * Global::                      Make a symbol visible to the linker
 224 * Ident::                       @code{.ident}
 225 * If::                          @code{.if @var{absolute expression}}
 226 * Include::                     @code{.include "@var{file}"}
 227 * Int::                         Fill memory with 32-bit integers
 228 * Lcomm::                       Reserve private space in the BSS segment
 229 * Line::                        Set the logical line number
 230 * Ln::                          @code{.ln @var{line-number}}
 231 * List::                        @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
 232 * Long::                        Fill memory with 32-bit integers
 233 * Lsym::                        Create a local symbol
 234 * Octa::                        Fill memory with 128-bit integers
 235 * Org::                         Change the location counter
 236 * Quad::                        Fill memory with 64-bit integers
 237 * Set::                         Set the value of a symbol
 238 * Short::                       Fill memory with 16-bit integers
 239 * Single::                      @code{.single @var{flonums}}
 240 * Stab::                        Store debugging information
 241 * Text::                        Change to the text segment
 242 @c if am29k or sparc
 243 * Word::                        Fill memory with 32-bit integers
 244 @c else (not am29k or sparc)
 245 * Deprecated::                  Deprecated Directives
 246 * Machine Options::             Options
 247 * Machine Syntax::              Syntax
 248 * Floating Point::              Floating Point
 249 * Machine Directives::          Machine Directives
 250 * Opcodes::                     Opcodes
 251
 252 Machine Directives
 253
 254 * block::                       @code{.block @var{size} , @var{fill}}
 255 * cputype::                     @code{.cputype}
 256 * file::                        @code{.file}
 257 * hword::                       @code{.hword @var{expressions}}
 258 * line::                        @code{.line}
 259 * reg::                         @code{.reg @var{symbol}, @var{expression}}
 260 * sect::                        @code{.sect}
 261 * use::                         @code{.use @var{segment name}}
 262 @end menu
 263
 264 @node Overview, Syntax, Top, Top
 265 @chapter Overview
 266
 267 This manual is a user guide to the GNU assembler @code{as}.
 268 @c pesch@cygnus.com:
 269 @c                   The following should be conditional on machine config
 270 @c if 680x0
 271 @c This version of the manual describes @code{as} configured to generate
 272 @c code for Motorola 680x0 architectures.
 273 @c fi 680x0
 274 @c if am29k
 275 This version of the manual describes @code{as} configured to generate
 276 code for Advanced Micro Devices' 29K architectures.
 277 @c fi am29k
 278
 279 @menu
 280 * Invoking::                    Invoking @code{as}
 281 * Manual::                      Structure of this Manual
 282 * GNU Assembler::               as, the GNU Assembler
 283 * Command Line::                Command Line
 284 * Input Files::                 Input Files
 285 * Object::                      Output (Object) File
 286 * Errors::                      Error and Warning Messages
 287 * Options::                     Options
 288 @end menu
 289
 290 @node Invoking, Manual, Overview, Overview
 291 @section Invoking @code{as}
 292
 293 Here is a brief summary of how to invoke GNU @code{as}.  For details,
 294 @pxref{Options}.
 295
 296 @c We don't use @deffn and friends for the following because they seem
 297 @c to be limited to one line for the header.
 298 @example
 299   as [ -D ] [ -f ] [ -I @var{path} ] [ -k ] [ -L ] [ -o @var{objfile} ] [ -R ] [ -v ] [ -w ]
 300 @c if 680x0
 301 @c    [ -l ] [ -mc68000 | -mc68010 | -mc68020 ]
 302 @c fi 680x0
 303 @c if am29k
 304 @c@c am29k has no machine-dependent assembler options
 305 @c fi am29k
 306    [ -- | @var{files} @dots{} ]
 307 @end example
 308
 309 @table @code
 310
 311 @item -D
 312 This option is accepted only for script compatibility with calls to
 313 other assemblers; it has no effect on GNU @code{as}.
 314
 315 @item -f
 316 ``fast''---skip preprocessing (assume source is compiler output)
 317
 318 @item -I @var{path}
 319 Add @var{path} to the search list for @code{.include} directives
 320
 321 @item -k
 322 @c if am29k
 323 This option is accepted but has no effect on the 29K family.
 324 @c fi am29k
 325 @c if not am29k
 326 @c Issue warnings when difference tables altered for long displacements
 327 @c fi not am29k
 328
 329 @item -L
 330 Keep (in symbol table) local symbols, starting with @samp{L}
 331
 332 @item -o @var{objfile}
 333 Name the object-file output from @code{as}
 334
 335 @item -R
 336 Fold data segment into text segment
 337
 338 @item -W
 339 Suppress warning messages
 340
 341 @c if 680x0
 342 @c @item -l
 343 @c Shorten references to undefined symbols, to one word instead of two
 344 @c
 345 @c @item -mc68000 | -mc68010 | -mc68020
 346 @c Specify what processor in the 68000 family is the target (default 68020)
 347 @c fi 680x0
 348
 349 @item -- | @var{files} @dots{}
 350 Source files to assemble, or standard input
 351 @end table
 352
 353 @node Manual, GNU Assembler, Invoking, Overview
 354 @section Structure of this Manual
 355 This document is intended to describe what you need to know to use GNU
 356 @code{as}.  We cover the syntax expected in source files, including
 357 notation for symbols, constants, and expressions; the directives that
 358 @code{as} understands; and of course how to invoke @code{as}.
 359
 360 @c if 680x0
 361 @c We also cover special features in the 68000 configuration of @code{as},
 362 @c including pseudo-operations.
 363 @c fi 680x0
 364 @c if am29k
 365 We also cover special features in the AMD 29K configuration of @code{as},
 366 including assembler directives.
 367 @c fi am29k
 368
 369 @ignore
 370   This document also describes some of the
 371 machine-dependent features of various flavors of the assembler.
 372 This document also describes how the assembler works internally, and
 373 provides some information that may be useful to people attempting to
 374 port the assembler to another machine.
 375 @end ignore
 376
 377 On the other hand, this manual is @emph{not} intended as an introduction
 378 to programming in assembly language---let alone programming in general!
 379 In a similar vein, we make no attempt to introduce the machine
 380 architecture; we do @emph{not} describe the instruction set, standard
 381 mnemonics, registers or addressing modes that are standard to a
 382 particular architecture.  You may want to consult the manufacturer's
 383 machine architecture manual for this information.
 384
 385
 386 @c I think this is premature---pesch@cygnus.com, 17jan1991
 387 @ignore
 388 Throughout this document, we assume that you are running @dfn{GNU},
 389 the portable operating system from the @dfn{Free Software
 390 Foundation, Inc.}.  This restricts our attention to certain kinds of
 391 computer (in particular, the kinds of computers that GNU can run on);
 392 once this assumption is granted examples and definitions need less
 393 qualification.
 394
 395 @code{as} is part of a team of programs that turn a high-level
 396 human-readable series of instructions into a low-level
 397 computer-readable series of instructions.  Different versions of
 398 @code{as} are used for different kinds of computer.  In particular,
 399 at the moment, @code{as} only works for the DEC Vax, the Motorola
 400 680x0, the Intel 80386, the Sparc, and the National Semiconductor
 401 32032/32532.
 402 @end ignore
 403
 404 @c There used to be a section "Terminology" here, which defined
 405 @c "contents", "byte", "word", and "long".  Defining "word" to any
 406 @c particular size is confusing when the .word directive may generate 16
 407 @c bits on one machine and 32 bits on another; in general, for the user
 408 @c version of this manual, none of these terms seem essential to define.
 409 @c They were used very little even in the former draft of the manual;
 410 @c this draft makes an effort to avoid them (except in names of
 411 @c directives).
 412
 413 @node GNU Assembler, Command Line, Manual, Overview
 414 @section as, the GNU Assembler
 415 @code{as} is primarily intended to assemble the output of the GNU C
 416 compiler @code{gcc} for use by the linker @code{ld}.  Nevertheless,
 417 we've tried to make @code{as} assemble correctly everything that the native
 418 assembler would.
 419 @c if not am29k
 420 @ignore
 421 Any exceptions are documented explicitly (@pxref{Machine Dependent}).
 422 @end ignore
 423 @c fi not am29k
 424 This doesn't mean @code{as} always uses the same syntax as another
 425 assembler for the same architecture; for example, we know of several
 426 incompatible versions of 680x0 assembly language syntax.
 427
 428 GNU @code{as} is really a family of assemblers.  If you use (or have
 429 used) GNU @code{as} on another architecture, you should find a fairly
 430 similar environment.  Each version has much in common with the others,
 431 including object file formats, most assembler directives (often called
 432 @dfn{pseudo-ops)} and assembler syntax.
 433
 434 Unlike older assemblers, @code{as} is designed to assemble a source
 435 program in one pass of the source file.  This has a subtle impact on the
 436 @kbd{.org} directive (@pxref{Org}).
 437
 438 @node Command Line, Input Files, GNU Assembler, Overview
 439 @section Command Line
 440
 441 After the program name @code{as}, the command line may contain
 442 options and file names.  Options may be in any order, and may be
 443 before, after, or between file names.  The order of file names is
 444 significant.
 445
 446 @file{--} (two hyphens) by itself names the standard input file
 447 explicitly, as one of the files for @code{as} to assemble.
 448
 449 Except for @samp{--} any command line argument that begins with a
 450 hyphen (@samp{-}) is an option.  Each option changes the behavior of
 451 @code{as}.  No option changes the way another option works.  An
 452 option is a @samp{-} followed by one or more letters; the case of
 453 the letter is important.   All options are optional.
 454
 455 Some options expect exactly one file name to follow them.  The file
 456 name may either immediately follow the option's letter (compatible
 457 with older assemblers) or it may be the next command argument (GNU
 458 standard).  These two command lines are equivalent:
 459
 460 @example
 461 as -o my-object-file.o mumble
 462 as -omy-object-file.o mumble
 463 @end example
 464
 465 @node Input Files, Object, Command Line, Overview
 466 @section Input Files
 467
 468 We use the phrase @dfn{source program}, abbreviated @dfn{source}, to
 469 describe the program input to one run of @code{as}.  The program may
 470 be in one or more files; how the source is partitioned into files
 471 doesn't change the meaning of the source.
 472
 473 @c I added "con" prefix to "catenation" just to prove I can overcome my
 474 @c APL training...   pesch@cygnus.com
 475 The source program is a concatenation of the text in all the files, in the
 476 order specified.
 477
 478 Each time you run @code{as} it assembles exactly one source
 479 program.  The source program is made up of one or more files.
 480 (The standard input is also a file.)
 481
 482 You give @code{as} a command line that has zero or more input file
 483 names.  The input files are read (from left file name to right).  A
 484 command line argument (in any position) that has no special meaning
 485 is taken to be an input file name.
 486
 487 If @code{as} is given no file names it attempts to read one input file
 488 from @code{as}'s standard input, which is normally your terminal.  You
 489 may have to type @key{ctl-D} to tell @code{as} there is no more program
 490 to assemble.
 491
 492 Use @samp{--} if you need to explicitly name the standard input file
 493 in your command line.
 494
 495 If the source is empty, @code{as} will produce a small, empty object
 496 file.
 497
 498 @menu
 499 * Filenames::                   Input Filenames and Line-numbers
 500 @end menu
 501
 502 @node Filenames,  , Input Files, Input Files
 503 @subsection Input Filenames and Line-numbers
 504 There are two ways of locating a line in the input file (or files) and both
 505 are used in reporting error messages.  One way refers to a line
 506 number in a physical file; the other refers to a line number in a
 507 ``logical'' file.
 508
 509 @dfn{Physical files} are those files named in the command line given
 510 to @code{as}.
 511
 512 @dfn{Logical files} are simply names declared explicitly by assembler
 513 directives; they bear no relation to physical files.  Logical file names
 514 help error messages reflect the original source file, when @code{as}
 515 source is itself synthesized from other files.  @xref{App-File}.
 516
 517 @node Object, Errors, Input Files, Overview
 518 @section Output (Object) File
 519 Every time you run @code{as} it produces an output file, which is
 520 your assembly language program translated into numbers.  This file
 521 is the object file, named @code{a.out} unless you tell @code{as} to
 522 give it another name by using the @code{-o} option.  Conventionally,
 523 object file names end with @file{.o}.  The default name of
 524 @file{a.out} is used for historical reasons:  older assemblers were
 525 capable of assembling self-contained programs directly into a
 526 runnable program.
 527 @c This may still work, but hasn't been tested.
 528
 529 The object file is meant for input to the linker @code{ld}.  It contains
 530 assembled program code, information to help @code{ld} integrate
 531 the assembled program into a runnable file, and (optionally) symbolic
 532 information for the debugger.
 533
 534 @comment link above to some info file(s) like the description of a.out.
 535 @comment don't forget to describe GNU info as well as Unix lossage.
 536
 537 @node Errors, Options, Object, Overview
 538 @section Error and Warning Messages
 539
 540 @code{as} may write warnings and error messages to the standard error
 541 file (usually your terminal).  This should not happen when @code{as} is
 542 run automatically by a compiler.  Warnings report an assumption made so
 543 that @code{as} could keep assembling a flawed program; errors report a
 544 grave problem that stops the assembly.
 545
 546 Warning messages have the format
 547 @example
 548 file_name:@b{NNN}:Warning Message Text
 549 @end example
 550 @noindent(where @b{NNN} is a line number).  If a logical file name has
 551 been given (@pxref{App-File}) it is used for the filename, otherwise the
 552 name of the current input file is used.  If a logical line number was
 553 given
 554 @c if not am29k
 555 @c (@pxref{Line})
 556 @c fi not am29k
 557 @c if am29k
 558 (@pxref{Ln})
 559 @c fi am29k
 560 then it is used to calculate the number printed,
 561 otherwise the actual line in the current source file is printed.  The
 562 message text is intended to be self explanatory (in the grand Unix
 563 tradition). @refill
 564
 565 Error messages have the format
 566 @example
 567 file_name:@b{NNN}:FATAL:Error Message Text
 568 @end example
 569 The file name and line number are derived as for warning
 570 messages.  The actual message text may be rather less explanatory
 571 because many of them aren't supposed to happen.
 572
 573 @group
 574 @node Options,  , Errors, Overview
 575 @section Options
 576 @subsection @code{-D}
 577 This option has no effect whatsoever, but it is accepted to make it more
 578 likely that scripts written for other assemblers will also work with
 579 GNU @code{as}.
 580 @end group
 581
 582 @subsection Work Faster: @code{-f}
 583 @samp{-f} should only be used when assembling programs written by a
 584 (trusted) compiler.  @samp{-f} stops the assembler from pre-processing
 585 the input file(s) before assembling them.
 586 @quotation
 587 @emph{Warning:} if the files actually need to be pre-processed (if they
 588 contain comments, for example), @code{as} will not work correctly if
 589 @samp{-f} is used.
 590 @end quotation
 591
 592 @subsection Add to @code{.include} search path: @code{-I} @var{path}
 593 Use this option to add a @var{path} to the list of directories GNU
 594 @code{as} will search for files specified in @code{.include} directives
 595 (@pxref{Include}).  You may use @code{-I} as many times as necessary to
 596 include a variety of paths.  The current working directory is always
 597 searched first; after that, @code{as} searches any @samp{-I} directories
 598 in the same order as they were specified (left to right) on the command
 599 line.
 600
 601 @subsection Warn if difference tables altered: @code{-k}
 602 @c if am29k
 603 On the AMD 29K family, this option is allowed, but has no effect.  It is
 604 permitted for compatibility with GNU @code{as} on other platforms,
 605 where it can be used to warn when @code{as} alters the machine code
 606 generated for @samp{.word} directives in difference tables.  The AMD 29K
 607 family does not have the addressing limitations that sometimes lead to this
 608 alteration on other platforms.
 609 @c fi am29k
 610
 611 @c if not am29k
 612 @ignore
 613 @code{as} sometimes alters the code emitted for directives of the form
 614 @samp{.word @var{sym1}-@var{sym2}}; @pxref{Word}.
 615 You can use the @samp{-k} option if you want a warning issued when this
 616 is done.
 617 @end ignore
 618 @c fi not am29k
 619
 620 @subsection Include Local Labels: @code{-L}
 621 Labels beginning with @samp{L} (upper case only) are called @dfn{local
 622 labels}. @xref{Symbol Names}.  Normally you don't see such labels when
 623 debugging, because they are intended for the use of programs (like
 624 compilers) that compose assembler programs, not for your notice.
 625 Normally both @code{as} and @code{ld} discard such labels, so you don't
 626 normally debug with them.
 627
 628 This option tells @code{as} to retain those @samp{L@dots{}} symbols
 629 in the object file.  Usually if you do this you also tell the linker
 630 @code{ld} to preserve symbols whose names begin with @samp{L}.
 631
 632 @subsection Name the Object File: @code{-o}
 633 There is always one object file output when you run @code{as}.  By
 634 default it has the name @file{a.out}.  You use this option (which
 635 takes exactly one filename) to give the object file a different name.
 636
 637 Whatever the object file is called, @code{as} will overwrite any
 638 existing file of the same name.
 639
 640 @subsection Fold Data Segment into Text Segment: @code{-R}
 641 @code{-R} tells @code{as} to write the object file as if all
 642 data-segment data lives in the text segment.  This is only done at
 643 the very last moment:  your binary data are the same, but data
 644 segment parts are relocated differently.  The data segment part of
 645 your object file is zero bytes long because all it bytes are
 646 appended to the text segment.  (@xref{Segments}.)
 647
 648 When you specify @code{-R} it would be possible to generate shorter
 649 address displacements (because we don't have to cross between text and
 650 data segment).  We don't do this simply for compatibility with older
 651 versions of @code{as}.  In future, @code{-R} may work this way.
 652
 653 @subsection Suppress Warnings: @code{-W}
 654 @code{as} should never give a warning or error message when
 655 assembling compiler output.  But programs written by people often
 656 cause @code{as} to give a warning that a particular assumption was
 657 made.  All such warnings are directed to the standard error file.
 658 If you use this option, no warnings are issued.  This option only
 659 affects the warning messages: it does not change any particular of how
 660 @code{as} assembles your file.  Errors, which stop the assembly, are
 661 still reported.
 662
 663 @node Syntax, Segments, Overview, Top
 664 @chapter Syntax
 665 This chapter describes the machine-independent syntax allowed in a
 666 source file.  @code{as} syntax is similar to what many other assemblers
 667 use; it is inspired in BSD 4.2
 668 @c if not vax
 669 assembler. @refill
 670 @c fi not vax
 671 @c if vax
 672 @c assembler, except that @code{as} does not
 673 @c assemble Vax bit-fields.
 674 @c fi vax
 675
 676 @menu
 677 * Pre-processing::              Pre-processing
 678 * Whitespace::                  Whitespace
 679 * Comments::                    Comments
 680 * Symbol Intro::                Symbols
 681 * Statements::                  Statements
 682 * Constants::                   Constants
 683 @end menu
 684
 685 @node Pre-processing, Whitespace, Syntax, Syntax
 686 @section Pre-processing
 687
 688 The pre-processor:
 689 @itemize @bullet
 690 @item
 691 adjusts and removes extra whitespace.  It leaves one space or tab before
 692 the keywords on a line, and turns any other whitespace on the line into
 693 a single space.
 694
 695 @item
 696 removes all comments, replacing them with a single space, or an
 697 appropriate number of newlines.
 698
 699 @item
 700 converts character constants into the appropriate numeric values.
 701 @end itemize
 702
 703 Excess whitespace, comments, and character constants
 704 cannot be used in the portions of the input text that are not
 705 pre-processed.
 706
 707 If the first line of an input file is @code{#NO_APP} or the @samp{-f}
 708 option is given, the input file will not be pre-processed.  Within such
 709 an input file, parts of the file can be pre-processed by putting a line
 710 that says @code{#APP} before the text that should be pre-processed, and
 711 putting a line that says @code{#NO_APP} after them.  This feature is
 712 mainly intend to support @code{asm} statements in compilers whose output
 713 normally does not need to be pre-processed.
 714
 715 @node Whitespace, Comments, Pre-processing, Syntax
 716 @section Whitespace
 717 @dfn{Whitespace} is one or more blanks or tabs, in any order.
 718 Whitespace is used to separate symbols, and to make programs neater
 719 for people to read.  Unless within character constants
 720 (@pxref{Characters}), any whitespace means the same as exactly one
 721 space.
 722
 723 @node Comments, Symbol Intro, Whitespace, Syntax
 724 @section Comments
 725 There are two ways of rendering comments to @code{as}.  In both
 726 cases the comment is equivalent to one space.
 727
 728 Anything from @samp{/*} through the next @samp{*/} is a comment.
 729 This means you may not nest these comments.
 730
 731 @example
 732 /*
 733   The only way to include a newline ('\n') in a comment
 734   is to use this sort of comment.
 735 */
 736
 737 /* This sort of comment does not nest. */
 738 @end example
 739
 740 Anything from the @dfn{line comment} character to the next newline
 741 is considered a comment and is ignored.  The line comment character is
 742 @c if vax
 743 @c @samp{#} on the Vax. @xref{Machine Dependent}. @refill
 744 @c @fi vax
 745 @c if 680x0
 746 @c @samp{|} on the 680x0. @xref{Machine Dependent}.  @refill
 747 @c fi 680x0
 748 @c if am29k
 749 @samp{;} for the AMD 29K family. @xref{Machine Dependent}.  @refill
 750 @c fi am29k
 751 @ignore
 752 @if all-arch
 753 On some machines there are two different line comment characters.  One
 754 will only begin a comment if it is the first non-whitespace character on
 755 a line, while the other will always begin a comment.
 756 @fi all-arch
 757 @end ignore
 758
 759 To be compatible with past assemblers a special interpretation is
 760 given to lines that begin with @samp{#}.  Following the @samp{#} an
 761 absolute expression (@pxref{Expressions}) is expected:  this will be
 762 the logical line number of the @b{next} line.  Then a string
 763 (@xref{Strings}.) is allowed: if present it is a new logical file
 764 name.  The rest of the line, if any, should be whitespace.
 765
 766 If the first non-whitespace characters on the line are not numeric,
 767 the line is ignored.  (Just like a comment.)
 768 @example
 769                           # This is an ordinary comment.
 770 # 42-6 "new_file_name"    # New logical file name
 771                           # This is logical line # 36.
 772 @end example
 773 This feature is deprecated, and may disappear from future versions
 774 of @code{as}.
 775
 776 @node Symbol Intro, Statements, Comments, Syntax
 777 @section Symbols
 778 A @dfn{symbol} is one or more characters chosen from the set of all
 779 letters (both upper and lower case), digits and the three characters
 780 @samp{_.$}.  No symbol may begin with a digit.  Case is significant.
 781 There is no length limit: all characters are significant.  Symbols are
 782 delimited by characters not in that set, or by the beginning of a file
 783 (since the source program must end with a newline, the end of a file is
 784 not a possible symbol delimiter).  @xref{Symbols}.
 785
 786 @node Statements, Constants, Symbol Intro, Syntax
 787 @section Statements
 788 A @dfn{statement} ends at a newline character (@samp{\n})
 789 @c @if m680x0 (or is this if !am29k?)
 790 @c or at a semicolon (@samp{;}).  The newline or semicolon
 791 @c fi m680x0 (or !am29k)
 792 @c if am29k
 793 or an ``at'' sign (@samp{@@}).  The newline or at sign
 794 @c fi am29k
 795 is considered part
 796 of the preceding statement.  Newlines
 797 @c if m680x0 (or !am29k)
 798 @c and semicolons
 799 @c fi m680x0 (or !am29k)
 800 @c if am29k
 801 and at signs
 802 @c fi am29k
 803 within
 804 character constants are an exception:  they don't end statements.
 805 It is an error to end any statement with end-of-file:  the last
 806 character of any input file should be a newline.@refill
 807
 808 You may write a statement on more than one line if you put a
 809 backslash (@kbd{\}) immediately in front of any newlines within the
 810 statement.  When @code{as} reads a backslashed newline both
 811 characters are ignored.  You can even put backslashed newlines in
 812 the middle of symbol names without changing the meaning of your
 813 source program.
 814
 815 An empty statement is allowed, and may include whitespace.  It is ignored.
 816
 817 @c "key symbol" is not used elsewhere in the document; seems pedantic to
 818 @c @defn{} it in that case, as was done previously...  pesch@cygnus.com,
 819 @c 13feb91.
 820 A statement begins with zero or more labels, optionally followed by a
 821 key symbol which determines what kind of statement it is.  The key
 822 symbol determines the syntax of the rest of the statement.  If the
 823 symbol begins with a dot @samp{.} then the statement is an assembler
 824 directive: typically valid for any computer.  If the symbol begins with
 825 a letter the statement is an assembly language @dfn{instruction}: it
 826 will assemble into a machine language instruction.  Different versions
 827 of @code{as} for different computers will recognize different
 828 instructions.  In fact, the same symbol may represent a different
 829 instruction in a different computer's assembly language.
 830
 831 A label is a symbol immediately followed by a colon (@code{:}).
 832 Whitespace before a label or after a colon is permitted, but you may not
 833 have whitespace between a label's symbol and its colon. @xref{Labels}.
 834
 835 @example
 836 label:     .directive    followed by something
 837 another$label:           # This is an empty statement.
 838            instruction   operand_1, operand_2, @dots{}
 839 @end example
 840
 841 @node Constants,  , Statements, Syntax
 842 @section Constants
 843 A constant is a number, written so that its value is known by
 844 inspection, without knowing any context.  Like this:
 845 @example
 846 .byte  74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value.
 847 .ascii "Ring the bell\7"                  # A string constant.
 848 .octa  0x123456789abcdef0123456789ABCDEF0 # A bignum.
 849 .float 0f-314159265358979323846264338327\
 850 95028841971.693993751E-40                 # - pi, a flonum.
 851 @end example
 852
 853 @menu
 854 * Characters::                  Character Constants
 855 * Numbers::                     Number Constants
 856 @end menu
 857
 858 @node Characters, Numbers, Constants, Constants
 859 @subsection Character Constants
 860 There are two kinds of character constants.  A @dfn{character} stands
 861 for one character in one byte and its value may be used in
 862 numeric expressions.  String constants (properly called string
 863 @emph{literals}) are potentially many bytes and their values may not be
 864 used in arithmetic expressions.
 865
 866 @menu
 867 * Strings::                     Strings
 868 * Chars::                       Characters
 869 @end menu
 870
 871 @node Strings, Chars, Characters, Characters
 872 @subsubsection Strings
 873 A @dfn{string} is written between double-quotes.  It may contain
 874 double-quotes or null characters.  The way to get special characters
 875 into a string is to @dfn{escape} these characters: precede them with
 876 a backslash @samp{\} character.  For example @samp{\\} represents
 877 one backslash:  the first @code{\} is an escape which tells
 878 @code{as} to interpret the second character literally as a backslash
 879 (which prevents @code{as} from recognizing the second @code{\} as an
 880 escape character).  The complete list of escapes follows.
 881
 882 @table @kbd
 883 @c      @item \a
 884 @c      Mnemonic for ACKnowledge; for ASCII this is octal code 007.
 885 @item \b
 886 Mnemonic for backspace; for ASCII this is octal code 010.
 887 @c      @item \e
 888 @c      Mnemonic for EOText; for ASCII this is octal code 004.
 889 @item \f
 890 Mnemonic for FormFeed; for ASCII this is octal code 014.
 891 @item \n
 892 Mnemonic for newline; for ASCII this is octal code 012.
 893 @c      @item \p
 894 @c      Mnemonic for prefix; for ASCII this is octal code 033, usually known as @code{escape}.
 895 @item \r
 896 Mnemonic for carriage-Return; for ASCII this is octal code 015.
 897 @c      @item \s
 898 @c      Mnemonic for space; for ASCII this is octal code 040.  Included for compliance with
 899 @c      other assemblers.
 900 @item \t
 901 Mnemonic for horizontal Tab; for ASCII this is octal code 011.
 902 @c      @item \v
 903 @c      Mnemonic for Vertical tab; for ASCII this is octal code 013.
 904 @c      @item \x @var{digit} @var{digit} @var{digit}
 905 @c      A hexadecimal character code.  The numeric code is 3 hexadecimal digits.
 906 @item \ @var{digit} @var{digit} @var{digit}
 907 An octal character code.  The numeric code is 3 octal digits.
 908 For compatibility with other Unix systems, 8 and 9 are accepted as digits:
 909 for example, @code{\008} has the value 010, and @code{\009} the value 011.
 910 @item \\
 911 Represents one @samp{\} character.
 912 @c      @item \'
 913 @c      Represents one @samp{'} (accent acute) character.
 914 @c      This is needed in single character literals
 915 @c      (@xref{Characters}.) to represent
 916 @c      a @samp{'}.
 917 @item \"
 918 Represents one @samp{"} character.  Needed in strings to represent
 919 this character, because an unescaped @samp{"} would end the string.
 920 @item \ @var{anything-else}
 921 Any other character when escaped by @kbd{\} will give a warning, but
 922 assemble as if the @samp{\} was not present.  The idea is that if
 923 you used an escape sequence you clearly didn't want the literal
 924 interpretation of the following character.  However @code{as} has no
 925 other interpretation, so @code{as} knows it is giving you the wrong
 926 code and warns you of the fact.
 927 @end table
 928
 929 Which characters are escapable, and what those escapes represent,
 930 varies widely among assemblers.  The current set is what we think
 931 BSD 4.2 @code{as} recognizes, and is a subset of what most C
 932 compilers recognize.  If you are in doubt, don't use an escape
 933 sequence.
 934
 935 @node Chars,  , Strings, Characters
 936 @subsubsection Characters
 937 A single character may be written as a single quote immediately
 938 followed by that character.  The same escapes apply to characters as
 939 to strings.  So if you want to write the character backslash, you
 940 must write @kbd{'\\} where the first @code{\} escapes the second
 941 @code{\}.  As you can see, the quote is an acute accent, not a
 942 grave accent.  A newline
 943 @c if 680x0 (or !am29k)
 944 @c (or semicolon @samp{;})
 945 @c fi 680x0 (or !am29k)
 946 @c if am29k
 947 (or at sign @samp{@@})
 948 @c fi am29k
 949 immediately
 950 following an acute accent is taken as a literal character and does
 951 not count as the end of a statement.  The value of a character
 952 constant in a numeric expression is the machine's byte-wide code for
 953 that character.  @code{as} assumes your character code is ASCII: @kbd{'A}
 954 means 65, @kbd{'B} means 66, and so on. @refill
 955
 956 @node Numbers,  , Characters, Constants
 957 @subsection Number Constants
 958 @code{as} distinguishes three kinds of numbers according to how they
 959 are stored in the target machine.  @emph{Integers} are numbers that
 960 would fit into an @code{int} in the C language.  @emph{Bignums} are
 961 integers, but they are stored in a more than 32 bits.  @emph{Flonums}
 962 are floating point numbers, described below.
 963
 964 @subsubsection Integers
 965 A binary integer is @samp{0b} or @samp{0B} followed by zero or more of
 966 the binary digits @samp{01}.
 967
 968 An octal integer is @samp{0} followed by zero or more of the octal
 969 digits (@samp{01234567}).
 970
 971 A decimal integer starts with a non-zero digit followed by zero or
 972 more digits (@samp{0123456789}).
 973
 974 A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or
 975 more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}.
 976
 977 Integers have the usual values.  To denote a negative integer, use
 978 the prefix operator @samp{-} discussed under expressions
 979 (@pxref{Prefix Ops}).
 980
 981 @subsubsection Bignums
 982 A @dfn{bignum} has the same syntax and semantics as an integer
 983 except that the number (or its negative) takes more than 32 bits to
 984 represent in binary.  The distinction is made because in some places
 985 integers are permitted while bignums are not.
 986
 987 @subsubsection Flonums
 988 A @dfn{flonum} represents a floating point number.  The translation is
 989 complex: a decimal floating point number from the text is converted by
 990 @code{as} to a generic binary floating point number of more than
 991 sufficient precision.  This generic floating point number is converted
 992 to a particular computer's floating point format (or formats) by a
 993 portion of @code{as} specialized to that computer.
 994
 995 A flonum is written by writing (in order)
 996 @itemize @bullet
 997 @item
 998 The digit @samp{0}.
 999 @item
1000 @c if am29k
1001 One of the letters @samp{DFPRSX} (in upper or lower case), to tell
1002 @code{as} the rest of the number is a flonum.
1003 @c fi am29k
1004 @c if not am29k
1005 @ignore
1006 A letter, to tell @code{as} the rest of the number is a flonum.  @kbd{e}
1007 is recommended.  Case is not important.  (Any otherwise illegal letter
1008 will work here, but that might be changed.  Vax BSD 4.2 assembler seems
1009 to allow any of @samp{defghDEFGH}.)
1010 @end ignore
1011 @c fi not am29k
1012 @item
1013 An optional sign: either @samp{+} or @samp{-}.
1014 @item
1015 An optional @dfn{integer part}: zero or more decimal digits.
1016 @item
1017 An optional @dfn{fraction part}: @samp{.} followed by zero
1018 or more decimal digits.
1019 @item
1020 An optional exponent, consisting of:
1021 @itemize @bullet
1022 @item
1023 @c if am29k
1024 An @samp{E} or @samp{e}.
1025 @c if not am29k
1026 @ignore
1027 A letter; the exact significance varies according to
1028 the computer that executes the program.  @code{as}
1029 accepts any letter for now.  Case is not important.
1030 @end ignore
1031 @c fi not am29k
1032 @item
1033 Optional sign: either @samp{+} or @samp{-}.
1034 @item
1035 One or more decimal digits.
1036 @end itemize
1037 @end itemize
1038
1039 At least one of @var{integer part} or @var{fraction part} must be
1040 present.  The floating point number has the usual base-10 value.
1041
1042 @code{as} does all processing using integers.  Flonums are computed
1043 independently of any floating point hardware in the computer running
1044 @code{as}.
1045
1046 @node Segments, Symbols, Syntax, Top
1047 @chapter Segments and Relocation
1048 @menu
1049 * Segs Background::             Background
1050 * ld Segments::                 ld Segments
1051 * as Segments::                 as Internal Segments
1052 * Sub-Segments::                Sub-Segments
1053 * bss::                         bss Segment
1054 @end menu
1055
1056 @node Segs Background, ld Segments, Segments, Segments
1057 @section Background
1058 Roughly, a segment is a range of addresses, with no gaps; all data
1059 ``in'' those addresses is treated the same for some particular purpose.
1060 For example there may be a ``read only'' segment.
1061
1062 The linker @code{ld} reads many object files (partial programs) and
1063 combines their contents to form a runnable program.  When @code{as}
1064 emits an object file, the partial program is assumed to start at address
1065 0.  @code{ld} will assign the final addresses the partial program
1066 occupies, so that different partial programs don't overlap.  This is
1067 actually an over-simplification, but it will suffice to explain how
1068 @code{as} uses segments.
1069
1070 @code{ld} moves blocks of bytes of your program to their run-time
1071 addresses.  These blocks slide to their run-time addresses as rigid
1072 units; their length does not change and neither does the order of bytes
1073 within them.  Such a rigid unit is called a @emph{segment}.  Assigning
1074 run-time addresses to segments is called @dfn{relocation}.  It includes
1075 the task of adjusting mentions of object-file addresses so they refer to
1076 the proper run-time addresses.
1077
1078 An object file written by @code{as} has three segments, any of which may
1079 be empty.  These are named @dfn{text}, @dfn{data} and @dfn{bss}
1080 segments.  Within the object file, the text segment starts at
1081 address @code{0}, the data segment follows, and the bss segment
1082 follows the data segment.
1083
1084 To let @code{ld} know which data will change when the segments are
1085 relocated, and how to change that data, @code{as} also writes to the
1086 object file details of the relocation needed.  To perform relocation
1087 @code{ld} must know, each time an address in the object
1088 file is mentioned:
1089 @itemize @bullet
1090 @item
1091 Where in the object file is the beginning of this reference to
1092 an address?
1093 @item
1094 How long (in bytes) is this reference?
1095 @item
1096 Which segment does the address refer to?  What is the numeric value of
1097 @display
1098 (@var{address}) @minus{} (@var{start-address of segment})?
1099 @end display
1100 @item
1101 Is the reference to an address ``Program-Counter relative''?
1102 @end itemize
1103
1104 In fact, every address @code{as} ever uses is expressed as
1105 @code{(@var{segment}) + (@var{offset into segment})}.  Further, every
1106 expression @code{as} computes is of this segmented nature.
1107 @dfn{Absolute expression} means an expression with segment ``absolute''
1108 (@pxref{ld Segments}).  A @dfn{pass1 expression} means an expression
1109 with segment ``pass1'' (@pxref{as Segments}).  In this manual we use the
1110 notation @{@var{segname} @var{N}@} to mean ``offset @var{N} into segment
1111 @var{segname}''.
1112
1113 Apart from text, data and bss segments you need to know about the
1114 @dfn{absolute} segment.  When @code{ld} mixes partial programs,
1115 addresses in the absolute segment remain unchanged.  That is, address
1116 @code{@{absolute 0@}} is ``relocated'' to run-time address 0 by @code{ld}.
1117 Although two partial programs' data segments will not overlap addresses
1118 after linking, @emph{by definition} their absolute segments will overlap.
1119 Address @code{@{absolute@ 239@}} in one partial program will always be the same
1120 address when the program is running as address @code{@{absolute@ 239@}} in any
1121 other partial program.
1122
1123 The idea of segments is extended to the @dfn{undefined} segment.  Any
1124 address whose segment is unknown at assembly time is by definition
1125 rendered @{undefined @var{U}@}---where @var{U} will be filled in later.
1126 Since numbers are always defined, the only way to generate an undefined
1127 address is to mention an undefined symbol.  A reference to a named
1128 common block would be such a symbol: its value is unknown at assembly
1129 time so it has segment @emph{undefined}.
1130
1131 By analogy the word @emph{segment} is used to describe groups of segments in
1132 the linked program.  @code{ld} puts all partial programs' text
1133 segments in contiguous addresses in the linked program.  It is
1134 customary to refer to the @emph{text segment} of a program, meaning all
1135 the addresses of all partial program's text segments.  Likewise for
1136 data and bss segments.
1137
1138 Some segments are manipulated by @code{ld}; others are invented for
1139 use of @code{as} and have no meaning except during assembly.
1140
1141 @menu
1142 * ld Segments::                 ld Segments
1143 * as Segments::                 as Internal Segments
1144 * Sub-Segments::                Sub-Segments
1145 * bss::                         bss Segment
1146 @end menu
1147
1148 @node ld Segments, as Segments, Segs Background, Segments
1149 @section ld Segments
1150 @code{ld} deals with just five kinds of segments, summarized below.
1151
1152 @table @strong
1153
1154 @item text segment
1155 @itemx data segment
1156 These segments hold your program.  @code{as} and @code{ld} treat them as
1157 separate but equal segments.  Anything you can say of one segment is
1158 true of the other.  When the program is running, however, it is
1159 customary for the text segment to be unalterable.  The
1160 text segment is often shared among processes: it will contain
1161 instructions, constants and the like.  The data segment of a running
1162 program is usually alterable: for example, C variables would be stored
1163 in the data segment.
1164
1165 @item bss segment
1166 This segment contains zeroed bytes when your program begins running.  It
1167 is used to hold unitialized variables or common storage.  The length of
1168 each partial program's bss segment is important, but because it starts
1169 out containing zeroed bytes there is no need to store explicit zero
1170 bytes in the object file.  The bss segment was invented to eliminate
1171 those explicit zeros from object files.
1172
1173 @item absolute segment
1174 Address 0 of this segment is always ``relocated'' to runtime address 0.
1175 This is useful if you want to refer to an address that @code{ld} must
1176 not change when relocating.  In this sense we speak of absolute
1177 addresses being ``unrelocatable'': they don't change during relocation.
1178
1179 @item @code{undefined} segment
1180 This ``segment'' is a catch-all for address references to objects not in
1181 the preceding segments.
1182 @c FIXME: ref to some other doc on obj-file formats could go here.
1183
1184 @end table
1185
1186 An idealized example of the 3 relocatable segments follows.  Memory
1187 addresses are on the horizontal axis.
1188
1189 @ifinfo
1190 @example
1191                       +-----+----+--+
1192 partial program # 1:  |ttttt|dddd|00|
1193                       +-----+----+--+
1194
1195                       text   data bss
1196                       seg.   seg. seg.
1197
1198                       +---+---+---+
1199 partial program # 2:  |TTT|DDD|000|
1200                       +---+---+---+
1201
1202                       +--+---+-----+--+----+---+-----+~~
1203 linked program:       |  |TTT|ttttt|  |dddd|DDD|00000|
1204                       +--+---+-----+--+----+---+-----+~~
1205
1206     addresses:        0 @dots{}
1207 @end example
1208 @end ifinfo
1209 @tex
1210 \halign{\hfil\rm #\quad&#\cr
1211 \cr
1212    &\ibox{2.5cm}{\tt text}\ibox{2cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1213 Partial program \#1:
1214 &\boxit{2.5cm}{\tt ttttt}\boxit{2cm}{\tt dddd}\boxit{1cm}{\tt 00}\cr
1215 \cr
1216    &\ibox{1cm}{\tt text}\ibox{1.5cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1217 Partial program \#2:
1218 &\boxit{1cm}{\tt TTT}\boxit{1.5cm}{\tt DDDD}\boxit{1cm}{\tt 000}\cr
1219 \cr
1220    &\ibox{.5cm}{}\ibox{1cm}{\tt text}\ibox{2.5cm}{}\ibox{.75cm}{}\ibox{2cm}{\tt data}\ibox{1.5cm}{}\ibox{2cm}{\tt bss}\cr
1221 linked program:
1222 &\boxit{.5cm}{}\boxit{1cm}{\tt TTT}\boxit{2.5cm}{\tt
1223 ttttt}\boxit{.75cm}{}\boxit{2cm}{\tt dddd}\boxit{1.5cm}{\tt
1224 DDDD}\boxit{2cm}{00000}\ \dots\cr
1225 addresses:
1226 &\dots\cr
1227 }
1228 @end tex
1229
1230 @node as Segments, Sub-Segments, ld Segments, Segments
1231 @section as Internal Segments
1232 These segments are invented for the internal use of @code{as}.  They
1233 have no meaning at run-time.  You don't need to know about these
1234 segments except that they might be mentioned in @code{as}' warning
1235 messages.  These segments are invented to permit the value of every
1236 expression in your assembly language program to be a segmented
1237 address.
1238
1239 @table @b
1240 @item absent segment
1241 An expression was expected and none was
1242 found.
1243
1244 @item goof segment
1245 An internal assembler logic error has been
1246 found.  This means there is a bug in the assembler.
1247
1248 @item grand segment
1249 A @dfn{grand number} is a bignum or a flonum, but not an integer.  If a
1250 number can't be written as a C @code{int} constant, it is a grand
1251 number.  @code{as} has to remember that a flonum or a bignum does not
1252 fit into 32 bits, and cannot be an argument (@pxref{Arguments}) in an
1253 expression: this is done by making a flonum or bignum be in segment
1254 grand.  This is purely for internal @code{as} convenience; grand
1255 segment behaves similarly to absolute segment.
1256
1257 @item pass1 segment
1258 The expression was impossible to evaluate in the first pass.  The
1259 assembler will attempt a second pass (second reading of the source) to
1260 evaluate the expression.  Your expression mentioned an undefined symbol
1261 in a way that defies the one-pass (segment + offset in segment) assembly
1262 process.  No compiler need emit such an expression.
1263
1264 @quotation
1265 @emph{Warning:} the second pass is currently not implemented.  @code{as}
1266 will abort with an error message if one is required.
1267 @end quotation
1268
1269 @item difference segment
1270 As an assist to the C compiler, expressions of the forms
1271 @display
1272    (@var{undefined symbol}) @minus{} (@var{expression}
1273    (@var{something} @minus{} (@var{undefined symbol})
1274    (@var{undefined symbol}) @minus{} (@var{undefined symbol})
1275 @end display
1276 are permitted, and belong to the difference segment.  @code{as}
1277 re-evaluates such expressions after the source file has been read and
1278 the symbol table built.  If by that time there are no undefined symbols
1279 in the expression then the expression assumes a new segment.  The
1280 intention is to permit statements like
1281 @samp{.word label - base_of_table}
1282 to be assembled in one pass where both @code{label} and
1283 @code{base_of_table} are undefined.  This is useful for compiling C and
1284 Algol switch statements, Pascal case statements, FORTRAN computed goto
1285 statements and the like.
1286 @end table
1287
1288 @node Sub-Segments, bss, as Segments, Segments
1289 @section Sub-Segments
1290 Assembled bytes fall into two segments: text and data.
1291 Because you may have groups of text or data that you want to end up near
1292 to each other in the object file, @code{as} allows you to use
1293 @dfn{subsegments}.  Within each segment, there can be numbered
1294 subsegments with values from 0 to 8192.  Objects assembled into the same
1295 subsegment will be grouped with other objects in the same subsegment
1296 when they are all put into the object file.  For example, a compiler
1297 might want to store constants in the text segment, but might not want to
1298 have them interspersed with the program being assembled.  In this case,
1299 the compiler could issue a @code{text 0} before each section of code
1300 being output, and a @code{text 1} before each group of constants being
1301 output.
1302
1303 Subsegments are optional.  If you don't use subsegments, everything
1304 will be stored in subsegment number zero.
1305
1306 @c @if not am29k
1307 @c Each subsegment is zero-padded up to a multiple of four bytes.
1308 @c (Subsegments may be padded a different amount on different flavors
1309 @c of @code{as}.)
1310 @c fi not am29k
1311 @c if am29k
1312 On the AMD 29K family, no particular padding is added to segment sizes;
1313 GNU as forces no alignment on this platform.
1314 @c fi am29k
1315 Subsegments appear in your object file in numeric order, lowest numbered
1316 to highest.  (All this to be compatible with other people's assemblers.)
1317 The object file contains no representation of subsegments; @code{ld} and
1318 other programs that manipulate object files will see no trace of them.
1319 They just see all your text subsegments as a text segment, and all your
1320 data subsegments as a data segment.
1321
1322 To specify which subsegment you want subsequent statements assembled
1323 into, use a @samp{.text @var{expression}} or a @samp{.data
1324 @var{expression}} statement.  @var{Expression} should be an absolute
1325 expression.  (@xref{Expressions}.)  If you just say @samp{.text}
1326 then @samp{.text 0} is assumed.  Likewise @samp{.data} means
1327 @samp{.data 0}.  Assembly begins in @code{text 0}.
1328 For instance:
1329 @example
1330 .text 0     # The default subsegment is text 0 anyway.
1331 .ascii "This lives in the first text subsegment. *"
1332 .text 1
1333 .ascii "But this lives in the second text subsegment."
1334 .data 0
1335 .ascii "This lives in the data segment,"
1336 .ascii "in the first data subsegment."
1337 .text 0
1338 .ascii "This lives in the first text segment,"
1339 .ascii "immediately following the asterisk (*)."
1340 @end example
1341
1342 Each segment has a @dfn{location counter} incremented by one for every
1343 byte assembled into that segment.  Because subsegments are merely a
1344 convenience restricted to @code{as} there is no concept of a subsegment
1345 location counter.  There is no way to directly manipulate a location
1346 counter---but the @code{.align} directive will change it, and any label
1347 definition will capture its current value.  The location counter of the
1348 segment that statements are being assembled into is said to be the
1349 @dfn{active} location counter.
1350
1351 @node bss,  , Sub-Segments, Segments
1352 @section bss Segment
1353 The bss segment is used for local common variable storage.
1354 You may allocate address space in the bss segment, but you may
1355 not dictate data to load into it before your program executes.  When
1356 your program starts running, all the contents of the bss
1357 segment are zeroed bytes.
1358
1359 Addresses in the bss segment are allocated with special directives;
1360 you may not assemble anything directly into the bss segment.  Hence
1361 there are no bss subsegments. @xref{Comm}; @pxref{Lcomm}.
1362
1363 @node Symbols, Expressions, Segments, Top
1364 @chapter Symbols
1365 Symbols are a central concept: the programmer uses symbols to name
1366 things, the linker uses symbols to link, and the debugger uses symbols
1367 to debug.
1368
1369 @quotation
1370 @emph{Warning:} @code{as} does not place symbols in the object file in
1371 the same order they were declared.  This may break some debuggers.
1372 @end quotation
1373
1374 @menu
1375 * Labels::                      Labels
1376 * Setting Symbols::             Giving Symbols Other Values
1377 * Symbol Names::                Symbol Names
1378 * Dot::                         The Special Dot Symbol
1379 * Symbol Attributes::           Symbol Attributes
1380 @end menu
1381
1382 @node Labels, Setting Symbols, Symbols, Symbols
1383 @section Labels
1384 A @dfn{label} is written as a symbol immediately followed by a colon
1385 @samp{:}.  The symbol then represents the current value of the
1386 active location counter, and is, for example, a suitable instruction
1387 operand.  You are warned if you use the same symbol to represent two
1388 different locations: the first definition overrides any other
1389 definitions.
1390
1391 @node Setting Symbols, Symbol Names, Labels, Symbols
1392 @section Giving Symbols Other Values
1393 A symbol can be given an arbitrary value by writing a symbol, followed
1394 by an equals sign @samp{=}, followed by an expression
1395 (@pxref{Expressions}).  This is equivalent to using the @code{.set}
1396 directive.  @xref{Set}.
1397
1398 @node Symbol Names, Dot, Setting Symbols, Symbols
1399 @section Symbol Names
1400 Symbol names begin with a letter or with one of @samp{$._}.  That
1401 character may be followed by any string of digits, letters,
1402 underscores and dollar signs.  Case of letters is significant:
1403 @code{foo} is a different symbol name than @code{Foo}.
1404
1405 @c if am29k
1406 For the AMD 29K family, @samp{?} is also allowed in the
1407 body of a symbol name, though not at its beginning.
1408 @c fi am29k
1409
1410 Each symbol has exactly one name. Each name in an assembly language
1411 program refers to exactly one symbol. You may use that symbol name any
1412 number of times in a program.
1413
1414 @menu
1415 * Local Symbols::               Local Symbol Names
1416 @end menu
1417
1418 @node Local Symbols,  , Symbol Names, Symbol Names
1419 @subsection Local Symbol Names
1420
1421 Local symbols help compilers and programmers use names temporarily.
1422 There are ten local symbol names, which are re-used throughout the
1423 program.  You may refer to them using the names @samp{0} @samp{1}
1424 @dots{} @samp{9}.  To define a local symbol, write a label of the form
1425 @samp{@b{N}:} (where @b{N} represents any digit).  To refer to the most
1426 recent previous definition of that symbol write @samp{@b{N}b}, using the
1427 same digit as when you defined the label.  To refer to the next
1428 definition of a local label, write @samp{@b{N}f}---where @b{N} gives you
1429 a choice of 10 forward references.  The @samp{b} stands for
1430 ``backwards'' and the @samp{f} stands for ``forwards''.
1431
1432 Local symbols are not emitted by the current GNU C compiler.
1433
1434 There is no restriction on how you can use these labels, but
1435 remember that at any point in the assembly you can refer to at most
1436 10 prior local labels and to at most 10 forward local labels.
1437
1438 Local symbol names are only a notation device.  They are immediately
1439 transformed into more conventional symbol names before the assembler
1440 uses them.  The symbol names stored in the symbol table, appearing in
1441 error messages and optionally emitted to the object file have these
1442 parts:
1443
1444 @table @code
1445 @item L
1446 All local labels begin with @samp{L}. Normally both @code{as} and
1447 @code{ld} forget symbols that start with @samp{L}. These labels are
1448 used for symbols you are never intended to see.  If you give the
1449 @samp{-L} option then @code{as} will retain these symbols in the
1450 object file. If you also instruct @code{ld} to retain these symbols,
1451 you may use them in debugging.
1452
1453 @item @var{digit}
1454 If the label is written @samp{0:} then the digit is @samp{0}.
1455 If the label is written @samp{1:} then the digit is @samp{1}.
1456 And so on up through @samp{9:}.
1457
1458 @item @ctrl{A}
1459 This unusual character is included so you don't accidentally invent
1460 a symbol of the same name.  The character has ASCII value
1461 @samp{\001}.
1462
1463 @item @emph{ordinal number}
1464 This is a serial number to keep the labels distinct.  The first
1465 @samp{0:} gets the number @samp{1}; The 15th @samp{0:} gets the
1466 number @samp{15}; @emph{etc.}.  Likewise for the other labels @samp{1:}
1467 through @samp{9:}.
1468 @end table
1469
1470 For instance, the first @code{1:} is named @code{L1@ctrl{A}1}, the 44th
1471 @code{3:} is named @code{L3@ctrl{A}44}.
1472
1473 @node Dot, Symbol Attributes, Symbol Names, Symbols
1474 @section The Special Dot Symbol
1475
1476 The special symbol @samp{.} refers to the current address that
1477 @code{as} is assembling into.  Thus, the expression @samp{melvin:
1478 .long .} will cause @code{melvin} to contain its own address.
1479 Assigning a value to @code{.} is treated the same as a @code{.org}
1480 directive.  Thus, the expression @samp{.=.+4} is the same as saying
1481 @c if not am29k
1482 @c @samp{.space 4}.
1483 @c fi not am29k
1484 @c if am29k
1485 @samp{.block 4}.
1486 @c fi am29k
1487
1488 @node Symbol Attributes,  , Dot, Symbols
1489 @section Symbol Attributes
1490 Every symbol has these attributes: Value, Type, Descriptor, and ``Other''.
1491 @c if internals
1492 @c The detailed definitions are in <a.out.h>.
1493 @c fi internals
1494
1495 If you use a symbol without defining it, @code{as} assumes zero for
1496 all these attributes, and probably won't warn you.  This makes the
1497 symbol an externally defined symbol, which is generally what you
1498 would want.
1499
1500 @menu
1501 * Symbol Value::                Value
1502 * Symbol Type::                 Type
1503 * Symbol Desc::                 Descriptor
1504 * Symbol Other::                Other
1505 @end menu
1506
1507 @node Symbol Value, Symbol Type, Symbol Attributes, Symbol Attributes
1508 @subsection Value
1509 The value of a symbol is (usually) 32 bits, the size of one GNU C
1510 @code{int}.  For a symbol which labels a location in the
1511 text, data, bss or absolute segments the
1512 value is the number of addresses from the start of that segment to
1513 the label.  Naturally for text, data and bss
1514 segments the value of a symbol changes as @code{ld} changes segment
1515 base addresses during linking.  absolute symbols' values do
1516 not change during linking: that is why they are called absolute.
1517
1518 The value of an undefined symbol is treated in a special way.  If it is
1519 0 then the symbol is not defined in this assembler source program, and
1520 @code{ld} will try to determine its value from other programs it is
1521 linked with.  You make this kind of symbol simply by mentioning a symbol
1522 name without defining it.  A non-zero value represents a @code{.comm}
1523 common declaration.  The value is how much common storage to reserve, in
1524 bytes (addresses).  The symbol refers to the first address of the
1525 allocated storage.
1526
1527 @node Symbol Type, Symbol Desc, Symbol Value, Symbol Attributes
1528 @subsection Type
1529 The type attribute of a symbol is 8 bits encoded in a devious way.
1530 We kept this coding standard for compatibility with older operating
1531 systems.
1532
1533 @ifinfo
1534 @example
1535
1536         7     6     5     4     3     2     1     0     bit numbers
1537      +-----+-----+-----+-----+-----+-----+-----+-----+
1538      |                 |                       |     |
1539      |   N_STAB bits   |      N_TYPE bits      |N_EXT|
1540      |                 |                       | bit |
1541      +-----+-----+-----+-----+-----+-----+-----+-----+
1542
1543                      Type byte
1544 @end example
1545 @end ifinfo
1546 @tex
1547 \vskip 1pc
1548 \halign{#\quad&#\cr
1549 \ibox{3cm}{7}\ibox{4cm}{4}\ibox{1.1cm}{0}&bit numbers\cr
1550 \boxit{3cm}{{\tt N\_STAB} bits}\boxit{4cm}{{\tt N\_TYPE}
1551 bits}\boxit{1.1cm}{\tt N\_EXT}\cr
1552 \hfill {\bf Type} byte\hfill\cr
1553 }
1554 @end tex
1555
1556 @subsubsection @code{N_EXT} bit
1557 This bit is set if @code{ld} might need to use the symbol's type bits
1558 and value.  If this bit is off, then @code{ld} can ignore the
1559 symbol while linking.  It is set in two cases.  If the symbol is
1560 undefined, then @code{ld} is expected to find the symbol's value
1561 elsewhere in another program module.  Otherwise the symbol has the
1562 value given, but this symbol name and value are revealed to any other
1563 programs linked in the same executable program.  This second use of
1564 the @code{N_EXT} bit is most often made by a @code{.globl} statement.
1565
1566 @subsubsection @code{N_TYPE} bits
1567 These establish the symbol's ``type'', which is mainly a relocation
1568 concept.  Common values are detailed in the manual describing the
1569 executable file format.
1570
1571 @subsubsection @code{N_STAB} bits
1572 Common values for these bits are described in the manual on the
1573 executable file format.
1574
1575 @node Symbol Desc, Symbol Other, Symbol Type, Symbol Attributes
1576 @subsection Descriptor
1577 This is an arbitrary 16-bit value.  You may establish a symbol's
1578 descriptor value by using a @code{.desc} statement (@pxref{Desc}).
1579 A descriptor value means nothing to @code{as}.
1580
1581 @node Symbol Other,  , Symbol Desc, Symbol Attributes
1582 @subsection Other
1583 This is an arbitrary 8-bit value.  It means nothing to @code{as}.
1584
1585 @node Expressions, Pseudo Ops, Symbols, Top
1586 @chapter Expressions
1587 An @dfn{expression} specifies an address or numeric value.
1588 Whitespace may precede and/or follow an expression.
1589
1590 @menu
1591 * Empty Exprs::                 Empty Expressions
1592 * Integer Exprs::               Integer Expressions
1593 @end menu
1594
1595 @node Empty Exprs, Integer Exprs, Expressions, Expressions
1596 @section Empty Expressions
1597 An empty expression has no value: it is just whitespace or null.
1598 Wherever an absolute expression is required, you may omit the
1599 expression and @code{as} will assume a value of (absolute) 0.  This
1600 is compatible with other assemblers.
1601
1602 @node Integer Exprs,  , Empty Exprs, Expressions
1603 @section Integer Expressions
1604 An @dfn{integer expression} is one or more @emph{arguments} delimited
1605 by @emph{operators}.
1606
1607 @menu
1608 * Arguments::                   Arguments
1609 * Operators::                   Operators
1610 * Prefix Ops::                  Prefix Operators
1611 * Infix Ops::                   Infix Operators
1612 @end menu
1613
1614 @node Arguments, Operators, Integer Exprs, Integer Exprs
1615 @subsection Arguments
1616
1617 @dfn{Arguments} are symbols, numbers or subexpressions.  In other
1618 contexts arguments are sometimes called ``arithmetic operands''.  In
1619 this manual, to avoid confusing them with the ``instruction operands'' of
1620 the machine language, we use the term ``argument'' to refer to parts of
1621 expressions only, reserving the word ``operand'' to refer only to machine
1622 instruction operands.
1623
1624 Symbols are evaluated to yield @{@var{segment} @var{NNN}@} where
1625 @var{segment} is one of text, data, bss, absolute,
1626 or @code{undefined}.  @var{NNN} is a signed, 2's complement 32 bit
1627 integer.
1628
1629 Numbers are usually integers.
1630
1631 A number can be a flonum or bignum.  In this case, you are warned
1632 that only the low order 32 bits are used, and @code{as} pretends
1633 these 32 bits are an integer.  You may write integer-manipulating
1634 instructions that act on exotic constants, compatible with other
1635 assemblers.
1636
1637 Subexpressions are a left parenthesis @samp{(} followed by an integer
1638 expression, followed by a right parenthesis @samp{)}; or a prefix
1639 operator followed by an argument.
1640
1641 @node Operators, Prefix Ops, Arguments, Integer Exprs
1642 @subsection Operators
1643 @dfn{Operators} are arithmetic functions, like @code{+} or @code{%}.  Prefix
1644 operators are followed by an argument.  Infix operators appear
1645 between their arguments.  Operators may be preceded and/or followed by
1646 whitespace.
1647
1648 @node Prefix Ops, Infix Ops, Operators, Integer Exprs
1649 @subsection Prefix Operators
1650 @code{as} has the following @dfn{prefix operators}.  They each take
1651 one argument, which must be absolute.
1652 @table @code
1653 @item -
1654 @dfn{Negation}.  Two's complement negation.
1655 @item ~
1656 @dfn{Complementation}.  Bitwise not.
1657 @end table
1658
1659 @node Infix Ops,  , Prefix Ops, Integer Exprs
1660 @subsection Infix Operators
1661
1662 @dfn{Infix operators} take two arguments, one on either side.  Operators
1663 have precedence, but operations with equal precedence are performed left
1664 to right.  Apart from @code{+} or @code{-}, both arguments must be
1665 absolute, and the result is absolute.
1666
1667 @enumerate
1668
1669 @item
1670 Highest Precedence
1671 @table @code
1672 @item *
1673 @dfn{Multiplication}.
1674 @item /
1675 @dfn{Division}.  Truncation is the same as the C operator @samp{/}
1676 @item %
1677 @dfn{Remainder}.
1678 @item <
1679 @itemx <<
1680 @dfn{Shift Left}.  Same as the C operator @samp{<<}
1681 @item >
1682 @itemx >>
1683 @dfn{Shift Right}.  Same as the C operator @samp{>>}
1684 @end table
1685
1686 @item
1687 Intermediate precedence
1688 @table @code
1689 @item |
1690 @dfn{Bitwise Inclusive Or}.
1691 @item &
1692 @dfn{Bitwise And}.
1693 @item ^
1694 @dfn{Bitwise Exclusive Or}.
1695 @item !
1696 @dfn{Bitwise Or Not}.
1697 @end table
1698
1699 @item
1700 Lowest Precedence
1701 @table @code
1702 @item +
1703 @dfn{Addition}.  If either argument is absolute, the result
1704 has the segment of the other argument.
1705 If either argument is pass1 or undefined, the result is pass1.
1706 Otherwise @code{+} is illegal.
1707 @item -
1708 @dfn{Subtraction}.  If the right argument is absolute, the
1709 result has the segment of the left argument.
1710 If either argument is pass1 the result is pass1.
1711 If either argument is undefined the result is difference segment.
1712 If both arguments are in the same segment, the result is absolute---provided
1713 that segment is one of text, data or bss.
1714 Otherwise subtraction is illegal.
1715 @end table
1716 @end enumerate
1717
1718 The sense of the rule for addition is that it's only meaningful to add
1719 the @emph{offsets} in an address; you can only have a defined segment in
1720 one of the two arguments.
1721
1722 Similarly, you can't subtract quantities from two different segments.
1723
1724 @node Pseudo Ops, Machine Dependent, Expressions, Top
1725 @chapter Assembler Directives
1726 @menu
1727 * Abort::                       The Abort directive causes as to abort
1728 * Align::                       Pad the location counter to a power of 2
1729 * App-File::                    Set the logical file name
1730 * Ascii::                       Fill memory with bytes of ASCII characters
1731 * Asciz::                       Fill memory with bytes of ASCII characters followed
1732                 by a null.
1733 * Byte::                        Fill memory with 8-bit integers
1734 * Comm::                        Reserve public space in the BSS segment
1735 * Data::                        Change to the data segment
1736 * Desc::                        Set the n_desc of a symbol
1737 * Double::                      Fill memory with double-precision floating-point numbers
1738 * Else::                        @code{.else}
1739 * End::                         @code{.end}
1740 * Endif::                       @code{.endif}
1741 * Equ::                         @code{.equ @var{symbol}, @var{expression}}
1742 * Extern::                      @code{.extern}
1743 * Fill::                        Fill memory with repeated values
1744 * Float::                       Fill memory with single-precision floating-point numbers
1745 * Global::                      Make a symbol visible to the linker
1746 * Ident::                       @code{.ident}
1747 * If::                          @code{.if @var{absolute expression}}
1748 * Include::                     @code{.include "@var{file}"}
1749 * Int::                         Fill memory with 32-bit integers
1750 * Lcomm::                       Reserve private space in the BSS segment
1751 * Line::                        Set the logical line number
1752 * Ln::                          @code{.ln @var{line-number}}
1753 * List::                        @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
1754 * Long::                        Fill memory with 32-bit integers
1755 * Lsym::                        Create a local symbol
1756 * Octa::                        Fill memory with 128-bit integers
1757 * Org::                         Change the location counter
1758 * Quad::                        Fill memory with 64-bit integers
1759 * Set::                         Set the value of a symbol
1760 * Short::                       Fill memory with 16-bit integers
1761 * Single::                      @code{.single @var{flonums}}
1762 * Stab::                        Store debugging information
1763 * Text::                        Change to the text segment
1764 @c if am29k or sparc
1765 * Word::                        Fill memory with 32-bit integers
1766 @c else (not am29k or sparc)
1767 * Deprecated::                  Deprecated Directives
1768 * Machine Options::             Options
1769 * Machine Syntax::              Syntax
1770 * Floating Point::              Floating Point
1771 * Machine Directives::          Machine Directives
1772 * Opcodes::                     Opcodes
1773 @end menu
1774
1775 All assembler directives have names that begin with a period (@samp{.}).
1776 The rest of the name is letters: their case does not matter.
1777
1778 This chapter discusses directives present in all versions of GNU
1779 @code{as}; @pxref{Machine Dependent} for additional directives.
1780
1781 @node Abort, Align, Pseudo Ops, Pseudo Ops
1782 @section @code{.abort}
1783 This directive stops the assembly immediately.  It is for
1784 compatibility with other assemblers.  The original idea was that the
1785 assembler program would be piped into the assembler.  If the sender
1786 of a program quit, it could use this directive tells @code{as} to
1787 quit also.  One day @code{.abort} will not be supported.
1788
1789 @node Align, App-File, Abort, Pseudo Ops
1790 @section @code{.align @var{absolute-expression} , @var{absolute-expression}}
1791 Pad the location counter (in the current subsegment) to a particular
1792 storage boundary.  The first expression is the number of low-order zero
1793 bits the location counter will have after advancement.  For example
1794 @samp{.align 3} will advance the location counter until it a multiple of
1795 8.  If the location counter is already a multiple of 8, no change is
1796 needed.
1797
1798 The second expression gives the value to be stored in the padding
1799 bytes.  It (and the comma) may be omitted.  If it is omitted, the
1800 padding bytes are zero.
1801
1802 @node App-File, Ascii, Align, Pseudo Ops
1803 @section @code{.app-file @var{string}}
1804 @code{.app-file} tells @code{as} that we are about to start a new
1805 logical file.  @var{String} is the new file name.  In general, the
1806 filename is recognized whether or not it is surrounded by quotes @samp{"};
1807 but if you wish to specify an empty file name is permitted,
1808 you must give the quotes--@code{""}.  This statement may go away in
1809 future: it is only recognized to be compatible with old @code{as}
1810 programs.
1811
1812 @node Ascii, Asciz, App-File, Pseudo Ops
1813 @section @code{.ascii "@var{string}"}@dots{}
1814 @code{.ascii} expects zero or more string literals (@pxref{Strings})
1815 separated by commas.  It assembles each string (with no automatic
1816 trailing zero byte) into consecutive addresses.
1817
1818 @node Asciz, Byte, Ascii, Pseudo Ops
1819 @section @code{.asciz "@var{string}"}@dots{}
1820 @code{.asciz} is just like @code{.ascii}, but each string is followed by
1821 a zero byte.  The ``z'' in @samp{.asciz} stands for ``zero''.
1822
1823 @node Byte, Comm, Asciz, Pseudo Ops
1824 @section @code{.byte @var{expressions}}
1825
1826 @code{.byte} expects zero or more expressions, separated by commas.
1827 Each expression is assembled into the next byte.
1828
1829 @node Comm, Data, Byte, Pseudo Ops
1830 @section @code{.comm @var{symbol} , @var{length} }
1831 @code{.comm} declares a named common area in the bss segment.  Normally
1832 @code{ld} reserves memory addresses for it during linking, so no partial
1833 program defines the location of the symbol.  Use @code{.comm} to tell
1834 @code{ld} that it must be at least @var{length} bytes long.  @code{ld}
1835 will allocate space for each @code{.comm} symbol that is at least as
1836 long as the longest @code{.comm} request in any of the partial programs
1837 linked.  @var{length} is an absolute expression.
1838
1839 @node Data, Desc, Comm, Pseudo Ops
1840 @section @code{.data @var{subsegment}}
1841 @code{.data} tells @code{as} to assemble the following statements onto the
1842 end of the data subsegment numbered @var{subsegment} (which is an
1843 absolute expression).  If @var{subsegment} is omitted, it defaults
1844 to zero.
1845
1846 @node Desc, Double, Data, Pseudo Ops
1847 @section @code{.desc @var{symbol}, @var{absolute-expression}}
1848 This directive sets the descriptor of the symbol (@pxref{Symbol Attributes})
1849 to the low 16 bits of @var{absolute-expression}.
1850
1851 @node Double, Else, Desc, Pseudo Ops
1852 @section @code{.double @var{flonums}}
1853 @code{.double} expects zero or more flonums, separated by commas.  It assembles
1854 floating point numbers.
1855 @c if all-arch
1856 @c The exact kind of floating point numbers
1857 @c emitted depends on how @code{as} is configured.  @xref{Machine
1858 @c Dependent}.
1859 @c fi all-arch
1860 @c if am29k
1861 On the AMD 29K family the floating point format used is IEEE.
1862 @c fi am29k
1863
1864 @node Else, End, Double, Pseudo Ops
1865 @section @code{.else}
1866 @code{.else} is part of the @code{as} support for conditional assembly;
1867 @pxref{If}.  It marks the beginning of a section of code to be assembled
1868 if the condition for the preceding @code{.if} was false.
1869
1870 @ignore
1871 @node End, Endif, Else, Pseudo Ops
1872 @section @code{.end}
1873 This doesn't do anything---but isn't an s_ignore, so I suspect it's
1874 meant to do something eventually (which is why it isn't documented here
1875 as "for compatibility with blah").
1876 @end ignore
1877
1878 @node Endif, Equ, End, Pseudo Ops
1879 @section @code{.endif}
1880 @code{.endif} is part of the @code{as} support for conditional assembly;
1881 it marks the end of a block of code that is only assembled
1882 conditionally.  @xref{If}.
1883
1884 @node Equ, Extern, Endif, Pseudo Ops
1885 @section @code{.equ @var{symbol}, @var{expression}}
1886
1887 This directive sets the value of @var{symbol} to @var{expression}.
1888 It is synonymous with @samp{.set}; @pxref{Set}.
1889
1890 @node Extern, Fill, Equ, Pseudo Ops
1891 @section @code{.extern}
1892 @code{.extern} is accepted in the source program---for compatibility
1893 with other assemblers---but it is ignored.  GNU @code{as} treats
1894 all undefined symbols as external.
1895
1896 @node Fill, Float, Extern, Pseudo Ops
1897 @section @code{.fill @var{repeat} , @var{size} , @var{value}}
1898 @var{result}, @var{size} and @var{value} are absolute expressions.
1899 This emits @var{repeat} copies of @var{size} bytes.  @var{Repeat}
1900 may be zero or more.  @var{Size} may be zero or more, but if it is
1901 more than 8, then it is deemed to have the value 8, compatible with
1902 other people's assemblers.  The contents of each @var{repeat} bytes
1903 is taken from an 8-byte number.  The highest order 4 bytes are
1904 zero.  The lowest order 4 bytes are @var{value} rendered in the
1905 byte-order of an integer on the computer @code{as} is assembling for.
1906 Each @var{size} bytes in a repetition is taken from the lowest order
1907 @var{size} bytes of this number.  Again, this bizarre behavior is
1908 compatible with other people's assemblers.
1909
1910 @var{Size} and @var{value} are optional.
1911 If the second comma and @var{value} are absent, @var{value} is
1912 assumed zero.  If the first comma and following tokens are absent,
1913 @var{size} is assumed to be 1.
1914
1915 @node Float, Global, Fill, Pseudo Ops
1916 @section @code{.float @var{flonums}}
1917 This directive assembles zero or more flonums, separated by commas.  It
1918 has the same effect as @code{.single}.
1919 @c if all-arch
1920 @c The exact kind of floating point numbers emitted depends on how
1921 @c @code{as} is configured.
1922 @c @xref{Machine Dependent}.
1923 @c fi all-arch
1924 @c if am29k
1925 The floating point format used for the AMD 29K family is IEEE.
1926 @c fi am29k
1927
1928 @node Global, Ident, Float, Pseudo Ops
1929 @section @code{.global @var{symbol}}, @code{.globl @var{symbol}}
1930 @code{.global} makes the symbol visible to @code{ld}.  If you define
1931 @var{symbol} in your partial program, its value is made available to
1932 other partial programs that are linked with it.  Otherwise,
1933 @var{symbol} will take its attributes from a symbol of the same name
1934 from another partial program it is linked with.
1935
1936 This is done by setting the @code{N_EXT} bit of that symbol's type byte
1937 to 1. @xref{Symbol Attributes}.
1938
1939 Both spellings (@samp{.globl} and @samp{.global}) are accepted, for
1940 compatibility with other assemblers.
1941
1942 @node Ident, If, Global, Pseudo Ops
1943 @section @code{.ident}
1944 This directive is used by some assemblers to place tags in object files.
1945 GNU @code{as} simply accepts the directive for source-file
1946 compatibility with such assemblers, but does not actually emit anything
1947 for it.
1948
1949 @node If, Include, Ident, Pseudo Ops
1950 @section @code{.if @var{absolute expression}}
1951 @code{.if} marks the beginning of a section of code which is only
1952 considered part of the source program being assembled if the argument
1953 (which must be an @var{absolute expression}) is non-zero.  The end of
1954 the conditional section of code must be marked by @code{.endif}
1955 (@pxref{Endif}); optionally, you may include code for the
1956 alternative condition, flagged by @code{.else} (@pxref{Else}.
1957
1958 The following variants of @code{.if} are also supported:
1959 @table @code
1960 @item ifdef @var{symbol}
1961 Assembles the following section of code if the specified @var{symbol}
1962 has been defined.
1963
1964 @ignore
1965 @item ifeqs
1966 BOGONS??
1967 @end ignore
1968
1969 @item ifndef @var{symbol}
1970 @itemx ifnotdef @var{symbol}
1971 Assembles the following section of code if the specified @var{symbol}
1972 has not been defined.  Both spelling variants are equivalent.
1973
1974 @ignore
1975 @item ifnes
1976 NO bogons, I presume?
1977 @end ignore
1978 @end table
1979
1980 @node Include, Int, If, Pseudo Ops
1981 @section @code{.include "@var{file}"}
1982 This directive provides a way to include supporting files at specified
1983 points in your source program.  The code from @var{file} is assembled as
1984 if it followed the point of the @code{.include}; when the end of the
1985 included file is reached, assembly of the original file continues.  You
1986 can control the search paths used with the @samp{-I} command-line option
1987 (@pxref{Options}).  Quotation marks are required around @var{file}.
1988
1989 @node Int, Lcomm, Include, Pseudo Ops
1990 @section @code{.int @var{expressions}}
1991 Expect zero or more @var{expressions}, of any segment, separated by
1992 commas.  For each expression, emit a 32-bit number that will, at run
1993 time, be the value of that expression.  The byte order of the
1994 expression depends on what kind of computer will run the program.
1995
1996 @node Lcomm, Line, Int, Pseudo Ops
1997 @section @code{.lcomm @var{symbol} , @var{length}}
1998 Reserve @var{length} (an absolute expression) bytes for a local
1999 common denoted by @var{symbol}.  The segment and value of @var{symbol} are
2000 those of the new local common.  The addresses are allocated in the
2001 bss segment, so at run-time the bytes will start off zeroed.
2002 @var{Symbol} is not declared global (@pxref{Global}), so is normally
2003 not visible to @code{ld}.
2004
2005 @c if not am29k
2006 @ignore
2007 @node Line, Ln, Lcomm, Pseudo Ops
2008 @section @code{.line @var{line-number}}, @code{.ln @var{line-number}}
2009 @code{.line}, and its alternate spelling @code{.ln}, tell
2010 @end ignore
2011 @c fi not am29k
2012 @c if am29k
2013 @node Ln, List, Line, Pseudo Ops
2014 @section @code{.ln @var{line-number}}
2015 Tell
2016 @c fi am29k
2017 @code{as} to change the logical line number.  @var{line-number} must be
2018 an absolute expression.  The next line will have that logical line
2019 number.  So any other statements on the current line (after a statement
2020 separator character
2021 @c if am29k
2022 @samp{@@})
2023 @c fi am29k
2024 @c if not am29k
2025 @c @code{;})
2026 @c fi not am29k
2027 will be reported as on logical line number
2028 @var{logical line number} @minus{} 1.
2029 One day this directive will be unsupported: it is used only
2030 for compatibility with existing assembler programs. @refill
2031
2032 @node List, Long, Ln, Pseudo Ops
2033 @section @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
2034 GNU @code{as} ignores these directives; however, they're
2035 accepted for compatibility with assemblers that use them.
2036
2037 @node Long, Lsym, List, Pseudo Ops
2038 @section @code{.long @var{expressions}}
2039 @code{.long} is the same as @samp{.int}, @pxref{Int}.
2040
2041 @node Lsym, Octa, Long, Pseudo Ops
2042 @section @code{.lsym @var{symbol}, @var{expression}}
2043 @code{.lsym} creates a new symbol named @var{symbol}, but does not put it in
2044 the hash table, ensuring it cannot be referenced by name during the
2045 rest of the assembly.  This sets the attributes of the symbol to be
2046 the same as the expression value:
2047 @example
2048 @var{other} = @var{descriptor} = 0
2049 @var{type} = @r{(segment of @var{expression})}
2050 N_EXT = 0
2051 @var{value} = @var{expression}
2052 @end example
2053
2054 @node Octa, Org, Lsym, Pseudo Ops
2055 @section @code{.octa @var{bignums}}
2056 This directive expects zero or more bignums, separated by commas.  For each
2057 bignum, it emits a 16-byte integer.
2058
2059 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2060 hence @emph{quad}-word for 8 bytes.
2061
2062 @node Org, Quad, Octa, Pseudo Ops
2063 @section @code{.org @var{new-lc} , @var{fill}}
2064
2065 @code{.org} will advance the location counter of the current segment to
2066 @var{new-lc}.  @var{new-lc} is either an absolute expression or an
2067 expression with the same segment as the current subsegment.  That is,
2068 you can't use @code{.org} to cross segments: if @var{new-lc} has the
2069 wrong segment, the @code{.org} directive is ignored.  To be compatible
2070 with former assemblers, if the segment of @var{new-lc} is absolute,
2071 @code{as} will issue a warning, then pretend the segment of @var{new-lc}
2072 is the same as the current subsegment.
2073
2074 @code{.org} may only increase the location counter, or leave it
2075 unchanged; you cannot use @code{.org} to move the location counter
2076 backwards.
2077
2078 @c double negative used below "not undefined" because this is a specific
2079 @c reference to "undefined" (as SEG_UNKNOWN is called in this manual)
2080 @c segment. pesch@cygnus.com 18feb91
2081 Because @code{as} tries to assemble programs in one pass @var{new-lc}
2082 may not be undefined.  If you really detest this restriction we eagerly await
2083 a chance to share your improved assembler.
2084
2085 Beware that the origin is relative to the start of the segment, not
2086 to the start of the subsegment.  This is compatible with other
2087 people's assemblers.
2088
2089 When the location counter (of the current subsegment) is advanced, the
2090 intervening bytes are filled with @var{fill} which should be an
2091 absolute expression.  If the comma and @var{fill} are omitted,
2092 @var{fill} defaults to zero.
2093
2094 @node Quad, Set, Org, Pseudo Ops
2095 @section @code{.quad @var{bignums}}
2096 @code{.quad} expects zero or more bignums, separated by commas.  For
2097 each bignum, it emits an 8-byte integer.  If the bignum won't fit in a 8
2098 bytes, it prints a warning message; and just takes the lowest order 8
2099 bytes of the bignum.
2100
2101 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2102 hence @emph{quad}-word for 8 bytes.
2103
2104 @node Set, Short, Quad, Pseudo Ops
2105 @section @code{.set @var{symbol}, @var{expression}}
2106
2107 This directive sets the value of @var{symbol} to @var{expression}.  This
2108 will change @var{symbol}'s value and type to conform to
2109 @var{expression}.  If @code{N_EXT} is set, it remains set.
2110 (@xref{Symbol Attributes}.)
2111
2112 You may @code{.set} a symbol many times in the same assembly.
2113 If the expression's segment is unknowable during pass 1, a second
2114 pass over the source program will be forced.  The second pass is
2115 currently not implemented.  @code{as} will abort with an error
2116 message if one is required.
2117
2118 If you @code{.set} a global symbol, the value stored in the object
2119 file is the last value stored into it.
2120
2121 @node Short, Single, Set, Pseudo Ops
2122 @section @code{.short @var{expressions}}
2123 @c if not (sparc or amd29k)
2124 @c @code{.short} is the same as @samp{.word}.  @xref{Word}.
2125 @c fi not (sparc or amd29k)
2126 @c if (sparc or amd29k)
2127 This expects zero or more @var{expressions}, and emits
2128 a 16 bit number for each.
2129 @c fi (sparc or amd29k)
2130
2131 @node Single, Space, Short, Pseudo Ops
2132 @section @code{.single @var{flonums}}
2133 This directive assembles zero or more flonums, separated by commas.  It
2134 has the same effect as @code{.float}.
2135 @c if all-arch
2136 @c The exact kind of floating point numbers emitted depends on how
2137 @c @code{as} is configured.  @xref{Machine Dependent}.
2138 @c fi all-arch
2139 @c if am29k
2140 The floating point format used for the AMD 29K family is IEEE.
2141 @c fi am29k
2142
2143
2144 @node Space, Space, Single, Pseudo Ops
2145 @c if not am29k
2146 @ignore
2147 @section @code{.space @var{size} , @var{fill}}
2148 This directive emits @var{size} bytes, each of value @var{fill}.  Both
2149 @var{size} and @var{fill} are absolute expressions.  If the comma
2150 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2151 @end ignore
2152 @c fi not am29k
2153
2154 @c if am29k
2155 @section @code{.space}
2156 This directive is ignored; it is accepted for compatibility with other
2157 AMD 29K assemblers.
2158
2159 @quotation
2160 @emph{Warning:} In other versions of GNU @code{as}, the directive
2161 @code{.space} has the effect of @code{.block}  @xref{Machine Directives}.
2162 @end quotation
2163 @c fi am29k
2164
2165 @node Stab, Text, Space, Pseudo Ops
2166 @section @code{.stabd, .stabn, .stabs}
2167 There are three directives that begin @samp{.stab}.
2168 All emit symbols (@pxref{Symbols}), for use by symbolic debuggers.
2169 The symbols are not entered in @code{as}' hash table: they
2170 cannot be referenced elsewhere in the source file.
2171 Up to five fields are required:
2172 @table @var
2173 @item string
2174 This is the symbol's name.  It may contain any character except @samp{\000},
2175 so is more general than ordinary symbol names.  Some debuggers used to
2176 code arbitrarily complex structures into symbol names using this field.
2177 @item type
2178 An absolute expression.  The symbol's type is set to the low 8
2179 bits of this expression.
2180 Any bit pattern is permitted, but @code{ld} and debuggers will choke on
2181 silly bit patterns.
2182 @item other
2183 An absolute expression.
2184 The symbol's ``other'' attribute is set to the low 8 bits of this expression.
2185 @item desc
2186 An absolute expression.
2187 The symbol's descriptor is set to the low 16 bits of this expression.
2188 @item value
2189 An absolute expression which becomes the symbol's value.
2190 @end table
2191
2192 If a warning is detected while reading a @code{.stabd}, @code{.stabn},
2193 or @code{.stabs} statement, the symbol has probably already been created
2194 and you will get a half-formed symbol in your object file.  This is
2195 compatible with earlier assemblers!
2196
2197 @table @code
2198 @item .stabd @var{type} , @var{other} , @var{desc}
2199
2200 The ``name'' of the symbol generated is not even an empty string.
2201 It is a null pointer, for compatibility.  Older assemblers used a
2202 null pointer so they didn't waste space in object files with empty
2203 strings.
2204
2205 The symbol's value is set to the location counter,
2206 relocatably.  When your program is linked, the value of this symbol
2207 will be where the location counter was when the @code{.stabd} was
2208 assembled.
2209
2210 @item .stabn @var{type} , @var{other} , @var{desc} , @var{value}
2211
2212 The name of the symbol is set to the empty string @code{""}.
2213
2214 @item .stabs @var{string} ,  @var{type} , @var{other} , @var{desc} , @var{value}
2215
2216 All five fields are specified.
2217 @end table
2218
2219 @node Text, Word, Stab, Pseudo Ops
2220 @section @code{.text @var{subsegment}}
2221 Tells @code{as} to assemble the following statements onto the end of
2222 the text subsegment numbered @var{subsegment}, which is an absolute
2223 expression.  If @var{subsegment} is omitted, subsegment number zero
2224 is used.
2225
2226 @node Word, Deprecated, Text, Pseudo Ops
2227 @section @code{.word @var{expressions}}
2228 This directive expects zero or more @var{expressions}, of any segment,
2229 separated by commas.
2230 @c if sparc or amd29k
2231 For each expression, @code{as} emits a 32-bit number.
2232 @c fi sparc or amd29k
2233 @c if not (sparc or amd29k)
2234 @c For each expression, @code{as} emits a 16-bit number.
2235 @c fi not (sparc or amd29k)
2236 @ignore
2237 @c if all-arch
2238 The byte order
2239 of the expression depends on what kind of computer will run the
2240 program.
2241 @c fi all-arch
2242 @end ignore
2243
2244 @ignore
2245 @c on the 29k this doesn't happen---32-bit addressability, period; no
2246 @c long/short jumps.
2247 @c if not am29k
2248 @subsection Special Treatment to support Compilers
2249
2250 In order to assemble compiler output into something that will work,
2251 @code{as} will occasionlly do strange things to @samp{.word} directives.
2252 Directives of the form @samp{.word sym1-sym2} are often emitted by
2253 compilers as part of jump tables.  Therefore, when @code{as} assembles a
2254 directive of the form @samp{.word sym1-sym2}, and the difference between
2255 @code{sym1} and @code{sym2} does not fit in 16 bits, @code{as} will
2256 create a @dfn{secondary jump table}, immediately before the next label.
2257 This @var{secondary jump table} will be preceded by a short-jump to the
2258 first byte after the secondary table.  This short-jump prevents the flow
2259 of control from accidentally falling into the new table.  Inside the
2260 table will be a long-jump to @code{sym2}.  The original @samp{.word}
2261 will contain @code{sym1} minus the address of the long-jump to
2262 @code{sym2}.
2263
2264 If there were several occurrences of @samp{.word sym1-sym2} before the
2265 secondary jump table, all of them will be adjusted.  If there was a
2266 @samp{.word sym3-sym4}, that also did not fit in sixteen bits, a
2267 long-jump to @code{sym4} will be included in the secondary jump table,
2268 and the @code{.word} directives will be adjusted to contain @code{sym3}
2269 minus the address of the long-jump to @code{sym4}; and so on, for as many
2270 entries in the original jump table as necessary.
2271 @end ignore
2272 @ignore
2273 @c if internals
2274 @emph{This feature may be disabled by compiling @code{as} with the
2275 @samp{-DWORKING_DOT_WORD} option.} This feature is likely to confuse
2276 assembly language programmers.
2277 @c fi internals
2278 @end ignore
2279
2280
2281 @node Deprecated, Machine Dependent, Word, Pseudo Ops
2282 @section Deprecated Directives
2283 One day these directives won't work.
2284 They are included for compatibility with older assemblers.
2285 @table @t
2286 @item .abort
2287 @item .app-file
2288 @item .line
2289 @end table
2290
2291 @node Machine Dependent, Machine Dependent, Pseudo Ops, Top
2292 @c if all-arch
2293 @c chapter Machine Dependent Features
2294 @c fi all-arch
2295 @c if 680x0
2296 @c chapter Machine Dependent Features: Motorola 680x0
2297 @c fi 680x0
2298 @c if amd29k
2299 @chapter Machine Dependent Features: AMD 29K
2300 @c fi amd29k
2301 @c pesch@cygnus.com: This version of the manual is specifically hacked
2302 @c                   for gas on a particular machine.
2303 @c                   We should have a config method of
2304 @c                   automating this; in the meantime, use ignore
2305 @c                   for the other architectures (or for their stubs)
2306 @ignore
2307 @c if all-arch
2308 @section Vax
2309 @c fi all-arch
2310 @subsection Options
2311
2312 The Vax version of @code{as} accepts any of the following options,
2313 gives a warning message that the option was ignored and proceeds.
2314 These options are for compatibility with scripts designed for other
2315 people's assemblers.
2316
2317 @table @asis
2318 @item @kbd{-D} (Debug)
2319 @itemx @kbd{-S} (Symbol Table)
2320 @itemx @kbd{-T} (Token Trace)
2321 These are obsolete options used to debug old assemblers.
2322
2323 @item @kbd{-d} (Displacement size for JUMPs)
2324 This option expects a number following the @kbd{-d}.  Like options
2325 that expect filenames, the number may immediately follow the
2326 @kbd{-d} (old standard) or constitute the whole of the command line
2327 argument that follows @kbd{-d} (GNU standard).
2328
2329 @item @kbd{-V} (Virtualize Interpass Temporary File)
2330 Some other assemblers use a temporary file.  This option
2331 commanded them to keep the information in active memory rather
2332 than in a disk file.  @code{as} always does this, so this
2333 option is redundant.
2334
2335 @item @kbd{-J} (JUMPify Longer Branches)
2336 Many 32-bit computers permit a variety of branch instructions
2337 to do the same job.  Some of these instructions are short (and
2338 fast) but have a limited range; others are long (and slow) but
2339 can branch anywhere in virtual memory.  Often there are 3
2340 flavors of branch: short, medium and long.  Some other
2341 assemblers would emit short and medium branches, unless told by
2342 this option to emit short and long branches.
2343
2344 @item @kbd{-t} (Temporary File Directory)
2345 Some other assemblers may use a temporary file, and this option
2346 takes a filename being the directory to site the temporary
2347 file.  @code{as} does not use a temporary disk file, so this
2348 option makes no difference.  @kbd{-t} needs exactly one
2349 filename.
2350 @end table
2351
2352 The Vax version of the assembler accepts two options when
2353 compiled for VMS.  They are @kbd{-h}, and @kbd{-+}.  The
2354 @kbd{-h} option prevents @code{as} from modifying the
2355 symbol-table entries for symbols that contain lowercase
2356 characters (I think).  The @kbd{-+} option causes @code{as} to
2357 print warning messages if the FILENAME part of the object file,
2358 or any symbol name is larger than 31 characters.  The @kbd{-+}
2359 option also insertes some code following the @samp{_main}
2360 symbol so that the object file will be compatible with Vax-11
2361 "C".
2362
2363 @subsection Floating Point
2364 Conversion of flonums to floating point is correct, and
2365 compatible with previous assemblers.  Rounding is
2366 towards zero if the remainder is exactly half the least significant bit.
2367
2368 @code{D}, @code{F}, @code{G} and @code{H} floating point formats
2369 are understood.
2370
2371 Immediate floating literals (@emph{e.g.} @samp{S`$6.9})
2372 are rendered correctly.  Again, rounding is towards zero in the
2373 boundary case.
2374
2375 The @code{.float} directive produces @code{f} format numbers.
2376 The @code{.double} directive produces @code{d} format numbers.
2377
2378 @subsection Machine Directives
2379 The Vax version of the assembler supports four directives for
2380 generating Vax floating point constants.  They are described in the
2381 table below.
2382
2383 @table @code
2384 @item .dfloat
2385 This expects zero or more flonums, separated by commas, and
2386 assembles Vax @code{d} format 64-bit floating point constants.
2387
2388 @item .ffloat
2389 This expects zero or more flonums, separated by commas, and
2390 assembles Vax @code{f} format 32-bit floating point constants.
2391
2392 @item .gfloat
2393 This expects zero or more flonums, separated by commas, and
2394 assembles Vax @code{g} format 64-bit floating point constants.
2395
2396 @item .hfloat
2397 This expects zero or more flonums, separated by commas, and
2398 assembles Vax @code{h} format 128-bit floating point constants.
2399
2400 @end table
2401
2402 @subsection Opcodes
2403 All DEC mnemonics are supported.  Beware that @code{case@dots{}}
2404 instructions have exactly 3 operands.  The dispatch table that
2405 follows the @code{case@dots{}} instruction should be made with
2406 @code{.word} statements.  This is compatible with all unix
2407 assemblers we know of.
2408
2409 @subsection Branch Improvement
2410 Certain pseudo opcodes are permitted.  They are for branch
2411 instructions.  They expand to the shortest branch instruction that
2412 will reach the target.  Generally these mnemonics are made by
2413 substituting @samp{j} for @samp{b} at the start of a DEC mnemonic.
2414 This feature is included both for compatibility and to help
2415 compilers.  If you don't need this feature, don't use these
2416 opcodes.  Here are the mnemonics, and the code they can expand into.
2417
2418 @table @code
2419 @item jbsb
2420 @samp{Jsb} is already an instruction mnemonic, so we chose @samp{jbsb}.
2421 @table @asis
2422 @item (byte displacement)
2423 @kbd{bsbb @dots{}}
2424 @item (word displacement)
2425 @kbd{bsbw @dots{}}
2426 @item (long displacement)
2427 @kbd{jsb @dots{}}
2428 @end table
2429 @item jbr
2430 @itemx jr
2431 Unconditional branch.
2432 @table @asis
2433 @item (byte displacement)
2434 @kbd{brb @dots{}}
2435 @item (word displacement)
2436 @kbd{brw @dots{}}
2437 @item (long displacement)
2438 @kbd{jmp @dots{}}
2439 @end table
2440 @item j@var{COND}
2441 @var{COND} may be any one of the conditional branches
2442 @code{neq nequ eql eqlu gtr geq lss gtru lequ vc vs gequ cc lssu cs}.
2443 @var{COND} may also be one of the bit tests
2444 @code{bs bc bss bcs bsc bcc bssi bcci lbs lbc}.
2445 @var{NOTCOND} is the opposite condition to @var{COND}.
2446 @table @asis
2447 @item (byte displacement)
2448 @kbd{b@var{COND} @dots{}}
2449 @item (word displacement)
2450 @kbd{b@var{UNCOND} foo ; brw @dots{} ; foo:}
2451 @item (long displacement)
2452 @kbd{b@var{UNCOND} foo ; jmp @dots{} ; foo:}
2453 @end table
2454 @item jacb@var{X}
2455 @var{X} may be one of @code{b d f g h l w}.
2456 @table @asis
2457 @item (word displacement)
2458 @kbd{@var{OPCODE} @dots{}}
2459 @item (long displacement)
2460 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @dots{} ; bar:}
2461 @end table
2462 @item jaob@var{YYY}
2463 @var{YYY} may be one of @code{lss leq}.
2464 @item jsob@var{ZZZ}
2465 @var{ZZZ} may be one of @code{geq gtr}.
2466 @table @asis
2467 @item (byte displacement)
2468 @kbd{@var{OPCODE} @dots{}}
2469 @item (word displacement)
2470 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2471 @item (long displacement)
2472 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar: }
2473 @end table
2474 @item aobleq
2475 @itemx aoblss
2476 @itemx sobgeq
2477 @itemx sobgtr
2478 @table @asis
2479 @item (byte displacement)
2480 @kbd{@var{OPCODE} @dots{}}
2481 @item (word displacement)
2482 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2483 @item (long displacement)
2484 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar:}
2485 @end table
2486 @end table
2487
2488 @subsection operands
2489 The immediate character is @samp{$} for Unix compatibility, not
2490 @samp{#} as DEC writes it.
2491
2492 The indirect character is @samp{*} for Unix compatibility, not
2493 @samp{@@} as DEC writes it.
2494
2495 The displacement sizing character is @samp{`} (an accent grave) for
2496 Unix compatibility, not @samp{^} as DEC writes it.  The letter
2497 preceding @samp{`} may have either case.  @samp{G} is not
2498 understood, but all other letters (@code{b i l s w}) are understood.
2499
2500 Register names understood are @code{r0 r1 r2 @dots{} r15 ap fp sp
2501 pc}.  Any case of letters will do.
2502
2503 For instance
2504 @example
2505 tstb *w`$4(r5)
2506 @end example
2507
2508 Any expression is permitted in an operand.  Operands are comma
2509 separated.
2510
2511 @c There is some bug to do with recognizing expressions
2512 @c in operands, but I forget what it is.  It is
2513 @c a syntax clash because () is used as an address mode
2514 @c and to encapsulate sub-expressions.
2515 @subsection Not Supported
2516 Vax bit fields can not be assembled with @code{as}.  Someone
2517 can add the required code if they really need it.
2518 @end ignore
2519
2520 @c if am29k
2521 @node Machine Options, Machine Syntax, Machine Dependent, Machine Dependent
2522 @section Options
2523 GNU @code{as} has no additional command-line options for the AMD
2524 29K family.
2525
2526 @node Machine Syntax, Floating Point, Machine Options, Machine Dependent
2527 @section Syntax
2528 @subsection Special Characters
2529 @samp{;} is the line comment character.
2530
2531 @samp{@@} can be used instead of a newline to separate statements.
2532
2533 The character @samp{?} is permitted in identifiers (but may not begin
2534 an identifier).
2535
2536 @subsection Register Names
2537 General-purpose registers are represented by predefined symbols of the
2538 form @samp{GR@var{nnn}} (for global registers) or @samp{LR@var{nnn}}
2539 (for local registers), where @var{nnn} represents a number between
2540 @code{0} and @code{127}, written with no leading zeros.  The leading
2541 letters may be in either upper or lower case; for example, @samp{gr13}
2542 and @samp{LR7} are both valid register names.
2543
2544 You may also refer to general-purpose registers by specifying the
2545 register number as the result of an expression (prefixed with @samp{%%}
2546 to flag the expression as a register number):
2547 @example
2548 %%@var{expression}
2549 @end example
2550 @noindent---where @var{expression} must be an absolute expression
2551 evaluating to a number between @code{0} and @code{255}.  The range
2552 [0, 127] refers to global registers, and the range [128, 255] to local
2553 registers.
2554
2555 In addition, GNU @code{as} understands the following protected
2556 special-purpose register names for the AMD 29K family:
2557
2558 @example
2559   vab    chd    pc0
2560   ops    chc    pc1
2561   cps    rbp    pc2
2562   cfg    tmc    mmu
2563   cha    tmr    lru
2564 @end example
2565
2566 These unprotected special-purpose register names are also recognized:
2567 @example
2568   ipc    alu    fpe
2569   ipa    bp     inte
2570   ipb    fc     fps
2571   q      cr     exop
2572 @end example
2573
2574 @node Floating Point, Machine Directives, Machine Syntax, Machine Dependent
2575 @section Floating Point
2576 The AMD 29K family uses IEEE floating-point numbers.
2577
2578 @node Machine Directives, Opcodes, Floating Point, Machine Dependent
2579 @section Machine Directives
2580
2581 @menu
2582 * block::                       @code{.block @var{size} , @var{fill}}
2583 * cputype::                     @code{.cputype}
2584 * file::                        @code{.file}
2585 * hword::                       @code{.hword @var{expressions}}
2586 * line::                        @code{.line}
2587 * reg::                         @code{.reg @var{symbol}, @var{expression}}
2588 * sect::                        @code{.sect}
2589 * use::                         @code{.use @var{segment name}}
2590 @end menu
2591
2592 @node block, cputype, Machine Directives, Machine Directives
2593 @subsection @code{.block @var{size} , @var{fill}}
2594 This directive emits @var{size} bytes, each of value @var{fill}.  Both
2595 @var{size} and @var{fill} are absolute expressions.  If the comma
2596 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2597
2598 In other versions of GNU @code{as}, this directive is called
2599 @samp{.space}.
2600
2601 @node cputype, file, block, Machine Directives
2602 @subsection @code{.cputype}
2603 This directive is ignored; it is accepted for compatibility with other
2604 AMD 29K assemblers.
2605
2606 @node file, hword, cputype, Machine Directives
2607 @subsection @code{.file}
2608 This directive is ignored; it is accepted for compatibility with other
2609 AMD 29K assemblers.
2610
2611 @quotation
2612 @emph{Warning:} in other versions of GNU @code{as}, @code{.file} is
2613 used for the directive called @code{.app-file} in the AMD 29K support.
2614 @end quotation
2615
2616 @node hword, line, file, Machine Directives
2617 @subsection @code{.hword @var{expressions}}
2618 This expects zero or more @var{expressions}, and emits
2619 a 16 bit number for each.  (Synonym for @samp{.short}.)
2620
2621 @node line, reg, hword, Machine Directives
2622 @subsection @code{.line}
2623 This directive is ignored; it is accepted for compatibility with other
2624 AMD 29K assemblers.
2625
2626 @node reg, sect, line, Machine Directives
2627 @subsection @code{.reg @var{symbol}, @var{expression}}
2628 @code{.reg} has the same effect as @code{.lsym}; @pxref{Lsym}.
2629
2630 @node sect, use, reg, Machine Directives
2631 @subsection @code{.sect}
2632 This directive is ignored; it is accepted for compatibility with other
2633 AMD 29K assemblers.
2634
2635 @node use,  , sect, Machine Directives
2636 @subsection @code{.use @var{segment name}}
2637 Establishes the segment and subsegment for the following code;
2638 @var{segment name} may be one of @code{.text}, @code{.data},
2639 @code{.data1}, or @code{.lit}.  With one of the first three @var{segment
2640 name} options, @samp{.use} is equivalent to the machine directive
2641 @var{segment name}; the remaining case, @samp{.use .lit}, is the same as
2642 @samp{.data 200}.
2643
2644
2645 @node Opcodes, Opcodes, Machine Directives, Machine Dependent
2646 @section Opcodes
2647 GNU @code{as} implements all the standard AMD 29K opcodes.  No
2648 additional pseudo-instructions are needed on this family.
2649
2650 For information on the 29K machine instruction set, see @cite{Am29000
2651 User's Manual}, Advanced Micro Devices, Inc.
2652
2653
2654 @c fi am29k
2655 @ignore
2656 @c if 680x0
2657 @section Options
2658 The 680x0 version of @code{as} has two machine dependent options.
2659 One shortens undefined references from 32 to 16 bits, while the
2660 other is used to tell @code{as} what kind of machine it is
2661 assembling for.
2662
2663 You can use the @kbd{-l} option to shorten the size of references to
2664 undefined symbols.  If the @kbd{-l} option is not given, references to
2665 undefined symbols will be a full long (32 bits) wide.  (Since @code{as}
2666 cannot know where these symbols will end up, @code{as} can only allocate
2667 space for the linker to fill in later.  Since @code{as} doesn't know how
2668 far away these symbols will be, it allocates as much space as it can.)
2669 If this option is given, the references will only be one word wide (16
2670 bits).  This may be useful if you want the object file to be as small as
2671 possible, and you know that the relevant symbols will be less than 17
2672 bits away.
2673
2674 The 680x0 version of @code{as} is most frequently used to assemble
2675 programs for the Motorola MC68020 microprocessor.  Occasionally it is
2676 used to assemble programs for the mostly similar, but slightly different
2677 MC68000 or MC68010 microprocessors.  You can give @code{as} the options
2678 @samp{-m68000}, @samp{-mc68000}, @samp{-m68010}, @samp{-mc68010},
2679 @samp{-m68020}, and @samp{-mc68020} to tell it what processor is the
2680 target.
2681
2682 @section Syntax
2683
2684 The 680x0 version of @code{as} uses syntax similar to the Sun assembler.
2685 Size modifiers are appended directly to the end of the opcode without an
2686 intervening period.  For example, write @samp{movl} rather than
2687 @samp{move.l}.
2688
2689 @c pesch@cygnus.com: Vintage Release c1.37 isn't compiled with
2690 @c SUN_ASM_SYNTAX.
2691 @c ignore
2692 If @code{as} is compiled with SUN_ASM_SYNTAX defined, it will also allow
2693 Sun-style local labels of the form @samp{1$} through @samp{$9}.
2694 @c end ignore
2695
2696 In the following table @dfn{apc} stands for any of the address
2697 registers (@samp{a0} through @samp{a7}), nothing, (@samp{}), the
2698 Program Counter (@samp{pc}), or the zero-address relative to the
2699 program counter (@samp{zpc}).
2700
2701 The following addressing modes are understood:
2702 @table @dfn
2703 @item Immediate
2704 @samp{#@var{digits}}
2705
2706 @item Data Register
2707 @samp{d0} through @samp{d7}
2708
2709 @item Address Register
2710 @samp{a0} through @samp{a7}
2711
2712 @item Address Register Indirect
2713 @samp{a0@@} through @samp{a7@@}
2714
2715 @item Address Register Postincrement
2716 @samp{a0@@+} through @samp{a7@@+}
2717
2718 @item Address Register Predecrement
2719 @samp{a0@@-} through @samp{a7@@-}
2720
2721 @item Indirect Plus Offset
2722 @samp{@var{apc}@@(@var{digits})}
2723
2724 @item Index
2725 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2726 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})}
2727
2728 @item Postindex
2729 @samp{@var{apc}@@(@var{digits})@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2730 or @samp{@var{apc}@@(@var{digits})@@(@var{register}:@var{size}:@var{scale})}
2731
2732 @item Preindex
2733 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2734 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2735
2736 @item Memory Indirect
2737 @samp{@var{apc}@@(@var{digits})@@(@var{digits})}
2738
2739 @item Absolute
2740 @samp{@var{symbol}}, or @samp{@var{digits}}
2741 @c ignore
2742 @c pesch@cygnus.com: gnu, rich concur the following needs careful
2743 @c                             research before documenting.
2744                                            , or either of the above followed
2745 by @samp{:b}, @samp{:w}, or @samp{:l}.
2746 @c end ignore
2747 @end table
2748
2749 @section Floating Point
2750 The floating point code is not too well tested, and may have
2751 subtle bugs in it.
2752
2753 Packed decimal (P) format floating literals are not supported.
2754 Feel free to add the code!
2755
2756 The floating point formats generated by directives are these.
2757 @table @code
2758 @item .float
2759 @code{Single} precision floating point constants.
2760 @item .double
2761 @code{Double} precision floating point constants.
2762 @end table
2763
2764 There is no directive to produce regions of memory holding
2765 extended precision numbers, however they can be used as
2766 immediate operands to floating-point instructions.  Adding a
2767 directive to create extended precision numbers would not be
2768 hard, but it has not yet seemed necessary.
2769
2770 @section Machine Directives
2771 In order to be compatible with the Sun assembler the 680x0 assembler
2772 understands the following directives.
2773 @table @code
2774 @item .data1
2775 This directive is identical to a @code{.data 1} directive.
2776 @item .data2
2777 This directive is identical to a @code{.data 2} directive.
2778 @item .even
2779 This directive is identical to a @code{.align 1} directive.
2780 @c Is this true?  does it work???
2781 @item .skip
2782 This directive is identical to a @code{.space} directive.
2783 @end table
2784
2785 @section Opcodes
2786 @c pesch@cygnus.com: I don't see any point in the following
2787 @c                   paragraph.  Bugs are bugs; how does saying this
2788 @c                   help anyone?
2789 @c ignore
2790 Danger:  Several bugs have been found in the opcode table (and
2791 fixed).  More bugs may exist.  Be careful when using obscure
2792 instructions.
2793 @c end ignore
2794
2795 @subsection Branch Improvement
2796
2797 Certain pseudo opcodes are permitted for branch instructions.
2798 They expand to the shortest branch instruction that will reach the
2799 target.  Generally these mnemonics are made by substituting @samp{j} for
2800 @samp{b} at the start of a Motorola mnemonic.
2801
2802 The following table summarizes the pseudo-operations.  A @code{*} flags
2803 cases that are more fully described after the table:
2804
2805 @example
2806           Displacement
2807           +---------------------------------------------------------
2808           |                68020   68000/10
2809 Pseudo-Op |BYTE    WORD    LONG    LONG      non-PC relative
2810           +---------------------------------------------------------
2811      jbsr |bsrs    bsr     bsrl    jsr       jsr
2812       jra |bras    bra     bral    jmp       jmp
2813 *     jXX |bXXs    bXX     bXXl    bNXs;jmpl bNXs;jmp
2814 *    dbXX |dbXX    dbXX        dbXX; bra; jmpl
2815 *    fjXX |fbXXw   fbXXw   fbXXl             fbNXw;jmp
2816
2817 XX: condition
2818 NX: negative of condition XX
2819
2820 @end example
2821 @center{@code{*}---see full description below}
2822
2823 @table @code
2824 @item jbsr
2825 @itemx jra
2826 These are the simplest jump pseudo-operations; they always map to one
2827 particular machine instruction, depending on the displacement to the
2828 branch target.
2829
2830 @item j@var{XX}
2831 Here, @samp{j@var{XX}} stands for an entire family of pseudo-operations,
2832 where @var{XX} is a conditional branch or condition-code test.  The full
2833 list of pseudo-ops in this family is:
2834 @example
2835  jhi   jls   jcc   jcs   jne   jeq   jvc
2836  jvs   jpl   jmi   jge   jlt   jgt   jle
2837 @end example
2838
2839 For the cases of non-PC relative displacements and long displacements on
2840 the 68000 or 68010, @code{as} will issue a longer code fragment in terms of
2841 @var{NX}, the opposite condition to @var{XX}:
2842 @example
2843     j@var{XX} foo
2844 @end example
2845 gives
2846 @example
2847      b@var{NX}s oof
2848      jmp foo
2849  oof:
2850 @end example
2851
2852 @item db@var{XX}
2853 The full family of pseudo-operations covered here is
2854 @example
2855  dbhi   dbls   dbcc   dbcs   dbne   dbeq   dbvc
2856  dbvs   dbpl   dbmi   dbge   dblt   dbgt   dble
2857  dbf    dbra   dbt
2858 @end example
2859
2860 Other than for word and byte displacements, when the source reads
2861 @samp{db@var{XX} foo}, @code{as} will emit
2862 @example
2863      db@var{XX} oo1
2864      bra oo2
2865  oo1:jmpl foo
2866  oo2:
2867 @end example
2868
2869 @item fj@var{XX}
2870 This family includes
2871 @example
2872  fjne   fjeq   fjge   fjlt   fjgt   fjle   fjf
2873  fjt    fjgl   fjgle  fjnge  fjngl  fjngle fjngt
2874  fjnle  fjnlt  fjoge  fjogl  fjogt  fjole  fjolt
2875  fjor   fjseq  fjsf   fjsne  fjst   fjueq  fjuge
2876  fjugt  fjule  fjult  fjun
2877 @end example
2878
2879 For branch targets that are not PC relative, @code{as} emits
2880 @example
2881      fb@var{NX} oof
2882      jmp foo
2883  oof:
2884 @end example
2885 when it encounters @samp{fj@var{XX} foo}.
2886
2887 @end table
2888
2889 @subsection Special Characters
2890 The immediate character is @samp{#} for Sun compatibility.  The
2891 line-comment character is @samp{|}.  If a @samp{#} appears at the
2892 beginning of a line, it is treated as a comment unless it looks like
2893 @samp{# line file}, in which case it is treated normally.
2894 @c fi 680x0
2895 @end ignore
2896
2897 @c pesch@cygnus.com: see remarks at ignore for vax.
2898 @ignore
2899 @section 32x32
2900 @section Options
2901 The 32x32 version of @code{as} accepts a @kbd{-m32032} option to
2902 specify thiat it is compiling for a 32032 processor, or a
2903 @kbd{-m32532} to specify that it is compiling for a 32532 option.
2904 The default (if neither is specified) is chosen when the assembler
2905 is compiled.
2906
2907 @subsection Syntax
2908 I don't know anything about the 32x32 syntax assembled by
2909 @code{as}.  Someone who undersands the processor (I've never seen
2910 one) and the possible syntaxes should write this section.
2911
2912 @subsection Floating Point
2913 The 32x32 uses IEEE floating point numbers, but @code{as} will only
2914 create single or double precision values.  I don't know if the 32x32
2915 understands extended precision numbers.
2916
2917 @subsection Machine Directives
2918 The 32x32 has no machine dependent directives.
2919
2920 @section Sparc
2921 @subsection Options
2922 The sparc has no machine dependent options.
2923
2924 @subsection syntax
2925 I don't know anything about Sparc syntax.  Someone who does
2926 will have to write this section.
2927
2928 @subsection Floating Point
2929 The Sparc uses ieee floating-point numbers.
2930
2931 @subsection Machine Directives
2932 The Sparc version of @code{as} supports the following additional
2933 machine directives:
2934
2935 @table @code
2936 @item .common
2937 This must be followed by a symbol name, a positive number, and
2938 @code{"bss"}.  This behaves somewhat like @code{.comm}, but the
2939 syntax is different.
2940
2941 @item .global
2942 This is functionally identical to @code{.globl}.
2943
2944 @item .half
2945 This is functionally identical to @code{.short}.
2946
2947 @item .proc
2948 This directive is ignored.  Any text following it on the same
2949 line is also ignored.
2950
2951 @item .reserve
2952 This must be followed by a symbol name, a positive number, and
2953 @code{"bss"}.  This behaves somewhat like @code{.lcomm}, but the
2954 syntax is different.
2955
2956 @item .seg
2957 This must be followed by @code{"text"}, @code{"data"}, or
2958 @code{"data1"}.  It behaves like @code{.text}, @code{.data}, or
2959 @code{.data 1}.
2960
2961 @item .skip
2962 This is functionally identical to the .space directive.
2963
2964 @item .word
2965 On the Sparc, the .word directive produces 32 bit values,
2966 instead of the 16 bit values it produces on every other machine.
2967
2968 @end table
2969
2970 @section Intel 80386
2971 @subsection Options
2972 The 80386 has no machine dependent options.
2973
2974 @subsection AT&T Syntax versus Intel Syntax
2975 In order to maintain compatibility with the output of @code{GCC},
2976 @code{as} supports AT&T System V/386 assembler syntax.  This is quite
2977 different from Intel syntax.  We mention these differences because
2978 almost all 80386 documents used only Intel syntax.  Notable differences
2979 between the two syntaxes are:
2980 @itemize @bullet
2981 @item
2982 AT&T immediate operands are preceded by @samp{$}; Intel immediate
2983 operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}).
2984 AT&T register operands are preceded by @samp{%}; Intel register operands
2985 are undelimited.  AT&T absolute (as opposed to PC relative) jump/call
2986 operands are prefixed by @samp{*}; they are undelimited in Intel syntax.
2987
2988 @item
2989 AT&T and Intel syntax use the opposite order for source and destination
2990 operands.  Intel @samp{add eax, 4} is @samp{addl $4, %eax}.  The
2991 @samp{source, dest} convention is maintained for compatibility with
2992 previous Unix assemblers.
2993
2994 @item
2995 In AT&T syntax the size of memory operands is determined from the last
2996 character of the opcode name.  Opcode suffixes of @samp{b}, @samp{w},
2997 and @samp{l} specify byte (8-bit), word (16-bit), and long (32-bit)
2998 memory references.  Intel syntax accomplishes this by prefixes memory
2999 operands (@emph{not} the opcodes themselves) with @samp{byte ptr},
3000 @samp{word ptr}, and @samp{dword ptr}.  Thus, Intel @samp{mov al, byte
3001 ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T syntax.
3002
3003 @item
3004 Immediate form long jumps and calls are
3005 @samp{lcall/ljmp $@var{segment}, $@var{offset}} in AT&T syntax; the
3006 Intel syntax is
3007 @samp{call/jmp far @var{segment}:@var{offset}}.  Also, the far return
3008 instruction
3009 is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is
3010 @samp{ret far @var{stack-adjust}}.
3011
3012 @item
3013 The AT&T assembler does not provide support for multiple segment
3014 programs.  Unix style systems expect all programs to be single segments.
3015 @end itemize
3016
3017 @subsection Opcode Naming
3018 Opcode names are suffixed with one character modifiers which specify the
3019 size of operands.  The letters @samp{b}, @samp{w}, and @samp{l} specify
3020 byte, word, and long operands.  If no suffix is specified by an
3021 instruction and it contains no memory operands then @code{as} tries to
3022 fill in the missing suffix based on the destination register operand
3023 (the last one by convention).  Thus, @samp{mov %ax, %bx} is equivalent
3024 to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to
3025 @samp{movw $1, %bx}.  Note that this is incompatible with the AT&T Unix
3026 assembler which assumes that a missing opcode suffix implies long
3027 operand size.  (This incompatibility does not affect compiler output
3028 since compilers always explicitly specify the opcode suffix.)
3029
3030 Almost all opcodes have the same names in AT&T and Intel format.  There
3031 are a few exceptions.  The sign extend and zero extend instructions need
3032 two sizes to specify them.  They need a size to sign/zero extend
3033 @emph{from} and a size to zero extend @emph{to}.  This is accomplished
3034 by using two opcode suffixes in AT&T syntax.  Base names for sign extend
3035 and zero extend are @samp{movs@dots{}} and @samp{movz@dots{}} in AT&T
3036 syntax (@samp{movsx} and @samp{movzx} in Intel syntax).  The opcode
3037 suffixes are tacked on to this base name, the @emph{from} suffix before
3038 the @emph{to} suffix.  Thus, @samp{movsbl %al, %edx} is AT&T syntax for
3039 ``move sign extend @emph{from} %al @emph{to} %edx.''  Possible suffixes,
3040 thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word),
3041 and @samp{wl} (from word to long).
3042
3043 The Intel syntax conversion instructions
3044 @itemize @bullet
3045 @item
3046 @samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax},
3047 @item
3048 @samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax},
3049 @item
3050 @samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax},
3051 @item
3052 @samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax},
3053 @end itemize
3054 are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, and @samp{cltd} in
3055 AT&T naming.  @code{as} accepts either naming for these instructions.
3056
3057 Far call/jump instructions are @samp{lcall} and @samp{ljmp} in
3058 AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel
3059 convention.
3060
3061 @subsection Register Naming
3062 Register operands are always prefixes with @samp{%}.  The 80386 registers
3063 consist of
3064 @itemize @bullet
3065 @item
3066 the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx},
3067 @samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the
3068 frame pointer), and @samp{%esp} (the stack pointer).
3069
3070 @item
3071 the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx},
3072 @samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}.
3073
3074 @item
3075 the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh},
3076 @samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These
3077 are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx},
3078 @samp{%cx}, and @samp{%dx})
3079
3080 @item
3081 the 6 segment registers @samp{%cs} (code segment), @samp{%ds}
3082 (data segment), @samp{%ss} (stack segment), @samp{%es}, @samp{%fs},
3083 and @samp{%gs}.
3084
3085 @item
3086 the 3 processor control registers @samp{%cr0}, @samp{%cr2}, and
3087 @samp{%cr3}.
3088
3089 @item
3090 the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2},
3091 @samp{%db3}, @samp{%db6}, and @samp{%db7}.
3092
3093 @item
3094 the 2 test registers @samp{%tr6} and @samp{%tr7}.
3095
3096 @item
3097 the 8 floating point register stack @samp{%st} or equivalently
3098 @samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)},
3099 @samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}.
3100 @end itemize
3101
3102 @subsection Opcode Prefixes
3103 Opcode prefixes are used to modify the following opcode.  They are used
3104 to repeat string instructions, to provide segment overrides, to perform
3105 bus lock operations, and to give operand and address size (16-bit
3106 operands are specified in an instruction by prefixing what would
3107 normally be 32-bit operands with a ``operand size'' opcode prefix).
3108 Opcode prefixes are usually given as single-line instructions with no
3109 operands, and must directly precede the instruction they act upon.  For
3110 example, the @samp{scas} (scan string) instruction is repeated with:
3111 @example
3112         repne
3113         scas
3114 @end example
3115
3116 Here is a list of opcode prefixes:
3117 @itemize @bullet
3118 @item
3119 Segment override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es},
3120 @samp{fs}, @samp{gs}.  These are automatically added by specifying
3121 using the @var{segment}:@var{memory-operand} form for memory references.
3122
3123 @item
3124 Operand/Address size prefixes @samp{data16} and @samp{addr16}
3125 change 32-bit operands/addresses into 16-bit operands/addresses.  Note
3126 that 16-bit addressing modes (i.e. 8086 and 80286 addressing modes)
3127 are not supported (yet).
3128
3129 @item
3130 The bus lock prefix @samp{lock} inhibits interrupts during
3131 execution of the instruction it precedes.  (This is only valid with
3132 certain instructions; see a 80386 manual for details).
3133
3134 @item
3135 The wait for coprocessor prefix @samp{wait} waits for the
3136 coprocessor to complete the current instruction.  This should never be
3137 needed for the 80386/80387 combination.
3138
3139 @item
3140 The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added
3141 to string instructions to make them repeat @samp{%ecx} times.
3142 @end itemize
3143
3144 @subsection Memory References
3145 An Intel syntax indirect memory reference of the form
3146 @example
3147 @var{segment}:[@var{base} + @var{index}*@var{scale} + @var{disp}]
3148 @end example
3149 is translated into the AT&T syntax
3150 @example
3151 @var{segment}:@var{disp}(@var{base}, @var{index}, @var{scale})
3152 @end example
3153 where @var{base} and @var{index} are the optional 32-bit base and
3154 index registers, @var{disp} is the optional displacement, and
3155 @var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index}
3156 to calculate the address of the operand.  If no @var{scale} is
3157 specified, @var{scale} is taken to be 1.  @var{segment} specifies the
3158 optional segment register for the memory operand, and may override the
3159 default segment register (see a 80386 manual for segment register
3160 defaults). Note that segment overrides in AT&T syntax @emph{must} have
3161 be preceded by a @samp{%}.  If you specify a segment override which
3162 coincides with the default segment register, @code{as} will @emph{not}
3163 output any segment register override prefixes to assemble the given
3164 instruction.  Thus, segment overrides can be specified to emphasize which
3165 segment register is used for a given memory operand.
3166
3167 Here are some examples of Intel and AT&T style memory references:
3168 @table @asis
3169
3170 @item AT&T: @samp{-4(%ebp)}, Intel:  @samp{[ebp - 4]}
3171 @var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{segment} is
3172 missing, and the default segment is used (@samp{%ss} for addressing with
3173 @samp{%ebp} as the base register).  @var{index}, @var{scale} are both missing.
3174
3175 @item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]}
3176 @var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is
3177 @samp{foo}.  All other fields are missing.  The segment register here
3178 defaults to @samp{%ds}.
3179
3180 @item AT&T: @samp{foo(,1)}; Intel @samp{[foo]}
3181 This uses the value pointed to by @samp{foo} as a memory operand.
3182 Note that @var{base} and @var{index} are both missing, but there is only
3183 @emph{one} @samp{,}.  This is a syntactic exception.
3184
3185 @item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo}
3186 This selects the contents of the variable @samp{foo} with segment
3187 register @var{segment} being @samp{%gs}.
3188
3189 @end table
3190
3191 Absolute (as opposed to PC relative) call and jump operands must be
3192 prefixed with @samp{*}.  If no @samp{*} is specified, @code{as} will
3193 always choose PC relative addressing for jump/call labels.
3194
3195 Any instruction that has a memory operand @emph{must} specify its size (byte,
3196 word, or long) with an opcode suffix (@samp{b}, @samp{w}, or @samp{l},
3197 respectively).
3198
3199 @subsection Handling of Jump Instructions
3200 Jump instructions are always optimized to use the smallest possible
3201 displacements.  This is accomplished by using byte (8-bit) displacement
3202 jumps whenever the target is sufficiently close.  If a byte displacement
3203 is insufficient a long (32-bit) displacement is used.  We do not support
3204 word (16-bit) displacement jumps (i.e. prefixing the jump instruction
3205 with the @samp{addr16} opcode prefix), since the 80386 insists upon masking
3206 @samp{%eip} to 16 bits after the word displacement is added.
3207
3208 Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz},
3209 @samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in
3210 byte displacements, so that it is possible that use of these
3211 instructions (@code{GCC} does not use them) will cause the assembler to
3212 print an error message (and generate incorrect code).  The AT&T 80386
3213 assembler tries to get around this problem by expanding @samp{jcxz foo} to
3214 @example
3215          jcxz cx_zero
3216          jmp cx_nonzero
3217 cx_zero: jmp foo
3218 cx_nonzero:
3219 @end example
3220
3221 @subsection Floating Point
3222 All 80387 floating point types except packed BCD are supported.
3223 (BCD support may be added without much difficulty).  These data
3224 types are 16-, 32-, and 64- bit integers, and single (32-bit),
3225 double (64-bit), and extended (80-bit) precision floating point.
3226 Each supported type has an opcode suffix and a constructor
3227 associated with it.  Opcode suffixes specify operand's data
3228 types.  Constructors build these data types into memory.
3229
3230 @itemize @bullet
3231 @item
3232 Floating point constructors are @samp{.float} or @samp{.single},
3233 @samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats.
3234 These correspond to opcode suffixes @samp{s}, @samp{l}, and @samp{t}.
3235 @samp{t} stands for temporary real, and that the 80387 only supports
3236 this format via the @samp{fldt} (load temporary real to stack top) and
3237 @samp{fstpt} (store temporary real and pop stack) instructions.
3238
3239 @item
3240 Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and
3241 @samp{.quad} for the 16-, 32-, and 64-bit integer formats.  The corresponding
3242 opcode suffixes are @samp{s} (single), @samp{l} (long), and @samp{q}
3243 (quad).  As with the temporary real format the 64-bit @samp{q} format is
3244 only present in the @samp{fildq} (load quad integer to stack top) and
3245 @samp{fistpq} (store quad integer and pop stack) instructions.
3246 @end itemize
3247
3248 Register to register operations do not require opcode suffixes,
3249 so that @samp{fst %st, %st(1)} is equivalent to @samp{fstl %st, %st(1)}.
3250
3251 Since the 80387 automatically synchronizes with the 80386 @samp{fwait}
3252 instructions are almost never needed (this is not the case for the
3253 80286/80287 and 8086/8087 combinations).  Therefore, @code{as} suppresses
3254 the @samp{fwait} instruction whenever it is implicitly selected by one
3255 of the @samp{fn@dots{}} instructions.  For example, @samp{fsave} and
3256 @samp{fnsave} are treated identically.  In general, all the @samp{fn@dots{}}
3257 instructions are made equivalent to @samp{f@dots{}} instructions.  If
3258 @samp{fwait} is desired it must be explicitly coded.
3259
3260 @subsection Notes
3261 There is some trickery concerning the @samp{mul} and @samp{imul}
3262 instructions that deserves mention.  The 16-, 32-, and 64-bit expanding
3263 multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5
3264 for @samp{imul}) can be output only in the one operand form.  Thus,
3265 @samp{imul %ebx, %eax} does @emph{not} select the expanding multiply;
3266 the expanding multiply would clobber the @samp{%edx} register, and this
3267 would confuse @code{GCC} output.  Use @samp{imul %ebx} to get the
3268 64-bit product in @samp{%edx:%eax}.
3269
3270 We have added a two operand form of @samp{imul} when the first operand
3271 is an immediate mode expression and the second operand is a register.
3272 This is just a shorthand, so that, multiplying @samp{%eax} by 69, for
3273 example, can be done with @samp{imul $69, %eax} rather than @samp{imul
3274 $69, %eax, %eax}.
3275 @end ignore
3276 @c pesch@cygnus.com: we also ignore the following chapters, but for
3277 @c                   a different reason---internals are changing
3278 @c                   rapidly.  These may need to be moved to another
3279 @c                   book anyhow, if we adopt the model of user/modifier
3280 @c                   books.
3281 @ignore
3282 @node Maintenance, Retargeting, Machine Dependent, Top
3283 @chapter Maintaining the Assembler
3284 [[this chapter is still being built]]
3285
3286 @section Design
3287 We had these goals, in descending priority:
3288 @table @b
3289 @item Accuracy.
3290 For every program composed by a compiler, @code{as} should emit
3291 ``correct'' code.  This leaves some latitude in choosing addressing
3292 modes, order of @code{relocation_info} structures in the object
3293 file, @emph{etc}.
3294
3295 @item Speed, for usual case.
3296 By far the most common use of @code{as} will be assembling compiler
3297 emissions.
3298
3299 @item Upward compatibility for existing assembler code.
3300 Well @dots{} we don't support Vax bit fields but everything else
3301 seems to be upward compatible.
3302
3303 @item Readability.
3304 The code should be maintainable with few surprises.  (JF: ha!)
3305
3306 @end table
3307
3308 We assumed that disk I/O was slow and expensive while memory was
3309 fast and access to memory was cheap.  We expect the in-memory data
3310 structures to be less than 10 times the size of the emitted object
3311 file.  (Contrast this with the C compiler where in-memory structures
3312 might be 100 times object file size!)
3313 This suggests:
3314 @itemize @bullet
3315 @item
3316 Try to read the source file from disk only one time.  For other
3317 reasons, we keep large chunks of the source file in memory during
3318 assembly so this is not a problem.  Also the assembly algorithm
3319 should only scan the source text once if the compiler composed the
3320 text according to a few simple rules.
3321 @item
3322 Emit the object code bytes only once.  Don't store values and then
3323 backpatch later.
3324 @item
3325 Build the object file in memory and do direct writes to disk of
3326 large buffers.
3327 @end itemize
3328
3329 RMS suggested a one-pass algorithm which seems to work well.  By not
3330 parsing text during a second pass considerable time is saved on
3331 large programs (@emph{e.g.} the sort of C program @code{yacc} would
3332 emit).
3333
3334 It happened that the data structures needed to emit relocation
3335 information to the object file were neatly subsumed into the data
3336 structures that do backpatching of addresses after pass 1.
3337
3338 Many of the functions began life as re-usable modules, loosely
3339 connected.  RMS changed this to gain speed.  For example, input
3340 parsing routines which used to work on pre-sanitized strings now
3341 must parse raw data.  Hence they have to import knowledge of the
3342 assemblers' comment conventions @emph{etc}.
3343
3344 @section Deprecated Feature(?)s
3345 We have stopped supporting some features:
3346 @itemize @bullet
3347 @item
3348 @code{.org} statements must have @b{defined} expressions.
3349 @item
3350 Vax Bit fields (@kbd{:} operator) are entirely unsupported.
3351 @end itemize
3352
3353 It might be a good idea to not support these features in a future release:
3354 @itemize @bullet
3355 @item
3356 @kbd{#} should begin a comment, even in column 1.
3357 @item
3358 Why support the logical line & file concept any more?
3359 @item
3360 Subsegments are a good candidate for flushing.
3361 Depends on which compilers need them I guess.
3362 @end itemize
3363
3364 @section Bugs, Ideas, Further Work
3365 Clearly the major improvement is DON'T USE A TEXT-READING
3366 ASSEMBLER for the back end of a compiler.  It is much faster to
3367 interpret binary gobbledygook from a compiler's tables than to
3368 ask the compiler to write out human-readable code just so the
3369 assembler can parse it back to binary.
3370
3371 Assuming you use @code{as} for human written programs: here are
3372 some ideas:
3373 @itemize @bullet
3374 @item
3375 Document (here) @code{APP}.
3376 @item
3377 Take advantage of knowing no spaces except after opcode
3378 to speed up @code{as}.  (Modify @code{app.c} to flush useless spaces:
3379 only keep space/tabs at begin of line or between 2
3380 symbols.)
3381 @item
3382 Put pointers in this documentation to @file{a.out} documentation.
3383 @item
3384 Split the assembler into parts so it can gobble direct binary
3385 from @emph{e.g.} @code{cc}.  It is silly for@code{cc} to compose text
3386 just so @code{as} can parse it back to binary.
3387 @item
3388 Rewrite hash functions: I want a more modular, faster library.
3389 @item
3390 Clean up LOTS of code.
3391 @item
3392 Include all the non-@file{.c} files in the maintenance chapter.
3393 @item
3394 Document flonums.
3395 @item
3396 Implement flonum short literals.
3397 @item
3398 Change all talk of expression operands to expression quantities,
3399 or perhaps to expression arguments.
3400 @item
3401 Implement pass 2.
3402 @item
3403 Whenever a @code{.text} or @code{.data} statement is seen, we close
3404 of the current frag with an imaginary @code{.fill 0}.  This is
3405 because we only have one obstack for frags, and we can't grow new
3406 frags for a new subsegment, then go back to the old subsegment and
3407 append bytes to the old frag.  All this nonsense goes away if we
3408 give each subsegment its own obstack.  It makes code simpler in
3409 about 10 places, but nobody has bothered to do it because C compiler
3410 output rarely changes subsegments (compared to ending frags with
3411 relaxable addresses, which is common).
3412 @end itemize
3413
3414 @section Sources
3415 @c The following files in the @file{as} directory
3416 @c are symbolic links to other files, of
3417 @c the same name, in a different directory.
3418 @c @itemize @bullet
3419 @c @item
3420 @c @file{atof_generic.c}
3421 @c @item
3422 @c @file{atof_vax.c}
3423 @c @item
3424 @c @file{flonum_const.c}
3425 @c @item
3426 @c @file{flonum_copy.c}
3427 @c @item
3428 @c @file{flonum_get.c}
3429 @c @item
3430 @c @file{flonum_multip.c}
3431 @c @item
3432 @c @file{flonum_normal.c}
3433 @c @item
3434 @c @file{flonum_print.c}
3435 @c @end itemize
3436
3437 Here is a list of the source files in the @file{as} directory.
3438
3439 @table @file
3440 @item app.c
3441 This contains the pre-processing phase, which deletes comments,
3442 handles whitespace, etc.  This was recently re-written, since app
3443 used to be a separate program, but RMS wanted it to be inline.
3444
3445 @item append.c
3446 This is a subroutine to append a string to another string returning a
3447 pointer just after the last @code{char} appended.  (JF:  All these
3448 little routines should probably all be put in one file.)
3449
3450 @item as.c
3451 Here you will find the main program of the assembler @code{as}.
3452
3453 @item expr.c
3454 This is a branch office of @file{read.c}.  This understands
3455 expressions, arguments.  Inside @code{as}, arguments are called
3456 (expression) @emph{operands}.  This is confusing, because we also talk
3457 (elsewhere) about instruction @emph{operands}.  Also, expression
3458 operands are called @emph{quantities} explicitly to avoid confusion
3459 with instruction operands.  What a mess.
3460
3461 @item frags.c
3462 This implements the @b{frag} concept.  Without frags, finding the
3463 right size for branch instructions would be a lot harder.
3464
3465 @item hash.c
3466 This contains the symbol table, opcode table @emph{etc.} hashing
3467 functions.
3468
3469 @item hex_value.c
3470 This is a table of values of digits, for use in atoi() type
3471 functions.  Could probably be flushed by using calls to strtol(), or
3472 something similar.
3473
3474 @item input-file.c
3475 This contains Operating system dependent source file reading
3476 routines.  Since error messages often say where we are in reading
3477 the source file, they live here too.  Since @code{as} is intended to
3478 run under GNU and Unix only, this might be worth flushing.  Anyway,
3479 almost all C compilers support stdio.
3480
3481 @item input-scrub.c
3482 This deals with calling the pre-processor (if needed) and feeding the
3483 chunks back to the rest of the assembler the right way.
3484
3485 @item messages.c
3486 This contains operating system independent parts of fatal and
3487 warning message reporting.  See @file{append.c} above.
3488
3489 @item output-file.c
3490 This contains operating system dependent functions that write an
3491 object file for @code{as}.  See @file{input-file.c} above.
3492
3493 @item read.c
3494 This implements all the directives of @code{as}.  This also deals
3495 with passing input lines to the machine dependent part of the
3496 assembler.
3497
3498 @item strstr.c
3499 This is a C library function that isn't in most C libraries yet.
3500 See @file{append.c} above.
3501
3502 @item subsegs.c
3503 This implements subsegments.
3504
3505 @item symbols.c
3506 This implements symbols.
3507
3508 @item write.c
3509 This contains the code to perform relaxation, and to write out
3510 the object file.  It is mostly operating system independent, but
3511 different OSes have different object file formats in any case.
3512
3513 @item xmalloc.c
3514 This implements @code{malloc()} or bust.  See @file{append.c} above.
3515
3516 @item xrealloc.c
3517 This implements @code{realloc()} or bust.  See @file{append.c} above.
3518
3519 @item atof-generic.c
3520 The following files were taken from a machine-independent subroutine
3521 library for manipulating floating point numbers and very large
3522 integers.
3523
3524 @file{atof-generic.c} turns a string into a flonum internal format
3525 floating-point number.
3526
3527 @item flonum-const.c
3528 This contains some potentially useful floating point numbers in
3529 flonum format.
3530
3531 @item flonum-copy.c
3532 This copies a flonum.
3533
3534 @item flonum-multip.c
3535 This multiplies two flonums together.
3536
3537 @item bignum-copy.c
3538 This copies a bignum.
3539
3540 @end table
3541
3542 Here is a table of all the machine-specific files (this includes
3543 both source and header files).  Typically, there is a
3544 @var{machine}.c file, a @var{machine}-opcode.h file, and an
3545 atof-@var{machine}.c file.  The @var{machine}-opcode.h file should
3546 be identical to the one used by GDB (which uses it for disassembly.)
3547
3548 @table @file
3549
3550 @item atof-ieee.c
3551 This contains code to turn a flonum into a ieee literal constant.
3552 This is used by tye 680x0, 32x32, sparc, and i386 versions of @code{as}.
3553
3554 @item i386-opcode.h
3555 This is the opcode-table for the i386 version of the assembler.
3556
3557 @item i386.c
3558 This contains all the code for the i386 version of the assembler.
3559
3560 @item i386.h
3561 This defines constants and macros used by the i386 version of the assembler.
3562
3563 @item m-generic.h
3564 generic 68020 header file.  To be linked to m68k.h on a
3565 non-sun3, non-hpux system.
3566
3567 @item m-sun2.h
3568 68010 header file for Sun2 workstations.  Not well tested.  To be linked
3569 to m68k.h on a sun2.  (See also @samp{-DSUN_ASM_SYNTAX} in the
3570 @file{Makefile}.)
3571
3572 @item m-sun3.h
3573 68020 header file for Sun3 workstations.  To be linked to m68k.h before
3574 compiling on a Sun3 system.  (See also @samp{-DSUN_ASM_SYNTAX} in the
3575 @file{Makefile}.)
3576
3577 @item m-hpux.h
3578 68020 header file for a HPUX (system 5?) box.  Which box, which
3579 version of HPUX, etc?  I don't know.
3580
3581 @item m68k.h
3582 A hard- or symbolic- link to one of @file{m-generic.h},
3583 @file{m-hpux.h} or @file{m-sun3.h} depending on which kind of
3584 680x0 you are assembling for.   (See also @samp{-DSUN_ASM_SYNTAX} in the
3585 @file{Makefile}.)
3586
3587 @item m68k-opcode.h
3588 Opcode table for 68020.  This is now a link to the opcode table
3589 in the @code{GDB} source directory.
3590
3591 @item m68k.c
3592 All the mc680x0 code, in one huge, slow-to-compile file.
3593
3594 @item ns32k.c
3595 This contains the code for the ns32032/ns32532 version of the
3596 assembler.
3597
3598 @item ns32k-opcode.h
3599 This contains the opcode table for the ns32032/ns32532 version
3600 of the assembler.
3601
3602 @item vax-inst.h
3603 Vax specific file for describing Vax operands and other Vax-ish things.
3604
3605 @item vax-opcode.h
3606 Vax opcode table.
3607
3608 @item vax.c
3609 Vax specific parts of @code{as}.  Also includes the former files
3610 @file{vax-ins-parse.c}, @file{vax-reg-parse.c} and @file{vip-op.c}.
3611
3612 @item atof-vax.c
3613 Turns a flonum into a Vax constant.
3614
3615 @item vms.c
3616 This file contains the special code needed to put out a VMS
3617 style object file for the Vax.
3618
3619 @end table
3620
3621 Here is a list of the header files in the source directory.
3622 (Warning:  This section may not be very accurate.  I didn't
3623 write the header files; I just report them.)  Also note that I
3624 think many of these header files could be cleaned up or
3625 eliminated.
3626
3627 @table @file
3628
3629 @item a.out.h
3630 This describes the structures used to create the binary header data
3631 inside the object file.  Perhaps we should use the one in
3632 @file{/usr/include}?
3633
3634 @item as.h
3635 This defines all the globally useful things, and pulls in <stdio.h>
3636 and <assert.h>.
3637
3638 @item bignum.h
3639 This defines macros useful for dealing with bignums.
3640
3641 @item expr.h
3642 Structure and macros for dealing with expression()
3643
3644 @item flonum.h
3645 This defines the structure for dealing with floating point
3646 numbers.  It #includes @file{bignum.h}.
3647
3648 @item frags.h
3649 This contains macro for appending a byte to the current frag.
3650
3651 @item hash.h
3652 Structures and function definitions for the hashing functions.
3653
3654 @item input-file.h
3655 Function headers for the input-file.c functions.
3656
3657 @item md.h
3658 structures and function headers for things defined in the
3659 machine dependent part of the assembler.
3660
3661 @item obstack.h
3662 This is the GNU systemwide include file for manipulating obstacks.
3663 Since nobody is running under real GNU yet, we include this file.
3664
3665 @item read.h
3666 Macros and function headers for reading in source files.
3667
3668 @item struct-symbol.h
3669 Structure definition and macros for dealing with the gas
3670 internal form of a symbol.
3671
3672 @item subsegs.h
3673 structure definition for dealing with the numbered subsegments
3674 of the text and data segments.
3675
3676 @item symbols.h
3677 Macros and function headers for dealing with symbols.
3678
3679 @item write.h
3680 Structure for doing segment fixups.
3681 @end table
3682
3683 @comment ~subsection Test Directory
3684 @comment (Note:  The test directory seems to have disappeared somewhere
3685 @comment along the line.  If you want it, you'll probably have to find a
3686 @comment REALLY OLD dump tape~dots{})
3687 @comment
3688 @comment The ~file{test/} directory is used for regression testing.
3689 @comment After you modify ~@code{as}, you can get a quick go/nogo
3690 @comment confidence test by running the new ~@code{as} over the source
3691 @comment files in this directory.  You use a shell script ~file{test/do}.
3692 @comment
3693 @comment The tests in this suite are evolving.  They are not comprehensive.
3694 @comment They have, however, caught hundreds of bugs early in the debugging
3695 @comment cycle of ~@code{as}.  Most test statements in this suite were naturally
3696 @comment selected: they were used to demonstrate actual ~@code{as} bugs rather
3697 @comment than being written ~i{a prioi}.
3698 @comment
3699 @comment Another testing suggestion: over 30 bugs have been found simply by
3700 @comment running examples from this manual through ~@code{as}.
3701 @comment Some examples in this manual are selected
3702 @comment to distinguish boundary conditions; they are good for testing ~@code{as}.
3703 @comment
3704 @comment ~subsubsection Regression Testing
3705 @comment Each regression test involves assembling a file and comparing the
3706 @comment actual output of ~@code{as} to ``known good'' output files.  Both
3707 @comment the object file and the error/warning message file (stderr) are
3708 @comment inspected.  Optionally ~@code{as}' exit status may be checked.
3709 @comment Discrepencies are reported.  Each discrepency means either that
3710 @comment you broke some part of ~@code{as} or that the ``known good'' files
3711 @comment are now out of date and should be changed to reflect the new
3712 @comment definition of ``good''.
3713 @comment
3714 @comment Each regression test lives in its own directory, in a tree
3715 @comment rooted in the directory ~file{test/}.  Each such directory
3716 @comment has a name ending in ~file{.ret}, where `ret' stands for
3717 @comment REgression Test.  The ~file{.ret} ending allows ~code{find
3718 @comment (1)} to find all regression tests in the tree, without
3719 @comment needing to list them explicitly.
3720 @comment
3721 @comment Any ~file{.ret} directory must contain a file called
3722 @comment ~file{input} which is the source file to assemble.  During
3723 @comment testing an object file ~file{output} is created, as well as
3724 @comment a file ~file{stdouterr} which contains the output to both
3725 @comment stderr and stderr.  If there is a file ~file{output.good} in
3726 @comment the directory, and if ~file{output} contains exactly the
3727 @comment same data as ~file{output.good}, the file ~file{output} is
3728 @comment deleted.  Likewise ~file{stdouterr} is removed if it exactly
3729 @comment matches a file ~file{stdouterr.good}.  If file
3730 @comment ~file{status.good} is present, containing a decimal number
3731 @comment before a newline, the exit status of ~@code{as} is compared
3732 @comment to this number.  If the status numbers are not equal, a file
3733 @comment ~file{status} is written to the directory, containing the
3734 @comment actual status as a decimal number followed by newline.
3735 @comment
3736 @comment Should any of the ~file{*.good} files fail to match their corresponding
3737 @comment actual files, this is noted by a 1-line message on the screen during
3738 @comment the regression test, and you can use ~@code{find (1)} to find any
3739 @comment files named ~file{status}, ~file {output} or ~file{stdouterr}.
3740 @comment
3741 @node Retargeting, License, Maintenance, Top
3742 @chapter Teaching the Assembler about a New Machine
3743
3744 This chapter describes the steps required in order to make the
3745 assembler work with another machine's assembly language.  This
3746 chapter is not complete, and only describes the steps in the
3747 broadest terms.  You should look at the source for the
3748 currently supported machine in order to discover some of the
3749 details that aren't mentioned here.
3750
3751 You should create a new file called @file{@var{machine}.c}, and
3752 add the appropriate lines to the file @file{Makefile} so that
3753 you can compile your new version of the assembler.  This should
3754 be straighforward; simply add lines similar to the ones there
3755 for the four current versions of the assembler.
3756
3757 If you want to be compatible with GDB, (and the current
3758 machine-dependent versions of the assembler), you should create
3759 a file called @file{@var{machine}-opcode.h} which should
3760 contain all the information about the names of the machine
3761 instructions, their opcodes, and what addressing modes they
3762 support.  If you do this right, the assembler and GDB can share
3763 this file, and you'll only have to write it once.  Note that
3764 while you're writing @code{as}, you may want to use an
3765 independent program (if you have access to one), to make sure
3766 that @code{as} is emitting the correct bytes.  Since @code{as}
3767 and @code{GDB} share the opcode table, an incorrect opcode
3768 table entry may make invalid bytes look OK when you disassemble
3769 them with @code{GDB}.
3770
3771 @section Functions You will Have to Write
3772
3773 Your file @file{@var{machine}.c} should contain definitions for
3774 the following functions and variables.  It will need to include
3775 some header files in order to use some of the structures
3776 defined in the machine-independent part of the assembler.  The
3777 needed header files are mentioned in the descriptions of the
3778 functions that will need them.
3779
3780 @table @code
3781
3782 @item long omagic;
3783 This long integer holds the value to place at the beginning of
3784 the @file{a.out} file.  It is usually @samp{OMAGIC}, except on
3785 machines that store additional information in the magic-number.
3786
3787 @item char comment_chars[];
3788 This character array holds the values of the characters that
3789 start a comment anywhere in a line.  Comments are stripped off
3790 automatically by the machine independent part of the
3791 assembler.  Note that the @samp{/*} will always start a
3792 comment, and that only @samp{*/} will end a comment started by
3793 @samp{*/}.
3794
3795 @item char line_comment_chars[];
3796 This character array holds the values of the chars that start a
3797 comment only if they are the first (non-whitespace) character
3798 on a line.  If the character @samp{#} does not appear in this
3799 list, you may get unexpected results.  (Various
3800 machine-independent parts of the assembler treat the comments
3801 @samp{#APP} and @samp{#NO_APP} specially, and assume that lines
3802 that start with @samp{#} are comments.)
3803
3804 @item char EXP_CHARS[];
3805 This character array holds the letters that can separate the
3806 mantissa and the exponent of a floating point number.  Typical
3807 values are @samp{e} and @samp{E}.
3808
3809 @item char FLT_CHARS[];
3810 This character array holds the letters that--when they appear
3811 immediately after a leading zero--indicate that a number is a
3812 floating-point number.  (Sort of how 0x indicates that a
3813 hexadecimal number follows.)
3814
3815 @item pseudo_typeS md_pseudo_table[];
3816 (@var{pseudo_typeS} is defined in @file{md.h})
3817 This array contains a list of the machine_dependent directives
3818 the assembler must support.  It contains the name of each
3819 pseudo op (Without the leading @samp{.}), a pointer to a
3820 function to be called when that directive is encountered, and
3821 an integer argument to be passed to that function.
3822
3823 @item void md_begin(void)
3824 This function is called as part of the assembler's
3825 initialization.  It should do any initialization required by
3826 any of your other routines.
3827
3828 @item int md_parse_option(char **optionPTR, int *argcPTR, char ***argvPTR)
3829 This routine is called once for each option on the command line
3830 that the machine-independent part of @code{as} does not
3831 understand.  This function should return non-zero if the option
3832 pointed to by @var{optionPTR} is a valid option.  If it is not
3833 a valid option, this routine should return zero.  The variables
3834 @var{argcPTR} and @var{argvPTR} are provided in case the option
3835 requires a filename or something similar as an argument.  If
3836 the option is multi-character, @var{optionPTR} should be
3837 advanced past the end of the option, otherwise every letter in
3838 the option will be treated as a separate single-character
3839 option.
3840
3841 @item void md_assemble(char *string)
3842 This routine is called for every machine-dependent
3843 non-directive line in the source file.  It does all the real
3844 work involved in reading the opcode, parsing the operands,
3845 etc.  @var{string} is a pointer to a null-terminated string,
3846 that comprises the input line, with all excess whitespace and
3847 comments removed.
3848
3849 @item void md_number_to_chars(char *outputPTR,long value,int nbytes)
3850 This routine is called to turn a C long int, short int, or char
3851 into the series of bytes that represents that number on the
3852 target machine.  @var{outputPTR} points to an array where the
3853 result should be stored; @var{value} is the value to store; and
3854 @var{nbytes} is the number of bytes in 'value' that should be
3855 stored.
3856
3857 @item void md_number_to_imm(char *outputPTR,long value,int nbytes)
3858 This routine is called to turn a C long int, short int, or char
3859 into the series of bytes that represent an immediate value on
3860 the target machine.  It is identical to the function @code{md_number_to_chars},
3861 except on NS32K machines.@refill
3862
3863 @item void md_number_to_disp(char *outputPTR,long value,int nbytes)
3864 This routine is called to turn a C long int, short int, or char
3865 into the series of bytes that represent an displacement value on
3866 the target machine.  It is identical to the function @code{md_number_to_chars},
3867 except on NS32K machines.@refill
3868
3869 @item void md_number_to_field(char *outputPTR,long value,int nbytes)
3870 This routine is identical to @code{md_number_to_chars},
3871 except on NS32K machines.
3872
3873 @item void md_ri_to_chars(struct relocation_info *riPTR,ri)
3874 (@code{struct relocation_info} is defined in @file{a.out.h})
3875 This routine emits the relocation info in @var{ri}
3876 in the appropriate bit-pattern for the target machine.
3877 The result should be stored in the location pointed
3878 to by @var{riPTR}.  This routine may be a no-op unless you are
3879 attempting to do cross-assembly.
3880
3881 @item char *md_atof(char type,char *outputPTR,int *sizePTR)
3882 This routine turns a series of digits into the appropriate
3883 internal representation for a floating-point number.
3884 @var{type} is a character from @var{FLT_CHARS[]} that describes
3885 what kind of floating point number is wanted; @var{outputPTR}
3886 is a pointer to an array that the result should be stored in;
3887 and @var{sizePTR} is a pointer to an integer where the size (in
3888 bytes) of the result should be stored.  This routine should
3889 return an error message, or an empty string (not (char *)0) for
3890 success.
3891
3892 @item int md_short_jump_size;
3893 This variable holds the (maximum) size in bytes of a short (16
3894 bit or so) jump created by @code{md_create_short_jump()}.  This
3895 variable is used as part of the broken-word feature, and isn't
3896 needed if the assembler is compiled with
3897 @samp{-DWORKING_DOT_WORD}.
3898
3899 @item int md_long_jump_size;
3900 This variable holds the (maximum) size in bytes of a long (32
3901 bit or so) jump created by @code{md_create_long_jump()}.  This
3902 variable is used as part of the broken-word feature, and isn't
3903 needed if the assembler is compiled with
3904 @samp{-DWORKING_DOT_WORD}.
3905
3906 @item void md_create_short_jump(char *resultPTR,long from_addr,
3907 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3908 This function emits a jump from @var{from_addr} to @var{to_addr} in
3909 the array of bytes pointed to by @var{resultPTR}.  If this creates a
3910 type of jump that must be relocated, this function should call
3911 @code{fix_new()} with @var{frag} and @var{to_symbol}.  The jump
3912 emitted by this function may be smaller than @var{md_short_jump_size},
3913 but it must never create a larger one.
3914 (If it creates a smaller jump, the extra bytes of memory will not be
3915 used.)  This function is used as part of the broken-word feature,
3916 and isn't needed if the assembler is compiled with
3917 @samp{-DWORKING_DOT_WORD}.@refill
3918
3919 @item void md_create_long_jump(char *ptr,long from_addr,
3920 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3921 This function is similar to the previous function,
3922 @code{md_create_short_jump()}, except that it creates a long
3923 jump instead of a short one.  This function is used as part of
3924 the broken-word feature, and isn't needed if the assembler is
3925 compiled with @samp{-DWORKING_DOT_WORD}.
3926
3927 @item int md_estimate_size_before_relax(fragS *fragPTR,int segment_type)
3928 This function does the initial setting up for relaxation.  This
3929 includes forcing references to still-undefined symbols to the
3930 appropriate addressing modes.
3931
3932 @item relax_typeS md_relax_table[];
3933 (relax_typeS is defined in md.h)
3934 This array describes the various machine dependent states a
3935 frag may be in before relaxation.  You will need one group of
3936 entries for each type of addressing mode you intend to relax.
3937
3938 @item void md_convert_frag(fragS *fragPTR)
3939 (@var{fragS} is defined in @file{as.h})
3940 This routine does the required cleanup after relaxation.
3941 Relaxation has changed the type of the frag to a type that can
3942 reach its destination.  This function should adjust the opcode
3943 of the frag to use the appropriate addressing mode.
3944 @var{fragPTR} points to the frag to clean up.
3945
3946 @item void md_end(void)
3947 This function is called just before the assembler exits.  It
3948 need not free up memory unless the operating system doesn't do
3949 it automatically on exit.  (In which case you'll also have to
3950 track down all the other places where the assembler allocates
3951 space but never frees it.)
3952
3953 @end table
3954
3955 @section External Variables You will Need to Use
3956
3957 You will need to refer to or change the following external variables
3958 from within the machine-dependent part of the assembler.
3959
3960 @table @code
3961 @item extern char flagseen[];
3962 This array holds non-zero values in locations corresponding to
3963 the options that were on the command line.  Thus, if the
3964 assembler was called with @samp{-W}, @var{flagseen['W']} would
3965 be non-zero.
3966
3967 @item extern fragS *frag_now;
3968 This pointer points to the current frag--the frag that bytes
3969 are currently being added to.  If nothing else, you will need
3970 to pass it as an argument to various machine-independent
3971 functions.  It is maintained automatically by the
3972 frag-manipulating functions; you should never have to change it
3973 yourself.
3974
3975 @item extern LITTLENUM_TYPE generic_bignum[];
3976 (@var{LITTLENUM_TYPE} is defined in @file{bignum.h}.
3977 This is where @dfn{bignums}--numbers larger than 32 bits--are
3978 returned when they are encountered in an expression. You will
3979 need to use this if you need to implement directives (or
3980 anything else) that must deal with these large numbers.
3981 @code{Bignums} are of @code{segT} @code{SEG_BIG} (defined in
3982 @file{as.h}, and have a positive @code{X_add_number}.  The
3983 @code{X_add_number} of a @code{bignum} is the number of
3984 @code{LITTLENUMS} in @var{generic_bignum} that the number takes
3985 up.
3986
3987 @item extern FLONUM_TYPE generic_floating_point_number;
3988 (@var{FLONUM_TYPE} is defined in @file{flonum.h}.
3989 The is where @dfn{flonums}--floating-point numbers within
3990 expressions--are returned.  @code{Flonums} are of @code{segT}
3991 @code{SEG_BIG}, and have a negative @code{X_add_number}.
3992 @code{Flonums} are returned in a generic format.  You will have
3993 to write a routine to turn this generic format into the
3994 appropriate floating-point format for your machine.
3995
3996 @item extern int need_pass_2;
3997 If this variable is non-zero, the assembler has encountered an
3998 expression that cannot be assembled in a single pass.  Since
3999 the second pass isn't implemented, this flag means that the
4000 assembler is punting, and is only looking for additional syntax
4001 errors.  (Or something like that.)
4002
4003 @item extern segT now_seg;
4004 This variable holds the value of the segment the assembler is
4005 currently assembling into.
4006
4007 @end table
4008
4009 @section External functions will you need
4010
4011 You will find the following external functions useful (or
4012 indispensable) when you're writing the machine-dependent part
4013 of the assembler.
4014
4015 @table @code
4016
4017 @item char *frag_more(int bytes)
4018 This function allocates @var{bytes} more bytes in the current
4019 frag (or starts a new frag, if it can't expand the current frag
4020 any more.)  for you to store some object-file bytes in.  It
4021 returns a pointer to the bytes, ready for you to store data in.
4022
4023 @item void fix_new(fragS *frag, int where, short size, symbolS *add_symbol, symbolS *sub_symbol, long offset, int pcrel)
4024 This function stores a relocation fixup to be acted on later.
4025 @var{frag} points to the frag the relocation belongs in;
4026 @var{where} is the location within the frag where the relocation begins;
4027 @var{size} is the size of the relocation, and is usually 1 (a single byte),
4028   2 (sixteen bits), or 4 (a longword).
4029 The value @var{add_symbol} @minus{} @var{sub_symbol} + @var{offset}, is added to the byte(s)
4030 at @var{frag->literal[where]}.  If @var{pcrel} is non-zero, the address of the
4031 location is subtracted from the result.  A relocation entry is also added
4032 to the @file{a.out} file.  @var{add_symbol}, @var{sub_symbol}, and/or
4033 @var{offset} may be NULL.@refill
4034
4035 @item char *frag_var(relax_stateT type, int max_chars, int var,
4036 @code{relax_substateT subtype, symbolS *symbol, char *opcode)}
4037 This function creates a machine-dependent frag of type @var{type}
4038 (usually @code{rs_machine_dependent}).
4039 @var{max_chars} is the maximum size in bytes that the frag may grow by;
4040 @var{var} is the current size of the variable end of the frag;
4041 @var{subtype} is the sub-type of the frag.  The sub-type is used to index into
4042 @var{md_relax_table[]} during @code{relaxation}.
4043 @var{symbol} is the symbol whose value should be used to when relax-ing this frag.
4044 @var{opcode} points into a byte whose value may have to be modified if the
4045 addressing mode used by this frag changes.  It typically points into the
4046 @var{fr_literal[]} of the previous frag, and is used to point to a location
4047 that @code{md_convert_frag()}, may have to change.@refill
4048
4049 @item void frag_wane(fragS *fragPTR)
4050 This function is useful from within @code{md_convert_frag}.  It
4051 changes a frag to type rs_fill, and sets the variable-sized
4052 piece of the frag to zero.  The frag will never change in size
4053 again.
4054
4055 @item segT expression(expressionS *retval)
4056 (@var{segT} is defined in @file{as.h}; @var{expressionS} is defined in @file{expr.h})
4057 This function parses the string pointed to by the external char
4058 pointer @var{input_line_pointer}, and returns the segment-type
4059 of the expression.  It also stores the results in the
4060 @var{expressionS} pointed to by @var{retval}.
4061 @var{input_line_pointer} is advanced to point past the end of
4062 the expression.  (@var{input_line_pointer} is used by other
4063 parts of the assembler.  If you modify it, be sure to restore
4064 it to its original value.)
4065
4066 @item as_warn(char *message,@dots{})
4067 If warning messages are disabled, this function does nothing.
4068 Otherwise, it prints out the current file name, and the current
4069 line number, then uses @code{fprintf} to print the
4070 @var{message} and any arguments it was passed.
4071
4072 @item as_bad(char *message,@dots{})
4073 This function should be called when @code{as} encounters
4074 conditions that are bad enough that @code{as} should not
4075 produce an object file, but should continue reading input and
4076 printing warning and bad error messages.
4077
4078 @item as_fatal(char *message,@dots{})
4079 This function prints out the current file name and line number,
4080 prints the word @samp{FATAL:}, then uses @code{fprintf} to
4081 print the @var{message} and any arguments it was passed.  Then
4082 the assembler exits.  This function should only be used for
4083 serious, unrecoverable errors.
4084
4085 @item void float_const(int float_type)
4086 This function reads floating-point constants from the current
4087 input line, and calls @code{md_atof} to assemble them.  It is
4088 useful as the function to call for the directives
4089 @samp{.single}, @samp{.double}, @samp{.float}, etc.
4090 @var{float_type} must be a character from @var{FLT_CHARS}.
4091
4092 @item void demand_empty_rest_of_line(void);
4093 This function can be used by machine-dependent directives to
4094 make sure the rest of the input line is empty.  It prints a
4095 warning message if there are additional characters on the line.
4096
4097 @item long int get_absolute_expression(void)
4098 This function can be used by machine-dependent directives to
4099 read an absolute number from the current input line.  It
4100 returns the result.  If it isn't given an absolute expression,
4101 it prints a warning message and returns zero.
4102
4103 @end table
4104
4105
4106 @section The concept of Frags
4107
4108 This assembler works to optimize the size of certain addressing
4109 modes.  (e.g. branch instructions) This means the size of many
4110 pieces of object code cannot be determined until after assembly
4111 is finished.  (This means that the addresses of symbols cannot be
4112 determined until assembly is finished.)  In order to do this,
4113 @code{as} stores the output bytes as @dfn{frags}.
4114
4115 Here is the definition of a frag (from @file{as.h})
4116 @example
4117 struct frag
4118 @{
4119         long int fr_fix;
4120         long int fr_var;
4121         relax_stateT fr_type;
4122         relax_substateT fr_substate;
4123         unsigned long fr_address;
4124         long int fr_offset;
4125         struct symbol *fr_symbol;
4126         char *fr_opcode;
4127         struct frag *fr_next;
4128         char fr_literal[];
4129 @}
4130 @end example
4131
4132 @table @var
4133 @item fr_fix
4134 is the size of the fixed-size piece of the frag.
4135
4136 @item fr_var
4137 is the maximum (?) size of the variable-sized piece of the frag.
4138
4139 @item fr_type
4140 is the type of the frag.
4141 Current types are:
4142 rs_fill
4143 rs_align
4144 rs_org
4145 rs_machine_dependent
4146
4147 @item fr_substate
4148 This stores the type of machine-dependent frag this is.  (what
4149 kind of addressing mode is being used, and what size is being
4150 tried/will fit/etc.
4151
4152 @item fr_address
4153 @var{fr_address} is only valid after relaxation is finished.
4154 Before relaxation, the only way to store an address is (pointer
4155 to frag containing the address) plus (offset into the frag).
4156
4157 @item fr_offset
4158 This contains a number, whose meaning depends on the type of
4159 the frag.
4160 for machine_dependent frags, this contains the offset from
4161 fr_symbol that the frag wants to go to.  Thus, for branch
4162 instructions it is usually zero.  (unless the instruction was
4163 @samp{jba foo+12}  or something like that.)
4164
4165 @item fr_symbol
4166 for machine_dependent frags, this points to the symbol the frag
4167 needs to reach.
4168
4169 @item fr_opcode
4170 This points to the location in the frag (or in a previous frag)
4171 of the opcode for the instruction that caused this to be a frag.
4172 @var{fr_opcode} is needed if the actual opcode must be changed
4173 in order to use a different form of the addressing mode.
4174 (For example, if a conditional branch only comes in size tiny,
4175 a large-size branch could be implemented by reversing the sense
4176 of the test, and turning it into a tiny branch over a large jump.
4177 This would require changing the opcode.)
4178
4179 @var{fr_literal} is a variable-size array that contains the
4180 actual object bytes.  A frag consists of a fixed size piece of
4181 object data, (which may be zero bytes long), followed by a
4182 piece of object data whose size may not have been determined
4183 yet.  Other information includes the type of the frag (which
4184 controls how it is relaxed),
4185
4186 @item fr_next
4187 This is the next frag in the singly-linked list.  This is
4188 usually only needed by the machine-independent part of
4189 @code{as}.
4190
4191 @end table
4192 @end ignore
4193
4194 @node License,  , Retargeting, Top
4195 @unnumbered GNU GENERAL PUBLIC LICENSE
4196 @center Version 1, February 1989
4197
4198 @display
4199 Copyright @copyright{} 1989 Free Software Foundation, Inc.
4200 675 Mass Ave, Cambridge, MA 02139, USA
4201
4202 Everyone is permitted to copy and distribute verbatim copies
4203 of this license document, but changing it is not allowed.
4204 @end display
4205
4206 @unnumberedsec Preamble
4207
4208   The license agreements of most software companies try to keep users
4209 at the mercy of those companies.  By contrast, our General Public
4210 License is intended to guarantee your freedom to share and change free
4211 software---to make sure the software is free for all its users.  The
4212 General Public License applies to the Free Software Foundation's
4213 software and to any other program whose authors commit to using it.
4214 You can use it for your programs, too.
4215
4216   When we speak of free software, we are referring to freedom, not
4217 price.  Specifically, the General Public License is designed to make
4218 sure that you have the freedom to give away or sell copies of free
4219 software, that you receive source code or can get it if you want it,
4220 that you can change the software or use pieces of it in new free
4221 programs; and that you know you can do these things.
4222
4223   To protect your rights, we need to make restrictions that forbid
4224 anyone to deny you these rights or to ask you to surrender the rights.
4225 These restrictions translate to certain responsibilities for you if you
4226 distribute copies of the software, or if you modify it.
4227
4228   For example, if you distribute copies of a such a program, whether
4229 gratis or for a fee, you must give the recipients all the rights that
4230 you have.  You must make sure that they, too, receive or can get the
4231 source code.  And you must tell them their rights.
4232
4233   We protect your rights with two steps: (1) copyright the software, and
4234 (2) offer you this license which gives you legal permission to copy,
4235 distribute and/or modify the software.
4236
4237   Also, for each author's protection and ours, we want to make certain
4238 that everyone understands that there is no warranty for this free
4239 software.  If the software is modified by someone else and passed on, we
4240 want its recipients to know that what they have is not the original, so
4241 that any problems introduced by others will not reflect on the original
4242 authors' reputations.
4243
4244   The precise terms and conditions for copying, distribution and
4245 modification follow.
4246
4247 @iftex
4248 @unnumberedsec TERMS AND CONDITIONS
4249 @end iftex
4250 @ifinfo
4251 @center TERMS AND CONDITIONS
4252 @end ifinfo
4253
4254 @enumerate
4255 @item
4256 This License Agreement applies to any program or other work which
4257 contains a notice placed by the copyright holder saying it may be
4258 distributed under the terms of this General Public License.  The
4259 ``Program'', below, refers to any such program or work, and a ``work based
4260 on the Program'' means either the Program or any work containing the
4261 Program or a portion of it, either verbatim or with modifications.  Each
4262 licensee is addressed as ``you''.
4263
4264 @item
4265 You may copy and distribute verbatim copies of the Program's source
4266 code as you receive it, in any medium, provided that you conspicuously and
4267 appropriately publish on each copy an appropriate copyright notice and
4268 disclaimer of warranty; keep intact all the notices that refer to this
4269 General Public License and to the absence of any warranty; and give any
4270 other recipients of the Program a copy of this General Public License
4271 along with the Program.  You may charge a fee for the physical act of
4272 transferring a copy.
4273
4274 @item
4275 You may modify your copy or copies of the Program or any portion of
4276 it, and copy and distribute such modifications under the terms of Paragraph
4277 1 above, provided that you also do the following:
4278
4279 @itemize @bullet
4280 @item
4281 cause the modified files to carry prominent notices stating that
4282 you changed the files and the date of any change; and
4283
4284 @item
4285 cause the whole of any work that you distribute or publish, that
4286 in whole or in part contains the Program or any part thereof, either
4287 with or without modifications, to be licensed at no charge to all
4288 third parties under the terms of this General Public License (except
4289 that you may choose to grant warranty protection to some or all
4290 third parties, at your option).
4291
4292 @item
4293 If the modified program normally reads commands interactively when
4294 run, you must cause it, when started running for such interactive use
4295 in the simplest and most usual way, to print or display an
4296 announcement including an appropriate copyright notice and a notice
4297 that there is no warranty (or else, saying that you provide a
4298 warranty) and that users may redistribute the program under these
4299 conditions, and telling the user how to view a copy of this General
4300 Public License.
4301
4302 @item
4303 You may charge a fee for the physical act of transferring a
4304 copy, and you may at your option offer warranty protection in
4305 exchange for a fee.
4306 @end itemize
4307
4308 Mere aggregation of another independent work with the Program (or its
4309 derivative) on a volume of a storage or distribution medium does not bring
4310 the other work under the scope of these terms.
4311
4312 @item
4313 You may copy and distribute the Program (or a portion or derivative of
4314 it, under Paragraph 2) in object code or executable form under the terms of
4315 Paragraphs 1 and 2 above provided that you also do one of the following:
4316
4317 @itemize @bullet
4318 @item
4319 accompany it with the complete corresponding machine-readable
4320 source code, which must be distributed under the terms of
4321 Paragraphs 1 and 2 above; or,
4322
4323 @item
4324 accompany it with a written offer, valid for at least three
4325 years, to give any third party free (except for a nominal charge
4326 for the cost of distribution) a complete machine-readable copy of the
4327 corresponding source code, to be distributed under the terms of
4328 Paragraphs 1 and 2 above; or,
4329
4330 @item
4331 accompany it with the information you received as to where the
4332 corresponding source code may be obtained.  (This alternative is
4333 allowed only for noncommercial distribution and only if you
4334 received the program in object code or executable form alone.)
4335 @end itemize
4336
4337 Source code for a work means the preferred form of the work for making
4338 modifications to it.  For an executable file, complete source code means
4339 all the source code for all modules it contains; but, as a special
4340 exception, it need not include source code for modules which are standard
4341 libraries that accompany the operating system on which the executable
4342 file runs, or for standard header files or definitions files that
4343 accompany that operating system.
4344
4345 @item
4346 You may not copy, modify, sublicense, distribute or transfer the
4347 Program except as expressly provided under this General Public License.
4348 Any attempt otherwise to copy, modify, sublicense, distribute or transfer
4349 the Program is void, and will automatically terminate your rights to use
4350 the Program under this License.  However, parties who have received
4351 copies, or rights to use copies, from you under this General Public
4352 License will not have their licenses terminated so long as such parties
4353 remain in full compliance.
4354
4355 @item
4356 By copying, distributing or modifying the Program (or any work based
4357 on the Program) you indicate your acceptance of this license to do so,
4358 and all its terms and conditions.
4359
4360 @item
4361 Each time you redistribute the Program (or any work based on the
4362 Program), the recipient automatically receives a license from the original
4363 licensor to copy, distribute or modify the Program subject to these
4364 terms and conditions.  You may not impose any further restrictions on the
4365 recipients' exercise of the rights granted herein.
4366
4367 @item
4368 The Free Software Foundation may publish revised and/or new versions
4369 of the General Public License from time to time.  Such new versions will
4370 be similar in spirit to the present version, but may differ in detail to
4371 address new problems or concerns.
4372
4373 Each version is given a distinguishing version number.  If the Program
4374 specifies a version number of the license which applies to it and ``any
4375 later version'', you have the option of following the terms and conditions
4376 either of that version or of any later version published by the Free
4377 Software Foundation.  If the Program does not specify a version number of
4378 the license, you may choose any version ever published by the Free Software
4379 Foundation.
4380
4381 @item
4382 If you wish to incorporate parts of the Program into other free
4383 programs whose distribution conditions are different, write to the author
4384 to ask for permission.  For software which is copyrighted by the Free
4385 Software Foundation, write to the Free Software Foundation; we sometimes
4386 make exceptions for this.  Our decision will be guided by the two goals
4387 of preserving the free status of all derivatives of our free software and
4388 of promoting the sharing and reuse of software generally.
4389
4390 @iftex
4391 @heading NO WARRANTY
4392 @end iftex
4393 @ifinfo
4394 @center NO WARRANTY
4395 @end ifinfo
4396
4397 @item
4398 BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
4399 FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
4400 OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
4401 PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
4402 OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
4403 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
4404 TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
4405 PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
4406 REPAIR OR CORRECTION.
4407
4408 @item
4409 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL
4410 ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
4411 REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
4412 INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
4413 ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT
4414 LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
4415 SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
4416 WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
4417 ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
4418 @end enumerate
4419
4420 @iftex
4421 @heading END OF TERMS AND CONDITIONS
4422 @end iftex
4423 @ifinfo
4424 @center END OF TERMS AND CONDITIONS
4425 @end ifinfo
4426
4427 @page
4428 @unnumberedsec Appendix: How to Apply These Terms to Your New Programs
4429
4430   If you develop a new program, and you want it to be of the greatest
4431 possible use to humanity, the best way to achieve this is to make it
4432 free software which everyone can redistribute and change under these
4433 terms.
4434
4435   To do so, attach the following notices to the program.  It is safest to
4436 attach them to the start of each source file to most effectively convey
4437 the exclusion of warranty; and each file should have at least the
4438 ``copyright'' line and a pointer to where the full notice is found.
4439
4440 @smallexample
4441 @var{one line to give the program's name and a brief idea of what it does.}
4442 Copyright (C) 19@var{yy}  @var{name of author}
4443
4444 This program is free software; you can redistribute it and/or modify
4445 it under the terms of the GNU General Public License as published by
4446 the Free Software Foundation; either version 1, or (at your option)
4447 any later version.
4448
4449 This program is distributed in the hope that it will be useful,
4450 but WITHOUT ANY WARRANTY; without even the implied warranty of
4451 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
4452 GNU General Public License for more details.
4453
4454 You should have received a copy of the GNU General Public License
4455 along with this program; if not, write to the Free Software
4456 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
4457 @end smallexample
4458
4459 Also add information on how to contact you by electronic and paper mail.
4460
4461 If the program is interactive, make it output a short notice like this
4462 when it starts in an interactive mode:
4463
4464 @smallexample
4465 Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
4466 Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
4467 This is free software, and you are welcome to redistribute it
4468 under certain conditions; type `show c' for details.
4469 @end smallexample
4470
4471 The hypothetical commands `show w' and `show c' should show the
4472 appropriate parts of the General Public License.  Of course, the
4473 commands you use may be called something other than `show w' and `show
4474 c'; they could even be mouse-clicks or menu items---whatever suits your
4475 program.
4476
4477 You should also get your employer (if you work as a programmer) or your
4478 school, if any, to sign a ``copyright disclaimer'' for the program, if
4479 necessary.  Here is a sample; alter the names:
4480
4481 @example
4482 Yoyodyne, Inc., hereby disclaims all copyright interest in the
4483 program `Gnomovision' (a program to direct compilers to make passes
4484 at assemblers) written by James Hacker.
4485
4486 @var{signature of Ty Coon}, 1 April 1989
4487 Ty Coon, President of Vice
4488 @end example
4489
4490 That's all there is to it!
4491
4492
4493 @summarycontents
4494 @contents
4495 @bye