gas/doc/as.texinfo

   1 \input texinfo
   2 @c @tex
   3 @c \special{twoside}
   4 @c @end tex
   5 @setfilename as
   6 @synindex ky cp
   7 @ifinfo
   8 This file documents the GNU Assembler "as".
   9
  10 Copyright (C) 1991 Free Software Foundation, Inc.
  11
  12 Permission is granted to make and distribute verbatim copies of
  13 this manual provided the copyright notice and this permission notice
  14 are preserved on all copies.
  15
  16 @ignore
  17 Permission is granted to process this file through Tex and print the
  18 results, provided the printed document carries copying permission
  19 notice identical to this one except for the removal of this paragraph
  20 (this paragraph not being relevant to the printed manual).
  21
  22 @end ignore
  23 Permission is granted to copy and distribute modified versions of this
  24 manual under the conditions for verbatim copying, provided also that the
  25 section entitled ``GNU General Public License'' is included exactly as
  26 in the original, and provided that the entire resulting derived work is
  27 distributed under the terms of a permission notice identical to this
  28 one.
  29
  30 Permission is granted to copy and distribute translations of this manual
  31 into another language, under the above conditions for modified versions,
  32 except that the section entitled ``GNU General Public License'' may be
  33 included in a translation approved by the author instead of in the
  34 original English.
  35 @end ifinfo
  36
  37 @setchapternewpage odd
  38 @c if m680x0
  39 @c @settitle Using GNU as (680x0)
  40 @c fi m680x0
  41 @c if am29k
  42 @settitle Using GNU as (AMD 29K)
  43 @c fi am29k
  44 @titlepage
  45 @finalout
  46 @title{Using GNU as}
  47 @subtitle{The GNU Assembler}
  48 @c if m680x0
  49 @c @subtitle{for Motorola 680x0}
  50 @c fi m680x0
  51 @c if am29k
  52 @subtitle{for the AMD 29K family}
  53 @c fi am29k
  54 @sp 1
  55 @subtitle February 1991
  56 @sp 13
  57 The Free Software Foundation Inc.  thanks The Nice Computer
  58 Company of Australia for loaning Dean Elsner to write the
  59 first (Vax) version of @code{as} for Project GNU.
  60 The proprietors, management and staff of TNCCA thank FSF for
  61 distracting the boss while they got some work
  62 done.
  63 @sp 3
  64 @author{Dean Elsner, Jay Fenlason & friends}
  65 @author{revised by Roland Pesch for Cygnus Support}
  66 @c pesch@cygnus.com
  67 @page
  68 @tex
  69 \def\$#1${{#1}}  % Kluge: collect RCS revision info without $...$
  70 \xdef\manvers{\$Revision$}  % For use in headers, footers too
  71 {\parskip=0pt
  72 \hfill Cygnus Support\par
  73 \hfill \manvers\par
  74 \hfill \TeX{}info \texinfoversion\par
  75 }
  76 %"boxit" macro for figures:
  77 %Modified from Knuth's ``boxit'' macro from TeXbook (answer to exercise 21.3)
  78 \gdef\boxit#1#2{\vbox{\hrule\hbox{\vrule\kern3pt
  79      \vbox{\parindent=0pt\parskip=0pt\hsize=#1\kern3pt\strut\hfil
  80 #2\hfil\strut\kern3pt}\kern3pt\vrule}\hrule}}%box with visible outline
  81 \gdef\ibox#1#2{\hbox to #1{#2\hfil}\kern8pt}% invisible box
  82 @end tex
  83
  84 @vskip 0pt plus 1filll
  85 Copyright @copyright{} 1991 Free Software Foundation, Inc.
  86
  87 Permission is granted to make and distribute verbatim copies of
  88 this manual provided the copyright notice and this permission notice
  89 are preserved on all copies.
  90
  91 Permission is granted to copy and distribute modified versions of this
  92 manual under the conditions for verbatim copying, provided also that the
  93 section entitled ``GNU General Public License'' is included exactly as
  94 in the original, and provided that the entire resulting derived work is
  95 distributed under the terms of a permission notice identical to this
  96 one.
  97
  98 Permission is granted to copy and distribute translations of this manual
  99 into another language, under the above conditions for modified versions,
 100 except that the section entitled ``GNU General Public License'' may be
 101 included in a translation approved by the author instead of in the
 102 original English.
 103 @end titlepage
 104 @page
 105
 106 @node Top, Overview, (dir), (dir)
 107
 108 @menu
 109 * Overview::                    Overview
 110 * Syntax::                      Syntax
 111 * Segments::                    Segments and Relocation
 112 * Symbols::                     Symbols
 113 * Expressions::                 Expressions
 114 * Pseudo Ops::                  Assembler Directives
 115 * Maintenance::                 Maintaining the Assembler
 116 * Retargeting::                 Teaching the Assembler about a New Machine
 117 * License::                     GNU GENERAL PUBLIC LICENSE
 118
 119  --- The Detailed Node Listing ---
 120
 121 Overview
 122
 123 * Invoking::                    Invoking @code{as}
 124 * Manual::                      Structure of this Manual
 125 * GNU Assembler::               as, the GNU Assembler
 126 * Command Line::                Command Line
 127 * Input Files::                 Input Files
 128 * Object::                      Output (Object) File
 129 * Errors::                      Error and Warning Messages
 130 * Options::                     Options
 131
 132 Input Files
 133
 134 * Filenames::                   Input Filenames and Line-numbers
 135
 136 Syntax
 137
 138 * Pre-processing::              Pre-processing
 139 * Whitespace::                  Whitespace
 140 * Comments::                    Comments
 141 * Symbol Intro::                Symbols
 142 * Statements::                  Statements
 143 * Constants::                   Constants
 144
 145 Constants
 146
 147 * Characters::                  Character Constants
 148 * Numbers::                     Number Constants
 149
 150 Character Constants
 151
 152 * Strings::                     Strings
 153 * Chars::                       Characters
 154
 155 Segments and Relocation
 156
 157 * Segs Background::             Background
 158 * ld Segments::                 ld Segments
 159 * as Segments::                 as Internal Segments
 160 * Sub-Segments::                Sub-Segments
 161 * bss::                         bss Segment
 162
 163 Segments and Relocation
 164
 165 * ld Segments::                 ld Segments
 166 * as Segments::                 as Internal Segments
 167 * Sub-Segments::                Sub-Segments
 168 * bss::                         bss Segment
 169
 170 Symbols
 171
 172 * Labels::                      Labels
 173 * Setting Symbols::             Giving Symbols Other Values
 174 * Symbol Names::                Symbol Names
 175 * Dot::                         The Special Dot Symbol
 176 * Symbol Attributes::           Symbol Attributes
 177
 178 Symbol Names
 179
 180 * Local Symbols::               Local Symbol Names
 181
 182 Symbol Attributes
 183
 184 * Symbol Value::                Value
 185 * Symbol Type::                 Type
 186 * Symbol Desc::                 Descriptor
 187 * Symbol Other::                Other
 188
 189 Expressions
 190
 191 * Empty Exprs::                 Empty Expressions
 192 * Integer Exprs::               Integer Expressions
 193
 194 Integer Expressions
 195
 196 * Arguments::                   Arguments
 197 * Operators::                   Operators
 198 * Prefix Ops::                  Prefix Operators
 199 * Infix Ops::                   Infix Operators
 200
 201 Assembler Directives
 202
 203 * Abort::                       The Abort directive causes as to abort
 204 * Align::                       Pad the location counter to a power of 2
 205 * App-File::                    Set the logical file name
 206 * Ascii::                       Fill memory with bytes of ASCII characters
 207 * Asciz::                       Fill memory with bytes of ASCII characters followed
 208                 by a null.
 209 * Byte::                        Fill memory with 8-bit integers
 210 * Comm::                        Reserve public space in the BSS segment
 211 * Data::                        Change to the data segment
 212 * Desc::                        Set the n_desc of a symbol
 213 * Double::                      Fill memory with double-precision floating-point numbers
 214 * Else::                        @code{.else}
 215 * End::                         @code{.end}
 216 * Endif::                       @code{.endif}
 217 * Equ::                         @code{.equ @var{symbol}, @var{expression}}
 218 * Extern::                      @code{.extern}
 219 * Fill::                        Fill memory with repeated values
 220 * Float::                       Fill memory with single-precision floating-point numbers
 221 * Global::                      Make a symbol visible to the linker
 222 * Ident::                       @code{.ident}
 223 * If::                          @code{.if @var{absolute expression}}
 224 * Include::                     @code{.include "@var{file}"}
 225 * Int::                         Fill memory with 32-bit integers
 226 * Lcomm::                       Reserve private space in the BSS segment
 227 * Line::                        Set the logical line number
 228 * Ln::                          @code{.ln @var{line-number}}
 229 * List::                        @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
 230 * Long::                        Fill memory with 32-bit integers
 231 * Lsym::                        Create a local symbol
 232 * Octa::                        Fill memory with 128-bit integers
 233 * Org::                         Change the location counter
 234 * Quad::                        Fill memory with 64-bit integers
 235 * Set::                         Set the value of a symbol
 236 * Short::                       Fill memory with 16-bit integers
 237 * Single::                      @code{.single @var{flonums}}
 238 * Stab::                        Store debugging information
 239 * Text::                        Change to the text segment
 240 @c if am29k or sparc
 241 * Word::                        Fill memory with 32-bit integers
 242 @c else (not am29k or sparc)
 243 * Deprecated::                  Deprecated Directives
 244 * Machine Options::             Options
 245 * Machine Syntax::              Syntax
 246 * Floating Point::              Floating Point
 247 * Machine Directives::          Machine Directives
 248 * Opcodes::                     Opcodes
 249
 250 Machine Directives
 251
 252 * block::                       @code{.block @var{size} , @var{fill}}
 253 * cputype::                     @code{.cputype}
 254 * file::                        @code{.file}
 255 * hword::                       @code{.hword @var{expressions}}
 256 * line::                        @code{.line}
 257 * reg::                         @code{.reg @var{symbol}, @var{expression}}
 258 * sect::                        @code{.sect}
 259 * use::                         @code{.use @var{segment name}}
 260 @end menu
 261
 262 @node Overview, Syntax, Top, Top
 263 @chapter Overview
 264
 265 This manual is a user guide to the GNU assembler @code{as}.
 266 @c pesch@cygnus.com:
 267 @c                   The following should be conditional on machine config
 268 @c if 680x0
 269 @c This version of the manual describes @code{as} configured to generate
 270 @c code for Motorola 680x0 architectures.
 271 @c fi 680x0
 272 @c if am29k
 273 This version of the manual describes @code{as} configured to generate
 274 code for Advanced Micro Devices' 29K architectures.
 275 @c fi am29k
 276
 277 @menu
 278 * Invoking::                    Invoking @code{as}
 279 * Manual::                      Structure of this Manual
 280 * GNU Assembler::               as, the GNU Assembler
 281 * Command Line::                Command Line
 282 * Input Files::                 Input Files
 283 * Object::                      Output (Object) File
 284 * Errors::                      Error and Warning Messages
 285 * Options::                     Options
 286 @end menu
 287
 288 @node Invoking, Manual, Overview, Overview
 289 @section Invoking @code{as}
 290
 291 Here is a brief summary of how to invoke GNU @code{as}.  For details,
 292 @pxref{Options}.
 293
 294 @c We don't use @deffn and friends for the following because they seem
 295 @c to be limited to one line for the header.
 296 @example
 297   as [ -D ] [ -f ] [ -I @var{path} ] [ -k ] [ -L ] [ -o @var{objfile} ] [ -R ] [ -v ] [ -w ]
 298 @c if 680x0
 299 @c    [ -l ] [ -mc68000 | -mc68010 | -mc68020 ]
 300 @c fi 680x0
 301 @c if am29k
 302 @c@c am29k has no machine-dependent assembler options
 303 @c fi am29k
 304    [ -- | @var{files} @dots{} ]
 305 @end example
 306
 307 @table @code
 308
 309 @item -D
 310 This option is accepted only for script compatibility with calls to
 311 other assemblers; it has no effect on GNU @code{as}.
 312
 313 @item -f
 314 ``fast''---skip preprocessing (assume source is compiler output)
 315
 316 @item -I @var{path}
 317 Add @var{path} to the search list for @code{.include} directives
 318
 319 @item -k
 320 @c if am29k
 321 This option is accepted but has no effect on the 29K family.
 322 @c fi am29k
 323 @c if not am29k
 324 @c Issue warnings when difference tables altered for long displacements
 325 @c fi not am29k
 326
 327 @item -L
 328 Keep (in symbol table) local symbols, starting with @samp{L}
 329
 330 @item -o @var{objfile}
 331 Name the object-file output from @code{as}
 332
 333 @item -R
 334 Fold data segment into text segment
 335
 336 @item -W
 337 Suppress warning messages
 338
 339 @c if 680x0
 340 @c @item -l
 341 @c Shorten references to undefined symbols, to one word instead of two
 342 @c
 343 @c @item -mc68000 | -mc68010 | -mc68020
 344 @c Specify what processor in the 68000 family is the target (default 68020)
 345 @c fi 680x0
 346
 347 @item -- | @var{files} @dots{}
 348 Source files to assemble, or standard input
 349 @end table
 350
 351 @node Manual, GNU Assembler, Invoking, Overview
 352 @section Structure of this Manual
 353 This document is intended to describe what you need to know to use GNU
 354 @code{as}.  We cover the syntax expected in source files, including
 355 notation for symbols, constants, and expressions; the directives that
 356 @code{as} understands; and of course how to invoke @code{as}.
 357
 358 @c if 680x0
 359 @c We also cover special features in the 68000 configuration of @code{as},
 360 @c including pseudo-operations.
 361 @c fi 680x0
 362 @c if am29k
 363 We also cover special features in the AMD 29K configuration of @code{as},
 364 including assembler directives.
 365 @c fi am29k
 366
 367 @ignore
 368   This document also describes some of the
 369 machine-dependent features of various flavors of the assembler.
 370 This document also describes how the assembler works internally, and
 371 provides some information that may be useful to people attempting to
 372 port the assembler to another machine.
 373 @end ignore
 374
 375 On the other hand, this manual is @emph{not} intended as an introduction
 376 to programming in assembly language---let alone programming in general!
 377 In a similar vein, we make no attempt to introduce the machine
 378 architecture; we do @emph{not} describe the instruction set, standard
 379 mnemonics, registers or addressing modes that are standard to a
 380 particular architecture.  You may want to consult the manufacturer's
 381 machine architecture manual for this information.
 382
 383
 384 @c I think this is premature---pesch@cygnus.com, 17jan1991
 385 @ignore
 386 Throughout this document, we assume that you are running @dfn{GNU},
 387 the portable operating system from the @dfn{Free Software
 388 Foundation, Inc.}.  This restricts our attention to certain kinds of
 389 computer (in particular, the kinds of computers that GNU can run on);
 390 once this assumption is granted examples and definitions need less
 391 qualification.
 392
 393 @code{as} is part of a team of programs that turn a high-level
 394 human-readable series of instructions into a low-level
 395 computer-readable series of instructions.  Different versions of
 396 @code{as} are used for different kinds of computer.  In particular,
 397 at the moment, @code{as} only works for the DEC Vax, the Motorola
 398 680x0, the Intel 80386, the Sparc, and the National Semiconductor
 399 32032/32532.
 400 @end ignore
 401
 402 @c There used to be a section "Terminology" here, which defined
 403 @c "contents", "byte", "word", and "long".  Defining "word" to any
 404 @c particular size is confusing when the .word directive may generate 16
 405 @c bits on one machine and 32 bits on another; in general, for the user
 406 @c version of this manual, none of these terms seem essential to define.
 407 @c They were used very little even in the former draft of the manual;
 408 @c this draft makes an effort to avoid them (except in names of
 409 @c directives).
 410
 411 @node GNU Assembler, Command Line, Manual, Overview
 412 @section as, the GNU Assembler
 413 @code{as} is primarily intended to assemble the output of the GNU C
 414 compiler @code{gcc} for use by the linker @code{ld}.  Nevertheless,
 415 we've tried to make @code{as} assemble correctly everything that the native
 416 assembler would.
 417 @c if not am29k
 418 @ignore
 419 Any exceptions are documented explicitly (@pxref{Machine Dependent}).
 420 @end ignore
 421 @c fi not am29k
 422 This doesn't mean @code{as} always uses the same syntax as another
 423 assembler for the same architecture; for example, we know of several
 424 incompatible versions of 680x0 assembly language syntax.
 425
 426 GNU @code{as} is really a family of assemblers.  If you use (or have
 427 used) GNU @code{as} on another architecture, you should find a fairly
 428 similar environment.  Each version has much in common with the others,
 429 including object file formats, most assembler directives (often called
 430 @dfn{pseudo-ops)} and assembler syntax.
 431
 432 Unlike older assemblers, @code{as} is designed to assemble a source
 433 program in one pass of the source file.  This has a subtle impact on the
 434 @kbd{.org} directive (@pxref{Org}).
 435
 436 @node Command Line, Input Files, GNU Assembler, Overview
 437 @section Command Line
 438 @example
 439 as [ options @dots{} ] [ file1 @dots{} ]
 440 @end example
 441
 442 After the program name @code{as}, the command line may contain
 443 options and file names.  Options may be in any order, and may be
 444 before, after, or between file names.  The order of file names is
 445 significant.
 446
 447 @file{--} (two hyphens) by itself names the standard input file
 448 explicitly, as one of the files for @code{as} to assemble.
 449
 450 Except for @samp{--} any command line argument that begins with a
 451 hyphen (@samp{-}) is an option.  Each option changes the behavior of
 452 @code{as}.  No option changes the way another option works.  An
 453 option is a @samp{-} followed by one or more letters; the case of
 454 the letter is important.   All options are optional.
 455
 456 Some options expect exactly one file name to follow them.  The file
 457 name may either immediately follow the option's letter (compatible
 458 with older assemblers) or it may be the next command argument (GNU
 459 standard).  These two command lines are equivalent:
 460
 461 @example
 462 as -o my-object-file.o mumble
 463 as -omy-object-file.o mumble
 464 @end example
 465
 466 @node Input Files, Object, Command Line, Overview
 467 @section Input Files
 468
 469 We use the phrase @dfn{source program}, abbreviated @dfn{source}, to
 470 describe the program input to one run of @code{as}.  The program may
 471 be in one or more files; how the source is partitioned into files
 472 doesn't change the meaning of the source.
 473
 474 @c I added "con" prefix to "catenation" just to prove I can overcome my
 475 @c APL training...   pesch@cygnus.com
 476 The source program is a concatenation of the text in all the files, in the
 477 order specified.
 478
 479 Each time you run @code{as} it assembles exactly one source
 480 program.  The source program is made up of one or more files.
 481 (The standard input is also a file.)
 482
 483 You give @code{as} a command line that has zero or more input file
 484 names.  The input files are read (from left file name to right).  A
 485 command line argument (in any position) that has no special meaning
 486 is taken to be an input file name.
 487
 488 If @code{as} is given no file names it attempts to read one input file
 489 from @code{as}'s standard input, which is normally your terminal.  You
 490 may have to type @key{ctl-D} to tell @code{as} there is no more program
 491 to assemble.
 492
 493 Use @samp{--} if you need to explicitly name the standard input file
 494 in your command line.
 495
 496 If the source is empty, @code{as} will produce a small, empty object
 497 file.
 498
 499 @menu
 500 * Filenames::                   Input Filenames and Line-numbers
 501 @end menu
 502
 503 @node Filenames,  , Input Files, Input Files
 504 @subsection Input Filenames and Line-numbers
 505 There are two ways of locating a line in the input file (or files) and both
 506 are used in reporting error messages.  One way refers to a line
 507 number in a physical file; the other refers to a line number in a
 508 ``logical'' file.
 509
 510 @dfn{Physical files} are those files named in the command line given
 511 to @code{as}.
 512
 513 @dfn{Logical files} are simply names declared explicitly by assembler
 514 directives; they bear no relation to physical files.  Logical file names
 515 help error messages reflect the original source file, when @code{as}
 516 source is itself synthesized from other files.  @xref{App-File}.
 517
 518 @node Object, Errors, Input Files, Overview
 519 @section Output (Object) File
 520 Every time you run @code{as} it produces an output file, which is
 521 your assembly language program translated into numbers.  This file
 522 is the object file, named @code{a.out} unless you tell @code{as} to
 523 give it another name by using the @code{-o} option.  Conventionally,
 524 object file names end with @file{.o}.  The default name of
 525 @file{a.out} is used for historical reasons:  older assemblers were
 526 capable of assembling self-contained programs directly into a
 527 runnable program.
 528 @c This may still work, but hasn't been tested.
 529
 530 The object file is meant for input to the linker @code{ld}.  It contains
 531 assembled program code, information to help @code{ld} integrate
 532 the assembled program into a runnable file, and (optionally) symbolic
 533 information for the debugger.
 534
 535 @comment link above to some info file(s) like the description of a.out.
 536 @comment don't forget to describe GNU info as well as Unix lossage.
 537
 538 @node Errors, Options, Object, Overview
 539 @section Error and Warning Messages
 540
 541 @code{as} may write warnings and error messages to the standard error
 542 file (usually your terminal).  This should not happen when @code{as} is
 543 run automatically by a compiler.  Warnings report an assumption made so
 544 that @code{as} could keep assembling a flawed program; errors report a
 545 grave problem that stops the assembly.
 546
 547 Warning messages have the format
 548 @example
 549 file_name:@b{NNN}:Warning Message Text
 550 @end example
 551 @noindent(where @b{NNN} is a line number).  If a logical file name has
 552 been given (@pxref{App-File}) it is used for the filename, otherwise the
 553 name of the current input file is used.  If a logical line number was
 554 given (@pxref{Line}) then it is used to calculate the number printed,
 555 otherwise the actual line in the current source file is printed.  The
 556 message text is intended to be self explanatory (in the grand Unix
 557 tradition).
 558
 559 Error messages have the format
 560 @example
 561 file_name:@b{NNN}:FATAL:Error Message Text
 562 @end example
 563 The file name and line number are derived as for warning
 564 messages.  The actual message text may be rather less explanatory
 565 because many of them aren't supposed to happen.
 566
 567 @node Options,  , Errors, Overview
 568 @section Options
 569 @subsection @code{-D}
 570 This option has no effect whatsoever, but it is accepted to make it more
 571 likely that scripts written for other assemblers will also work with
 572 GNU @code{as}.
 573
 574 @subsection Work Faster: @code{-f}
 575 @samp{-f} should only be used when assembling programs written by a
 576 (trusted) compiler.  @samp{-f} stops the assembler from pre-processing
 577 the input file(s) before assembling them.
 578 @quotation
 579 @emph{Warning:} if the files actually need to be pre-processed (if they
 580 contain comments, for example), @code{as} will not work correctly if
 581 @samp{-f} is used.
 582 @end quotation
 583
 584 @subsection Add to @code{.include} search path: @code{-I} @var{path}
 585 Use this option to add a @var{path} to the list of directories GNU
 586 @code{as} will search for files specified in @code{.include} directives
 587 (@pxref{Include}).  You may use @code{-I} as many times as necessary to
 588 include a variety of paths.  The current working directory is always
 589 searched first; after that, @code{as} searches any @samp{-I} directories
 590 in the same order as they were specified (left to right) on the command
 591 line.
 592
 593 @subsection Warn if difference tables altered: @code{-k}
 594 @c if am29k
 595 On the AMD 29K family, this option is allowed, but has no effect.  It is
 596 permitted for compatibility with GNU @code{as} on other platforms,
 597 where it can be used to warn when @code{as} alters the machine code
 598 generated for @samp{.word} directives in difference tables.  The AMD 29K
 599 family does not have the addressing limitations that sometimes lead to this
 600 alteration on other platforms.
 601 @c fi am29k
 602
 603 @c if not am29k
 604 @ignore
 605 @code{as} sometimes alters the code emitted for directives of the form
 606 @samp{.word @var{sym1}-@var{sym2}}; @pxref{Word}.
 607 You can use the @samp{-k} option if you want a warning issued when this
 608 is done.
 609 @end ignore
 610 @c fi not am29k
 611
 612 @subsection Include Local Labels: @code{-L}
 613 Labels beginning with @samp{L} (upper case only) are called @dfn{local
 614 labels}. @xref{Symbol Names}.  Normally you don't see such labels when
 615 debugging, because they are intended for the use of programs (like
 616 compilers) that compose assembler programs, not for your notice.
 617 Normally both @code{as} and @code{ld} discard such labels, so you don't
 618 normally debug with them.
 619
 620 This option tells @code{as} to retain those @samp{L@dots{}} symbols
 621 in the object file.  Usually if you do this you also tell the linker
 622 @code{ld} to preserve symbols whose names begin with @samp{L}.
 623
 624 @subsection Name the Object File: @code{-o}
 625 There is always one object file output when you run @code{as}.  By
 626 default it has the name @file{a.out}.  You use this option (which
 627 takes exactly one filename) to give the object file a different name.
 628
 629 Whatever the object file is called, @code{as} will overwrite any
 630 existing file of the same name.
 631
 632 @subsection Fold Data Segment into Text Segment: @code{-R}
 633 @code{-R} tells @code{as} to write the object file as if all
 634 data-segment data lives in the text segment.  This is only done at
 635 the very last moment:  your binary data are the same, but data
 636 segment parts are relocated differently.  The data segment part of
 637 your object file is zero bytes long because all it bytes are
 638 appended to the text segment.  (@xref{Segments}.)
 639
 640 When you specify @code{-R} it would be possible to generate shorter
 641 address displacements (because we don't have to cross between text and
 642 data segment).  We don't do this simply for compatibility with older
 643 versions of @code{as}.  In future, @code{-R} may work this way.
 644
 645 @subsection Suppress Warnings: @code{-W}
 646 @code{as} should never give a warning or error message when
 647 assembling compiler output.  But programs written by people often
 648 cause @code{as} to give a warning that a particular assumption was
 649 made.  All such warnings are directed to the standard error file.
 650 If you use this option, no warnings are issued.  This option only
 651 affects the warning messages: it does not change any particular of how
 652 @code{as} assembles your file.  Errors, which stop the assembly, are
 653 still reported.
 654
 655 @node Syntax, Segments, Overview, Top
 656 @chapter Syntax
 657 This chapter describes the machine-independent syntax allowed in a
 658 source file.  @code{as} syntax is similar to what many other assemblers
 659 use; it is inspired in BSD 4.2
 660 @c if not vax
 661 assembler. @refill
 662 @c fi not vax
 663 @c if vax
 664 @c assembler, except that @code{as} does not
 665 @c assemble Vax bit-fields.
 666 @c fi vax
 667
 668 @menu
 669 * Pre-processing::              Pre-processing
 670 * Whitespace::                  Whitespace
 671 * Comments::                    Comments
 672 * Symbol Intro::                Symbols
 673 * Statements::                  Statements
 674 * Constants::                   Constants
 675 @end menu
 676
 677 @node Pre-processing, Whitespace, Syntax, Syntax
 678 @section Pre-processing
 679
 680 The pre-processor:
 681 @itemize @bullet
 682 @item
 683 adjusts and removes extra whitespace.  It leaves one space or tab before
 684 the keywords on a line, and turns any other whitespace on the line into
 685 a single space.
 686
 687 @item
 688 removes all comments, replacing them with a single space, or an
 689 appropriate number of newlines.
 690
 691 @item
 692 converts character constants into the appropriate numeric values.
 693 @end itemize
 694
 695 Excess whitespace, comments, and character constants
 696 cannot be used in the portions of the input text that are not
 697 pre-processed.
 698
 699 If the first line of an input file is @code{#NO_APP} or the @samp{-f}
 700 option is given, the input file will not be pre-processed.  Within such
 701 an input file, parts of the file can be pre-processed by putting a line
 702 that says @code{#APP} before the text that should be pre-processed, and
 703 putting a line that says @code{#NO_APP} after them.  This feature is
 704 mainly intend to support @code{asm} statements in compilers whose output
 705 normally does not need to be pre-processed.
 706
 707 @node Whitespace, Comments, Pre-processing, Syntax
 708 @section Whitespace
 709 @dfn{Whitespace} is one or more blanks or tabs, in any order.
 710 Whitespace is used to separate symbols, and to make programs neater
 711 for people to read.  Unless within character constants
 712 (@pxref{Characters}), any whitespace means the same as exactly one
 713 space.
 714
 715 @node Comments, Symbol Intro, Whitespace, Syntax
 716 @section Comments
 717 There are two ways of rendering comments to @code{as}.  In both
 718 cases the comment is equivalent to one space.
 719
 720 Anything from @samp{/*} through the next @samp{*/} is a comment.
 721 This means you may not nest these comments.
 722
 723 @example
 724 /*
 725   The only way to include a newline ('\n') in a comment
 726   is to use this sort of comment.
 727 */
 728
 729 /* This sort of comment does not nest. */
 730 @end example
 731
 732 Anything from the @dfn{line comment} character to the next newline
 733 is considered a comment and is ignored.  The line comment character is
 734 @c if vax
 735 @c @samp{#} on the Vax. @xref{Machine Dependent}. @refill
 736 @c @fi vax
 737 @c if 680x0
 738 @c @samp{|} on the 680x0. @xref{Machine Dependent}.  @refill
 739 @c fi 680x0
 740 @c if am29k
 741 @samp{;} for the AMD 29K family. @xref{Machine Dependent}.  @refill
 742 @c fi am29k
 743 @ignore
 744 @if all-arch
 745 On some machines there are two different line comment characters.  One
 746 will only begin a comment if it is the first non-whitespace character on
 747 a line, while the other will always begin a comment.
 748 @fi all-arch
 749 @end ignore
 750
 751 To be compatible with past assemblers a special interpretation is
 752 given to lines that begin with @samp{#}.  Following the @samp{#} an
 753 absolute expression (@pxref{Expressions}) is expected:  this will be
 754 the logical line number of the @b{next} line.  Then a string
 755 (@xref{Strings}.) is allowed: if present it is a new logical file
 756 name.  The rest of the line, if any, should be whitespace.
 757
 758 If the first non-whitespace characters on the line are not numeric,
 759 the line is ignored.  (Just like a comment.)
 760 @example
 761                           # This is an ordinary comment.
 762 # 42-6 "new_file_name"    # New logical file name
 763                           # This is logical line # 36.
 764 @end example
 765 This feature is deprecated, and may disappear from future versions
 766 of @code{as}.
 767
 768 @node Symbol Intro, Statements, Comments, Syntax
 769 @section Symbols
 770 A @dfn{symbol} is one or more characters chosen from the set of all
 771 letters (both upper and lower case), digits and the three characters
 772 @samp{_.$}.  No symbol may begin with a digit.  Case is significant.
 773 There is no length limit: all characters are significant.  Symbols are
 774 delimited by characters not in that set, or by the beginning of a file
 775 (since the source program must end with a newline, the end of a file is
 776 not a possible symbol delimiter).  @xref{Symbols}.
 777
 778 @node Statements, Constants, Symbol Intro, Syntax
 779 @section Statements
 780 A @dfn{statement} ends at a newline character (@samp{\n})
 781 @c @if m680x0 (or is this if !am29k?)
 782 @c or at a semicolon (@samp{;}).  The newline or semicolon
 783 @c fi m680x0 (or !am29k)
 784 @c if am29k
 785 or an ``at'' sign (@samp{@@}).  The newline or at sign
 786 @c fi am29k
 787 is considered part
 788 of the preceding statement.  Newlines
 789 @c if m680x0 (or !am29k)
 790 @c and semicolons
 791 @c fi m680x0 (or !am29k)
 792 @c if am29k
 793 and at signs
 794 @c fi am29k
 795 within
 796 character constants are an exception:  they don't end statements.
 797 It is an error to end any statement with end-of-file:  the last
 798 character of any input file should be a newline.@refill
 799
 800 You may write a statement on more than one line if you put a
 801 backslash (@kbd{\}) immediately in front of any newlines within the
 802 statement.  When @code{as} reads a backslashed newline both
 803 characters are ignored.  You can even put backslashed newlines in
 804 the middle of symbol names without changing the meaning of your
 805 source program.
 806
 807 An empty statement is allowed, and may include whitespace.  It is ignored.
 808
 809 @c "key symbol" is not used elsewhere in the document; seems pedantic to
 810 @c @defn{} it in that case, as was done previously...  pesch@cygnus.com,
 811 @c 13feb91.
 812 A statement begins with zero or more labels, optionally followed by a
 813 key symbol which determines what kind of statement it is.  The key
 814 symbol determines the syntax of the rest of the statement.  If the
 815 symbol begins with a dot @samp{.} then the statement is an assembler
 816 directive: typically valid for any computer.  If the symbol begins with
 817 a letter the statement is an assembly language @dfn{instruction}: it
 818 will assemble into a machine language instruction.  Different versions
 819 of @code{as} for different computers will recognize different
 820 instructions.  In fact, the same symbol may represent a different
 821 instruction in a different computer's assembly language.
 822
 823 A label is a symbol immediately followed by a colon (@code{:}).
 824 Whitespace before a label or after a colon is permitted, but you may not
 825 have whitespace between a label's symbol and its colon. @xref{Labels}.
 826
 827 @example
 828 label:     .directive    followed by something
 829 another$label:           # This is an empty statement.
 830            instruction   operand_1, operand_2, @dots{}
 831 @end example
 832
 833 @node Constants,  , Statements, Syntax
 834 @section Constants
 835 A constant is a number, written so that its value is known by
 836 inspection, without knowing any context.  Like this:
 837 @example
 838 .byte  74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value.
 839 .ascii "Ring the bell\7"                  # A string constant.
 840 .octa  0x123456789abcdef0123456789ABCDEF0 # A bignum.
 841 .float 0f-314159265358979323846264338327\
 842 95028841971.693993751E-40                 # - pi, a flonum.
 843 @end example
 844
 845 @menu
 846 * Characters::                  Character Constants
 847 * Numbers::                     Number Constants
 848 @end menu
 849
 850 @node Characters, Numbers, Constants, Constants
 851 @subsection Character Constants
 852 There are two kinds of character constants.  A @dfn{character} stands
 853 for one character in one byte and its value may be used in
 854 numeric expressions.  String constants (properly called string
 855 @emph{literals}) are potentially many bytes and their values may not be
 856 used in arithmetic expressions.
 857
 858 @menu
 859 * Strings::                     Strings
 860 * Chars::                       Characters
 861 @end menu
 862
 863 @node Strings, Chars, Characters, Characters
 864 @subsubsection Strings
 865 A @dfn{string} is written between double-quotes.  It may contain
 866 double-quotes or null characters.  The way to get special characters
 867 into a string is to @dfn{escape} these characters: precede them with
 868 a backslash @samp{\} character.  For example @samp{\\} represents
 869 one backslash:  the first @code{\} is an escape which tells
 870 @code{as} to interpret the second character literally as a backslash
 871 (which prevents @code{as} from recognizing the second @code{\} as an
 872 escape character).  The complete list of escapes follows.
 873
 874 @table @kbd
 875 @c      @item \a
 876 @c      Mnemonic for ACKnowledge; for ASCII this is octal code 007.
 877 @item \b
 878 Mnemonic for backspace; for ASCII this is octal code 010.
 879 @c      @item \e
 880 @c      Mnemonic for EOText; for ASCII this is octal code 004.
 881 @item \f
 882 Mnemonic for FormFeed; for ASCII this is octal code 014.
 883 @item \n
 884 Mnemonic for newline; for ASCII this is octal code 012.
 885 @c      @item \p
 886 @c      Mnemonic for prefix; for ASCII this is octal code 033, usually known as @code{escape}.
 887 @item \r
 888 Mnemonic for carriage-Return; for ASCII this is octal code 015.
 889 @c      @item \s
 890 @c      Mnemonic for space; for ASCII this is octal code 040.  Included for compliance with
 891 @c      other assemblers.
 892 @item \t
 893 Mnemonic for horizontal Tab; for ASCII this is octal code 011.
 894 @c      @item \v
 895 @c      Mnemonic for Vertical tab; for ASCII this is octal code 013.
 896 @c      @item \x @var{digit} @var{digit} @var{digit}
 897 @c      A hexadecimal character code.  The numeric code is 3 hexadecimal digits.
 898 @item \ @var{digit} @var{digit} @var{digit}
 899 An octal character code.  The numeric code is 3 octal digits.
 900 For compatibility with other Unix systems, 8 and 9 are accepted as digits:
 901 for example, @code{\008} has the value 010, and @code{\009} the value 011.
 902 @item \\
 903 Represents one @samp{\} character.
 904 @c      @item \'
 905 @c      Represents one @samp{'} (accent acute) character.
 906 @c      This is needed in single character literals
 907 @c      (@xref{Characters}.) to represent
 908 @c      a @samp{'}.
 909 @item \"
 910 Represents one @samp{"} character.  Needed in strings to represent
 911 this character, because an unescaped @samp{"} would end the string.
 912 @item \ @var{anything-else}
 913 Any other character when escaped by @kbd{\} will give a warning, but
 914 assemble as if the @samp{\} was not present.  The idea is that if
 915 you used an escape sequence you clearly didn't want the literal
 916 interpretation of the following character.  However @code{as} has no
 917 other interpretation, so @code{as} knows it is giving you the wrong
 918 code and warns you of the fact.
 919 @end table
 920
 921 Which characters are escapable, and what those escapes represent,
 922 varies widely among assemblers.  The current set is what we think
 923 BSD 4.2 @code{as} recognizes, and is a subset of what most C
 924 compilers recognize.  If you are in doubt, don't use an escape
 925 sequence.
 926
 927 @node Chars,  , Strings, Characters
 928 @subsubsection Characters
 929 A single character may be written as a single quote immediately
 930 followed by that character.  The same escapes apply to characters as
 931 to strings.  So if you want to write the character backslash, you
 932 must write @kbd{'\\} where the first @code{\} escapes the second
 933 @code{\}.  As you can see, the quote is an acute accent, not a
 934 grave accent.  A newline
 935 @c if 680x0 (or !am29k)
 936 @c (or semicolon @samp{;})
 937 @c fi 680x0 (or !am29k)
 938 @c if am29k
 939 (or at sign @samp{@@})
 940 @c fi am29k
 941 immediately
 942 following an acute accent is taken as a literal character and does
 943 not count as the end of a statement.  The value of a character
 944 constant in a numeric expression is the machine's byte-wide code for
 945 that character.  @code{as} assumes your character code is ASCII: @kbd{'A}
 946 means 65, @kbd{'B} means 66, and so on. @refill
 947
 948 @node Numbers,  , Characters, Constants
 949 @subsection Number Constants
 950 @code{as} distinguishes three kinds of numbers according to how they
 951 are stored in the target machine.  @emph{Integers} are numbers that
 952 would fit into an @code{int} in the C language.  @emph{Bignums} are
 953 integers, but they are stored in a more than 32 bits.  @emph{Flonums}
 954 are floating point numbers, described below.
 955
 956 @subsubsection Integers
 957 A binary integer is @samp{0b} or @samp{0B} followed by zero or more of
 958 the binary digits @samp{01}.
 959
 960 An octal integer is @samp{0} followed by zero or more of the octal
 961 digits (@samp{01234567}).
 962
 963 A decimal integer starts with a non-zero digit followed by zero or
 964 more digits (@samp{0123456789}).
 965
 966 A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or
 967 more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}.
 968
 969 Integers have the usual values.  To denote a negative integer, use
 970 the prefix operator @samp{-} discussed under expressions
 971 (@pxref{Prefix Ops}).
 972
 973 @subsubsection Bignums
 974 A @dfn{bignum} has the same syntax and semantics as an integer
 975 except that the number (or its negative) takes more than 32 bits to
 976 represent in binary.  The distinction is made because in some places
 977 integers are permitted while bignums are not.
 978
 979 @subsubsection Flonums
 980 A @dfn{flonum} represents a floating point number.  The translation is
 981 complex: a decimal floating point number from the text is converted by
 982 @code{as} to a generic binary floating point number of more than
 983 sufficient precision.  This generic floating point number is converted
 984 to a particular computer's floating point format (or formats) by a
 985 portion of @code{as} specialized to that computer.
 986
 987 A flonum is written by writing (in order)
 988 @itemize @bullet
 989 @item
 990 The digit @samp{0}.
 991 @item
 992 @c if am29k
 993 One of the letters @samp{DFPRSX} (in upper or lower case), to tell
 994 @code{as} the rest of the number is a flonum.
 995 @c fi am29k
 996 @c if not am29k
 997 @ignore
 998 A letter, to tell @code{as} the rest of the number is a flonum.  @kbd{e}
 999 is recommended.  Case is not important.  (Any otherwise illegal letter
1000 will work here, but that might be changed.  Vax BSD 4.2 assembler seems
1001 to allow any of @samp{defghDEFGH}.)
1002 @end ignore
1003 @c fi not am29k
1004 @item
1005 An optional sign: either @samp{+} or @samp{-}.
1006 @item
1007 An optional @dfn{integer part}: zero or more decimal digits.
1008 @item
1009 An optional @dfn{fraction part}: @samp{.} followed by zero
1010 or more decimal digits.
1011 @item
1012 An optional exponent, consisting of:
1013 @itemize @bullet
1014 @item
1015 @c if am29k
1016 An @samp{E} or @samp{e}.
1017 @c if not am29k
1018 @ignore
1019 A letter; the exact significance varies according to
1020 the computer that executes the program.  @code{as}
1021 accepts any letter for now.  Case is not important.
1022 @end ignore
1023 @c fi not am29k
1024 @item
1025 Optional sign: either @samp{+} or @samp{-}.
1026 @item
1027 One or more decimal digits.
1028 @end itemize
1029 @end itemize
1030
1031 At least one of @var{integer part} or @var{fraction part} must be
1032 present.  The floating point number has the usual base-10 value.
1033
1034 @code{as} does all processing using integers.  Flonums are computed
1035 independently of any floating point hardware in the computer running
1036 @code{as}.
1037
1038 @node Segments, Symbols, Syntax, Top
1039 @chapter Segments and Relocation
1040 @menu
1041 * Segs Background::             Background
1042 * ld Segments::                 ld Segments
1043 * as Segments::                 as Internal Segments
1044 * Sub-Segments::                Sub-Segments
1045 * bss::                         bss Segment
1046 @end menu
1047
1048 @node Segs Background, ld Segments, Segments, Segments
1049 @section Background
1050 Roughly, a segment is a range of addresses, with no gaps; all data
1051 ``in'' those addresses is treated the same for some particular purpose.
1052 For example there may be a ``read only'' segment.
1053
1054 The linker @code{ld} reads many object files (partial programs) and
1055 combines their contents to form a runnable program.  When @code{as}
1056 emits an object file, the partial program is assumed to start at address
1057 0.  @code{ld} will assign the final addresses the partial program
1058 occupies, so that different partial programs don't overlap.  This is
1059 actually an over-simplification, but it will suffice to explain how
1060 @code{as} uses segments.
1061
1062 @code{ld} moves blocks of bytes of your program to their run-time
1063 addresses.  These blocks slide to their run-time addresses as rigid
1064 units; their length does not change and neither does the order of bytes
1065 within them.  Such a rigid unit is called a @emph{segment}.  Assigning
1066 run-time addresses to segments is called @dfn{relocation}.  It includes
1067 the task of adjusting mentions of object-file addresses so they refer to
1068 the proper run-time addresses.
1069
1070 An object file written by @code{as} has three segments, any of which may
1071 be empty.  These are named @dfn{text}, @dfn{data} and @dfn{bss}
1072 segments.  Within the object file, the text segment starts at
1073 address @code{0}, the data segment follows, and the bss segment
1074 follows the data segment.
1075
1076 To let @code{ld} know which data will change when the segments are
1077 relocated, and how to change that data, @code{as} also writes to the
1078 object file details of the relocation needed.  To perform relocation
1079 @code{ld} must know, each time an address in the object
1080 file is mentioned:
1081 @itemize @bullet
1082 @item
1083 Where in the object file is the beginning of this reference to
1084 an address?
1085 @item
1086 How long (in bytes) is this reference?
1087 @item
1088 Which segment does the address refer to?  What is the numeric value of
1089 @display
1090 (@var{address}) @minus{} (@var{start-address of segment})?
1091 @end display
1092 @item
1093 Is the reference to an address ``Program-Counter relative''?
1094 @end itemize
1095
1096 In fact, every address @code{as} ever uses is expressed as
1097 @code{(@var{segment}) + (@var{offset into segment})}.  Further, every
1098 expression @code{as} computes is of this segmented nature.
1099 @dfn{Absolute expression} means an expression with segment ``absolute''
1100 (@pxref{ld Segments}).  A @dfn{pass1 expression} means an expression
1101 with segment ``pass1'' (@pxref{as Segments}).  In this manual we use the
1102 notation @{@var{segname} @var{N}@} to mean ``offset @var{N} into segment
1103 @var{segname}''.
1104
1105 Apart from text, data and bss segments you need to know about the
1106 @dfn{absolute} segment.  When @code{ld} mixes partial programs,
1107 addresses in the absolute segment remain unchanged.  That is, address
1108 @code{@{absolute 0@}} is ``relocated'' to run-time address 0 by @code{ld}.
1109 Although two partial programs' data segments will not overlap addresses
1110 after linking, @emph{by definition} their absolute segments will overlap.
1111 Address @code{@{absolute@ 239@}} in one partial program will always be the same
1112 address when the program is running as address @code{@{absolute@ 239@}} in any
1113 other partial program.
1114
1115 The idea of segments is extended to the @dfn{undefined} segment.  Any
1116 address whose segment is unknown at assembly time is by definition
1117 rendered @{undefined @var{U}@}---where @var{U} will be filled in later.
1118 Since numbers are always defined, the only way to generate an undefined
1119 address is to mention an undefined symbol.  A reference to a named
1120 common block would be such a symbol: its value is unknown at assembly
1121 time so it has segment @emph{undefined}.
1122
1123 By analogy the word @emph{segment} is used to describe groups of segments in
1124 the linked program.  @code{ld} puts all partial programs' text
1125 segments in contiguous addresses in the linked program.  It is
1126 customary to refer to the @emph{text segment} of a program, meaning all
1127 the addresses of all partial program's text segments.  Likewise for
1128 data and bss segments.
1129
1130 Some segments are manipulated by @code{ld}; others are invented for
1131 use of @code{as} and have no meaning except during assembly.
1132
1133 @menu
1134 * ld Segments::                 ld Segments
1135 * as Segments::                 as Internal Segments
1136 * Sub-Segments::                Sub-Segments
1137 * bss::                         bss Segment
1138 @end menu
1139
1140 @node ld Segments, as Segments, Segs Background, Segments
1141 @section ld Segments
1142 @code{ld} deals with just five kinds of segments, summarized below.
1143
1144 @table @strong
1145
1146 @item text segment
1147 @itemx data segment
1148 These segments hold your program.  @code{as} and @code{ld} treat them as
1149 separate but equal segments.  Anything you can say of one segment is
1150 true of the other.  When the program is running, however, it is
1151 customary for the text segment to be unalterable.  The
1152 text segment is often shared among processes: it will contain
1153 instructions, constants and the like.  The data segment of a running
1154 program is usually alterable: for example, C variables would be stored
1155 in the data segment.
1156
1157 @item bss segment
1158 This segment contains zeroed bytes when your program begins running.  It
1159 is used to hold unitialized variables or common storage.  The length of
1160 each partial program's bss segment is important, but because it starts
1161 out containing zeroed bytes there is no need to store explicit zero
1162 bytes in the object file.  The bss segment was invented to eliminate
1163 those explicit zeros from object files.
1164
1165 @item absolute segment
1166 Address 0 of this segment is always ``relocated'' to runtime address 0.
1167 This is useful if you want to refer to an address that @code{ld} must
1168 not change when relocating.  In this sense we speak of absolute
1169 addresses being ``unrelocatable'': they don't change during relocation.
1170
1171 @item @code{undefined} segment
1172 This ``segment'' is a catch-all for address references to objects not in
1173 the preceding segments.
1174 @c FIXME: ref to some other doc on obj-file formats could go here.
1175
1176 @end table
1177
1178 An idealized example of the 3 relocatable segments follows.  Memory
1179 addresses are on the horizontal axis.
1180
1181 @ifinfo
1182 @example
1183                       +-----+----+--+
1184 partial program # 1:  |ttttt|dddd|00|
1185                       +-----+----+--+
1186
1187                       text   data bss
1188                       seg.   seg. seg.
1189
1190                       +---+---+---+
1191 partial program # 2:  |TTT|DDD|000|
1192                       +---+---+---+
1193
1194                       +--+---+-----+--+----+---+-----+~~
1195 linked program:       |  |TTT|ttttt|  |dddd|DDD|00000|
1196                       +--+---+-----+--+----+---+-----+~~
1197
1198     addresses:        0 @dots{}
1199 @end example
1200 @end ifinfo
1201 @tex
1202 \halign{\hfil\rm #\quad&#\cr
1203 \cr
1204    &\ibox{2.5cm}{\tt text}\ibox{2cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1205 Partial program \#1:
1206 &\boxit{2.5cm}{\tt ttttt}\boxit{2cm}{\tt dddd}\boxit{1cm}{\tt 00}\cr
1207 \cr
1208    &\ibox{1cm}{\tt text}\ibox{1.5cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1209 Partial program \#2:
1210 &\boxit{1cm}{\tt TTT}\boxit{1.5cm}{\tt DDDD}\boxit{1cm}{\tt 000}\cr
1211 \cr
1212    &\ibox{.5cm}{}\ibox{1cm}{\tt text}\ibox{2.5cm}{}\ibox{.75cm}{}\ibox{2cm}{\tt data}\ibox{1.5cm}{}\ibox{2cm}{\tt bss}\cr
1213 linked program:
1214 &\boxit{.5cm}{}\boxit{1cm}{\tt TTT}\boxit{2.5cm}{\tt
1215 ttttt}\boxit{.75cm}{}\boxit{2cm}{\tt dddd}\boxit{1.5cm}{\tt
1216 DDDD}\boxit{2cm}{00000}\ \dots\cr
1217 addresses:
1218 &\dots\cr
1219 }
1220 @end tex
1221
1222 @node as Segments, Sub-Segments, ld Segments, Segments
1223 @section as Internal Segments
1224 These segments are invented for the internal use of @code{as}.  They
1225 have no meaning at run-time.  You don't need to know about these
1226 segments except that they might be mentioned in @code{as}' warning
1227 messages.  These segments are invented to permit the value of every
1228 expression in your assembly language program to be a segmented
1229 address.
1230
1231 @table @b
1232 @item absent segment
1233 An expression was expected and none was
1234 found.
1235
1236 @item goof segment
1237 An internal assembler logic error has been
1238 found.  This means there is a bug in the assembler.
1239
1240 @item grand segment
1241 A @dfn{grand number} is a bignum or a flonum, but not an integer.  If a
1242 number can't be written as a C @code{int} constant, it is a grand
1243 number.  @code{as} has to remember that a flonum or a bignum does not
1244 fit into 32 bits, and cannot be an argument (@pxref{Arguments}) in an
1245 expression: this is done by making a flonum or bignum be in segment
1246 grand.  This is purely for internal @code{as} convenience; grand
1247 segment behaves similarly to absolute segment.
1248
1249 @item pass1 segment
1250 The expression was impossible to evaluate in the first pass.  The
1251 assembler will attempt a second pass (second reading of the source) to
1252 evaluate the expression.  Your expression mentioned an undefined symbol
1253 in a way that defies the one-pass (segment + offset in segment) assembly
1254 process.  No compiler need emit such an expression.
1255
1256 @quotation
1257 @emph{Warning:} the second pass is currently not implemented.  @code{as}
1258 will abort with an error message if one is required.
1259 @end quotation
1260
1261 @item difference segment
1262 As an assist to the C compiler, expressions of the forms
1263 @display
1264    (@var{undefined symbol}) @minus{} (@var{expression}
1265    (@var{something} @minus{} (@var{undefined symbol})
1266    (@var{undefined symbol}) @minus{} (@var{undefined symbol})
1267 @end display
1268 are permitted, and belong to the difference segment.  @code{as}
1269 re-evaluates such expressions after the source file has been read and
1270 the symbol table built.  If by that time there are no undefined symbols
1271 in the expression then the expression assumes a new segment.  The
1272 intention is to permit statements like
1273 @samp{.word label - base_of_table}
1274 to be assembled in one pass where both @code{label} and
1275 @code{base_of_table} are undefined.  This is useful for compiling C and
1276 Algol switch statements, Pascal case statements, FORTRAN computed goto
1277 statements and the like.
1278 @end table
1279
1280 @node Sub-Segments, bss, as Segments, Segments
1281 @section Sub-Segments
1282 Assembled bytes fall into two segments: text and data.
1283 Because you may have groups of text or data that you want to end up near
1284 to each other in the object file, @code{as} allows you to use
1285 @dfn{subsegments}.  Within each segment, there can be numbered
1286 subsegments with values from 0 to 8192.  Objects assembled into the same
1287 subsegment will be grouped with other objects in the same subsegment
1288 when they are all put into the object file.  For example, a compiler
1289 might want to store constants in the text segment, but might not want to
1290 have them interspersed with the program being assembled.  In this case,
1291 the compiler could issue a @code{text 0} before each section of code
1292 being output, and a @code{text 1} before each group of constants being
1293 output.
1294
1295 Subsegments are optional.  If you don't use subsegments, everything
1296 will be stored in subsegment number zero.
1297
1298 @c @if not am29k
1299 @c Each subsegment is zero-padded up to a multiple of four bytes.
1300 @c (Subsegments may be padded a different amount on different flavors
1301 @c of @code{as}.)
1302 @c fi not am29k
1303 @c if am29k
1304 On the AMD 29K family, no particular padding is added to segment sizes;
1305 GNU as forces no alignment on this platform.
1306 @c fi am29k
1307 Subsegments appear in your object file in numeric order, lowest numbered
1308 to highest.  (All this to be compatible with other people's assemblers.)
1309 The object file contains no representation of subsegments; @code{ld} and
1310 other programs that manipulate object files will see no trace of them.
1311 They just see all your text subsegments as a text segment, and all your
1312 data subsegments as a data segment.
1313
1314 To specify which subsegment you want subsequent statements assembled
1315 into, use a @samp{.text @var{expression}} or a @samp{.data
1316 @var{expression}} statement.  @var{Expression} should be an absolute
1317 expression.  (@xref{Expressions}.)  If you just say @samp{.text}
1318 then @samp{.text 0} is assumed.  Likewise @samp{.data} means
1319 @samp{.data 0}.  Assembly begins in @code{text 0}.
1320 For instance:
1321 @example
1322 .text 0     # The default subsegment is text 0 anyway.
1323 .ascii "This lives in the first text subsegment. *"
1324 .text 1
1325 .ascii "But this lives in the second text subsegment."
1326 .data 0
1327 .ascii "This lives in the data segment,"
1328 .ascii "in the first data subsegment."
1329 .text 0
1330 .ascii "This lives in the first text segment,"
1331 .ascii "immediately following the asterisk (*)."
1332 @end example
1333
1334 Each segment has a @dfn{location counter} incremented by one for every
1335 byte assembled into that segment.  Because subsegments are merely a
1336 convenience restricted to @code{as} there is no concept of a subsegment
1337 location counter.  There is no way to directly manipulate a location
1338 counter---but the @code{.align} directive will change it, and any label
1339 definition will capture its current value.  The location counter of the
1340 segment that statements are being assembled into is said to be the
1341 @dfn{active} location counter.
1342
1343 @node bss,  , Sub-Segments, Segments
1344 @section bss Segment
1345 The bss segment is used for local common variable storage.
1346 You may allocate address space in the bss segment, but you may
1347 not dictate data to load into it before your program executes.  When
1348 your program starts running, all the contents of the bss
1349 segment are zeroed bytes.
1350
1351 Addresses in the bss segment are allocated with special directives;
1352 you may not assemble anything directly into the bss segment.  Hence
1353 there are no bss subsegments. @xref{Comm}; @pxref{Lcomm}.
1354
1355 @node Symbols, Expressions, Segments, Top
1356 @chapter Symbols
1357 Symbols are a central concept: the programmer uses symbols to name
1358 things, the linker uses symbols to link, and the debugger uses symbols
1359 to debug.
1360
1361 @quotation
1362 @emph{Warning:} @code{as} does not place symbols in the object file in
1363 the same order they were declared.  This may break some debuggers.
1364 @end quotation
1365
1366 @menu
1367 * Labels::                      Labels
1368 * Setting Symbols::             Giving Symbols Other Values
1369 * Symbol Names::                Symbol Names
1370 * Dot::                         The Special Dot Symbol
1371 * Symbol Attributes::           Symbol Attributes
1372 @end menu
1373
1374 @node Labels, Setting Symbols, Symbols, Symbols
1375 @section Labels
1376 A @dfn{label} is written as a symbol immediately followed by a colon
1377 @samp{:}.  The symbol then represents the current value of the
1378 active location counter, and is, for example, a suitable instruction
1379 operand.  You are warned if you use the same symbol to represent two
1380 different locations: the first definition overrides any other
1381 definitions.
1382
1383 @node Setting Symbols, Symbol Names, Labels, Symbols
1384 @section Giving Symbols Other Values
1385 A symbol can be given an arbitrary value by writing a symbol, followed
1386 by an equals sign @samp{=}, followed by an expression
1387 (@pxref{Expressions}).  This is equivalent to using the @code{.set}
1388 directive.  @xref{Set}.
1389
1390 @node Symbol Names, Dot, Setting Symbols, Symbols
1391 @section Symbol Names
1392 Symbol names begin with a letter or with one of @samp{$._}.  That
1393 character may be followed by any string of digits, letters,
1394 underscores and dollar signs.  Case of letters is significant:
1395 @code{foo} is a different symbol name than @code{Foo}.
1396
1397 @c if am29k
1398 For the AMD 29K family, @samp{?} is also allowed in the
1399 body of a symbol name, though not at its beginning.
1400 @c fi am29k
1401
1402 Each symbol has exactly one name. Each name in an assembly language
1403 program refers to exactly one symbol. You may use that symbol name any
1404 number of times in a program.
1405
1406 @menu
1407 * Local Symbols::               Local Symbol Names
1408 @end menu
1409
1410 @node Local Symbols,  , Symbol Names, Symbol Names
1411 @subsection Local Symbol Names
1412
1413 Local symbols help compilers and programmers use names temporarily.
1414 There are ten local symbol names, which are re-used throughout the
1415 program.  You may refer to them using the names @samp{0} @samp{1}
1416 @dots{} @samp{9}.  To define a local symbol, write a label of the form
1417 @samp{@b{N}:} (where @b{N} represents any digit).  To refer to the most
1418 recent previous definition of that symbol write @samp{@b{N}b}, using the
1419 same digit as when you defined the label.  To refer to the next
1420 definition of a local label, write @samp{@b{N}f}---where @b{N} gives you
1421 a choice of 10 forward references.  The @samp{b} stands for
1422 ``backwards'' and the @samp{f} stands for ``forwards''.
1423
1424 Local symbols are not emitted by the current GNU C compiler.
1425
1426 There is no restriction on how you can use these labels, but
1427 remember that at any point in the assembly you can refer to at most
1428 10 prior local labels and to at most 10 forward local labels.
1429
1430 Local symbol names are only a notation device.  They are immediately
1431 transformed into more conventional symbol names before the assembler
1432 uses them.  The symbol names stored in the symbol table, appearing in
1433 error messages and optionally emitted to the object file have these
1434 parts:
1435
1436 @table @code
1437 @item L
1438 All local labels begin with @samp{L}. Normally both @code{as} and
1439 @code{ld} forget symbols that start with @samp{L}. These labels are
1440 used for symbols you are never intended to see.  If you give the
1441 @samp{-L} option then @code{as} will retain these symbols in the
1442 object file. If you also instruct @code{ld} to retain these symbols,
1443 you may use them in debugging.
1444
1445 @item @var{digit}
1446 If the label is written @samp{0:} then the digit is @samp{0}.
1447 If the label is written @samp{1:} then the digit is @samp{1}.
1448 And so on up through @samp{9:}.
1449
1450 @item @ctrl{A}
1451 This unusual character is included so you don't accidentally invent
1452 a symbol of the same name.  The character has ASCII value
1453 @samp{\001}.
1454
1455 @item @emph{ordinal number}
1456 This is a serial number to keep the labels distinct.  The first
1457 @samp{0:} gets the number @samp{1}; The 15th @samp{0:} gets the
1458 number @samp{15}; @emph{etc.}.  Likewise for the other labels @samp{1:}
1459 through @samp{9:}.
1460 @end table
1461
1462 For instance, the first @code{1:} is named @code{L1@ctrl{A}1}, the 44th
1463 @code{3:} is named @code{L3@ctrl{A}44}.
1464
1465 @node Dot, Symbol Attributes, Symbol Names, Symbols
1466 @section The Special Dot Symbol
1467
1468 The special symbol @samp{.} refers to the current address that
1469 @code{as} is assembling into.  Thus, the expression @samp{melvin:
1470 .long .} will cause @code{melvin} to contain its own address.
1471 Assigning a value to @code{.} is treated the same as a @code{.org}
1472 directive.  Thus, the expression @samp{.=.+4} is the same as saying
1473 @c if not am29k
1474 @c @samp{.space 4}.
1475 @c fi not am29k
1476 @c if am29k
1477 @samp{.block 4}.
1478 @c fi am29k
1479
1480 @node Symbol Attributes,  , Dot, Symbols
1481 @section Symbol Attributes
1482 Every symbol has these attributes: Value, Type, Descriptor, and ``Other''.
1483 @c if internals
1484 @c The detailed definitions are in <a.out.h>.
1485 @c fi internals
1486
1487 If you use a symbol without defining it, @code{as} assumes zero for
1488 all these attributes, and probably won't warn you.  This makes the
1489 symbol an externally defined symbol, which is generally what you
1490 would want.
1491
1492 @menu
1493 * Symbol Value::                Value
1494 * Symbol Type::                 Type
1495 * Symbol Desc::                 Descriptor
1496 * Symbol Other::                Other
1497 @end menu
1498
1499 @node Symbol Value, Symbol Type, Symbol Attributes, Symbol Attributes
1500 @subsection Value
1501 The value of a symbol is (usually) 32 bits, the size of one GNU C
1502 @code{int}.  For a symbol which labels a location in the
1503 text, data, bss or absolute segments the
1504 value is the number of addresses from the start of that segment to
1505 the label.  Naturally for text, data and bss
1506 segments the value of a symbol changes as @code{ld} changes segment
1507 base addresses during linking.  absolute symbols' values do
1508 not change during linking: that is why they are called absolute.
1509
1510 The value of an undefined symbol is treated in a special way.  If it is
1511 0 then the symbol is not defined in this assembler source program, and
1512 @code{ld} will try to determine its value from other programs it is
1513 linked with.  You make this kind of symbol simply by mentioning a symbol
1514 name without defining it.  A non-zero value represents a @code{.comm}
1515 common declaration.  The value is how much common storage to reserve, in
1516 bytes (addresses).  The symbol refers to the first address of the
1517 allocated storage.
1518
1519 @node Symbol Type, Symbol Desc, Symbol Value, Symbol Attributes
1520 @subsection Type
1521 The type attribute of a symbol is 8 bits encoded in a devious way.
1522 We kept this coding standard for compatibility with older operating
1523 systems.
1524
1525 @ifinfo
1526 @example
1527
1528         7     6     5     4     3     2     1     0     bit numbers
1529      +-----+-----+-----+-----+-----+-----+-----+-----+
1530      |                 |                       |     |
1531      |   N_STAB bits   |      N_TYPE bits      |N_EXT|
1532      |                 |                       | bit |
1533      +-----+-----+-----+-----+-----+-----+-----+-----+
1534
1535                      Type byte
1536 @end example
1537 @end ifinfo
1538 @tex
1539 \vskip 1pc
1540 \halign{#\quad&#\cr
1541 \ibox{3cm}{7}\ibox{4cm}{4}\ibox{1cm}{0}&bit numbers\cr
1542 \boxit{3cm}{{\tt N\_STAB} bits}\boxit{4cm}{{\tt N\_TYPE}
1543 bits}\boxit{1cm}{\tt N\_EXT}\cr
1544 \hfill {\bf Type} byte\hfill\cr
1545 }
1546 @end tex
1547
1548 @subsubsection @code{N_EXT} bit
1549 This bit is set if @code{ld} might need to use the symbol's type bits
1550 and value.  If this bit is off, then @code{ld} can ignore the
1551 symbol while linking.  It is set in two cases.  If the symbol is
1552 undefined, then @code{ld} is expected to find the symbol's value
1553 elsewhere in another program module.  Otherwise the symbol has the
1554 value given, but this symbol name and value are revealed to any other
1555 programs linked in the same executable program.  This second use of
1556 the @code{N_EXT} bit is most often made by a @code{.globl} statement.
1557
1558 @subsubsection @code{N_TYPE} bits
1559 These establish the symbol's ``type'', which is mainly a relocation
1560 concept.  Common values are detailed in the manual describing the
1561 executable file format.
1562
1563 @subsubsection @code{N_STAB} bits
1564 Common values for these bits are described in the manual on the
1565 executable file format.
1566
1567 @node Symbol Desc, Symbol Other, Symbol Type, Symbol Attributes
1568 @subsection Descriptor
1569 This is an arbitrary 16-bit value.  You may establish a symbol's
1570 descriptor value by using a @code{.desc} statement (@pxref{Desc}).
1571 A descriptor value means nothing to @code{as}.
1572
1573 @node Symbol Other,  , Symbol Desc, Symbol Attributes
1574 @subsection Other
1575 This is an arbitrary 8-bit value.  It means nothing to @code{as}.
1576
1577 @node Expressions, Pseudo Ops, Symbols, Top
1578 @chapter Expressions
1579 An @dfn{expression} specifies an address or numeric value.
1580 Whitespace may precede and/or follow an expression.
1581
1582 @menu
1583 * Empty Exprs::                 Empty Expressions
1584 * Integer Exprs::               Integer Expressions
1585 @end menu
1586
1587 @node Empty Exprs, Integer Exprs, Expressions, Expressions
1588 @section Empty Expressions
1589 An empty expression has no value: it is just whitespace or null.
1590 Wherever an absolute expression is required, you may omit the
1591 expression and @code{as} will assume a value of (absolute) 0.  This
1592 is compatible with other assemblers.
1593
1594 @node Integer Exprs,  , Empty Exprs, Expressions
1595 @section Integer Expressions
1596 An @dfn{integer expression} is one or more @emph{arguments} delimited
1597 by @emph{operators}.
1598
1599 @menu
1600 * Arguments::                   Arguments
1601 * Operators::                   Operators
1602 * Prefix Ops::                  Prefix Operators
1603 * Infix Ops::                   Infix Operators
1604 @end menu
1605
1606 @node Arguments, Operators, Integer Exprs, Integer Exprs
1607 @subsection Arguments
1608
1609 @dfn{Arguments} are symbols, numbers or subexpressions.  In other
1610 contexts arguments are sometimes called ``arithmetic operands''.  In
1611 this manual, to avoid confusing them with the ``instruction operands'' of
1612 the machine language, we use the term ``argument'' to refer to parts of
1613 expressions only, reserving the word ``operand'' to refer only to machine
1614 instruction operands.
1615
1616 Symbols are evaluated to yield @{@var{segment} @var{NNN}@} where
1617 @var{segment} is one of text, data, bss, absolute,
1618 or @code{undefined}.  @var{NNN} is a signed, 2's complement 32 bit
1619 integer.
1620
1621 Numbers are usually integers.
1622
1623 A number can be a flonum or bignum.  In this case, you are warned
1624 that only the low order 32 bits are used, and @code{as} pretends
1625 these 32 bits are an integer.  You may write integer-manipulating
1626 instructions that act on exotic constants, compatible with other
1627 assemblers.
1628
1629 Subexpressions are a left parenthesis @samp{(} followed by an integer
1630 expression, followed by a right parenthesis @samp{)}; or a prefix
1631 operator followed by an argument.
1632
1633 @node Operators, Prefix Ops, Arguments, Integer Exprs
1634 @subsection Operators
1635 @dfn{Operators} are arithmetic functions, like @code{+} or @code{%}.  Prefix
1636 operators are followed by an argument.  Infix operators appear
1637 between their arguments.  Operators may be preceded and/or followed by
1638 whitespace.
1639
1640 @node Prefix Ops, Infix Ops, Operators, Integer Exprs
1641 @subsection Prefix Operators
1642 @code{as} has the following @dfn{prefix operators}.  They each take
1643 one argument, which must be absolute.
1644 @table @code
1645 @item -
1646 @dfn{Negation}.  Two's complement negation.
1647 @item ~
1648 @dfn{Complementation}.  Bitwise not.
1649 @end table
1650
1651 @node Infix Ops,  , Prefix Ops, Integer Exprs
1652 @subsection Infix Operators
1653
1654 @dfn{Infix operators} take two arguments, one on either side.  Operators
1655 have precedence, but operations with equal precedence are performed left
1656 to right.  Apart from @code{+} or @code{-}, both arguments must be
1657 absolute, and the result is absolute.
1658
1659 @enumerate
1660
1661 @item
1662 Highest Precedence
1663 @table @code
1664 @item *
1665 @dfn{Multiplication}.
1666 @item /
1667 @dfn{Division}.  Truncation is the same as the C operator @samp{/}
1668 @item %
1669 @dfn{Remainder}.
1670 @item <
1671 @itemx <<
1672 @dfn{Shift Left}.  Same as the C operator @samp{<<}
1673 @item >
1674 @itemx >>
1675 @dfn{Shift Right}.  Same as the C operator @samp{>>}
1676 @end table
1677
1678 @item
1679 Intermediate precedence
1680 @table @code
1681 @item |
1682 @dfn{Bitwise Inclusive Or}.
1683 @item &
1684 @dfn{Bitwise And}.
1685 @item ^
1686 @dfn{Bitwise Exclusive Or}.
1687 @item !
1688 @dfn{Bitwise Or Not}.
1689 @end table
1690
1691 @item
1692 Lowest Precedence
1693 @table @code
1694 @item +
1695 @dfn{Addition}.  If either argument is absolute, the result
1696 has the segment of the other argument.
1697 If either argument is pass1 or undefined, the result is pass1.
1698 Otherwise @code{+} is illegal.
1699 @item -
1700 @dfn{Subtraction}.  If the right argument is absolute, the
1701 result has the segment of the left argument.
1702 If either argument is pass1 the result is pass1.
1703 If either argument is undefined the result is difference segment.
1704 If both arguments are in the same segment, the result is absolute---provided
1705 that segment is one of text, data or bss.
1706 Otherwise subtraction is illegal.
1707 @end table
1708 @end enumerate
1709
1710 The sense of the rule for addition is that it's only meaningful to add
1711 the @emph{offsets} in an address; you can only have a defined segment in
1712 one of the two arguments.
1713
1714 Similarly, you can't subtract quantities from two different segments.
1715
1716 @node Pseudo Ops, Machine Dependent, Expressions, Top
1717 @chapter Assembler Directives
1718 @menu
1719 * Abort::                       The Abort directive causes as to abort
1720 * Align::                       Pad the location counter to a power of 2
1721 * App-File::                    Set the logical file name
1722 * Ascii::                       Fill memory with bytes of ASCII characters
1723 * Asciz::                       Fill memory with bytes of ASCII characters followed
1724                 by a null.
1725 * Byte::                        Fill memory with 8-bit integers
1726 * Comm::                        Reserve public space in the BSS segment
1727 * Data::                        Change to the data segment
1728 * Desc::                        Set the n_desc of a symbol
1729 * Double::                      Fill memory with double-precision floating-point numbers
1730 * Else::                        @code{.else}
1731 * End::                         @code{.end}
1732 * Endif::                       @code{.endif}
1733 * Equ::                         @code{.equ @var{symbol}, @var{expression}}
1734 * Extern::                      @code{.extern}
1735 * Fill::                        Fill memory with repeated values
1736 * Float::                       Fill memory with single-precision floating-point numbers
1737 * Global::                      Make a symbol visible to the linker
1738 * Ident::                       @code{.ident}
1739 * If::                          @code{.if @var{absolute expression}}
1740 * Include::                     @code{.include "@var{file}"}
1741 * Int::                         Fill memory with 32-bit integers
1742 * Lcomm::                       Reserve private space in the BSS segment
1743 * Line::                        Set the logical line number
1744 * Ln::                          @code{.ln @var{line-number}}
1745 * List::                        @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
1746 * Long::                        Fill memory with 32-bit integers
1747 * Lsym::                        Create a local symbol
1748 * Octa::                        Fill memory with 128-bit integers
1749 * Org::                         Change the location counter
1750 * Quad::                        Fill memory with 64-bit integers
1751 * Set::                         Set the value of a symbol
1752 * Short::                       Fill memory with 16-bit integers
1753 * Single::                      @code{.single @var{flonums}}
1754 * Stab::                        Store debugging information
1755 * Text::                        Change to the text segment
1756 @c if am29k or sparc
1757 * Word::                        Fill memory with 32-bit integers
1758 @c else (not am29k or sparc)
1759 * Deprecated::                  Deprecated Directives
1760 * Machine Options::             Options
1761 * Machine Syntax::              Syntax
1762 * Floating Point::              Floating Point
1763 * Machine Directives::          Machine Directives
1764 * Opcodes::                     Opcodes
1765 @end menu
1766
1767 All assembler directives have names that begin with a period (@samp{.}).
1768 The rest of the name is letters: their case does not matter.
1769
1770 This chapter discusses directives present in all versions of GNU
1771 @code{as}; @pxref{Machine Dependent} for additional directives.
1772
1773 @node Abort, Align, Pseudo Ops, Pseudo Ops
1774 @section @code{.abort}
1775 This directive stops the assembly immediately.  It is for
1776 compatibility with other assemblers.  The original idea was that the
1777 assembler program would be piped into the assembler.  If the sender
1778 of a program quit, it could use this directive tells @code{as} to
1779 quit also.  One day @code{.abort} will not be supported.
1780
1781 @node Align, App-File, Abort, Pseudo Ops
1782 @section @code{.align @var{absolute-expression} , @var{absolute-expression}}
1783 Pad the location counter (in the current subsegment) to a particular
1784 storage boundary.  The first expression is the number of low-order zero
1785 bits the location counter will have after advancement.  For example
1786 @samp{.align 3} will advance the location counter until it a multiple of
1787 8.  If the location counter is already a multiple of 8, no change is
1788 needed.
1789
1790 The second expression gives the value to be stored in the padding
1791 bytes.  It (and the comma) may be omitted.  If it is omitted, the
1792 padding bytes are zero.
1793
1794 @node App-File, Ascii, Align, Pseudo Ops
1795 @section @code{.app-file @var{string}}
1796 @code{.app-file} tells @code{as} that we are about to start a new
1797 logical file.  @var{String} is the new file name.  In general, the
1798 filename is recognized whether or not it is surrounded by quotes @samp{"};
1799 but if you wish to specify an empty file name is permitted,
1800 you must give the quotes--@code{""}.  This statement may go away in
1801 future: it is only recognized to be compatible with old @code{as}
1802 programs.
1803
1804 @node Ascii, Asciz, App-File, Pseudo Ops
1805 @section @code{.ascii "@var{string}"}@dots{}
1806 @code{.ascii} expects zero or more string literals (@pxref{Strings})
1807 separated by commas.  It assembles each string (with no automatic
1808 trailing zero byte) into consecutive addresses.
1809
1810 @node Asciz, Byte, Ascii, Pseudo Ops
1811 @section @code{.asciz "@var{string}"}@dots{}
1812 @code{.asciz} is just like @code{.ascii}, but each string is followed by
1813 a zero byte.  The ``z'' in @samp{.asciz} stands for ``zero''.
1814
1815 @node Byte, Comm, Asciz, Pseudo Ops
1816 @section @code{.byte @var{expressions}}
1817
1818 @code{.byte} expects zero or more expressions, separated by commas.
1819 Each expression is assembled into the next byte.
1820
1821 @node Comm, Data, Byte, Pseudo Ops
1822 @section @code{.comm @var{symbol} , @var{length} }
1823 @code{.comm} declares a named common area in the bss segment.  Normally
1824 @code{ld} reserves memory addresses for it during linking, so no partial
1825 program defines the location of the symbol.  Use @code{.comm} to tell
1826 @code{ld} that it must be at least @var{length} bytes long.  @code{ld}
1827 will allocate space for each @code{.comm} symbol that is at least as
1828 long as the longest @code{.comm} request in any of the partial programs
1829 linked.  @var{length} is an absolute expression.
1830
1831 @node Data, Desc, Comm, Pseudo Ops
1832 @section @code{.data @var{subsegment}}
1833 @code{.data} tells @code{as} to assemble the following statements onto the
1834 end of the data subsegment numbered @var{subsegment} (which is an
1835 absolute expression).  If @var{subsegment} is omitted, it defaults
1836 to zero.
1837
1838 @node Desc, Double, Data, Pseudo Ops
1839 @section @code{.desc @var{symbol}, @var{absolute-expression}}
1840 This directive sets the descriptor of the symbol (@pxref{Symbol Attributes})
1841 to the low 16 bits of @var{absolute-expression}.
1842
1843 @node Double, Else, Desc, Pseudo Ops
1844 @section @code{.double @var{flonums}}
1845 @code{.double} expects zero or more flonums, separated by commas.  It assembles
1846 floating point numbers.
1847 @c if all-arch
1848 @c The exact kind of floating point numbers
1849 @c emitted depends on how @code{as} is configured.  @xref{Machine
1850 @c Dependent}.
1851 @c fi all-arch
1852 @c if am29k
1853 On the AMD 29K family the floating point format used is IEEE.
1854 @c fi am29k
1855
1856 @node Else, End, Double, Pseudo Ops
1857 @section @code{.else}
1858 @code{.else} is part of the @code{as} support for conditional assembly;
1859 @pxref{If}.  It marks the beginning of a section of code to be assembled
1860 if the condition for the preceding @code{.if} was false.
1861
1862 @ignore
1863 @node End, Endif, Else, Pseudo Ops
1864 @section @code{.end}
1865 This doesn't do anything---but isn't an s_ignore, so I suspect it's
1866 meant to do something eventually (which is why it isn't documented here
1867 as "for compatibility with blah").
1868 @end ignore
1869
1870 @node Endif, Equ, End, Pseudo Ops
1871 @section @code{.endif}
1872 @code{.endif} is part of the @code{as} support for conditional assembly;
1873 it marks the end of a block of code that is only assembled
1874 conditionally.  @xref{If}.
1875
1876 @node Equ, Extern, Endif, Pseudo Ops
1877 @section @code{.equ @var{symbol}, @var{expression}}
1878
1879 This directive sets the value of @var{symbol} to @var{expression}.
1880 It is synonymous with @samp{.set}; @pxref{Set}.
1881
1882 @node Extern, Fill, Equ, Pseudo Ops
1883 @section @code{.extern}
1884 @code{.extern} is accepted in the source program---for compatibility
1885 with other assemblers---but it is ignored.  GNU @code{as} treats
1886 all undefined symbols as external.
1887
1888 @node Fill, Float, Extern, Pseudo Ops
1889 @section @code{.fill @var{repeat} , @var{size} , @var{value}}
1890 @var{result}, @var{size} and @var{value} are absolute expressions.
1891 This emits @var{repeat} copies of @var{size} bytes.  @var{Repeat}
1892 may be zero or more.  @var{Size} may be zero or more, but if it is
1893 more than 8, then it is deemed to have the value 8, compatible with
1894 other people's assemblers.  The contents of each @var{repeat} bytes
1895 is taken from an 8-byte number.  The highest order 4 bytes are
1896 zero.  The lowest order 4 bytes are @var{value} rendered in the
1897 byte-order of an integer on the computer @code{as} is assembling for.
1898 Each @var{size} bytes in a repetition is taken from the lowest order
1899 @var{size} bytes of this number.  Again, this bizarre behavior is
1900 compatible with other people's assemblers.
1901
1902 @var{Size} and @var{value} are optional.
1903 If the second comma and @var{value} are absent, @var{value} is
1904 assumed zero.  If the first comma and following tokens are absent,
1905 @var{size} is assumed to be 1.
1906
1907 @node Float, Global, Fill, Pseudo Ops
1908 @section @code{.float @var{flonums}}
1909 This directive assembles zero or more flonums, separated by commas.  It
1910 has the same effect as @code{.single}.
1911 @c if all-arch
1912 @c The exact kind of floating point numbers emitted depends on how
1913 @c @code{as} is configured.
1914 @c @xref{Machine Dependent}.
1915 @c fi all-arch
1916 @c if am29k
1917 The floating point format used for the AMD 29K family is IEEE.
1918 @c fi am29k
1919
1920 @node Global, Ident, Float, Pseudo Ops
1921 @section @code{.global @var{symbol}}, @code{.globl @var{symbol}}
1922 @code{.global} makes the symbol visible to @code{ld}.  If you define
1923 @var{symbol} in your partial program, its value is made available to
1924 other partial programs that are linked with it.  Otherwise,
1925 @var{symbol} will take its attributes from a symbol of the same name
1926 from another partial program it is linked with.
1927
1928 This is done by setting the @code{N_EXT} bit of that symbol's type byte
1929 to 1. @xref{Symbol Attributes}.
1930
1931 Both spellings (@samp{.globl} and @samp{.global}) are accepted, for
1932 compatibility with other assemblers.
1933
1934 @node Ident, If, Global, Pseudo Ops
1935 @section @code{.ident}
1936 This directive is used by some assemblers to place tags in object files.
1937 GNU @code{as} simply accepts the directive for source-file
1938 compatibility with such assemblers, but does not actually emit anything
1939 for it.
1940
1941 @node If, Include, Ident, Pseudo Ops
1942 @section @code{.if @var{absolute expression}}
1943 @code{.if} marks the beginning of a section of code which is only
1944 considered part of the source program being assembled if the argument
1945 (which must be an @var{absolute expression}) is non-zero.  The end of
1946 the conditional section of code must be marked by @code{.endif}
1947 (@pxref{Endif}); optionally, you may include code for the
1948 alternative condition, flagged by @code{.else} (@pxref{Else}.
1949
1950 The following variants of @code{.if} are also supported:
1951 @table @code
1952 @item ifdef @var{symbol}
1953 Assembles the following section of code if the specified @var{symbol}
1954 has been defined.
1955
1956 @ignore
1957 @item ifeqs
1958 BOGONS??
1959 @end ignore
1960
1961 @item ifndef @var{symbol}
1962 @itemx ifnotdef @var{symbol}
1963 Assembles the following section of code if the specified @var{symbol}
1964 has not been defined.  Both spelling variants are equivalent.
1965
1966 @ignore
1967 @item ifnes
1968 NO bogons, I presume?
1969 @end ignore
1970 @end table
1971
1972 @node Include, Int, If, Pseudo Ops
1973 @section @code{.include "@var{file}"}
1974 This directive provides a way to include supporting files at specified
1975 points in your source program.  The code from @var{file} is assembled as
1976 if it followed the point of the @code{.include}; when the end of the
1977 included file is reached, assembly of the original file continues.  You
1978 can control the search paths used with the @samp{-I} command-line option
1979 (@pxref{Options}).  Quotation marks are required around @var{file}.
1980
1981 @node Int, Lcomm, Include, Pseudo Ops
1982 @section @code{.int @var{expressions}}
1983 Expect zero or more @var{expressions}, of any segment, separated by
1984 commas.  For each expression, emit a 32-bit number that will, at run
1985 time, be the value of that expression.  The byte order of the
1986 expression depends on what kind of computer will run the program.
1987
1988 @node Lcomm, Line, Int, Pseudo Ops
1989 @section @code{.lcomm @var{symbol} , @var{length}}
1990 Reserve @var{length} (an absolute expression) bytes for a local
1991 common denoted by @var{symbol}.  The segment and value of @var{symbol} are
1992 those of the new local common.  The addresses are allocated in the
1993 bss segment, so at run-time the bytes will start off zeroed.
1994 @var{Symbol} is not declared global (@pxref{Global}), so is normally
1995 not visible to @code{ld}.
1996
1997 @c if not am29k
1998 @ignore
1999 @node Line, Ln, Lcomm, Pseudo Ops
2000 @section @code{.line @var{line-number}}, @code{.ln @var{line-number}}
2001 @code{.line}, and its alternate spelling @code{.ln}, tell
2002 @end ignore
2003 @c fi not am29k
2004 @c if am29k
2005 @node Ln, List, Line, Pseudo Ops
2006 @section @code{.ln @var{line-number}}
2007 Tell
2008 @c fi am29k
2009 @code{as} to change the logical line number.  @var{line-number} must be
2010 an absolute expression.  The next line will have that logical line
2011 number.  So any other statements on the current line (after a statement
2012 separator character
2013 @c if am29k
2014 @samp{@@})
2015 @c fi am29k
2016 @c if not am29k
2017 @c @code{;})
2018 @c fi not am29k
2019 will be reported as on logical line number
2020 @var{logical line number} @minus{} 1.
2021 One day this directive will be unsupported: it is used only
2022 for compatibility with existing assembler programs. @refill
2023
2024 @node List, Long, Ln, Pseudo Ops
2025 @section @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
2026 GNU @code{as} ignores these directives; however, they're
2027 accepted for compatibility with assemblers that use them.
2028
2029 @node Long, Lsym, List, Pseudo Ops
2030 @section @code{.long @var{expressions}}
2031 @code{.long} is the same as @samp{.int}, @pxref{Int}.
2032
2033 @node Lsym, Octa, Long, Pseudo Ops
2034 @section @code{.lsym @var{symbol}, @var{expression}}
2035 @code{.lsym} creates a new symbol named @var{symbol}, but does not put it in
2036 the hash table, ensuring it cannot be referenced by name during the
2037 rest of the assembly.  This sets the attributes of the symbol to be
2038 the same as the expression value:
2039 @example
2040 @var{other} = @var{descriptor} = 0
2041 @var{type} = @r{(segment of @var{expression})}
2042 N_EXT = 0
2043 @var{value} = @var{expression}
2044 @end example
2045
2046 @node Octa, Org, Lsym, Pseudo Ops
2047 @section @code{.octa @var{bignums}}
2048 This directive expects zero or more bignums, separated by commas.  For each
2049 bignum, it emits a 16-byte integer.
2050
2051 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2052 hence @emph{quad}-word for 8 bytes.
2053
2054 @node Org, Quad, Octa, Pseudo Ops
2055 @section @code{.org @var{new-lc} , @var{fill}}
2056
2057 @code{.org} will advance the location counter of the current segment to
2058 @var{new-lc}.  @var{new-lc} is either an absolute expression or an
2059 expression with the same segment as the current subsegment.  That is,
2060 you can't use @code{.org} to cross segments: if @var{new-lc} has the
2061 wrong segment, the @code{.org} directive is ignored.  To be compatible
2062 with former assemblers, if the segment of @var{new-lc} is absolute,
2063 @code{as} will issue a warning, then pretend the segment of @var{new-lc}
2064 is the same as the current subsegment.
2065
2066 @code{.org} may only increase the location counter, or leave it
2067 unchanged; you cannot use @code{.org} to move the location counter
2068 backwards.
2069
2070 @c double negative used below "not undefined" because this is a specific
2071 @c reference to "undefined" (as SEG_UNKNOWN is called in this manual)
2072 @c segment. pesch@cygnus.com 18feb91
2073 Because @code{as} tries to assemble programs in one pass @var{new-lc}
2074 may not be undefined.  If you really detest this restriction we eagerly await
2075 a chance to share your improved assembler.
2076
2077 Beware that the origin is relative to the start of the segment, not
2078 to the start of the subsegment.  This is compatible with other
2079 people's assemblers.
2080
2081 When the location counter (of the current subsegment) is advanced, the
2082 intervening bytes are filled with @var{fill} which should be an
2083 absolute expression.  If the comma and @var{fill} are omitted,
2084 @var{fill} defaults to zero.
2085
2086 @node Quad, Set, Org, Pseudo Ops
2087 @section @code{.quad @var{bignums}}
2088 @code{.quad} expects zero or more bignums, separated by commas.  For
2089 each bignum, it emits an 8-byte integer.  If the bignum won't fit in a 8
2090 bytes, it prints a warning message; and just takes the lowest order 8
2091 bytes of the bignum.
2092
2093 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2094 hence @emph{quad}-word for 8 bytes.
2095
2096 @node Set, Short, Quad, Pseudo Ops
2097 @section @code{.set @var{symbol}, @var{expression}}
2098
2099 This directive sets the value of @var{symbol} to @var{expression}.  This
2100 will change @var{symbol}'s value and type to conform to
2101 @var{expression}.  If @code{N_EXT} is set, it remains set.
2102 (@xref{Symbol Attributes}.)
2103
2104 You may @code{.set} a symbol many times in the same assembly.
2105 If the expression's segment is unknowable during pass 1, a second
2106 pass over the source program will be forced.  The second pass is
2107 currently not implemented.  @code{as} will abort with an error
2108 message if one is required.
2109
2110 If you @code{.set} a global symbol, the value stored in the object
2111 file is the last value stored into it.
2112
2113 @node Short, Single, Set, Pseudo Ops
2114 @section @code{.short @var{expressions}}
2115 @c if not (sparc or amd29k)
2116 @c @code{.short} is the same as @samp{.word}.  @xref{Word}.
2117 @c fi not (sparc or amd29k)
2118 @c if (sparc or amd29k)
2119 This expects zero or more @var{expressions}, and emits
2120 a 16 bit number for each.
2121 @c fi (sparc or amd29k)
2122
2123 @node Single, Space, Short, Pseudo Ops
2124 @section @code{.single @var{flonums}}
2125 This directive assembles zero or more flonums, separated by commas.  It
2126 has the same effect as @code{.float}.
2127 @c if all-arch
2128 @c The exact kind of floating point numbers emitted depends on how
2129 @c @code{as} is configured.  @xref{Machine Dependent}.
2130 @c fi all-arch
2131 @c if am29k
2132 The floating point format used for the AMD 29K family is IEEE.
2133 @c fi am29k
2134
2135
2136 @node Space, Space, Single, Pseudo Ops
2137 @c if not am29k
2138 @ignore
2139 @section @code{.space @var{size} , @var{fill}}
2140 This directive emits @var{size} bytes, each of value @var{fill}.  Both
2141 @var{size} and @var{fill} are absolute expressions.  If the comma
2142 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2143 @end ignore
2144 @c fi not am29k
2145
2146 @c if am29k
2147 @section @code{.space}
2148 This directive is ignored; it is accepted for compatibility with other
2149 AMD 29K assemblers.
2150
2151 @quotation
2152 @emph{Warning:} In other versions of GNU @code{as}, the directive
2153 @code{.space} has the effect of @code{.block}  @xref{Machine Directives}.
2154 @end quotation
2155 @c fi am29k
2156
2157 @node Stab, Text, Space, Pseudo Ops
2158 @section @code{.stabd, .stabn, .stabs}
2159 There are three directives that begin @samp{.stab}.
2160 All emit symbols (@pxref{Symbols}), for use by symbolic debuggers.
2161 The symbols are not entered in @code{as}' hash table: they
2162 cannot be referenced elsewhere in the source file.
2163 Up to five fields are required:
2164 @table @var
2165 @item string
2166 This is the symbol's name.  It may contain any character except @samp{\000},
2167 so is more general than ordinary symbol names.  Some debuggers used to
2168 code arbitrarily complex structures into symbol names using this field.
2169 @item type
2170 An absolute expression.  The symbol's type is set to the low 8
2171 bits of this expression.
2172 Any bit pattern is permitted, but @code{ld} and debuggers will choke on
2173 silly bit patterns.
2174 @item other
2175 An absolute expression.
2176 The symbol's ``other'' attribute is set to the low 8 bits of this expression.
2177 @item desc
2178 An absolute expression.
2179 The symbol's descriptor is set to the low 16 bits of this expression.
2180 @item value
2181 An absolute expression which becomes the symbol's value.
2182 @end table
2183
2184 If a warning is detected while reading a @code{.stabd}, @code{.stabn},
2185 or @code{.stabs} statement, the symbol has probably already been created
2186 and you will get a half-formed symbol in your object file.  This is
2187 compatible with earlier assemblers!
2188
2189 @table @code
2190 @item .stabd @var{type} , @var{other} , @var{desc}
2191
2192 The ``name'' of the symbol generated is not even an empty string.
2193 It is a null pointer, for compatibility.  Older assemblers used a
2194 null pointer so they didn't waste space in object files with empty
2195 strings.
2196
2197 The symbol's value is set to the location counter,
2198 relocatably.  When your program is linked, the value of this symbol
2199 will be where the location counter was when the @code{.stabd} was
2200 assembled.
2201
2202 @item .stabn @var{type} , @var{other} , @var{desc} , @var{value}
2203
2204 The name of the symbol is set to the empty string @code{""}.
2205
2206 @item .stabs @var{string} ,  @var{type} , @var{other} , @var{desc} , @var{value}
2207
2208 All five fields are specified.
2209 @end table
2210
2211 @node Text, Word, Stab, Pseudo Ops
2212 @section @code{.text @var{subsegment}}
2213 Tells @code{as} to assemble the following statements onto the end of
2214 the text subsegment numbered @var{subsegment}, which is an absolute
2215 expression.  If @var{subsegment} is omitted, subsegment number zero
2216 is used.
2217
2218 @node Word, Deprecated, Text, Pseudo Ops
2219 @section @code{.word @var{expressions}}
2220 This directive expects zero or more @var{expressions}, of any segment,
2221 separated by commas.
2222 @c if sparc or amd29k
2223 For each expression, @code{as} emits a 32-bit number.
2224 @c fi sparc or amd29k
2225 @c if not (sparc or amd29k)
2226 @c For each expression, @code{as} emits a 16-bit number.
2227 @c fi not (sparc or amd29k)
2228 @ignore
2229 @c if all-arch
2230 The byte order
2231 of the expression depends on what kind of computer will run the
2232 program.
2233 @c fi all-arch
2234 @end ignore
2235
2236 @ignore
2237 @c on the 29k this doesn't happen---32-bit addressability, period; no
2238 @c long/short jumps.
2239 @c if not am29k
2240 @subsection Special Treatment to support Compilers
2241
2242 In order to assemble compiler output into something that will work,
2243 @code{as} will occasionlly do strange things to @samp{.word} directives.
2244 Directives of the form @samp{.word sym1-sym2} are often emitted by
2245 compilers as part of jump tables.  Therefore, when @code{as} assembles a
2246 directive of the form @samp{.word sym1-sym2}, and the difference between
2247 @code{sym1} and @code{sym2} does not fit in 16 bits, @code{as} will
2248 create a @dfn{secondary jump table}, immediately before the next label.
2249 This @var{secondary jump table} will be preceded by a short-jump to the
2250 first byte after the secondary table.  This short-jump prevents the flow
2251 of control from accidentally falling into the new table.  Inside the
2252 table will be a long-jump to @code{sym2}.  The original @samp{.word}
2253 will contain @code{sym1} minus the address of the long-jump to
2254 @code{sym2}.
2255
2256 If there were several occurrences of @samp{.word sym1-sym2} before the
2257 secondary jump table, all of them will be adjusted.  If there was a
2258 @samp{.word sym3-sym4}, that also did not fit in sixteen bits, a
2259 long-jump to @code{sym4} will be included in the secondary jump table,
2260 and the @code{.word} directives will be adjusted to contain @code{sym3}
2261 minus the address of the long-jump to @code{sym4}; and so on, for as many
2262 entries in the original jump table as necessary.
2263 @end ignore
2264 @ignore
2265 @c if internals
2266 @emph{This feature may be disabled by compiling @code{as} with the
2267 @samp{-DWORKING_DOT_WORD} option.} This feature is likely to confuse
2268 assembly language programmers.
2269 @c fi internals
2270 @end ignore
2271
2272
2273 @node Deprecated, Machine Dependent, Word, Pseudo Ops
2274 @section Deprecated Directives
2275 One day these directives won't work.
2276 They are included for compatibility with older assemblers.
2277 @table @t
2278 @item .abort
2279 @item .app-file
2280 @item .line
2281 @end table
2282
2283 @node Machine Dependent, Machine Dependent, Pseudo Ops, Top
2284 @c if all-arch
2285 @c chapter Machine Dependent Features
2286 @c fi all-arch
2287 @c if 680x0
2288 @c chapter Machine Dependent Features: Motorola 680x0
2289 @c fi 680x0
2290 @c if amd29k
2291 @chapter Machine Dependent Features: AMD 29K
2292 @c fi amd29k
2293 @c pesch@cygnus.com: This version of the manual is specifically hacked
2294 @c                   for gas on a particular machine.
2295 @c                   We should have a config method of
2296 @c                   automating this; in the meantime, use ignore
2297 @c                   for the other architectures (or for their stubs)
2298 @ignore
2299 @c if all-arch
2300 @section Vax
2301 @c fi all-arch
2302 @subsection Options
2303
2304 The Vax version of @code{as} accepts any of the following options,
2305 gives a warning message that the option was ignored and proceeds.
2306 These options are for compatibility with scripts designed for other
2307 people's assemblers.
2308
2309 @table @asis
2310 @item @kbd{-D} (Debug)
2311 @itemx @kbd{-S} (Symbol Table)
2312 @itemx @kbd{-T} (Token Trace)
2313 These are obsolete options used to debug old assemblers.
2314
2315 @item @kbd{-d} (Displacement size for JUMPs)
2316 This option expects a number following the @kbd{-d}.  Like options
2317 that expect filenames, the number may immediately follow the
2318 @kbd{-d} (old standard) or constitute the whole of the command line
2319 argument that follows @kbd{-d} (GNU standard).
2320
2321 @item @kbd{-V} (Virtualize Interpass Temporary File)
2322 Some other assemblers use a temporary file.  This option
2323 commanded them to keep the information in active memory rather
2324 than in a disk file.  @code{as} always does this, so this
2325 option is redundant.
2326
2327 @item @kbd{-J} (JUMPify Longer Branches)
2328 Many 32-bit computers permit a variety of branch instructions
2329 to do the same job.  Some of these instructions are short (and
2330 fast) but have a limited range; others are long (and slow) but
2331 can branch anywhere in virtual memory.  Often there are 3
2332 flavors of branch: short, medium and long.  Some other
2333 assemblers would emit short and medium branches, unless told by
2334 this option to emit short and long branches.
2335
2336 @item @kbd{-t} (Temporary File Directory)
2337 Some other assemblers may use a temporary file, and this option
2338 takes a filename being the directory to site the temporary
2339 file.  @code{as} does not use a temporary disk file, so this
2340 option makes no difference.  @kbd{-t} needs exactly one
2341 filename.
2342 @end table
2343
2344 The Vax version of the assembler accepts two options when
2345 compiled for VMS.  They are @kbd{-h}, and @kbd{-+}.  The
2346 @kbd{-h} option prevents @code{as} from modifying the
2347 symbol-table entries for symbols that contain lowercase
2348 characters (I think).  The @kbd{-+} option causes @code{as} to
2349 print warning messages if the FILENAME part of the object file,
2350 or any symbol name is larger than 31 characters.  The @kbd{-+}
2351 option also insertes some code following the @samp{_main}
2352 symbol so that the object file will be compatible with Vax-11
2353 "C".
2354
2355 @subsection Floating Point
2356 Conversion of flonums to floating point is correct, and
2357 compatible with previous assemblers.  Rounding is
2358 towards zero if the remainder is exactly half the least significant bit.
2359
2360 @code{D}, @code{F}, @code{G} and @code{H} floating point formats
2361 are understood.
2362
2363 Immediate floating literals (@emph{e.g.} @samp{S`$6.9})
2364 are rendered correctly.  Again, rounding is towards zero in the
2365 boundary case.
2366
2367 The @code{.float} directive produces @code{f} format numbers.
2368 The @code{.double} directive produces @code{d} format numbers.
2369
2370 @subsection Machine Directives
2371 The Vax version of the assembler supports four directives for
2372 generating Vax floating point constants.  They are described in the
2373 table below.
2374
2375 @table @code
2376 @item .dfloat
2377 This expects zero or more flonums, separated by commas, and
2378 assembles Vax @code{d} format 64-bit floating point constants.
2379
2380 @item .ffloat
2381 This expects zero or more flonums, separated by commas, and
2382 assembles Vax @code{f} format 32-bit floating point constants.
2383
2384 @item .gfloat
2385 This expects zero or more flonums, separated by commas, and
2386 assembles Vax @code{g} format 64-bit floating point constants.
2387
2388 @item .hfloat
2389 This expects zero or more flonums, separated by commas, and
2390 assembles Vax @code{h} format 128-bit floating point constants.
2391
2392 @end table
2393
2394 @subsection Opcodes
2395 All DEC mnemonics are supported.  Beware that @code{case@dots{}}
2396 instructions have exactly 3 operands.  The dispatch table that
2397 follows the @code{case@dots{}} instruction should be made with
2398 @code{.word} statements.  This is compatible with all unix
2399 assemblers we know of.
2400
2401 @subsection Branch Improvement
2402 Certain pseudo opcodes are permitted.  They are for branch
2403 instructions.  They expand to the shortest branch instruction that
2404 will reach the target.  Generally these mnemonics are made by
2405 substituting @samp{j} for @samp{b} at the start of a DEC mnemonic.
2406 This feature is included both for compatibility and to help
2407 compilers.  If you don't need this feature, don't use these
2408 opcodes.  Here are the mnemonics, and the code they can expand into.
2409
2410 @table @code
2411 @item jbsb
2412 @samp{Jsb} is already an instruction mnemonic, so we chose @samp{jbsb}.
2413 @table @asis
2414 @item (byte displacement)
2415 @kbd{bsbb @dots{}}
2416 @item (word displacement)
2417 @kbd{bsbw @dots{}}
2418 @item (long displacement)
2419 @kbd{jsb @dots{}}
2420 @end table
2421 @item jbr
2422 @itemx jr
2423 Unconditional branch.
2424 @table @asis
2425 @item (byte displacement)
2426 @kbd{brb @dots{}}
2427 @item (word displacement)
2428 @kbd{brw @dots{}}
2429 @item (long displacement)
2430 @kbd{jmp @dots{}}
2431 @end table
2432 @item j@var{COND}
2433 @var{COND} may be any one of the conditional branches
2434 @code{neq nequ eql eqlu gtr geq lss gtru lequ vc vs gequ cc lssu cs}.
2435 @var{COND} may also be one of the bit tests
2436 @code{bs bc bss bcs bsc bcc bssi bcci lbs lbc}.
2437 @var{NOTCOND} is the opposite condition to @var{COND}.
2438 @table @asis
2439 @item (byte displacement)
2440 @kbd{b@var{COND} @dots{}}
2441 @item (word displacement)
2442 @kbd{b@var{UNCOND} foo ; brw @dots{} ; foo:}
2443 @item (long displacement)
2444 @kbd{b@var{UNCOND} foo ; jmp @dots{} ; foo:}
2445 @end table
2446 @item jacb@var{X}
2447 @var{X} may be one of @code{b d f g h l w}.
2448 @table @asis
2449 @item (word displacement)
2450 @kbd{@var{OPCODE} @dots{}}
2451 @item (long displacement)
2452 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @dots{} ; bar:}
2453 @end table
2454 @item jaob@var{YYY}
2455 @var{YYY} may be one of @code{lss leq}.
2456 @item jsob@var{ZZZ}
2457 @var{ZZZ} may be one of @code{geq gtr}.
2458 @table @asis
2459 @item (byte displacement)
2460 @kbd{@var{OPCODE} @dots{}}
2461 @item (word displacement)
2462 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2463 @item (long displacement)
2464 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar: }
2465 @end table
2466 @item aobleq
2467 @itemx aoblss
2468 @itemx sobgeq
2469 @itemx sobgtr
2470 @table @asis
2471 @item (byte displacement)
2472 @kbd{@var{OPCODE} @dots{}}
2473 @item (word displacement)
2474 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2475 @item (long displacement)
2476 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar:}
2477 @end table
2478 @end table
2479
2480 @subsection operands
2481 The immediate character is @samp{$} for Unix compatibility, not
2482 @samp{#} as DEC writes it.
2483
2484 The indirect character is @samp{*} for Unix compatibility, not
2485 @samp{@@} as DEC writes it.
2486
2487 The displacement sizing character is @samp{`} (an accent grave) for
2488 Unix compatibility, not @samp{^} as DEC writes it.  The letter
2489 preceding @samp{`} may have either case.  @samp{G} is not
2490 understood, but all other letters (@code{b i l s w}) are understood.
2491
2492 Register names understood are @code{r0 r1 r2 @dots{} r15 ap fp sp
2493 pc}.  Any case of letters will do.
2494
2495 For instance
2496 @example
2497 tstb *w`$4(r5)
2498 @end example
2499
2500 Any expression is permitted in an operand.  Operands are comma
2501 separated.
2502
2503 @c There is some bug to do with recognizing expressions
2504 @c in operands, but I forget what it is.  It is
2505 @c a syntax clash because () is used as an address mode
2506 @c and to encapsulate sub-expressions.
2507 @subsection Not Supported
2508 Vax bit fields can not be assembled with @code{as}.  Someone
2509 can add the required code if they really need it.
2510 @end ignore
2511
2512 @c if am29k
2513 @node Machine Options, Machine Syntax, Machine Dependent, Machine Dependent
2514 @section Options
2515 GNU @code{as} has no additional command-line options for the AMD
2516 29K family.
2517
2518 @node Machine Syntax, Floating Point, Machine Options, Machine Dependent
2519 @section Syntax
2520 @subsection Special Characters
2521 @samp{;} is the line comment character.
2522
2523 @samp{@@} can be used instead of a newline to separate statements.
2524
2525 The character @samp{?} is permitted in identifiers (but may not begin
2526 an identifier).
2527
2528 @subsection Register Names
2529 General-purpose registers are represented by predefined symbols of the
2530 form @samp{GR@var{nnn}} (for global registers) or @samp{LR@var{nnn}}
2531 (for local registers), where @var{nnn} represents a number between
2532 @code{0} and @code{127}, written with no leading zeros.  The leading
2533 letters may be in either upper or lower case; for example, @samp{gr13}
2534 and @samp{LR7} are both valid register names.
2535
2536 You may also refer to general-purpose registers by specifying the
2537 register number as the result of an expression (prefixed with @samp{%%}
2538 to flag the expression as a register number):
2539 @example
2540 %%@var{expression}
2541 @end example
2542 @noindent---where @var{expression} must be an absolute expression
2543 evaluating to a number between @code{0} and @code{255}.  The range
2544 [0, 127] refers to global registers, and the range [128, 255] to local
2545 registers.
2546
2547 In addition, GNU @code{as} understands the following protected
2548 special-purpose register names for the AMD 29K family:
2549
2550 @example
2551   vab    chd    pc0
2552   ops    chc    pc1
2553   cps    rbp    pc2
2554   cfg    tmc    mmu
2555   cha    tmr    lru
2556 @end example
2557
2558 These unprotected special-purpose register names are also recognized:
2559 @example
2560   ipc    alu    fpe
2561   ipa    bp     inte
2562   ipb    fc     fps
2563   q      cr     exop
2564 @end example
2565
2566 @node Floating Point, Machine Directives, Machine Syntax, Machine Dependent
2567 @section Floating Point
2568 The AMD 29K family uses IEEE floating-point numbers.
2569
2570 @node Machine Directives, Opcodes, Floating Point, Machine Dependent
2571 @section Machine Directives
2572
2573 @menu
2574 * block::                       @code{.block @var{size} , @var{fill}}
2575 * cputype::                     @code{.cputype}
2576 * file::                        @code{.file}
2577 * hword::                       @code{.hword @var{expressions}}
2578 * line::                        @code{.line}
2579 * reg::                         @code{.reg @var{symbol}, @var{expression}}
2580 * sect::                        @code{.sect}
2581 * use::                         @code{.use @var{segment name}}
2582 @end menu
2583
2584 @node block, cputype, Machine Directives, Machine Directives
2585 @subsection @code{.block @var{size} , @var{fill}}
2586 This directive emits @var{size} bytes, each of value @var{fill}.  Both
2587 @var{size} and @var{fill} are absolute expressions.  If the comma
2588 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2589
2590 In other versions of GNU @code{as}, this directive is called
2591 @samp{.space}.
2592
2593 @node cputype, file, block, Machine Directives
2594 @subsection @code{.cputype}
2595 This directive is ignored; it is accepted for compatibility with other
2596 AMD 29K assemblers.
2597
2598 @node file, hword, cputype, Machine Directives
2599 @subsection @code{.file}
2600 This directive is ignored; it is accepted for compatibility with other
2601 AMD 29K assemblers.
2602
2603 @quotation
2604 @emph{Warning:} in other versions of GNU @code{as}, @code{.file} is
2605 used for the directive called @code{.app-file} in the AMD 29K support.
2606 @end quotation
2607
2608 @node hword, line, file, Machine Directives
2609 @subsection @code{.hword @var{expressions}}
2610 This expects zero or more @var{expressions}, and emits
2611 a 16 bit number for each.  (Synonym for @samp{.short}.)
2612
2613 @node line, reg, hword, Machine Directives
2614 @subsection @code{.line}
2615 This directive is ignored; it is accepted for compatibility with other
2616 AMD 29K assemblers.
2617
2618 @node reg, sect, line, Machine Directives
2619 @subsection @code{.reg @var{symbol}, @var{expression}}
2620 @code{.reg} has the same effect as @code{.lsym}; @pxref{Lsym}.
2621
2622 @node sect, use, reg, Machine Directives
2623 @subsection @code{.sect}
2624 This directive is ignored; it is accepted for compatibility with other
2625 AMD 29K assemblers.
2626
2627 @node use,  , sect, Machine Directives
2628 @subsection @code{.use @var{segment name}}
2629 Establishes the segment and subsegment for the following code;
2630 @var{segment name} may be one of @code{.text}, @code{.data},
2631 @code{.data1}, or @code{.lit}.  With one of the first three @var{segment
2632 name} options, @samp{.use} is equivalent to the machine directive
2633 @var{segment name}; the remaining case, @samp{.use .lit}, is the same as
2634 @samp{.data 200}.
2635
2636
2637 @node Opcodes, Opcodes, Machine Directives, Machine Dependent
2638 @section Opcodes
2639 GNU @code{as} implements all the standard AMD 29K opcodes.  No
2640 additional pseudo-instructions are needed on this family.
2641
2642 For information on the 29K machine instruction set, see @cite{Am29000
2643 User's Manual}, Advanced Micro Devices, Inc.
2644
2645
2646 @c fi am29k
2647 @ignore
2648 @c if 680x0
2649 @section Options
2650 The 680x0 version of @code{as} has two machine dependent options.
2651 One shortens undefined references from 32 to 16 bits, while the
2652 other is used to tell @code{as} what kind of machine it is
2653 assembling for.
2654
2655 You can use the @kbd{-l} option to shorten the size of references to
2656 undefined symbols.  If the @kbd{-l} option is not given, references to
2657 undefined symbols will be a full long (32 bits) wide.  (Since @code{as}
2658 cannot know where these symbols will end up, @code{as} can only allocate
2659 space for the linker to fill in later.  Since @code{as} doesn't know how
2660 far away these symbols will be, it allocates as much space as it can.)
2661 If this option is given, the references will only be one word wide (16
2662 bits).  This may be useful if you want the object file to be as small as
2663 possible, and you know that the relevant symbols will be less than 17
2664 bits away.
2665
2666 The 680x0 version of @code{as} is most frequently used to assemble
2667 programs for the Motorola MC68020 microprocessor.  Occasionally it is
2668 used to assemble programs for the mostly similar, but slightly different
2669 MC68000 or MC68010 microprocessors.  You can give @code{as} the options
2670 @samp{-m68000}, @samp{-mc68000}, @samp{-m68010}, @samp{-mc68010},
2671 @samp{-m68020}, and @samp{-mc68020} to tell it what processor is the
2672 target.
2673
2674 @section Syntax
2675
2676 The 680x0 version of @code{as} uses syntax similar to the Sun assembler.
2677 Size modifiers are appended directly to the end of the opcode without an
2678 intervening period.  For example, write @samp{movl} rather than
2679 @samp{move.l}.
2680
2681 @c pesch@cygnus.com: Vintage Release c1.37 isn't compiled with
2682 @c SUN_ASM_SYNTAX.
2683 @c ignore
2684 If @code{as} is compiled with SUN_ASM_SYNTAX defined, it will also allow
2685 Sun-style local labels of the form @samp{1$} through @samp{$9}.
2686 @c end ignore
2687
2688 In the following table @dfn{apc} stands for any of the address
2689 registers (@samp{a0} through @samp{a7}), nothing, (@samp{}), the
2690 Program Counter (@samp{pc}), or the zero-address relative to the
2691 program counter (@samp{zpc}).
2692
2693 The following addressing modes are understood:
2694 @table @dfn
2695 @item Immediate
2696 @samp{#@var{digits}}
2697
2698 @item Data Register
2699 @samp{d0} through @samp{d7}
2700
2701 @item Address Register
2702 @samp{a0} through @samp{a7}
2703
2704 @item Address Register Indirect
2705 @samp{a0@@} through @samp{a7@@}
2706
2707 @item Address Register Postincrement
2708 @samp{a0@@+} through @samp{a7@@+}
2709
2710 @item Address Register Predecrement
2711 @samp{a0@@-} through @samp{a7@@-}
2712
2713 @item Indirect Plus Offset
2714 @samp{@var{apc}@@(@var{digits})}
2715
2716 @item Index
2717 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2718 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})}
2719
2720 @item Postindex
2721 @samp{@var{apc}@@(@var{digits})@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2722 or @samp{@var{apc}@@(@var{digits})@@(@var{register}:@var{size}:@var{scale})}
2723
2724 @item Preindex
2725 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2726 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2727
2728 @item Memory Indirect
2729 @samp{@var{apc}@@(@var{digits})@@(@var{digits})}
2730
2731 @item Absolute
2732 @samp{@var{symbol}}, or @samp{@var{digits}}
2733 @c ignore
2734 @c pesch@cygnus.com: gnu, rich concur the following needs careful
2735 @c                             research before documenting.
2736                                            , or either of the above followed
2737 by @samp{:b}, @samp{:w}, or @samp{:l}.
2738 @c end ignore
2739 @end table
2740
2741 @section Floating Point
2742 The floating point code is not too well tested, and may have
2743 subtle bugs in it.
2744
2745 Packed decimal (P) format floating literals are not supported.
2746 Feel free to add the code!
2747
2748 The floating point formats generated by directives are these.
2749 @table @code
2750 @item .float
2751 @code{Single} precision floating point constants.
2752 @item .double
2753 @code{Double} precision floating point constants.
2754 @end table
2755
2756 There is no directive to produce regions of memory holding
2757 extended precision numbers, however they can be used as
2758 immediate operands to floating-point instructions.  Adding a
2759 directive to create extended precision numbers would not be
2760 hard, but it has not yet seemed necessary.
2761
2762 @section Machine Directives
2763 In order to be compatible with the Sun assembler the 680x0 assembler
2764 understands the following directives.
2765 @table @code
2766 @item .data1
2767 This directive is identical to a @code{.data 1} directive.
2768 @item .data2
2769 This directive is identical to a @code{.data 2} directive.
2770 @item .even
2771 This directive is identical to a @code{.align 1} directive.
2772 @c Is this true?  does it work???
2773 @item .skip
2774 This directive is identical to a @code{.space} directive.
2775 @end table
2776
2777 @section Opcodes
2778 @c pesch@cygnus.com: I don't see any point in the following
2779 @c                   paragraph.  Bugs are bugs; how does saying this
2780 @c                   help anyone?
2781 @c ignore
2782 Danger:  Several bugs have been found in the opcode table (and
2783 fixed).  More bugs may exist.  Be careful when using obscure
2784 instructions.
2785 @c end ignore
2786
2787 @subsection Branch Improvement
2788
2789 Certain pseudo opcodes are permitted for branch instructions.
2790 They expand to the shortest branch instruction that will reach the
2791 target.  Generally these mnemonics are made by substituting @samp{j} for
2792 @samp{b} at the start of a Motorola mnemonic.
2793
2794 The following table summarizes the pseudo-operations.  A @code{*} flags
2795 cases that are more fully described after the table:
2796
2797 @example
2798           Displacement
2799           +---------------------------------------------------------
2800           |                68020   68000/10
2801 Pseudo-Op |BYTE    WORD    LONG    LONG      non-PC relative
2802           +---------------------------------------------------------
2803      jbsr |bsrs    bsr     bsrl    jsr       jsr
2804       jra |bras    bra     bral    jmp       jmp
2805 *     jXX |bXXs    bXX     bXXl    bNXs;jmpl bNXs;jmp
2806 *    dbXX |dbXX    dbXX        dbXX; bra; jmpl
2807 *    fjXX |fbXXw   fbXXw   fbXXl             fbNXw;jmp
2808
2809 XX: condition
2810 NX: negative of condition XX
2811
2812 @end example
2813 @center{@code{*}---see full description below}
2814
2815 @table @code
2816 @item jbsr
2817 @itemx jra
2818 These are the simplest jump pseudo-operations; they always map to one
2819 particular machine instruction, depending on the displacement to the
2820 branch target.
2821
2822 @item j@var{XX}
2823 Here, @samp{j@var{XX}} stands for an entire family of pseudo-operations,
2824 where @var{XX} is a conditional branch or condition-code test.  The full
2825 list of pseudo-ops in this family is:
2826 @example
2827  jhi   jls   jcc   jcs   jne   jeq   jvc
2828  jvs   jpl   jmi   jge   jlt   jgt   jle
2829 @end example
2830
2831 For the cases of non-PC relative displacements and long displacements on
2832 the 68000 or 68010, @code{as} will issue a longer code fragment in terms of
2833 @var{NX}, the opposite condition to @var{XX}:
2834 @example
2835     j@var{XX} foo
2836 @end example
2837 gives
2838 @example
2839      b@var{NX}s oof
2840      jmp foo
2841  oof:
2842 @end example
2843
2844 @item db@var{XX}
2845 The full family of pseudo-operations covered here is
2846 @example
2847  dbhi   dbls   dbcc   dbcs   dbne   dbeq   dbvc
2848  dbvs   dbpl   dbmi   dbge   dblt   dbgt   dble
2849  dbf    dbra   dbt
2850 @end example
2851
2852 Other than for word and byte displacements, when the source reads
2853 @samp{db@var{XX} foo}, @code{as} will emit
2854 @example
2855      db@var{XX} oo1
2856      bra oo2
2857  oo1:jmpl foo
2858  oo2:
2859 @end example
2860
2861 @item fj@var{XX}
2862 This family includes
2863 @example
2864  fjne   fjeq   fjge   fjlt   fjgt   fjle   fjf
2865  fjt    fjgl   fjgle  fjnge  fjngl  fjngle fjngt
2866  fjnle  fjnlt  fjoge  fjogl  fjogt  fjole  fjolt
2867  fjor   fjseq  fjsf   fjsne  fjst   fjueq  fjuge
2868  fjugt  fjule  fjult  fjun
2869 @end example
2870
2871 For branch targets that are not PC relative, @code{as} emits
2872 @example
2873      fb@var{NX} oof
2874      jmp foo
2875  oof:
2876 @end example
2877 when it encounters @samp{fj@var{XX} foo}.
2878
2879 @end table
2880
2881 @subsection Special Characters
2882 The immediate character is @samp{#} for Sun compatibility.  The
2883 line-comment character is @samp{|}.  If a @samp{#} appears at the
2884 beginning of a line, it is treated as a comment unless it looks like
2885 @samp{# line file}, in which case it is treated normally.
2886 @c fi 680x0
2887 @end ignore
2888
2889 @c pesch@cygnus.com: see remarks at ignore for vax.
2890 @ignore
2891 @section 32x32
2892 @section Options
2893 The 32x32 version of @code{as} accepts a @kbd{-m32032} option to
2894 specify thiat it is compiling for a 32032 processor, or a
2895 @kbd{-m32532} to specify that it is compiling for a 32532 option.
2896 The default (if neither is specified) is chosen when the assembler
2897 is compiled.
2898
2899 @subsection Syntax
2900 I don't know anything about the 32x32 syntax assembled by
2901 @code{as}.  Someone who undersands the processor (I've never seen
2902 one) and the possible syntaxes should write this section.
2903
2904 @subsection Floating Point
2905 The 32x32 uses IEEE floating point numbers, but @code{as} will only
2906 create single or double precision values.  I don't know if the 32x32
2907 understands extended precision numbers.
2908
2909 @subsection Machine Directives
2910 The 32x32 has no machine dependent directives.
2911
2912 @section Sparc
2913 @subsection Options
2914 The sparc has no machine dependent options.
2915
2916 @subsection syntax
2917 I don't know anything about Sparc syntax.  Someone who does
2918 will have to write this section.
2919
2920 @subsection Floating Point
2921 The Sparc uses ieee floating-point numbers.
2922
2923 @subsection Machine Directives
2924 The Sparc version of @code{as} supports the following additional
2925 machine directives:
2926
2927 @table @code
2928 @item .common
2929 This must be followed by a symbol name, a positive number, and
2930 @code{"bss"}.  This behaves somewhat like @code{.comm}, but the
2931 syntax is different.
2932
2933 @item .global
2934 This is functionally identical to @code{.globl}.
2935
2936 @item .half
2937 This is functionally identical to @code{.short}.
2938
2939 @item .proc
2940 This directive is ignored.  Any text following it on the same
2941 line is also ignored.
2942
2943 @item .reserve
2944 This must be followed by a symbol name, a positive number, and
2945 @code{"bss"}.  This behaves somewhat like @code{.lcomm}, but the
2946 syntax is different.
2947
2948 @item .seg
2949 This must be followed by @code{"text"}, @code{"data"}, or
2950 @code{"data1"}.  It behaves like @code{.text}, @code{.data}, or
2951 @code{.data 1}.
2952
2953 @item .skip
2954 This is functionally identical to the .space directive.
2955
2956 @item .word
2957 On the Sparc, the .word directive produces 32 bit values,
2958 instead of the 16 bit values it produces on every other machine.
2959
2960 @end table
2961
2962 @section Intel 80386
2963 @subsection Options
2964 The 80386 has no machine dependent options.
2965
2966 @subsection AT&T Syntax versus Intel Syntax
2967 In order to maintain compatibility with the output of @code{GCC},
2968 @code{as} supports AT&T System V/386 assembler syntax.  This is quite
2969 different from Intel syntax.  We mention these differences because
2970 almost all 80386 documents used only Intel syntax.  Notable differences
2971 between the two syntaxes are:
2972 @itemize @bullet
2973 @item
2974 AT&T immediate operands are preceded by @samp{$}; Intel immediate
2975 operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}).
2976 AT&T register operands are preceded by @samp{%}; Intel register operands
2977 are undelimited.  AT&T absolute (as opposed to PC relative) jump/call
2978 operands are prefixed by @samp{*}; they are undelimited in Intel syntax.
2979
2980 @item
2981 AT&T and Intel syntax use the opposite order for source and destination
2982 operands.  Intel @samp{add eax, 4} is @samp{addl $4, %eax}.  The
2983 @samp{source, dest} convention is maintained for compatibility with
2984 previous Unix assemblers.
2985
2986 @item
2987 In AT&T syntax the size of memory operands is determined from the last
2988 character of the opcode name.  Opcode suffixes of @samp{b}, @samp{w},
2989 and @samp{l} specify byte (8-bit), word (16-bit), and long (32-bit)
2990 memory references.  Intel syntax accomplishes this by prefixes memory
2991 operands (@emph{not} the opcodes themselves) with @samp{byte ptr},
2992 @samp{word ptr}, and @samp{dword ptr}.  Thus, Intel @samp{mov al, byte
2993 ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T syntax.
2994
2995 @item
2996 Immediate form long jumps and calls are
2997 @samp{lcall/ljmp $@var{segment}, $@var{offset}} in AT&T syntax; the
2998 Intel syntax is
2999 @samp{call/jmp far @var{segment}:@var{offset}}.  Also, the far return
3000 instruction
3001 is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is
3002 @samp{ret far @var{stack-adjust}}.
3003
3004 @item
3005 The AT&T assembler does not provide support for multiple segment
3006 programs.  Unix style systems expect all programs to be single segments.
3007 @end itemize
3008
3009 @subsection Opcode Naming
3010 Opcode names are suffixed with one character modifiers which specify the
3011 size of operands.  The letters @samp{b}, @samp{w}, and @samp{l} specify
3012 byte, word, and long operands.  If no suffix is specified by an
3013 instruction and it contains no memory operands then @code{as} tries to
3014 fill in the missing suffix based on the destination register operand
3015 (the last one by convention).  Thus, @samp{mov %ax, %bx} is equivalent
3016 to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to
3017 @samp{movw $1, %bx}.  Note that this is incompatible with the AT&T Unix
3018 assembler which assumes that a missing opcode suffix implies long
3019 operand size.  (This incompatibility does not affect compiler output
3020 since compilers always explicitly specify the opcode suffix.)
3021
3022 Almost all opcodes have the same names in AT&T and Intel format.  There
3023 are a few exceptions.  The sign extend and zero extend instructions need
3024 two sizes to specify them.  They need a size to sign/zero extend
3025 @emph{from} and a size to zero extend @emph{to}.  This is accomplished
3026 by using two opcode suffixes in AT&T syntax.  Base names for sign extend
3027 and zero extend are @samp{movs@dots{}} and @samp{movz@dots{}} in AT&T
3028 syntax (@samp{movsx} and @samp{movzx} in Intel syntax).  The opcode
3029 suffixes are tacked on to this base name, the @emph{from} suffix before
3030 the @emph{to} suffix.  Thus, @samp{movsbl %al, %edx} is AT&T syntax for
3031 ``move sign extend @emph{from} %al @emph{to} %edx.''  Possible suffixes,
3032 thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word),
3033 and @samp{wl} (from word to long).
3034
3035 The Intel syntax conversion instructions
3036 @itemize @bullet
3037 @item
3038 @samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax},
3039 @item
3040 @samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax},
3041 @item
3042 @samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax},
3043 @item
3044 @samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax},
3045 @end itemize
3046 are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, and @samp{cltd} in
3047 AT&T naming.  @code{as} accepts either naming for these instructions.
3048
3049 Far call/jump instructions are @samp{lcall} and @samp{ljmp} in
3050 AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel
3051 convention.
3052
3053 @subsection Register Naming
3054 Register operands are always prefixes with @samp{%}.  The 80386 registers
3055 consist of
3056 @itemize @bullet
3057 @item
3058 the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx},
3059 @samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the
3060 frame pointer), and @samp{%esp} (the stack pointer).
3061
3062 @item
3063 the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx},
3064 @samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}.
3065
3066 @item
3067 the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh},
3068 @samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These
3069 are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx},
3070 @samp{%cx}, and @samp{%dx})
3071
3072 @item
3073 the 6 segment registers @samp{%cs} (code segment), @samp{%ds}
3074 (data segment), @samp{%ss} (stack segment), @samp{%es}, @samp{%fs},
3075 and @samp{%gs}.
3076
3077 @item
3078 the 3 processor control registers @samp{%cr0}, @samp{%cr2}, and
3079 @samp{%cr3}.
3080
3081 @item
3082 the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2},
3083 @samp{%db3}, @samp{%db6}, and @samp{%db7}.
3084
3085 @item
3086 the 2 test registers @samp{%tr6} and @samp{%tr7}.
3087
3088 @item
3089 the 8 floating point register stack @samp{%st} or equivalently
3090 @samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)},
3091 @samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}.
3092 @end itemize
3093
3094 @subsection Opcode Prefixes
3095 Opcode prefixes are used to modify the following opcode.  They are used
3096 to repeat string instructions, to provide segment overrides, to perform
3097 bus lock operations, and to give operand and address size (16-bit
3098 operands are specified in an instruction by prefixing what would
3099 normally be 32-bit operands with a ``operand size'' opcode prefix).
3100 Opcode prefixes are usually given as single-line instructions with no
3101 operands, and must directly precede the instruction they act upon.  For
3102 example, the @samp{scas} (scan string) instruction is repeated with:
3103 @example
3104         repne
3105         scas
3106 @end example
3107
3108 Here is a list of opcode prefixes:
3109 @itemize @bullet
3110 @item
3111 Segment override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es},
3112 @samp{fs}, @samp{gs}.  These are automatically added by specifying
3113 using the @var{segment}:@var{memory-operand} form for memory references.
3114
3115 @item
3116 Operand/Address size prefixes @samp{data16} and @samp{addr16}
3117 change 32-bit operands/addresses into 16-bit operands/addresses.  Note
3118 that 16-bit addressing modes (i.e. 8086 and 80286 addressing modes)
3119 are not supported (yet).
3120
3121 @item
3122 The bus lock prefix @samp{lock} inhibits interrupts during
3123 execution of the instruction it precedes.  (This is only valid with
3124 certain instructions; see a 80386 manual for details).
3125
3126 @item
3127 The wait for coprocessor prefix @samp{wait} waits for the
3128 coprocessor to complete the current instruction.  This should never be
3129 needed for the 80386/80387 combination.
3130
3131 @item
3132 The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added
3133 to string instructions to make them repeat @samp{%ecx} times.
3134 @end itemize
3135
3136 @subsection Memory References
3137 An Intel syntax indirect memory reference of the form
3138 @example
3139 @var{segment}:[@var{base} + @var{index}*@var{scale} + @var{disp}]
3140 @end example
3141 is translated into the AT&T syntax
3142 @example
3143 @var{segment}:@var{disp}(@var{base}, @var{index}, @var{scale})
3144 @end example
3145 where @var{base} and @var{index} are the optional 32-bit base and
3146 index registers, @var{disp} is the optional displacement, and
3147 @var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index}
3148 to calculate the address of the operand.  If no @var{scale} is
3149 specified, @var{scale} is taken to be 1.  @var{segment} specifies the
3150 optional segment register for the memory operand, and may override the
3151 default segment register (see a 80386 manual for segment register
3152 defaults). Note that segment overrides in AT&T syntax @emph{must} have
3153 be preceded by a @samp{%}.  If you specify a segment override which
3154 coincides with the default segment register, @code{as} will @emph{not}
3155 output any segment register override prefixes to assemble the given
3156 instruction.  Thus, segment overrides can be specified to emphasize which
3157 segment register is used for a given memory operand.
3158
3159 Here are some examples of Intel and AT&T style memory references:
3160 @table @asis
3161
3162 @item AT&T: @samp{-4(%ebp)}, Intel:  @samp{[ebp - 4]}
3163 @var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{segment} is
3164 missing, and the default segment is used (@samp{%ss} for addressing with
3165 @samp{%ebp} as the base register).  @var{index}, @var{scale} are both missing.
3166
3167 @item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]}
3168 @var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is
3169 @samp{foo}.  All other fields are missing.  The segment register here
3170 defaults to @samp{%ds}.
3171
3172 @item AT&T: @samp{foo(,1)}; Intel @samp{[foo]}
3173 This uses the value pointed to by @samp{foo} as a memory operand.
3174 Note that @var{base} and @var{index} are both missing, but there is only
3175 @emph{one} @samp{,}.  This is a syntactic exception.
3176
3177 @item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo}
3178 This selects the contents of the variable @samp{foo} with segment
3179 register @var{segment} being @samp{%gs}.
3180
3181 @end table
3182
3183 Absolute (as opposed to PC relative) call and jump operands must be
3184 prefixed with @samp{*}.  If no @samp{*} is specified, @code{as} will
3185 always choose PC relative addressing for jump/call labels.
3186
3187 Any instruction that has a memory operand @emph{must} specify its size (byte,
3188 word, or long) with an opcode suffix (@samp{b}, @samp{w}, or @samp{l},
3189 respectively).
3190
3191 @subsection Handling of Jump Instructions
3192 Jump instructions are always optimized to use the smallest possible
3193 displacements.  This is accomplished by using byte (8-bit) displacement
3194 jumps whenever the target is sufficiently close.  If a byte displacement
3195 is insufficient a long (32-bit) displacement is used.  We do not support
3196 word (16-bit) displacement jumps (i.e. prefixing the jump instruction
3197 with the @samp{addr16} opcode prefix), since the 80386 insists upon masking
3198 @samp{%eip} to 16 bits after the word displacement is added.
3199
3200 Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz},
3201 @samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in
3202 byte displacements, so that it is possible that use of these
3203 instructions (@code{GCC} does not use them) will cause the assembler to
3204 print an error message (and generate incorrect code).  The AT&T 80386
3205 assembler tries to get around this problem by expanding @samp{jcxz foo} to
3206 @example
3207          jcxz cx_zero
3208          jmp cx_nonzero
3209 cx_zero: jmp foo
3210 cx_nonzero:
3211 @end example
3212
3213 @subsection Floating Point
3214 All 80387 floating point types except packed BCD are supported.
3215 (BCD support may be added without much difficulty).  These data
3216 types are 16-, 32-, and 64- bit integers, and single (32-bit),
3217 double (64-bit), and extended (80-bit) precision floating point.
3218 Each supported type has an opcode suffix and a constructor
3219 associated with it.  Opcode suffixes specify operand's data
3220 types.  Constructors build these data types into memory.
3221
3222 @itemize @bullet
3223 @item
3224 Floating point constructors are @samp{.float} or @samp{.single},
3225 @samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats.
3226 These correspond to opcode suffixes @samp{s}, @samp{l}, and @samp{t}.
3227 @samp{t} stands for temporary real, and that the 80387 only supports
3228 this format via the @samp{fldt} (load temporary real to stack top) and
3229 @samp{fstpt} (store temporary real and pop stack) instructions.
3230
3231 @item
3232 Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and
3233 @samp{.quad} for the 16-, 32-, and 64-bit integer formats.  The corresponding
3234 opcode suffixes are @samp{s} (single), @samp{l} (long), and @samp{q}
3235 (quad).  As with the temporary real format the 64-bit @samp{q} format is
3236 only present in the @samp{fildq} (load quad integer to stack top) and
3237 @samp{fistpq} (store quad integer and pop stack) instructions.
3238 @end itemize
3239
3240 Register to register operations do not require opcode suffixes,
3241 so that @samp{fst %st, %st(1)} is equivalent to @samp{fstl %st, %st(1)}.
3242
3243 Since the 80387 automatically synchronizes with the 80386 @samp{fwait}
3244 instructions are almost never needed (this is not the case for the
3245 80286/80287 and 8086/8087 combinations).  Therefore, @code{as} suppresses
3246 the @samp{fwait} instruction whenever it is implicitly selected by one
3247 of the @samp{fn@dots{}} instructions.  For example, @samp{fsave} and
3248 @samp{fnsave} are treated identically.  In general, all the @samp{fn@dots{}}
3249 instructions are made equivalent to @samp{f@dots{}} instructions.  If
3250 @samp{fwait} is desired it must be explicitly coded.
3251
3252 @subsection Notes
3253 There is some trickery concerning the @samp{mul} and @samp{imul}
3254 instructions that deserves mention.  The 16-, 32-, and 64-bit expanding
3255 multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5
3256 for @samp{imul}) can be output only in the one operand form.  Thus,
3257 @samp{imul %ebx, %eax} does @emph{not} select the expanding multiply;
3258 the expanding multiply would clobber the @samp{%edx} register, and this
3259 would confuse @code{GCC} output.  Use @samp{imul %ebx} to get the
3260 64-bit product in @samp{%edx:%eax}.
3261
3262 We have added a two operand form of @samp{imul} when the first operand
3263 is an immediate mode expression and the second operand is a register.
3264 This is just a shorthand, so that, multiplying @samp{%eax} by 69, for
3265 example, can be done with @samp{imul $69, %eax} rather than @samp{imul
3266 $69, %eax, %eax}.
3267 @end ignore
3268 @c pesch@cygnus.com: we also ignore the following chapters, but for
3269 @c                   a different reason---internals are changing
3270 @c                   rapidly.  These may need to be moved to another
3271 @c                   book anyhow, if we adopt the model of user/modifier
3272 @c                   books.
3273 @ignore
3274 @node Maintenance, Retargeting, Machine Dependent, Top
3275 @chapter Maintaining the Assembler
3276 [[this chapter is still being built]]
3277
3278 @section Design
3279 We had these goals, in descending priority:
3280 @table @b
3281 @item Accuracy.
3282 For every program composed by a compiler, @code{as} should emit
3283 ``correct'' code.  This leaves some latitude in choosing addressing
3284 modes, order of @code{relocation_info} structures in the object
3285 file, @emph{etc}.
3286
3287 @item Speed, for usual case.
3288 By far the most common use of @code{as} will be assembling compiler
3289 emissions.
3290
3291 @item Upward compatibility for existing assembler code.
3292 Well @dots{} we don't support Vax bit fields but everything else
3293 seems to be upward compatible.
3294
3295 @item Readability.
3296 The code should be maintainable with few surprises.  (JF: ha!)
3297
3298 @end table
3299
3300 We assumed that disk I/O was slow and expensive while memory was
3301 fast and access to memory was cheap.  We expect the in-memory data
3302 structures to be less than 10 times the size of the emitted object
3303 file.  (Contrast this with the C compiler where in-memory structures
3304 might be 100 times object file size!)
3305 This suggests:
3306 @itemize @bullet
3307 @item
3308 Try to read the source file from disk only one time.  For other
3309 reasons, we keep large chunks of the source file in memory during
3310 assembly so this is not a problem.  Also the assembly algorithm
3311 should only scan the source text once if the compiler composed the
3312 text according to a few simple rules.
3313 @item
3314 Emit the object code bytes only once.  Don't store values and then
3315 backpatch later.
3316 @item
3317 Build the object file in memory and do direct writes to disk of
3318 large buffers.
3319 @end itemize
3320
3321 RMS suggested a one-pass algorithm which seems to work well.  By not
3322 parsing text during a second pass considerable time is saved on
3323 large programs (@emph{e.g.} the sort of C program @code{yacc} would
3324 emit).
3325
3326 It happened that the data structures needed to emit relocation
3327 information to the object file were neatly subsumed into the data
3328 structures that do backpatching of addresses after pass 1.
3329
3330 Many of the functions began life as re-usable modules, loosely
3331 connected.  RMS changed this to gain speed.  For example, input
3332 parsing routines which used to work on pre-sanitized strings now
3333 must parse raw data.  Hence they have to import knowledge of the
3334 assemblers' comment conventions @emph{etc}.
3335
3336 @section Deprecated Feature(?)s
3337 We have stopped supporting some features:
3338 @itemize @bullet
3339 @item
3340 @code{.org} statements must have @b{defined} expressions.
3341 @item
3342 Vax Bit fields (@kbd{:} operator) are entirely unsupported.
3343 @end itemize
3344
3345 It might be a good idea to not support these features in a future release:
3346 @itemize @bullet
3347 @item
3348 @kbd{#} should begin a comment, even in column 1.
3349 @item
3350 Why support the logical line & file concept any more?
3351 @item
3352 Subsegments are a good candidate for flushing.
3353 Depends on which compilers need them I guess.
3354 @end itemize
3355
3356 @section Bugs, Ideas, Further Work
3357 Clearly the major improvement is DON'T USE A TEXT-READING
3358 ASSEMBLER for the back end of a compiler.  It is much faster to
3359 interpret binary gobbledygook from a compiler's tables than to
3360 ask the compiler to write out human-readable code just so the
3361 assembler can parse it back to binary.
3362
3363 Assuming you use @code{as} for human written programs: here are
3364 some ideas:
3365 @itemize @bullet
3366 @item
3367 Document (here) @code{APP}.
3368 @item
3369 Take advantage of knowing no spaces except after opcode
3370 to speed up @code{as}.  (Modify @code{app.c} to flush useless spaces:
3371 only keep space/tabs at begin of line or between 2
3372 symbols.)
3373 @item
3374 Put pointers in this documentation to @file{a.out} documentation.
3375 @item
3376 Split the assembler into parts so it can gobble direct binary
3377 from @emph{e.g.} @code{cc}.  It is silly for@code{cc} to compose text
3378 just so @code{as} can parse it back to binary.
3379 @item
3380 Rewrite hash functions: I want a more modular, faster library.
3381 @item
3382 Clean up LOTS of code.
3383 @item
3384 Include all the non-@file{.c} files in the maintenance chapter.
3385 @item
3386 Document flonums.
3387 @item
3388 Implement flonum short literals.
3389 @item
3390 Change all talk of expression operands to expression quantities,
3391 or perhaps to expression arguments.
3392 @item
3393 Implement pass 2.
3394 @item
3395 Whenever a @code{.text} or @code{.data} statement is seen, we close
3396 of the current frag with an imaginary @code{.fill 0}.  This is
3397 because we only have one obstack for frags, and we can't grow new
3398 frags for a new subsegment, then go back to the old subsegment and
3399 append bytes to the old frag.  All this nonsense goes away if we
3400 give each subsegment its own obstack.  It makes code simpler in
3401 about 10 places, but nobody has bothered to do it because C compiler
3402 output rarely changes subsegments (compared to ending frags with
3403 relaxable addresses, which is common).
3404 @end itemize
3405
3406 @section Sources
3407 @c The following files in the @file{as} directory
3408 @c are symbolic links to other files, of
3409 @c the same name, in a different directory.
3410 @c @itemize @bullet
3411 @c @item
3412 @c @file{atof_generic.c}
3413 @c @item
3414 @c @file{atof_vax.c}
3415 @c @item
3416 @c @file{flonum_const.c}
3417 @c @item
3418 @c @file{flonum_copy.c}
3419 @c @item
3420 @c @file{flonum_get.c}
3421 @c @item
3422 @c @file{flonum_multip.c}
3423 @c @item
3424 @c @file{flonum_normal.c}
3425 @c @item
3426 @c @file{flonum_print.c}
3427 @c @end itemize
3428
3429 Here is a list of the source files in the @file{as} directory.
3430
3431 @table @file
3432 @item app.c
3433 This contains the pre-processing phase, which deletes comments,
3434 handles whitespace, etc.  This was recently re-written, since app
3435 used to be a separate program, but RMS wanted it to be inline.
3436
3437 @item append.c
3438 This is a subroutine to append a string to another string returning a
3439 pointer just after the last @code{char} appended.  (JF:  All these
3440 little routines should probably all be put in one file.)
3441
3442 @item as.c
3443 Here you will find the main program of the assembler @code{as}.
3444
3445 @item expr.c
3446 This is a branch office of @file{read.c}.  This understands
3447 expressions, arguments.  Inside @code{as}, arguments are called
3448 (expression) @emph{operands}.  This is confusing, because we also talk
3449 (elsewhere) about instruction @emph{operands}.  Also, expression
3450 operands are called @emph{quantities} explicitly to avoid confusion
3451 with instruction operands.  What a mess.
3452
3453 @item frags.c
3454 This implements the @b{frag} concept.  Without frags, finding the
3455 right size for branch instructions would be a lot harder.
3456
3457 @item hash.c
3458 This contains the symbol table, opcode table @emph{etc.} hashing
3459 functions.
3460
3461 @item hex_value.c
3462 This is a table of values of digits, for use in atoi() type
3463 functions.  Could probably be flushed by using calls to strtol(), or
3464 something similar.
3465
3466 @item input-file.c
3467 This contains Operating system dependent source file reading
3468 routines.  Since error messages often say where we are in reading
3469 the source file, they live here too.  Since @code{as} is intended to
3470 run under GNU and Unix only, this might be worth flushing.  Anyway,
3471 almost all C compilers support stdio.
3472
3473 @item input-scrub.c
3474 This deals with calling the pre-processor (if needed) and feeding the
3475 chunks back to the rest of the assembler the right way.
3476
3477 @item messages.c
3478 This contains operating system independent parts of fatal and
3479 warning message reporting.  See @file{append.c} above.
3480
3481 @item output-file.c
3482 This contains operating system dependent functions that write an
3483 object file for @code{as}.  See @file{input-file.c} above.
3484
3485 @item read.c
3486 This implements all the directives of @code{as}.  This also deals
3487 with passing input lines to the machine dependent part of the
3488 assembler.
3489
3490 @item strstr.c
3491 This is a C library function that isn't in most C libraries yet.
3492 See @file{append.c} above.
3493
3494 @item subsegs.c
3495 This implements subsegments.
3496
3497 @item symbols.c
3498 This implements symbols.
3499
3500 @item write.c
3501 This contains the code to perform relaxation, and to write out
3502 the object file.  It is mostly operating system independent, but
3503 different OSes have different object file formats in any case.
3504
3505 @item xmalloc.c
3506 This implements @code{malloc()} or bust.  See @file{append.c} above.
3507
3508 @item xrealloc.c
3509 This implements @code{realloc()} or bust.  See @file{append.c} above.
3510
3511 @item atof-generic.c
3512 The following files were taken from a machine-independent subroutine
3513 library for manipulating floating point numbers and very large
3514 integers.
3515
3516 @file{atof-generic.c} turns a string into a flonum internal format
3517 floating-point number.
3518
3519 @item flonum-const.c
3520 This contains some potentially useful floating point numbers in
3521 flonum format.
3522
3523 @item flonum-copy.c
3524 This copies a flonum.
3525
3526 @item flonum-multip.c
3527 This multiplies two flonums together.
3528
3529 @item bignum-copy.c
3530 This copies a bignum.
3531
3532 @end table
3533
3534 Here is a table of all the machine-specific files (this includes
3535 both source and header files).  Typically, there is a
3536 @var{machine}.c file, a @var{machine}-opcode.h file, and an
3537 atof-@var{machine}.c file.  The @var{machine}-opcode.h file should
3538 be identical to the one used by GDB (which uses it for disassembly.)
3539
3540 @table @file
3541
3542 @item atof-ieee.c
3543 This contains code to turn a flonum into a ieee literal constant.
3544 This is used by tye 680x0, 32x32, sparc, and i386 versions of @code{as}.
3545
3546 @item i386-opcode.h
3547 This is the opcode-table for the i386 version of the assembler.
3548
3549 @item i386.c
3550 This contains all the code for the i386 version of the assembler.
3551
3552 @item i386.h
3553 This defines constants and macros used by the i386 version of the assembler.
3554
3555 @item m-generic.h
3556 generic 68020 header file.  To be linked to m68k.h on a
3557 non-sun3, non-hpux system.
3558
3559 @item m-sun2.h
3560 68010 header file for Sun2 workstations.  Not well tested.  To be linked
3561 to m68k.h on a sun2.  (See also @samp{-DSUN_ASM_SYNTAX} in the
3562 @file{Makefile}.)
3563
3564 @item m-sun3.h
3565 68020 header file for Sun3 workstations.  To be linked to m68k.h before
3566 compiling on a Sun3 system.  (See also @samp{-DSUN_ASM_SYNTAX} in the
3567 @file{Makefile}.)
3568
3569 @item m-hpux.h
3570 68020 header file for a HPUX (system 5?) box.  Which box, which
3571 version of HPUX, etc?  I don't know.
3572
3573 @item m68k.h
3574 A hard- or symbolic- link to one of @file{m-generic.h},
3575 @file{m-hpux.h} or @file{m-sun3.h} depending on which kind of
3576 680x0 you are assembling for.   (See also @samp{-DSUN_ASM_SYNTAX} in the
3577 @file{Makefile}.)
3578
3579 @item m68k-opcode.h
3580 Opcode table for 68020.  This is now a link to the opcode table
3581 in the @code{GDB} source directory.
3582
3583 @item m68k.c
3584 All the mc680x0 code, in one huge, slow-to-compile file.
3585
3586 @item ns32k.c
3587 This contains the code for the ns32032/ns32532 version of the
3588 assembler.
3589
3590 @item ns32k-opcode.h
3591 This contains the opcode table for the ns32032/ns32532 version
3592 of the assembler.
3593
3594 @item vax-inst.h
3595 Vax specific file for describing Vax operands and other Vax-ish things.
3596
3597 @item vax-opcode.h
3598 Vax opcode table.
3599
3600 @item vax.c
3601 Vax specific parts of @code{as}.  Also includes the former files
3602 @file{vax-ins-parse.c}, @file{vax-reg-parse.c} and @file{vip-op.c}.
3603
3604 @item atof-vax.c
3605 Turns a flonum into a Vax constant.
3606
3607 @item vms.c
3608 This file contains the special code needed to put out a VMS
3609 style object file for the Vax.
3610
3611 @end table
3612
3613 Here is a list of the header files in the source directory.
3614 (Warning:  This section may not be very accurate.  I didn't
3615 write the header files; I just report them.)  Also note that I
3616 think many of these header files could be cleaned up or
3617 eliminated.
3618
3619 @table @file
3620
3621 @item a.out.h
3622 This describes the structures used to create the binary header data
3623 inside the object file.  Perhaps we should use the one in
3624 @file{/usr/include}?
3625
3626 @item as.h
3627 This defines all the globally useful things, and pulls in <stdio.h>
3628 and <assert.h>.
3629
3630 @item bignum.h
3631 This defines macros useful for dealing with bignums.
3632
3633 @item expr.h
3634 Structure and macros for dealing with expression()
3635
3636 @item flonum.h
3637 This defines the structure for dealing with floating point
3638 numbers.  It #includes @file{bignum.h}.
3639
3640 @item frags.h
3641 This contains macro for appending a byte to the current frag.
3642
3643 @item hash.h
3644 Structures and function definitions for the hashing functions.
3645
3646 @item input-file.h
3647 Function headers for the input-file.c functions.
3648
3649 @item md.h
3650 structures and function headers for things defined in the
3651 machine dependent part of the assembler.
3652
3653 @item obstack.h
3654 This is the GNU systemwide include file for manipulating obstacks.
3655 Since nobody is running under real GNU yet, we include this file.
3656
3657 @item read.h
3658 Macros and function headers for reading in source files.
3659
3660 @item struct-symbol.h
3661 Structure definition and macros for dealing with the gas
3662 internal form of a symbol.
3663
3664 @item subsegs.h
3665 structure definition for dealing with the numbered subsegments
3666 of the text and data segments.
3667
3668 @item symbols.h
3669 Macros and function headers for dealing with symbols.
3670
3671 @item write.h
3672 Structure for doing segment fixups.
3673 @end table
3674
3675 @comment ~subsection Test Directory
3676 @comment (Note:  The test directory seems to have disappeared somewhere
3677 @comment along the line.  If you want it, you'll probably have to find a
3678 @comment REALLY OLD dump tape~dots{})
3679 @comment
3680 @comment The ~file{test/} directory is used for regression testing.
3681 @comment After you modify ~@code{as}, you can get a quick go/nogo
3682 @comment confidence test by running the new ~@code{as} over the source
3683 @comment files in this directory.  You use a shell script ~file{test/do}.
3684 @comment
3685 @comment The tests in this suite are evolving.  They are not comprehensive.
3686 @comment They have, however, caught hundreds of bugs early in the debugging
3687 @comment cycle of ~@code{as}.  Most test statements in this suite were naturally
3688 @comment selected: they were used to demonstrate actual ~@code{as} bugs rather
3689 @comment than being written ~i{a prioi}.
3690 @comment
3691 @comment Another testing suggestion: over 30 bugs have been found simply by
3692 @comment running examples from this manual through ~@code{as}.
3693 @comment Some examples in this manual are selected
3694 @comment to distinguish boundary conditions; they are good for testing ~@code{as}.
3695 @comment
3696 @comment ~subsubsection Regression Testing
3697 @comment Each regression test involves assembling a file and comparing the
3698 @comment actual output of ~@code{as} to ``known good'' output files.  Both
3699 @comment the object file and the error/warning message file (stderr) are
3700 @comment inspected.  Optionally ~@code{as}' exit status may be checked.
3701 @comment Discrepencies are reported.  Each discrepency means either that
3702 @comment you broke some part of ~@code{as} or that the ``known good'' files
3703 @comment are now out of date and should be changed to reflect the new
3704 @comment definition of ``good''.
3705 @comment
3706 @comment Each regression test lives in its own directory, in a tree
3707 @comment rooted in the directory ~file{test/}.  Each such directory
3708 @comment has a name ending in ~file{.ret}, where `ret' stands for
3709 @comment REgression Test.  The ~file{.ret} ending allows ~code{find
3710 @comment (1)} to find all regression tests in the tree, without
3711 @comment needing to list them explicitly.
3712 @comment
3713 @comment Any ~file{.ret} directory must contain a file called
3714 @comment ~file{input} which is the source file to assemble.  During
3715 @comment testing an object file ~file{output} is created, as well as
3716 @comment a file ~file{stdouterr} which contains the output to both
3717 @comment stderr and stderr.  If there is a file ~file{output.good} in
3718 @comment the directory, and if ~file{output} contains exactly the
3719 @comment same data as ~file{output.good}, the file ~file{output} is
3720 @comment deleted.  Likewise ~file{stdouterr} is removed if it exactly
3721 @comment matches a file ~file{stdouterr.good}.  If file
3722 @comment ~file{status.good} is present, containing a decimal number
3723 @comment before a newline, the exit status of ~@code{as} is compared
3724 @comment to this number.  If the status numbers are not equal, a file
3725 @comment ~file{status} is written to the directory, containing the
3726 @comment actual status as a decimal number followed by newline.
3727 @comment
3728 @comment Should any of the ~file{*.good} files fail to match their corresponding
3729 @comment actual files, this is noted by a 1-line message on the screen during
3730 @comment the regression test, and you can use ~@code{find (1)} to find any
3731 @comment files named ~file{status}, ~file {output} or ~file{stdouterr}.
3732 @comment
3733 @node Retargeting, License, Maintenance, Top
3734 @chapter Teaching the Assembler about a New Machine
3735
3736 This chapter describes the steps required in order to make the
3737 assembler work with another machine's assembly language.  This
3738 chapter is not complete, and only describes the steps in the
3739 broadest terms.  You should look at the source for the
3740 currently supported machine in order to discover some of the
3741 details that aren't mentioned here.
3742
3743 You should create a new file called @file{@var{machine}.c}, and
3744 add the appropriate lines to the file @file{Makefile} so that
3745 you can compile your new version of the assembler.  This should
3746 be straighforward; simply add lines similar to the ones there
3747 for the four current versions of the assembler.
3748
3749 If you want to be compatible with GDB, (and the current
3750 machine-dependent versions of the assembler), you should create
3751 a file called @file{@var{machine}-opcode.h} which should
3752 contain all the information about the names of the machine
3753 instructions, their opcodes, and what addressing modes they
3754 support.  If you do this right, the assembler and GDB can share
3755 this file, and you'll only have to write it once.  Note that
3756 while you're writing @code{as}, you may want to use an
3757 independent program (if you have access to one), to make sure
3758 that @code{as} is emitting the correct bytes.  Since @code{as}
3759 and @code{GDB} share the opcode table, an incorrect opcode
3760 table entry may make invalid bytes look OK when you disassemble
3761 them with @code{GDB}.
3762
3763 @section Functions You will Have to Write
3764
3765 Your file @file{@var{machine}.c} should contain definitions for
3766 the following functions and variables.  It will need to include
3767 some header files in order to use some of the structures
3768 defined in the machine-independent part of the assembler.  The
3769 needed header files are mentioned in the descriptions of the
3770 functions that will need them.
3771
3772 @table @code
3773
3774 @item long omagic;
3775 This long integer holds the value to place at the beginning of
3776 the @file{a.out} file.  It is usually @samp{OMAGIC}, except on
3777 machines that store additional information in the magic-number.
3778
3779 @item char comment_chars[];
3780 This character array holds the values of the characters that
3781 start a comment anywhere in a line.  Comments are stripped off
3782 automatically by the machine independent part of the
3783 assembler.  Note that the @samp{/*} will always start a
3784 comment, and that only @samp{*/} will end a comment started by
3785 @samp{*/}.
3786
3787 @item char line_comment_chars[];
3788 This character array holds the values of the chars that start a
3789 comment only if they are the first (non-whitespace) character
3790 on a line.  If the character @samp{#} does not appear in this
3791 list, you may get unexpected results.  (Various
3792 machine-independent parts of the assembler treat the comments
3793 @samp{#APP} and @samp{#NO_APP} specially, and assume that lines
3794 that start with @samp{#} are comments.)
3795
3796 @item char EXP_CHARS[];
3797 This character array holds the letters that can separate the
3798 mantissa and the exponent of a floating point number.  Typical
3799 values are @samp{e} and @samp{E}.
3800
3801 @item char FLT_CHARS[];
3802 This character array holds the letters that--when they appear
3803 immediately after a leading zero--indicate that a number is a
3804 floating-point number.  (Sort of how 0x indicates that a
3805 hexadecimal number follows.)
3806
3807 @item pseudo_typeS md_pseudo_table[];
3808 (@var{pseudo_typeS} is defined in @file{md.h})
3809 This array contains a list of the machine_dependent directives
3810 the assembler must support.  It contains the name of each
3811 pseudo op (Without the leading @samp{.}), a pointer to a
3812 function to be called when that directive is encountered, and
3813 an integer argument to be passed to that function.
3814
3815 @item void md_begin(void)
3816 This function is called as part of the assembler's
3817 initialization.  It should do any initialization required by
3818 any of your other routines.
3819
3820 @item int md_parse_option(char **optionPTR, int *argcPTR, char ***argvPTR)
3821 This routine is called once for each option on the command line
3822 that the machine-independent part of @code{as} does not
3823 understand.  This function should return non-zero if the option
3824 pointed to by @var{optionPTR} is a valid option.  If it is not
3825 a valid option, this routine should return zero.  The variables
3826 @var{argcPTR} and @var{argvPTR} are provided in case the option
3827 requires a filename or something similar as an argument.  If
3828 the option is multi-character, @var{optionPTR} should be
3829 advanced past the end of the option, otherwise every letter in
3830 the option will be treated as a separate single-character
3831 option.
3832
3833 @item void md_assemble(char *string)
3834 This routine is called for every machine-dependent
3835 non-directive line in the source file.  It does all the real
3836 work involved in reading the opcode, parsing the operands,
3837 etc.  @var{string} is a pointer to a null-terminated string,
3838 that comprises the input line, with all excess whitespace and
3839 comments removed.
3840
3841 @item void md_number_to_chars(char *outputPTR,long value,int nbytes)
3842 This routine is called to turn a C long int, short int, or char
3843 into the series of bytes that represents that number on the
3844 target machine.  @var{outputPTR} points to an array where the
3845 result should be stored; @var{value} is the value to store; and
3846 @var{nbytes} is the number of bytes in 'value' that should be
3847 stored.
3848
3849 @item void md_number_to_imm(char *outputPTR,long value,int nbytes)
3850 This routine is called to turn a C long int, short int, or char
3851 into the series of bytes that represent an immediate value on
3852 the target machine.  It is identical to the function @code{md_number_to_chars},
3853 except on NS32K machines.@refill
3854
3855 @item void md_number_to_disp(char *outputPTR,long value,int nbytes)
3856 This routine is called to turn a C long int, short int, or char
3857 into the series of bytes that represent an displacement value on
3858 the target machine.  It is identical to the function @code{md_number_to_chars},
3859 except on NS32K machines.@refill
3860
3861 @item void md_number_to_field(char *outputPTR,long value,int nbytes)
3862 This routine is identical to @code{md_number_to_chars},
3863 except on NS32K machines.
3864
3865 @item void md_ri_to_chars(struct relocation_info *riPTR,ri)
3866 (@code{struct relocation_info} is defined in @file{a.out.h})
3867 This routine emits the relocation info in @var{ri}
3868 in the appropriate bit-pattern for the target machine.
3869 The result should be stored in the location pointed
3870 to by @var{riPTR}.  This routine may be a no-op unless you are
3871 attempting to do cross-assembly.
3872
3873 @item char *md_atof(char type,char *outputPTR,int *sizePTR)
3874 This routine turns a series of digits into the appropriate
3875 internal representation for a floating-point number.
3876 @var{type} is a character from @var{FLT_CHARS[]} that describes
3877 what kind of floating point number is wanted; @var{outputPTR}
3878 is a pointer to an array that the result should be stored in;
3879 and @var{sizePTR} is a pointer to an integer where the size (in
3880 bytes) of the result should be stored.  This routine should
3881 return an error message, or an empty string (not (char *)0) for
3882 success.
3883
3884 @item int md_short_jump_size;
3885 This variable holds the (maximum) size in bytes of a short (16
3886 bit or so) jump created by @code{md_create_short_jump()}.  This
3887 variable is used as part of the broken-word feature, and isn't
3888 needed if the assembler is compiled with
3889 @samp{-DWORKING_DOT_WORD}.
3890
3891 @item int md_long_jump_size;
3892 This variable holds the (maximum) size in bytes of a long (32
3893 bit or so) jump created by @code{md_create_long_jump()}.  This
3894 variable is used as part of the broken-word feature, and isn't
3895 needed if the assembler is compiled with
3896 @samp{-DWORKING_DOT_WORD}.
3897
3898 @item void md_create_short_jump(char *resultPTR,long from_addr,
3899 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3900 This function emits a jump from @var{from_addr} to @var{to_addr} in
3901 the array of bytes pointed to by @var{resultPTR}.  If this creates a
3902 type of jump that must be relocated, this function should call
3903 @code{fix_new()} with @var{frag} and @var{to_symbol}.  The jump
3904 emitted by this function may be smaller than @var{md_short_jump_size},
3905 but it must never create a larger one.
3906 (If it creates a smaller jump, the extra bytes of memory will not be
3907 used.)  This function is used as part of the broken-word feature,
3908 and isn't needed if the assembler is compiled with
3909 @samp{-DWORKING_DOT_WORD}.@refill
3910
3911 @item void md_create_long_jump(char *ptr,long from_addr,
3912 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3913 This function is similar to the previous function,
3914 @code{md_create_short_jump()}, except that it creates a long
3915 jump instead of a short one.  This function is used as part of
3916 the broken-word feature, and isn't needed if the assembler is
3917 compiled with @samp{-DWORKING_DOT_WORD}.
3918
3919 @item int md_estimate_size_before_relax(fragS *fragPTR,int segment_type)
3920 This function does the initial setting up for relaxation.  This
3921 includes forcing references to still-undefined symbols to the
3922 appropriate addressing modes.
3923
3924 @item relax_typeS md_relax_table[];
3925 (relax_typeS is defined in md.h)
3926 This array describes the various machine dependent states a
3927 frag may be in before relaxation.  You will need one group of
3928 entries for each type of addressing mode you intend to relax.
3929
3930 @item void md_convert_frag(fragS *fragPTR)
3931 (@var{fragS} is defined in @file{as.h})
3932 This routine does the required cleanup after relaxation.
3933 Relaxation has changed the type of the frag to a type that can
3934 reach its destination.  This function should adjust the opcode
3935 of the frag to use the appropriate addressing mode.
3936 @var{fragPTR} points to the frag to clean up.
3937
3938 @item void md_end(void)
3939 This function is called just before the assembler exits.  It
3940 need not free up memory unless the operating system doesn't do
3941 it automatically on exit.  (In which case you'll also have to
3942 track down all the other places where the assembler allocates
3943 space but never frees it.)
3944
3945 @end table
3946
3947 @section External Variables You will Need to Use
3948
3949 You will need to refer to or change the following external variables
3950 from within the machine-dependent part of the assembler.
3951
3952 @table @code
3953 @item extern char flagseen[];
3954 This array holds non-zero values in locations corresponding to
3955 the options that were on the command line.  Thus, if the
3956 assembler was called with @samp{-W}, @var{flagseen['W']} would
3957 be non-zero.
3958
3959 @item extern fragS *frag_now;
3960 This pointer points to the current frag--the frag that bytes
3961 are currently being added to.  If nothing else, you will need
3962 to pass it as an argument to various machine-independent
3963 functions.  It is maintained automatically by the
3964 frag-manipulating functions; you should never have to change it
3965 yourself.
3966
3967 @item extern LITTLENUM_TYPE generic_bignum[];
3968 (@var{LITTLENUM_TYPE} is defined in @file{bignum.h}.
3969 This is where @dfn{bignums}--numbers larger than 32 bits--are
3970 returned when they are encountered in an expression. You will
3971 need to use this if you need to implement directives (or
3972 anything else) that must deal with these large numbers.
3973 @code{Bignums} are of @code{segT} @code{SEG_BIG} (defined in
3974 @file{as.h}, and have a positive @code{X_add_number}.  The
3975 @code{X_add_number} of a @code{bignum} is the number of
3976 @code{LITTLENUMS} in @var{generic_bignum} that the number takes
3977 up.
3978
3979 @item extern FLONUM_TYPE generic_floating_point_number;
3980 (@var{FLONUM_TYPE} is defined in @file{flonum.h}.
3981 The is where @dfn{flonums}--floating-point numbers within
3982 expressions--are returned.  @code{Flonums} are of @code{segT}
3983 @code{SEG_BIG}, and have a negative @code{X_add_number}.
3984 @code{Flonums} are returned in a generic format.  You will have
3985 to write a routine to turn this generic format into the
3986 appropriate floating-point format for your machine.
3987
3988 @item extern int need_pass_2;
3989 If this variable is non-zero, the assembler has encountered an
3990 expression that cannot be assembled in a single pass.  Since
3991 the second pass isn't implemented, this flag means that the
3992 assembler is punting, and is only looking for additional syntax
3993 errors.  (Or something like that.)
3994
3995 @item extern segT now_seg;
3996 This variable holds the value of the segment the assembler is
3997 currently assembling into.
3998
3999 @end table
4000
4001 @section External functions will you need
4002
4003 You will find the following external functions useful (or
4004 indispensable) when you're writing the machine-dependent part
4005 of the assembler.
4006
4007 @table @code
4008
4009 @item char *frag_more(int bytes)
4010 This function allocates @var{bytes} more bytes in the current
4011 frag (or starts a new frag, if it can't expand the current frag
4012 any more.)  for you to store some object-file bytes in.  It
4013 returns a pointer to the bytes, ready for you to store data in.
4014
4015 @item void fix_new(fragS *frag, int where, short size, symbolS *add_symbol, symbolS *sub_symbol, long offset, int pcrel)
4016 This function stores a relocation fixup to be acted on later.
4017 @var{frag} points to the frag the relocation belongs in;
4018 @var{where} is the location within the frag where the relocation begins;
4019 @var{size} is the size of the relocation, and is usually 1 (a single byte),
4020   2 (sixteen bits), or 4 (a longword).
4021 The value @var{add_symbol} @minus{} @var{sub_symbol} + @var{offset}, is added to the byte(s)
4022 at @var{frag->literal[where]}.  If @var{pcrel} is non-zero, the address of the
4023 location is subtracted from the result.  A relocation entry is also added
4024 to the @file{a.out} file.  @var{add_symbol}, @var{sub_symbol}, and/or
4025 @var{offset} may be NULL.@refill
4026
4027 @item char *frag_var(relax_stateT type, int max_chars, int var,
4028 @code{relax_substateT subtype, symbolS *symbol, char *opcode)}
4029 This function creates a machine-dependent frag of type @var{type}
4030 (usually @code{rs_machine_dependent}).
4031 @var{max_chars} is the maximum size in bytes that the frag may grow by;
4032 @var{var} is the current size of the variable end of the frag;
4033 @var{subtype} is the sub-type of the frag.  The sub-type is used to index into
4034 @var{md_relax_table[]} during @code{relaxation}.
4035 @var{symbol} is the symbol whose value should be used to when relax-ing this frag.
4036 @var{opcode} points into a byte whose value may have to be modified if the
4037 addressing mode used by this frag changes.  It typically points into the
4038 @var{fr_literal[]} of the previous frag, and is used to point to a location
4039 that @code{md_convert_frag()}, may have to change.@refill
4040
4041 @item void frag_wane(fragS *fragPTR)
4042 This function is useful from within @code{md_convert_frag}.  It
4043 changes a frag to type rs_fill, and sets the variable-sized
4044 piece of the frag to zero.  The frag will never change in size
4045 again.
4046
4047 @item segT expression(expressionS *retval)
4048 (@var{segT} is defined in @file{as.h}; @var{expressionS} is defined in @file{expr.h})
4049 This function parses the string pointed to by the external char
4050 pointer @var{input_line_pointer}, and returns the segment-type
4051 of the expression.  It also stores the results in the
4052 @var{expressionS} pointed to by @var{retval}.
4053 @var{input_line_pointer} is advanced to point past the end of
4054 the expression.  (@var{input_line_pointer} is used by other
4055 parts of the assembler.  If you modify it, be sure to restore
4056 it to its original value.)
4057
4058 @item as_warn(char *message,@dots{})
4059 If warning messages are disabled, this function does nothing.
4060 Otherwise, it prints out the current file name, and the current
4061 line number, then uses @code{fprintf} to print the
4062 @var{message} and any arguments it was passed.
4063
4064 @item as_bad(char *message,@dots{})
4065 This function should be called when @code{as} encounters
4066 conditions that are bad enough that @code{as} should not
4067 produce an object file, but should continue reading input and
4068 printing warning and bad error messages.
4069
4070 @item as_fatal(char *message,@dots{})
4071 This function prints out the current file name and line number,
4072 prints the word @samp{FATAL:}, then uses @code{fprintf} to
4073 print the @var{message} and any arguments it was passed.  Then
4074 the assembler exits.  This function should only be used for
4075 serious, unrecoverable errors.
4076
4077 @item void float_const(int float_type)
4078 This function reads floating-point constants from the current
4079 input line, and calls @code{md_atof} to assemble them.  It is
4080 useful as the function to call for the directives
4081 @samp{.single}, @samp{.double}, @samp{.float}, etc.
4082 @var{float_type} must be a character from @var{FLT_CHARS}.
4083
4084 @item void demand_empty_rest_of_line(void);
4085 This function can be used by machine-dependent directives to
4086 make sure the rest of the input line is empty.  It prints a
4087 warning message if there are additional characters on the line.
4088
4089 @item long int get_absolute_expression(void)
4090 This function can be used by machine-dependent directives to
4091 read an absolute number from the current input line.  It
4092 returns the result.  If it isn't given an absolute expression,
4093 it prints a warning message and returns zero.
4094
4095 @end table
4096
4097
4098 @section The concept of Frags
4099
4100 This assembler works to optimize the size of certain addressing
4101 modes.  (e.g. branch instructions) This means the size of many
4102 pieces of object code cannot be determined until after assembly
4103 is finished.  (This means that the addresses of symbols cannot be
4104 determined until assembly is finished.)  In order to do this,
4105 @code{as} stores the output bytes as @dfn{frags}.
4106
4107 Here is the definition of a frag (from @file{as.h})
4108 @example
4109 struct frag
4110 @{
4111         long int fr_fix;
4112         long int fr_var;
4113         relax_stateT fr_type;
4114         relax_substateT fr_substate;
4115         unsigned long fr_address;
4116         long int fr_offset;
4117         struct symbol *fr_symbol;
4118         char *fr_opcode;
4119         struct frag *fr_next;
4120         char fr_literal[];
4121 @}
4122 @end example
4123
4124 @table @var
4125 @item fr_fix
4126 is the size of the fixed-size piece of the frag.
4127
4128 @item fr_var
4129 is the maximum (?) size of the variable-sized piece of the frag.
4130
4131 @item fr_type
4132 is the type of the frag.
4133 Current types are:
4134 rs_fill
4135 rs_align
4136 rs_org
4137 rs_machine_dependent
4138
4139 @item fr_substate
4140 This stores the type of machine-dependent frag this is.  (what
4141 kind of addressing mode is being used, and what size is being
4142 tried/will fit/etc.
4143
4144 @item fr_address
4145 @var{fr_address} is only valid after relaxation is finished.
4146 Before relaxation, the only way to store an address is (pointer
4147 to frag containing the address) plus (offset into the frag).
4148
4149 @item fr_offset
4150 This contains a number, whose meaning depends on the type of
4151 the frag.
4152 for machine_dependent frags, this contains the offset from
4153 fr_symbol that the frag wants to go to.  Thus, for branch
4154 instructions it is usually zero.  (unless the instruction was
4155 @samp{jba foo+12}  or something like that.)
4156
4157 @item fr_symbol
4158 for machine_dependent frags, this points to the symbol the frag
4159 needs to reach.
4160
4161 @item fr_opcode
4162 This points to the location in the frag (or in a previous frag)
4163 of the opcode for the instruction that caused this to be a frag.
4164 @var{fr_opcode} is needed if the actual opcode must be changed
4165 in order to use a different form of the addressing mode.
4166 (For example, if a conditional branch only comes in size tiny,
4167 a large-size branch could be implemented by reversing the sense
4168 of the test, and turning it into a tiny branch over a large jump.
4169 This would require changing the opcode.)
4170
4171 @var{fr_literal} is a variable-size array that contains the
4172 actual object bytes.  A frag consists of a fixed size piece of
4173 object data, (which may be zero bytes long), followed by a
4174 piece of object data whose size may not have been determined
4175 yet.  Other information includes the type of the frag (which
4176 controls how it is relaxed),
4177
4178 @item fr_next
4179 This is the next frag in the singly-linked list.  This is
4180 usually only needed by the machine-independent part of
4181 @code{as}.
4182
4183 @end table
4184 @end ignore
4185
4186 @node License,  , Retargeting, Top
4187 @unnumbered GNU GENERAL PUBLIC LICENSE
4188 @center Version 1, February 1989
4189
4190 @display
4191 Copyright @copyright{} 1989 Free Software Foundation, Inc.
4192 675 Mass Ave, Cambridge, MA 02139, USA
4193
4194 Everyone is permitted to copy and distribute verbatim copies
4195 of this license document, but changing it is not allowed.
4196 @end display
4197
4198 @unnumberedsec Preamble
4199
4200   The license agreements of most software companies try to keep users
4201 at the mercy of those companies.  By contrast, our General Public
4202 License is intended to guarantee your freedom to share and change free
4203 software---to make sure the software is free for all its users.  The
4204 General Public License applies to the Free Software Foundation's
4205 software and to any other program whose authors commit to using it.
4206 You can use it for your programs, too.
4207
4208   When we speak of free software, we are referring to freedom, not
4209 price.  Specifically, the General Public License is designed to make
4210 sure that you have the freedom to give away or sell copies of free
4211 software, that you receive source code or can get it if you want it,
4212 that you can change the software or use pieces of it in new free
4213 programs; and that you know you can do these things.
4214
4215   To protect your rights, we need to make restrictions that forbid
4216 anyone to deny you these rights or to ask you to surrender the rights.
4217 These restrictions translate to certain responsibilities for you if you
4218 distribute copies of the software, or if you modify it.
4219
4220   For example, if you distribute copies of a such a program, whether
4221 gratis or for a fee, you must give the recipients all the rights that
4222 you have.  You must make sure that they, too, receive or can get the
4223 source code.  And you must tell them their rights.
4224
4225   We protect your rights with two steps: (1) copyright the software, and
4226 (2) offer you this license which gives you legal permission to copy,
4227 distribute and/or modify the software.
4228
4229   Also, for each author's protection and ours, we want to make certain
4230 that everyone understands that there is no warranty for this free
4231 software.  If the software is modified by someone else and passed on, we
4232 want its recipients to know that what they have is not the original, so
4233 that any problems introduced by others will not reflect on the original
4234 authors' reputations.
4235
4236   The precise terms and conditions for copying, distribution and
4237 modification follow.
4238
4239 @iftex
4240 @unnumberedsec TERMS AND CONDITIONS
4241 @end iftex
4242 @ifinfo
4243 @center TERMS AND CONDITIONS
4244 @end ifinfo
4245
4246 @enumerate
4247 @item
4248 This License Agreement applies to any program or other work which
4249 contains a notice placed by the copyright holder saying it may be
4250 distributed under the terms of this General Public License.  The
4251 ``Program'', below, refers to any such program or work, and a ``work based
4252 on the Program'' means either the Program or any work containing the
4253 Program or a portion of it, either verbatim or with modifications.  Each
4254 licensee is addressed as ``you''.
4255
4256 @item
4257 You may copy and distribute verbatim copies of the Program's source
4258 code as you receive it, in any medium, provided that you conspicuously and
4259 appropriately publish on each copy an appropriate copyright notice and
4260 disclaimer of warranty; keep intact all the notices that refer to this
4261 General Public License and to the absence of any warranty; and give any
4262 other recipients of the Program a copy of this General Public License
4263 along with the Program.  You may charge a fee for the physical act of
4264 transferring a copy.
4265
4266 @item
4267 You may modify your copy or copies of the Program or any portion of
4268 it, and copy and distribute such modifications under the terms of Paragraph
4269 1 above, provided that you also do the following:
4270
4271 @itemize @bullet
4272 @item
4273 cause the modified files to carry prominent notices stating that
4274 you changed the files and the date of any change; and
4275
4276 @item
4277 cause the whole of any work that you distribute or publish, that
4278 in whole or in part contains the Program or any part thereof, either
4279 with or without modifications, to be licensed at no charge to all
4280 third parties under the terms of this General Public License (except
4281 that you may choose to grant warranty protection to some or all
4282 third parties, at your option).
4283
4284 @item
4285 If the modified program normally reads commands interactively when
4286 run, you must cause it, when started running for such interactive use
4287 in the simplest and most usual way, to print or display an
4288 announcement including an appropriate copyright notice and a notice
4289 that there is no warranty (or else, saying that you provide a
4290 warranty) and that users may redistribute the program under these
4291 conditions, and telling the user how to view a copy of this General
4292 Public License.
4293
4294 @item
4295 You may charge a fee for the physical act of transferring a
4296 copy, and you may at your option offer warranty protection in
4297 exchange for a fee.
4298 @end itemize
4299
4300 Mere aggregation of another independent work with the Program (or its
4301 derivative) on a volume of a storage or distribution medium does not bring
4302 the other work under the scope of these terms.
4303
4304 @item
4305 You may copy and distribute the Program (or a portion or derivative of
4306 it, under Paragraph 2) in object code or executable form under the terms of
4307 Paragraphs 1 and 2 above provided that you also do one of the following:
4308
4309 @itemize @bullet
4310 @item
4311 accompany it with the complete corresponding machine-readable
4312 source code, which must be distributed under the terms of
4313 Paragraphs 1 and 2 above; or,
4314
4315 @item
4316 accompany it with a written offer, valid for at least three
4317 years, to give any third party free (except for a nominal charge
4318 for the cost of distribution) a complete machine-readable copy of the
4319 corresponding source code, to be distributed under the terms of
4320 Paragraphs 1 and 2 above; or,
4321
4322 @item
4323 accompany it with the information you received as to where the
4324 corresponding source code may be obtained.  (This alternative is
4325 allowed only for noncommercial distribution and only if you
4326 received the program in object code or executable form alone.)
4327 @end itemize
4328
4329 Source code for a work means the preferred form of the work for making
4330 modifications to it.  For an executable file, complete source code means
4331 all the source code for all modules it contains; but, as a special
4332 exception, it need not include source code for modules which are standard
4333 libraries that accompany the operating system on which the executable
4334 file runs, or for standard header files or definitions files that
4335 accompany that operating system.
4336
4337 @item
4338 You may not copy, modify, sublicense, distribute or transfer the
4339 Program except as expressly provided under this General Public License.
4340 Any attempt otherwise to copy, modify, sublicense, distribute or transfer
4341 the Program is void, and will automatically terminate your rights to use
4342 the Program under this License.  However, parties who have received
4343 copies, or rights to use copies, from you under this General Public
4344 License will not have their licenses terminated so long as such parties
4345 remain in full compliance.
4346
4347 @item
4348 By copying, distributing or modifying the Program (or any work based
4349 on the Program) you indicate your acceptance of this license to do so,
4350 and all its terms and conditions.
4351
4352 @item
4353 Each time you redistribute the Program (or any work based on the
4354 Program), the recipient automatically receives a license from the original
4355 licensor to copy, distribute or modify the Program subject to these
4356 terms and conditions.  You may not impose any further restrictions on the
4357 recipients' exercise of the rights granted herein.
4358
4359 @item
4360 The Free Software Foundation may publish revised and/or new versions
4361 of the General Public License from time to time.  Such new versions will
4362 be similar in spirit to the present version, but may differ in detail to
4363 address new problems or concerns.
4364
4365 Each version is given a distinguishing version number.  If the Program
4366 specifies a version number of the license which applies to it and ``any
4367 later version'', you have the option of following the terms and conditions
4368 either of that version or of any later version published by the Free
4369 Software Foundation.  If the Program does not specify a version number of
4370 the license, you may choose any version ever published by the Free Software
4371 Foundation.
4372
4373 @item
4374 If you wish to incorporate parts of the Program into other free
4375 programs whose distribution conditions are different, write to the author
4376 to ask for permission.  For software which is copyrighted by the Free
4377 Software Foundation, write to the Free Software Foundation; we sometimes
4378 make exceptions for this.  Our decision will be guided by the two goals
4379 of preserving the free status of all derivatives of our free software and
4380 of promoting the sharing and reuse of software generally.
4381
4382 @iftex
4383 @heading NO WARRANTY
4384 @end iftex
4385 @ifinfo
4386 @center NO WARRANTY
4387 @end ifinfo
4388
4389 @item
4390 BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
4391 FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
4392 OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
4393 PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
4394 OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
4395 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
4396 TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
4397 PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
4398 REPAIR OR CORRECTION.
4399
4400 @item
4401 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL
4402 ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
4403 REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
4404 INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
4405 ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT
4406 LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
4407 SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
4408 WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
4409 ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
4410 @end enumerate
4411
4412 @iftex
4413 @heading END OF TERMS AND CONDITIONS
4414 @end iftex
4415 @ifinfo
4416 @center END OF TERMS AND CONDITIONS
4417 @end ifinfo
4418
4419 @page
4420 @unnumberedsec Appendix: How to Apply These Terms to Your New Programs
4421
4422   If you develop a new program, and you want it to be of the greatest
4423 possible use to humanity, the best way to achieve this is to make it
4424 free software which everyone can redistribute and change under these
4425 terms.
4426
4427   To do so, attach the following notices to the program.  It is safest to
4428 attach them to the start of each source file to most effectively convey
4429 the exclusion of warranty; and each file should have at least the
4430 ``copyright'' line and a pointer to where the full notice is found.
4431
4432 @smallexample
4433 @var{one line to give the program's name and a brief idea of what it does.}
4434 Copyright (C) 19@var{yy}  @var{name of author}
4435
4436 This program is free software; you can redistribute it and/or modify
4437 it under the terms of the GNU General Public License as published by
4438 the Free Software Foundation; either version 1, or (at your option)
4439 any later version.
4440
4441 This program is distributed in the hope that it will be useful,
4442 but WITHOUT ANY WARRANTY; without even the implied warranty of
4443 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
4444 GNU General Public License for more details.
4445
4446 You should have received a copy of the GNU General Public License
4447 along with this program; if not, write to the Free Software
4448 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
4449 @end smallexample
4450
4451 Also add information on how to contact you by electronic and paper mail.
4452
4453 If the program is interactive, make it output a short notice like this
4454 when it starts in an interactive mode:
4455
4456 @smallexample
4457 Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
4458 Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
4459 This is free software, and you are welcome to redistribute it
4460 under certain conditions; type `show c' for details.
4461 @end smallexample
4462
4463 The hypothetical commands `show w' and `show c' should show the
4464 appropriate parts of the General Public License.  Of course, the
4465 commands you use may be called something other than `show w' and `show
4466 c'; they could even be mouse-clicks or menu items---whatever suits your
4467 program.
4468
4469 You should also get your employer (if you work as a programmer) or your
4470 school, if any, to sign a ``copyright disclaimer'' for the program, if
4471 necessary.  Here is a sample; alter the names:
4472
4473 @example
4474 Yoyodyne, Inc., hereby disclaims all copyright interest in the
4475 program `Gnomovision' (a program to direct compilers to make passes
4476 at assemblers) written by James Hacker.
4477
4478 @var{signature of Ty Coon}, 1 April 1989
4479 Ty Coon, President of Vice
4480 @end example
4481
4482 That's all there is to it!
4483
4484
4485 @summarycontents
4486 @contents
4487 @bye