8 This file documents the GNU Assembler "as".
10 Copyright (C) 1991 Free Software Foundation, Inc.
12 Permission is granted to make and distribute verbatim copies of
13 this manual provided the copyright notice and this permission notice
14 are preserved on all copies.
17 Permission is granted to process this file through Tex and print the
18 results, provided the printed document carries copying permission
19 notice identical to this one except for the removal of this paragraph
20 (this paragraph not being relevant to the printed manual).
23 Permission is granted to copy and distribute modified versions of this
24 manual under the conditions for verbatim copying, provided also that the
25 section entitled ``GNU General Public License'' is included exactly as
26 in the original, and provided that the entire resulting derived work is
27 distributed under the terms of a permission notice identical to this
30 Permission is granted to copy and distribute translations of this manual
31 into another language, under the above conditions for modified versions,
32 except that the section entitled ``GNU General Public License'' may be
33 included in a translation approved by the author instead of in the
40 @setchapternewpage odd
42 @c @settitle Using GNU as (680x0)
45 @settitle Using GNU as (AMD 29K)
49 @subtitle{The GNU Assembler}
51 @c @subtitle{for Motorola 680x0}
54 @subtitle{for the AMD 29K family}
57 @subtitle February 1991
59 The Free Software Foundation Inc. thanks The Nice Computer
60 Company of Australia for loaning Dean Elsner to write the
61 first (Vax) version of @code{as} for Project GNU.
62 The proprietors, management and staff of TNCCA thank FSF for
63 distracting the boss while they got some work
66 @author{Dean Elsner, Jay Fenlason & friends}
67 @author{revised by Roland Pesch for Cygnus Support}
71 \def\$#1${{#1}} % Kluge: collect RCS revision info without $...$
72 \xdef\manvers{\$Revision$} % For use in headers, footers too
74 \hfill Cygnus Support\par
76 \hfill \TeX{}info \texinfoversion\par
78 %"boxit" macro for figures:
79 %Modified from Knuth's ``boxit'' macro from TeXbook (answer to exercise 21.3)
80 \gdef\boxit#1#2{\vbox{\hrule\hbox{\vrule\kern3pt
81 \vbox{\parindent=0pt\parskip=0pt\hsize=#1\kern3pt\strut\hfil
82 #2\hfil\strut\kern3pt}\kern3pt\vrule}\hrule}}%box with visible outline
83 \gdef\ibox#1#2{\hbox to #1{#2\hfil}\kern8pt}% invisible box
86 @vskip 0pt plus 1filll
87 Copyright @copyright{} 1991 Free Software Foundation, Inc.
89 Permission is granted to make and distribute verbatim copies of
90 this manual provided the copyright notice and this permission notice
91 are preserved on all copies.
93 Permission is granted to copy and distribute modified versions of this
94 manual under the conditions for verbatim copying, provided also that the
95 section entitled ``GNU General Public License'' is included exactly as
96 in the original, and provided that the entire resulting derived work is
97 distributed under the terms of a permission notice identical to this
100 Permission is granted to copy and distribute translations of this manual
101 into another language, under the above conditions for modified versions,
102 except that the section entitled ``GNU General Public License'' may be
103 included in a translation approved by the author instead of in the
108 @node Top, Overview, (dir), (dir)
111 * Overview:: Overview
113 * Segments:: Segments and Relocation
115 * Expressions:: Expressions
116 * Pseudo Ops:: Assembler Directives
117 * Maintenance:: Maintaining the Assembler
118 * Retargeting:: Teaching the Assembler about a New Machine
119 * License:: GNU GENERAL PUBLIC LICENSE
121 --- The Detailed Node Listing ---
125 * Invoking:: Invoking @code{as}
126 * Manual:: Structure of this Manual
127 * GNU Assembler:: as, the GNU Assembler
128 * Command Line:: Command Line
129 * Input Files:: Input Files
130 * Object:: Output (Object) File
131 * Errors:: Error and Warning Messages
136 * Filenames:: Input Filenames and Line-numbers
140 * Pre-processing:: Pre-processing
141 * Whitespace:: Whitespace
142 * Comments:: Comments
143 * Symbol Intro:: Symbols
144 * Statements:: Statements
145 * Constants:: Constants
149 * Characters:: Character Constants
150 * Numbers:: Number Constants
157 Segments and Relocation
159 * Segs Background:: Background
160 * ld Segments:: ld Segments
161 * as Segments:: as Internal Segments
162 * Sub-Segments:: Sub-Segments
165 Segments and Relocation
167 * ld Segments:: ld Segments
168 * as Segments:: as Internal Segments
169 * Sub-Segments:: Sub-Segments
175 * Setting Symbols:: Giving Symbols Other Values
176 * Symbol Names:: Symbol Names
177 * Dot:: The Special Dot Symbol
178 * Symbol Attributes:: Symbol Attributes
182 * Local Symbols:: Local Symbol Names
186 * Symbol Value:: Value
188 * Symbol Desc:: Descriptor
189 * Symbol Other:: Other
193 * Empty Exprs:: Empty Expressions
194 * Integer Exprs:: Integer Expressions
198 * Arguments:: Arguments
199 * Operators:: Operators
200 * Prefix Ops:: Prefix Operators
201 * Infix Ops:: Infix Operators
205 * Abort:: The Abort directive causes as to abort
206 * Align:: Pad the location counter to a power of 2
207 * App-File:: Set the logical file name
208 * Ascii:: Fill memory with bytes of ASCII characters
209 * Asciz:: Fill memory with bytes of ASCII characters followed
211 * Byte:: Fill memory with 8-bit integers
212 * Comm:: Reserve public space in the BSS segment
213 * Data:: Change to the data segment
214 * Desc:: Set the n_desc of a symbol
215 * Double:: Fill memory with double-precision floating-point numbers
216 * Else:: @code{.else}
218 * Endif:: @code{.endif}
219 * Equ:: @code{.equ @var{symbol}, @var{expression}}
220 * Extern:: @code{.extern}
221 * Fill:: Fill memory with repeated values
222 * Float:: Fill memory with single-precision floating-point numbers
223 * Global:: Make a symbol visible to the linker
224 * Ident:: @code{.ident}
225 * If:: @code{.if @var{absolute expression}}
226 * Include:: @code{.include "@var{file}"}
227 * Int:: Fill memory with 32-bit integers
228 * Lcomm:: Reserve private space in the BSS segment
229 * Line:: Set the logical line number
230 * Ln:: @code{.ln @var{line-number}}
231 * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
232 * Long:: Fill memory with 32-bit integers
233 * Lsym:: Create a local symbol
234 * Octa:: Fill memory with 128-bit integers
235 * Org:: Change the location counter
236 * Quad:: Fill memory with 64-bit integers
237 * Set:: Set the value of a symbol
238 * Short:: Fill memory with 16-bit integers
239 * Single:: @code{.single @var{flonums}}
240 * Stab:: Store debugging information
241 * Text:: Change to the text segment
243 * Word:: Fill memory with 32-bit integers
244 @c else (not am29k or sparc)
245 * Deprecated:: Deprecated Directives
246 * Machine Options:: Options
247 * Machine Syntax:: Syntax
248 * Floating Point:: Floating Point
249 * Machine Directives:: Machine Directives
254 * block:: @code{.block @var{size} , @var{fill}}
255 * cputype:: @code{.cputype}
256 * file:: @code{.file}
257 * hword:: @code{.hword @var{expressions}}
258 * line:: @code{.line}
259 * reg:: @code{.reg @var{symbol}, @var{expression}}
260 * sect:: @code{.sect}
261 * use:: @code{.use @var{segment name}}
264 @node Overview, Syntax, Top, Top
267 This manual is a user guide to the GNU assembler @code{as}.
269 @c The following should be conditional on machine config
271 @c This version of the manual describes @code{as} configured to generate
272 @c code for Motorola 680x0 architectures.
275 This version of the manual describes @code{as} configured to generate
276 code for Advanced Micro Devices' 29K architectures.
280 * Invoking:: Invoking @code{as}
281 * Manual:: Structure of this Manual
282 * GNU Assembler:: as, the GNU Assembler
283 * Command Line:: Command Line
284 * Input Files:: Input Files
285 * Object:: Output (Object) File
286 * Errors:: Error and Warning Messages
290 @node Invoking, Manual, Overview, Overview
291 @section Invoking @code{as}
293 Here is a brief summary of how to invoke GNU @code{as}. For details,
296 @c We don't use @deffn and friends for the following because they seem
297 @c to be limited to one line for the header.
299 as [ -D ] [ -f ] [ -I @var{path} ] [ -k ] [ -L ] [ -o @var{objfile} ] [ -R ] [ -v ] [ -w ]
301 @c [ -l ] [ -mc68000 | -mc68010 | -mc68020 ]
304 @c@c am29k has no machine-dependent assembler options
306 [ -- | @var{files} @dots{} ]
312 This option is accepted only for script compatibility with calls to
313 other assemblers; it has no effect on GNU @code{as}.
316 ``fast''---skip preprocessing (assume source is compiler output)
319 Add @var{path} to the search list for @code{.include} directives
323 This option is accepted but has no effect on the 29K family.
326 @c Issue warnings when difference tables altered for long displacements
330 Keep (in symbol table) local symbols, starting with @samp{L}
332 @item -o @var{objfile}
333 Name the object-file output from @code{as}
336 Fold data segment into text segment
339 Suppress warning messages
343 @c Shorten references to undefined symbols, to one word instead of two
345 @c @item -mc68000 | -mc68010 | -mc68020
346 @c Specify what processor in the 68000 family is the target (default 68020)
349 @item -- | @var{files} @dots{}
350 Source files to assemble, or standard input
353 @node Manual, GNU Assembler, Invoking, Overview
354 @section Structure of this Manual
355 This document is intended to describe what you need to know to use GNU
356 @code{as}. We cover the syntax expected in source files, including
357 notation for symbols, constants, and expressions; the directives that
358 @code{as} understands; and of course how to invoke @code{as}.
361 @c We also cover special features in the 68000 configuration of @code{as},
362 @c including pseudo-operations.
365 We also cover special features in the AMD 29K configuration of @code{as},
366 including assembler directives.
370 This document also describes some of the
371 machine-dependent features of various flavors of the assembler.
372 This document also describes how the assembler works internally, and
373 provides some information that may be useful to people attempting to
374 port the assembler to another machine.
377 On the other hand, this manual is @emph{not} intended as an introduction
378 to programming in assembly language---let alone programming in general!
379 In a similar vein, we make no attempt to introduce the machine
380 architecture; we do @emph{not} describe the instruction set, standard
381 mnemonics, registers or addressing modes that are standard to a
382 particular architecture. You may want to consult the manufacturer's
383 machine architecture manual for this information.
386 @c I think this is premature---pesch@cygnus.com, 17jan1991
388 Throughout this document, we assume that you are running @dfn{GNU},
389 the portable operating system from the @dfn{Free Software
390 Foundation, Inc.}. This restricts our attention to certain kinds of
391 computer (in particular, the kinds of computers that GNU can run on);
392 once this assumption is granted examples and definitions need less
395 @code{as} is part of a team of programs that turn a high-level
396 human-readable series of instructions into a low-level
397 computer-readable series of instructions. Different versions of
398 @code{as} are used for different kinds of computer. In particular,
399 at the moment, @code{as} only works for the DEC Vax, the Motorola
400 680x0, the Intel 80386, the Sparc, and the National Semiconductor
404 @c There used to be a section "Terminology" here, which defined
405 @c "contents", "byte", "word", and "long". Defining "word" to any
406 @c particular size is confusing when the .word directive may generate 16
407 @c bits on one machine and 32 bits on another; in general, for the user
408 @c version of this manual, none of these terms seem essential to define.
409 @c They were used very little even in the former draft of the manual;
410 @c this draft makes an effort to avoid them (except in names of
413 @node GNU Assembler, Command Line, Manual, Overview
414 @section as, the GNU Assembler
415 @code{as} is primarily intended to assemble the output of the GNU C
416 compiler @code{gcc} for use by the linker @code{ld}. Nevertheless,
417 we've tried to make @code{as} assemble correctly everything that the native
421 Any exceptions are documented explicitly (@pxref{Machine Dependent}).
424 This doesn't mean @code{as} always uses the same syntax as another
425 assembler for the same architecture; for example, we know of several
426 incompatible versions of 680x0 assembly language syntax.
428 GNU @code{as} is really a family of assemblers. If you use (or have
429 used) GNU @code{as} on another architecture, you should find a fairly
430 similar environment. Each version has much in common with the others,
431 including object file formats, most assembler directives (often called
432 @dfn{pseudo-ops)} and assembler syntax.
434 Unlike older assemblers, @code{as} is designed to assemble a source
435 program in one pass of the source file. This has a subtle impact on the
436 @kbd{.org} directive (@pxref{Org}).
438 @node Command Line, Input Files, GNU Assembler, Overview
439 @section Command Line
441 After the program name @code{as}, the command line may contain
442 options and file names. Options may be in any order, and may be
443 before, after, or between file names. The order of file names is
446 @file{--} (two hyphens) by itself names the standard input file
447 explicitly, as one of the files for @code{as} to assemble.
449 Except for @samp{--} any command line argument that begins with a
450 hyphen (@samp{-}) is an option. Each option changes the behavior of
451 @code{as}. No option changes the way another option works. An
452 option is a @samp{-} followed by one or more letters; the case of
453 the letter is important. All options are optional.
455 Some options expect exactly one file name to follow them. The file
456 name may either immediately follow the option's letter (compatible
457 with older assemblers) or it may be the next command argument (GNU
458 standard). These two command lines are equivalent:
461 as -o my-object-file.o mumble
462 as -omy-object-file.o mumble
465 @node Input Files, Object, Command Line, Overview
468 We use the phrase @dfn{source program}, abbreviated @dfn{source}, to
469 describe the program input to one run of @code{as}. The program may
470 be in one or more files; how the source is partitioned into files
471 doesn't change the meaning of the source.
473 @c I added "con" prefix to "catenation" just to prove I can overcome my
474 @c APL training... pesch@cygnus.com
475 The source program is a concatenation of the text in all the files, in the
478 Each time you run @code{as} it assembles exactly one source
479 program. The source program is made up of one or more files.
480 (The standard input is also a file.)
482 You give @code{as} a command line that has zero or more input file
483 names. The input files are read (from left file name to right). A
484 command line argument (in any position) that has no special meaning
485 is taken to be an input file name.
487 If @code{as} is given no file names it attempts to read one input file
488 from @code{as}'s standard input, which is normally your terminal. You
489 may have to type @key{ctl-D} to tell @code{as} there is no more program
492 Use @samp{--} if you need to explicitly name the standard input file
493 in your command line.
495 If the source is empty, @code{as} will produce a small, empty object
499 * Filenames:: Input Filenames and Line-numbers
502 @node Filenames, , Input Files, Input Files
503 @subsection Input Filenames and Line-numbers
504 There are two ways of locating a line in the input file (or files) and both
505 are used in reporting error messages. One way refers to a line
506 number in a physical file; the other refers to a line number in a
509 @dfn{Physical files} are those files named in the command line given
512 @dfn{Logical files} are simply names declared explicitly by assembler
513 directives; they bear no relation to physical files. Logical file names
514 help error messages reflect the original source file, when @code{as}
515 source is itself synthesized from other files. @xref{App-File}.
517 @node Object, Errors, Input Files, Overview
518 @section Output (Object) File
519 Every time you run @code{as} it produces an output file, which is
520 your assembly language program translated into numbers. This file
521 is the object file, named @code{a.out} unless you tell @code{as} to
522 give it another name by using the @code{-o} option. Conventionally,
523 object file names end with @file{.o}. The default name of
524 @file{a.out} is used for historical reasons: older assemblers were
525 capable of assembling self-contained programs directly into a
527 @c This may still work, but hasn't been tested.
529 The object file is meant for input to the linker @code{ld}. It contains
530 assembled program code, information to help @code{ld} integrate
531 the assembled program into a runnable file, and (optionally) symbolic
532 information for the debugger.
534 @comment link above to some info file(s) like the description of a.out.
535 @comment don't forget to describe GNU info as well as Unix lossage.
537 @node Errors, Options, Object, Overview
538 @section Error and Warning Messages
540 @code{as} may write warnings and error messages to the standard error
541 file (usually your terminal). This should not happen when @code{as} is
542 run automatically by a compiler. Warnings report an assumption made so
543 that @code{as} could keep assembling a flawed program; errors report a
544 grave problem that stops the assembly.
546 Warning messages have the format
548 file_name:@b{NNN}:Warning Message Text
550 @noindent(where @b{NNN} is a line number). If a logical file name has
551 been given (@pxref{App-File}) it is used for the filename, otherwise the
552 name of the current input file is used. If a logical line number was
560 then it is used to calculate the number printed,
561 otherwise the actual line in the current source file is printed. The
562 message text is intended to be self explanatory (in the grand Unix
565 Error messages have the format
567 file_name:@b{NNN}:FATAL:Error Message Text
569 The file name and line number are derived as for warning
570 messages. The actual message text may be rather less explanatory
571 because many of them aren't supposed to happen.
574 @node Options, , Errors, Overview
576 @subsection @code{-D}
577 This option has no effect whatsoever, but it is accepted to make it more
578 likely that scripts written for other assemblers will also work with
582 @subsection Work Faster: @code{-f}
583 @samp{-f} should only be used when assembling programs written by a
584 (trusted) compiler. @samp{-f} stops the assembler from pre-processing
585 the input file(s) before assembling them.
587 @emph{Warning:} if the files actually need to be pre-processed (if they
588 contain comments, for example), @code{as} will not work correctly if
592 @subsection Add to @code{.include} search path: @code{-I} @var{path}
593 Use this option to add a @var{path} to the list of directories GNU
594 @code{as} will search for files specified in @code{.include} directives
595 (@pxref{Include}). You may use @code{-I} as many times as necessary to
596 include a variety of paths. The current working directory is always
597 searched first; after that, @code{as} searches any @samp{-I} directories
598 in the same order as they were specified (left to right) on the command
601 @subsection Warn if difference tables altered: @code{-k}
603 On the AMD 29K family, this option is allowed, but has no effect. It is
604 permitted for compatibility with GNU @code{as} on other platforms,
605 where it can be used to warn when @code{as} alters the machine code
606 generated for @samp{.word} directives in difference tables. The AMD 29K
607 family does not have the addressing limitations that sometimes lead to this
608 alteration on other platforms.
613 @code{as} sometimes alters the code emitted for directives of the form
614 @samp{.word @var{sym1}-@var{sym2}}; @pxref{Word}.
615 You can use the @samp{-k} option if you want a warning issued when this
620 @subsection Include Local Labels: @code{-L}
621 Labels beginning with @samp{L} (upper case only) are called @dfn{local
622 labels}. @xref{Symbol Names}. Normally you don't see such labels when
623 debugging, because they are intended for the use of programs (like
624 compilers) that compose assembler programs, not for your notice.
625 Normally both @code{as} and @code{ld} discard such labels, so you don't
626 normally debug with them.
628 This option tells @code{as} to retain those @samp{L@dots{}} symbols
629 in the object file. Usually if you do this you also tell the linker
630 @code{ld} to preserve symbols whose names begin with @samp{L}.
632 @subsection Name the Object File: @code{-o}
633 There is always one object file output when you run @code{as}. By
634 default it has the name @file{a.out}. You use this option (which
635 takes exactly one filename) to give the object file a different name.
637 Whatever the object file is called, @code{as} will overwrite any
638 existing file of the same name.
640 @subsection Fold Data Segment into Text Segment: @code{-R}
641 @code{-R} tells @code{as} to write the object file as if all
642 data-segment data lives in the text segment. This is only done at
643 the very last moment: your binary data are the same, but data
644 segment parts are relocated differently. The data segment part of
645 your object file is zero bytes long because all it bytes are
646 appended to the text segment. (@xref{Segments}.)
648 When you specify @code{-R} it would be possible to generate shorter
649 address displacements (because we don't have to cross between text and
650 data segment). We don't do this simply for compatibility with older
651 versions of @code{as}. In future, @code{-R} may work this way.
653 @subsection Suppress Warnings: @code{-W}
654 @code{as} should never give a warning or error message when
655 assembling compiler output. But programs written by people often
656 cause @code{as} to give a warning that a particular assumption was
657 made. All such warnings are directed to the standard error file.
658 If you use this option, no warnings are issued. This option only
659 affects the warning messages: it does not change any particular of how
660 @code{as} assembles your file. Errors, which stop the assembly, are
663 @node Syntax, Segments, Overview, Top
665 This chapter describes the machine-independent syntax allowed in a
666 source file. @code{as} syntax is similar to what many other assemblers
667 use; it is inspired in BSD 4.2
672 @c assembler, except that @code{as} does not
673 @c assemble Vax bit-fields.
677 * Pre-processing:: Pre-processing
678 * Whitespace:: Whitespace
679 * Comments:: Comments
680 * Symbol Intro:: Symbols
681 * Statements:: Statements
682 * Constants:: Constants
685 @node Pre-processing, Whitespace, Syntax, Syntax
686 @section Pre-processing
691 adjusts and removes extra whitespace. It leaves one space or tab before
692 the keywords on a line, and turns any other whitespace on the line into
696 removes all comments, replacing them with a single space, or an
697 appropriate number of newlines.
700 converts character constants into the appropriate numeric values.
703 Excess whitespace, comments, and character constants
704 cannot be used in the portions of the input text that are not
707 If the first line of an input file is @code{#NO_APP} or the @samp{-f}
708 option is given, the input file will not be pre-processed. Within such
709 an input file, parts of the file can be pre-processed by putting a line
710 that says @code{#APP} before the text that should be pre-processed, and
711 putting a line that says @code{#NO_APP} after them. This feature is
712 mainly intend to support @code{asm} statements in compilers whose output
713 normally does not need to be pre-processed.
715 @node Whitespace, Comments, Pre-processing, Syntax
717 @dfn{Whitespace} is one or more blanks or tabs, in any order.
718 Whitespace is used to separate symbols, and to make programs neater
719 for people to read. Unless within character constants
720 (@pxref{Characters}), any whitespace means the same as exactly one
723 @node Comments, Symbol Intro, Whitespace, Syntax
725 There are two ways of rendering comments to @code{as}. In both
726 cases the comment is equivalent to one space.
728 Anything from @samp{/*} through the next @samp{*/} is a comment.
729 This means you may not nest these comments.
733 The only way to include a newline ('\n') in a comment
734 is to use this sort of comment.
737 /* This sort of comment does not nest. */
740 Anything from the @dfn{line comment} character to the next newline
741 is considered a comment and is ignored. The line comment character is
743 @c @samp{#} on the Vax. @xref{Machine Dependent}. @refill
746 @c @samp{|} on the 680x0. @xref{Machine Dependent}. @refill
749 @samp{;} for the AMD 29K family. @xref{Machine Dependent}. @refill
753 On some machines there are two different line comment characters. One
754 will only begin a comment if it is the first non-whitespace character on
755 a line, while the other will always begin a comment.
759 To be compatible with past assemblers a special interpretation is
760 given to lines that begin with @samp{#}. Following the @samp{#} an
761 absolute expression (@pxref{Expressions}) is expected: this will be
762 the logical line number of the @b{next} line. Then a string
763 (@xref{Strings}.) is allowed: if present it is a new logical file
764 name. The rest of the line, if any, should be whitespace.
766 If the first non-whitespace characters on the line are not numeric,
767 the line is ignored. (Just like a comment.)
769 # This is an ordinary comment.
770 # 42-6 "new_file_name" # New logical file name
771 # This is logical line # 36.
773 This feature is deprecated, and may disappear from future versions
776 @node Symbol Intro, Statements, Comments, Syntax
778 A @dfn{symbol} is one or more characters chosen from the set of all
779 letters (both upper and lower case), digits and the three characters
780 @samp{_.$}. No symbol may begin with a digit. Case is significant.
781 There is no length limit: all characters are significant. Symbols are
782 delimited by characters not in that set, or by the beginning of a file
783 (since the source program must end with a newline, the end of a file is
784 not a possible symbol delimiter). @xref{Symbols}.
786 @node Statements, Constants, Symbol Intro, Syntax
788 A @dfn{statement} ends at a newline character (@samp{\n})
789 @c @if m680x0 (or is this if !am29k?)
790 @c or at a semicolon (@samp{;}). The newline or semicolon
791 @c fi m680x0 (or !am29k)
793 or an ``at'' sign (@samp{@@}). The newline or at sign
796 of the preceding statement. Newlines
797 @c if m680x0 (or !am29k)
799 @c fi m680x0 (or !am29k)
804 character constants are an exception: they don't end statements.
805 It is an error to end any statement with end-of-file: the last
806 character of any input file should be a newline.@refill
808 You may write a statement on more than one line if you put a
809 backslash (@kbd{\}) immediately in front of any newlines within the
810 statement. When @code{as} reads a backslashed newline both
811 characters are ignored. You can even put backslashed newlines in
812 the middle of symbol names without changing the meaning of your
815 An empty statement is allowed, and may include whitespace. It is ignored.
817 @c "key symbol" is not used elsewhere in the document; seems pedantic to
818 @c @defn{} it in that case, as was done previously... pesch@cygnus.com,
820 A statement begins with zero or more labels, optionally followed by a
821 key symbol which determines what kind of statement it is. The key
822 symbol determines the syntax of the rest of the statement. If the
823 symbol begins with a dot @samp{.} then the statement is an assembler
824 directive: typically valid for any computer. If the symbol begins with
825 a letter the statement is an assembly language @dfn{instruction}: it
826 will assemble into a machine language instruction. Different versions
827 of @code{as} for different computers will recognize different
828 instructions. In fact, the same symbol may represent a different
829 instruction in a different computer's assembly language.
831 A label is a symbol immediately followed by a colon (@code{:}).
832 Whitespace before a label or after a colon is permitted, but you may not
833 have whitespace between a label's symbol and its colon. @xref{Labels}.
836 label: .directive followed by something
837 another$label: # This is an empty statement.
838 instruction operand_1, operand_2, @dots{}
841 @node Constants, , Statements, Syntax
843 A constant is a number, written so that its value is known by
844 inspection, without knowing any context. Like this:
846 .byte 74, 0112, 092, 0x4A, 0X4a, 'J, '\J # All the same value.
847 .ascii "Ring the bell\7" # A string constant.
848 .octa 0x123456789abcdef0123456789ABCDEF0 # A bignum.
849 .float 0f-314159265358979323846264338327\
850 95028841971.693993751E-40 # - pi, a flonum.
854 * Characters:: Character Constants
855 * Numbers:: Number Constants
858 @node Characters, Numbers, Constants, Constants
859 @subsection Character Constants
860 There are two kinds of character constants. A @dfn{character} stands
861 for one character in one byte and its value may be used in
862 numeric expressions. String constants (properly called string
863 @emph{literals}) are potentially many bytes and their values may not be
864 used in arithmetic expressions.
871 @node Strings, Chars, Characters, Characters
872 @subsubsection Strings
873 A @dfn{string} is written between double-quotes. It may contain
874 double-quotes or null characters. The way to get special characters
875 into a string is to @dfn{escape} these characters: precede them with
876 a backslash @samp{\} character. For example @samp{\\} represents
877 one backslash: the first @code{\} is an escape which tells
878 @code{as} to interpret the second character literally as a backslash
879 (which prevents @code{as} from recognizing the second @code{\} as an
880 escape character). The complete list of escapes follows.
884 @c Mnemonic for ACKnowledge; for ASCII this is octal code 007.
886 Mnemonic for backspace; for ASCII this is octal code 010.
888 @c Mnemonic for EOText; for ASCII this is octal code 004.
890 Mnemonic for FormFeed; for ASCII this is octal code 014.
892 Mnemonic for newline; for ASCII this is octal code 012.
894 @c Mnemonic for prefix; for ASCII this is octal code 033, usually known as @code{escape}.
896 Mnemonic for carriage-Return; for ASCII this is octal code 015.
898 @c Mnemonic for space; for ASCII this is octal code 040. Included for compliance with
901 Mnemonic for horizontal Tab; for ASCII this is octal code 011.
903 @c Mnemonic for Vertical tab; for ASCII this is octal code 013.
904 @c @item \x @var{digit} @var{digit} @var{digit}
905 @c A hexadecimal character code. The numeric code is 3 hexadecimal digits.
906 @item \ @var{digit} @var{digit} @var{digit}
907 An octal character code. The numeric code is 3 octal digits.
908 For compatibility with other Unix systems, 8 and 9 are accepted as digits:
909 for example, @code{\008} has the value 010, and @code{\009} the value 011.
911 Represents one @samp{\} character.
913 @c Represents one @samp{'} (accent acute) character.
914 @c This is needed in single character literals
915 @c (@xref{Characters}.) to represent
918 Represents one @samp{"} character. Needed in strings to represent
919 this character, because an unescaped @samp{"} would end the string.
920 @item \ @var{anything-else}
921 Any other character when escaped by @kbd{\} will give a warning, but
922 assemble as if the @samp{\} was not present. The idea is that if
923 you used an escape sequence you clearly didn't want the literal
924 interpretation of the following character. However @code{as} has no
925 other interpretation, so @code{as} knows it is giving you the wrong
926 code and warns you of the fact.
929 Which characters are escapable, and what those escapes represent,
930 varies widely among assemblers. The current set is what we think
931 BSD 4.2 @code{as} recognizes, and is a subset of what most C
932 compilers recognize. If you are in doubt, don't use an escape
935 @node Chars, , Strings, Characters
936 @subsubsection Characters
937 A single character may be written as a single quote immediately
938 followed by that character. The same escapes apply to characters as
939 to strings. So if you want to write the character backslash, you
940 must write @kbd{'\\} where the first @code{\} escapes the second
941 @code{\}. As you can see, the quote is an acute accent, not a
942 grave accent. A newline
943 @c if 680x0 (or !am29k)
944 @c (or semicolon @samp{;})
945 @c fi 680x0 (or !am29k)
947 (or at sign @samp{@@})
950 following an acute accent is taken as a literal character and does
951 not count as the end of a statement. The value of a character
952 constant in a numeric expression is the machine's byte-wide code for
953 that character. @code{as} assumes your character code is ASCII: @kbd{'A}
954 means 65, @kbd{'B} means 66, and so on. @refill
956 @node Numbers, , Characters, Constants
957 @subsection Number Constants
958 @code{as} distinguishes three kinds of numbers according to how they
959 are stored in the target machine. @emph{Integers} are numbers that
960 would fit into an @code{int} in the C language. @emph{Bignums} are
961 integers, but they are stored in a more than 32 bits. @emph{Flonums}
962 are floating point numbers, described below.
964 @subsubsection Integers
965 A binary integer is @samp{0b} or @samp{0B} followed by zero or more of
966 the binary digits @samp{01}.
968 An octal integer is @samp{0} followed by zero or more of the octal
969 digits (@samp{01234567}).
971 A decimal integer starts with a non-zero digit followed by zero or
972 more digits (@samp{0123456789}).
974 A hexadecimal integer is @samp{0x} or @samp{0X} followed by one or
975 more hexadecimal digits chosen from @samp{0123456789abcdefABCDEF}.
977 Integers have the usual values. To denote a negative integer, use
978 the prefix operator @samp{-} discussed under expressions
979 (@pxref{Prefix Ops}).
981 @subsubsection Bignums
982 A @dfn{bignum} has the same syntax and semantics as an integer
983 except that the number (or its negative) takes more than 32 bits to
984 represent in binary. The distinction is made because in some places
985 integers are permitted while bignums are not.
987 @subsubsection Flonums
988 A @dfn{flonum} represents a floating point number. The translation is
989 complex: a decimal floating point number from the text is converted by
990 @code{as} to a generic binary floating point number of more than
991 sufficient precision. This generic floating point number is converted
992 to a particular computer's floating point format (or formats) by a
993 portion of @code{as} specialized to that computer.
995 A flonum is written by writing (in order)
1001 One of the letters @samp{DFPRSX} (in upper or lower case), to tell
1002 @code{as} the rest of the number is a flonum.
1006 A letter, to tell @code{as} the rest of the number is a flonum. @kbd{e}
1007 is recommended. Case is not important. (Any otherwise illegal letter
1008 will work here, but that might be changed. Vax BSD 4.2 assembler seems
1009 to allow any of @samp{defghDEFGH}.)
1013 An optional sign: either @samp{+} or @samp{-}.
1015 An optional @dfn{integer part}: zero or more decimal digits.
1017 An optional @dfn{fraction part}: @samp{.} followed by zero
1018 or more decimal digits.
1020 An optional exponent, consisting of:
1024 An @samp{E} or @samp{e}.
1027 A letter; the exact significance varies according to
1028 the computer that executes the program. @code{as}
1029 accepts any letter for now. Case is not important.
1033 Optional sign: either @samp{+} or @samp{-}.
1035 One or more decimal digits.
1039 At least one of @var{integer part} or @var{fraction part} must be
1040 present. The floating point number has the usual base-10 value.
1042 @code{as} does all processing using integers. Flonums are computed
1043 independently of any floating point hardware in the computer running
1046 @node Segments, Symbols, Syntax, Top
1047 @chapter Segments and Relocation
1049 * Segs Background:: Background
1050 * ld Segments:: ld Segments
1051 * as Segments:: as Internal Segments
1052 * Sub-Segments:: Sub-Segments
1056 @node Segs Background, ld Segments, Segments, Segments
1058 Roughly, a segment is a range of addresses, with no gaps; all data
1059 ``in'' those addresses is treated the same for some particular purpose.
1060 For example there may be a ``read only'' segment.
1062 The linker @code{ld} reads many object files (partial programs) and
1063 combines their contents to form a runnable program. When @code{as}
1064 emits an object file, the partial program is assumed to start at address
1065 0. @code{ld} will assign the final addresses the partial program
1066 occupies, so that different partial programs don't overlap. This is
1067 actually an over-simplification, but it will suffice to explain how
1068 @code{as} uses segments.
1070 @code{ld} moves blocks of bytes of your program to their run-time
1071 addresses. These blocks slide to their run-time addresses as rigid
1072 units; their length does not change and neither does the order of bytes
1073 within them. Such a rigid unit is called a @emph{segment}. Assigning
1074 run-time addresses to segments is called @dfn{relocation}. It includes
1075 the task of adjusting mentions of object-file addresses so they refer to
1076 the proper run-time addresses.
1078 An object file written by @code{as} has three segments, any of which may
1079 be empty. These are named @dfn{text}, @dfn{data} and @dfn{bss}
1080 segments. Within the object file, the text segment starts at
1081 address @code{0}, the data segment follows, and the bss segment
1082 follows the data segment.
1084 To let @code{ld} know which data will change when the segments are
1085 relocated, and how to change that data, @code{as} also writes to the
1086 object file details of the relocation needed. To perform relocation
1087 @code{ld} must know, each time an address in the object
1091 Where in the object file is the beginning of this reference to
1094 How long (in bytes) is this reference?
1096 Which segment does the address refer to? What is the numeric value of
1098 (@var{address}) @minus{} (@var{start-address of segment})?
1101 Is the reference to an address ``Program-Counter relative''?
1104 In fact, every address @code{as} ever uses is expressed as
1105 @code{(@var{segment}) + (@var{offset into segment})}. Further, every
1106 expression @code{as} computes is of this segmented nature.
1107 @dfn{Absolute expression} means an expression with segment ``absolute''
1108 (@pxref{ld Segments}). A @dfn{pass1 expression} means an expression
1109 with segment ``pass1'' (@pxref{as Segments}). In this manual we use the
1110 notation @{@var{segname} @var{N}@} to mean ``offset @var{N} into segment
1113 Apart from text, data and bss segments you need to know about the
1114 @dfn{absolute} segment. When @code{ld} mixes partial programs,
1115 addresses in the absolute segment remain unchanged. That is, address
1116 @code{@{absolute 0@}} is ``relocated'' to run-time address 0 by @code{ld}.
1117 Although two partial programs' data segments will not overlap addresses
1118 after linking, @emph{by definition} their absolute segments will overlap.
1119 Address @code{@{absolute@ 239@}} in one partial program will always be the same
1120 address when the program is running as address @code{@{absolute@ 239@}} in any
1121 other partial program.
1123 The idea of segments is extended to the @dfn{undefined} segment. Any
1124 address whose segment is unknown at assembly time is by definition
1125 rendered @{undefined @var{U}@}---where @var{U} will be filled in later.
1126 Since numbers are always defined, the only way to generate an undefined
1127 address is to mention an undefined symbol. A reference to a named
1128 common block would be such a symbol: its value is unknown at assembly
1129 time so it has segment @emph{undefined}.
1131 By analogy the word @emph{segment} is used to describe groups of segments in
1132 the linked program. @code{ld} puts all partial programs' text
1133 segments in contiguous addresses in the linked program. It is
1134 customary to refer to the @emph{text segment} of a program, meaning all
1135 the addresses of all partial program's text segments. Likewise for
1136 data and bss segments.
1138 Some segments are manipulated by @code{ld}; others are invented for
1139 use of @code{as} and have no meaning except during assembly.
1142 * ld Segments:: ld Segments
1143 * as Segments:: as Internal Segments
1144 * Sub-Segments:: Sub-Segments
1148 @node ld Segments, as Segments, Segs Background, Segments
1149 @section ld Segments
1150 @code{ld} deals with just five kinds of segments, summarized below.
1156 These segments hold your program. @code{as} and @code{ld} treat them as
1157 separate but equal segments. Anything you can say of one segment is
1158 true of the other. When the program is running, however, it is
1159 customary for the text segment to be unalterable. The
1160 text segment is often shared among processes: it will contain
1161 instructions, constants and the like. The data segment of a running
1162 program is usually alterable: for example, C variables would be stored
1163 in the data segment.
1166 This segment contains zeroed bytes when your program begins running. It
1167 is used to hold unitialized variables or common storage. The length of
1168 each partial program's bss segment is important, but because it starts
1169 out containing zeroed bytes there is no need to store explicit zero
1170 bytes in the object file. The bss segment was invented to eliminate
1171 those explicit zeros from object files.
1173 @item absolute segment
1174 Address 0 of this segment is always ``relocated'' to runtime address 0.
1175 This is useful if you want to refer to an address that @code{ld} must
1176 not change when relocating. In this sense we speak of absolute
1177 addresses being ``unrelocatable'': they don't change during relocation.
1179 @item @code{undefined} segment
1180 This ``segment'' is a catch-all for address references to objects not in
1181 the preceding segments.
1182 @c FIXME: ref to some other doc on obj-file formats could go here.
1186 An idealized example of the 3 relocatable segments follows. Memory
1187 addresses are on the horizontal axis.
1192 partial program # 1: |ttttt|dddd|00|
1199 partial program # 2: |TTT|DDD|000|
1202 +--+---+-----+--+----+---+-----+~~
1203 linked program: | |TTT|ttttt| |dddd|DDD|00000|
1204 +--+---+-----+--+----+---+-----+~~
1206 addresses: 0 @dots{}
1210 \halign{\hfil\rm #\quad&#\cr
1212 &\ibox{2.5cm}{\tt text}\ibox{2cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1213 Partial program \#1:
1214 &\boxit{2.5cm}{\tt ttttt}\boxit{2cm}{\tt dddd}\boxit{1cm}{\tt 00}\cr
1216 &\ibox{1cm}{\tt text}\ibox{1.5cm}{\tt data}\ibox{1cm}{\tt bss}\cr
1217 Partial program \#2:
1218 &\boxit{1cm}{\tt TTT}\boxit{1.5cm}{\tt DDDD}\boxit{1cm}{\tt 000}\cr
1220 &\ibox{.5cm}{}\ibox{1cm}{\tt text}\ibox{2.5cm}{}\ibox{.75cm}{}\ibox{2cm}{\tt data}\ibox{1.5cm}{}\ibox{2cm}{\tt bss}\cr
1222 &\boxit{.5cm}{}\boxit{1cm}{\tt TTT}\boxit{2.5cm}{\tt
1223 ttttt}\boxit{.75cm}{}\boxit{2cm}{\tt dddd}\boxit{1.5cm}{\tt
1224 DDDD}\boxit{2cm}{00000}\ \dots\cr
1230 @node as Segments, Sub-Segments, ld Segments, Segments
1231 @section as Internal Segments
1232 These segments are invented for the internal use of @code{as}. They
1233 have no meaning at run-time. You don't need to know about these
1234 segments except that they might be mentioned in @code{as}' warning
1235 messages. These segments are invented to permit the value of every
1236 expression in your assembly language program to be a segmented
1240 @item absent segment
1241 An expression was expected and none was
1245 An internal assembler logic error has been
1246 found. This means there is a bug in the assembler.
1249 A @dfn{grand number} is a bignum or a flonum, but not an integer. If a
1250 number can't be written as a C @code{int} constant, it is a grand
1251 number. @code{as} has to remember that a flonum or a bignum does not
1252 fit into 32 bits, and cannot be an argument (@pxref{Arguments}) in an
1253 expression: this is done by making a flonum or bignum be in segment
1254 grand. This is purely for internal @code{as} convenience; grand
1255 segment behaves similarly to absolute segment.
1258 The expression was impossible to evaluate in the first pass. The
1259 assembler will attempt a second pass (second reading of the source) to
1260 evaluate the expression. Your expression mentioned an undefined symbol
1261 in a way that defies the one-pass (segment + offset in segment) assembly
1262 process. No compiler need emit such an expression.
1265 @emph{Warning:} the second pass is currently not implemented. @code{as}
1266 will abort with an error message if one is required.
1269 @item difference segment
1270 As an assist to the C compiler, expressions of the forms
1272 (@var{undefined symbol}) @minus{} (@var{expression}
1273 (@var{something} @minus{} (@var{undefined symbol})
1274 (@var{undefined symbol}) @minus{} (@var{undefined symbol})
1276 are permitted, and belong to the difference segment. @code{as}
1277 re-evaluates such expressions after the source file has been read and
1278 the symbol table built. If by that time there are no undefined symbols
1279 in the expression then the expression assumes a new segment. The
1280 intention is to permit statements like
1281 @samp{.word label - base_of_table}
1282 to be assembled in one pass where both @code{label} and
1283 @code{base_of_table} are undefined. This is useful for compiling C and
1284 Algol switch statements, Pascal case statements, FORTRAN computed goto
1285 statements and the like.
1288 @node Sub-Segments, bss, as Segments, Segments
1289 @section Sub-Segments
1290 Assembled bytes fall into two segments: text and data.
1291 Because you may have groups of text or data that you want to end up near
1292 to each other in the object file, @code{as} allows you to use
1293 @dfn{subsegments}. Within each segment, there can be numbered
1294 subsegments with values from 0 to 8192. Objects assembled into the same
1295 subsegment will be grouped with other objects in the same subsegment
1296 when they are all put into the object file. For example, a compiler
1297 might want to store constants in the text segment, but might not want to
1298 have them interspersed with the program being assembled. In this case,
1299 the compiler could issue a @code{text 0} before each section of code
1300 being output, and a @code{text 1} before each group of constants being
1303 Subsegments are optional. If you don't use subsegments, everything
1304 will be stored in subsegment number zero.
1307 @c Each subsegment is zero-padded up to a multiple of four bytes.
1308 @c (Subsegments may be padded a different amount on different flavors
1312 On the AMD 29K family, no particular padding is added to segment sizes;
1313 GNU as forces no alignment on this platform.
1315 Subsegments appear in your object file in numeric order, lowest numbered
1316 to highest. (All this to be compatible with other people's assemblers.)
1317 The object file contains no representation of subsegments; @code{ld} and
1318 other programs that manipulate object files will see no trace of them.
1319 They just see all your text subsegments as a text segment, and all your
1320 data subsegments as a data segment.
1322 To specify which subsegment you want subsequent statements assembled
1323 into, use a @samp{.text @var{expression}} or a @samp{.data
1324 @var{expression}} statement. @var{Expression} should be an absolute
1325 expression. (@xref{Expressions}.) If you just say @samp{.text}
1326 then @samp{.text 0} is assumed. Likewise @samp{.data} means
1327 @samp{.data 0}. Assembly begins in @code{text 0}.
1330 .text 0 # The default subsegment is text 0 anyway.
1331 .ascii "This lives in the first text subsegment. *"
1333 .ascii "But this lives in the second text subsegment."
1335 .ascii "This lives in the data segment,"
1336 .ascii "in the first data subsegment."
1338 .ascii "This lives in the first text segment,"
1339 .ascii "immediately following the asterisk (*)."
1342 Each segment has a @dfn{location counter} incremented by one for every
1343 byte assembled into that segment. Because subsegments are merely a
1344 convenience restricted to @code{as} there is no concept of a subsegment
1345 location counter. There is no way to directly manipulate a location
1346 counter---but the @code{.align} directive will change it, and any label
1347 definition will capture its current value. The location counter of the
1348 segment that statements are being assembled into is said to be the
1349 @dfn{active} location counter.
1351 @node bss, , Sub-Segments, Segments
1352 @section bss Segment
1353 The bss segment is used for local common variable storage.
1354 You may allocate address space in the bss segment, but you may
1355 not dictate data to load into it before your program executes. When
1356 your program starts running, all the contents of the bss
1357 segment are zeroed bytes.
1359 Addresses in the bss segment are allocated with special directives;
1360 you may not assemble anything directly into the bss segment. Hence
1361 there are no bss subsegments. @xref{Comm}; @pxref{Lcomm}.
1363 @node Symbols, Expressions, Segments, Top
1365 Symbols are a central concept: the programmer uses symbols to name
1366 things, the linker uses symbols to link, and the debugger uses symbols
1370 @emph{Warning:} @code{as} does not place symbols in the object file in
1371 the same order they were declared. This may break some debuggers.
1376 * Setting Symbols:: Giving Symbols Other Values
1377 * Symbol Names:: Symbol Names
1378 * Dot:: The Special Dot Symbol
1379 * Symbol Attributes:: Symbol Attributes
1382 @node Labels, Setting Symbols, Symbols, Symbols
1384 A @dfn{label} is written as a symbol immediately followed by a colon
1385 @samp{:}. The symbol then represents the current value of the
1386 active location counter, and is, for example, a suitable instruction
1387 operand. You are warned if you use the same symbol to represent two
1388 different locations: the first definition overrides any other
1391 @node Setting Symbols, Symbol Names, Labels, Symbols
1392 @section Giving Symbols Other Values
1393 A symbol can be given an arbitrary value by writing a symbol, followed
1394 by an equals sign @samp{=}, followed by an expression
1395 (@pxref{Expressions}). This is equivalent to using the @code{.set}
1396 directive. @xref{Set}.
1398 @node Symbol Names, Dot, Setting Symbols, Symbols
1399 @section Symbol Names
1400 Symbol names begin with a letter or with one of @samp{$._}. That
1401 character may be followed by any string of digits, letters,
1402 underscores and dollar signs. Case of letters is significant:
1403 @code{foo} is a different symbol name than @code{Foo}.
1406 For the AMD 29K family, @samp{?} is also allowed in the
1407 body of a symbol name, though not at its beginning.
1410 Each symbol has exactly one name. Each name in an assembly language
1411 program refers to exactly one symbol. You may use that symbol name any
1412 number of times in a program.
1415 * Local Symbols:: Local Symbol Names
1418 @node Local Symbols, , Symbol Names, Symbol Names
1419 @subsection Local Symbol Names
1421 Local symbols help compilers and programmers use names temporarily.
1422 There are ten local symbol names, which are re-used throughout the
1423 program. You may refer to them using the names @samp{0} @samp{1}
1424 @dots{} @samp{9}. To define a local symbol, write a label of the form
1425 @samp{@b{N}:} (where @b{N} represents any digit). To refer to the most
1426 recent previous definition of that symbol write @samp{@b{N}b}, using the
1427 same digit as when you defined the label. To refer to the next
1428 definition of a local label, write @samp{@b{N}f}---where @b{N} gives you
1429 a choice of 10 forward references. The @samp{b} stands for
1430 ``backwards'' and the @samp{f} stands for ``forwards''.
1432 Local symbols are not emitted by the current GNU C compiler.
1434 There is no restriction on how you can use these labels, but
1435 remember that at any point in the assembly you can refer to at most
1436 10 prior local labels and to at most 10 forward local labels.
1438 Local symbol names are only a notation device. They are immediately
1439 transformed into more conventional symbol names before the assembler
1440 uses them. The symbol names stored in the symbol table, appearing in
1441 error messages and optionally emitted to the object file have these
1446 All local labels begin with @samp{L}. Normally both @code{as} and
1447 @code{ld} forget symbols that start with @samp{L}. These labels are
1448 used for symbols you are never intended to see. If you give the
1449 @samp{-L} option then @code{as} will retain these symbols in the
1450 object file. If you also instruct @code{ld} to retain these symbols,
1451 you may use them in debugging.
1454 If the label is written @samp{0:} then the digit is @samp{0}.
1455 If the label is written @samp{1:} then the digit is @samp{1}.
1456 And so on up through @samp{9:}.
1459 This unusual character is included so you don't accidentally invent
1460 a symbol of the same name. The character has ASCII value
1463 @item @emph{ordinal number}
1464 This is a serial number to keep the labels distinct. The first
1465 @samp{0:} gets the number @samp{1}; The 15th @samp{0:} gets the
1466 number @samp{15}; @emph{etc.}. Likewise for the other labels @samp{1:}
1470 For instance, the first @code{1:} is named @code{L1@ctrl{A}1}, the 44th
1471 @code{3:} is named @code{L3@ctrl{A}44}.
1473 @node Dot, Symbol Attributes, Symbol Names, Symbols
1474 @section The Special Dot Symbol
1476 The special symbol @samp{.} refers to the current address that
1477 @code{as} is assembling into. Thus, the expression @samp{melvin:
1478 .long .} will cause @code{melvin} to contain its own address.
1479 Assigning a value to @code{.} is treated the same as a @code{.org}
1480 directive. Thus, the expression @samp{.=.+4} is the same as saying
1488 @node Symbol Attributes, , Dot, Symbols
1489 @section Symbol Attributes
1490 Every symbol has these attributes: Value, Type, Descriptor, and ``Other''.
1492 @c The detailed definitions are in <a.out.h>.
1495 If you use a symbol without defining it, @code{as} assumes zero for
1496 all these attributes, and probably won't warn you. This makes the
1497 symbol an externally defined symbol, which is generally what you
1501 * Symbol Value:: Value
1502 * Symbol Type:: Type
1503 * Symbol Desc:: Descriptor
1504 * Symbol Other:: Other
1507 @node Symbol Value, Symbol Type, Symbol Attributes, Symbol Attributes
1509 The value of a symbol is (usually) 32 bits, the size of one GNU C
1510 @code{int}. For a symbol which labels a location in the
1511 text, data, bss or absolute segments the
1512 value is the number of addresses from the start of that segment to
1513 the label. Naturally for text, data and bss
1514 segments the value of a symbol changes as @code{ld} changes segment
1515 base addresses during linking. absolute symbols' values do
1516 not change during linking: that is why they are called absolute.
1518 The value of an undefined symbol is treated in a special way. If it is
1519 0 then the symbol is not defined in this assembler source program, and
1520 @code{ld} will try to determine its value from other programs it is
1521 linked with. You make this kind of symbol simply by mentioning a symbol
1522 name without defining it. A non-zero value represents a @code{.comm}
1523 common declaration. The value is how much common storage to reserve, in
1524 bytes (addresses). The symbol refers to the first address of the
1527 @node Symbol Type, Symbol Desc, Symbol Value, Symbol Attributes
1529 The type attribute of a symbol is 8 bits encoded in a devious way.
1530 We kept this coding standard for compatibility with older operating
1536 7 6 5 4 3 2 1 0 bit numbers
1537 +-----+-----+-----+-----+-----+-----+-----+-----+
1539 | N_STAB bits | N_TYPE bits |N_EXT|
1541 +-----+-----+-----+-----+-----+-----+-----+-----+
1549 \ibox{3cm}{7}\ibox{4cm}{4}\ibox{1.1cm}{0}&bit numbers\cr
1550 \boxit{3cm}{{\tt N\_STAB} bits}\boxit{4cm}{{\tt N\_TYPE}
1551 bits}\boxit{1.1cm}{\tt N\_EXT}\cr
1552 \hfill {\bf Type} byte\hfill\cr
1556 @subsubsection @code{N_EXT} bit
1557 This bit is set if @code{ld} might need to use the symbol's type bits
1558 and value. If this bit is off, then @code{ld} can ignore the
1559 symbol while linking. It is set in two cases. If the symbol is
1560 undefined, then @code{ld} is expected to find the symbol's value
1561 elsewhere in another program module. Otherwise the symbol has the
1562 value given, but this symbol name and value are revealed to any other
1563 programs linked in the same executable program. This second use of
1564 the @code{N_EXT} bit is most often made by a @code{.globl} statement.
1566 @subsubsection @code{N_TYPE} bits
1567 These establish the symbol's ``type'', which is mainly a relocation
1568 concept. Common values are detailed in the manual describing the
1569 executable file format.
1571 @subsubsection @code{N_STAB} bits
1572 Common values for these bits are described in the manual on the
1573 executable file format.
1575 @node Symbol Desc, Symbol Other, Symbol Type, Symbol Attributes
1576 @subsection Descriptor
1577 This is an arbitrary 16-bit value. You may establish a symbol's
1578 descriptor value by using a @code{.desc} statement (@pxref{Desc}).
1579 A descriptor value means nothing to @code{as}.
1581 @node Symbol Other, , Symbol Desc, Symbol Attributes
1583 This is an arbitrary 8-bit value. It means nothing to @code{as}.
1585 @node Expressions, Pseudo Ops, Symbols, Top
1586 @chapter Expressions
1587 An @dfn{expression} specifies an address or numeric value.
1588 Whitespace may precede and/or follow an expression.
1591 * Empty Exprs:: Empty Expressions
1592 * Integer Exprs:: Integer Expressions
1595 @node Empty Exprs, Integer Exprs, Expressions, Expressions
1596 @section Empty Expressions
1597 An empty expression has no value: it is just whitespace or null.
1598 Wherever an absolute expression is required, you may omit the
1599 expression and @code{as} will assume a value of (absolute) 0. This
1600 is compatible with other assemblers.
1602 @node Integer Exprs, , Empty Exprs, Expressions
1603 @section Integer Expressions
1604 An @dfn{integer expression} is one or more @emph{arguments} delimited
1605 by @emph{operators}.
1608 * Arguments:: Arguments
1609 * Operators:: Operators
1610 * Prefix Ops:: Prefix Operators
1611 * Infix Ops:: Infix Operators
1614 @node Arguments, Operators, Integer Exprs, Integer Exprs
1615 @subsection Arguments
1617 @dfn{Arguments} are symbols, numbers or subexpressions. In other
1618 contexts arguments are sometimes called ``arithmetic operands''. In
1619 this manual, to avoid confusing them with the ``instruction operands'' of
1620 the machine language, we use the term ``argument'' to refer to parts of
1621 expressions only, reserving the word ``operand'' to refer only to machine
1622 instruction operands.
1624 Symbols are evaluated to yield @{@var{segment} @var{NNN}@} where
1625 @var{segment} is one of text, data, bss, absolute,
1626 or @code{undefined}. @var{NNN} is a signed, 2's complement 32 bit
1629 Numbers are usually integers.
1631 A number can be a flonum or bignum. In this case, you are warned
1632 that only the low order 32 bits are used, and @code{as} pretends
1633 these 32 bits are an integer. You may write integer-manipulating
1634 instructions that act on exotic constants, compatible with other
1637 Subexpressions are a left parenthesis @samp{(} followed by an integer
1638 expression, followed by a right parenthesis @samp{)}; or a prefix
1639 operator followed by an argument.
1641 @node Operators, Prefix Ops, Arguments, Integer Exprs
1642 @subsection Operators
1643 @dfn{Operators} are arithmetic functions, like @code{+} or @code{%}. Prefix
1644 operators are followed by an argument. Infix operators appear
1645 between their arguments. Operators may be preceded and/or followed by
1648 @node Prefix Ops, Infix Ops, Operators, Integer Exprs
1649 @subsection Prefix Operators
1650 @code{as} has the following @dfn{prefix operators}. They each take
1651 one argument, which must be absolute.
1654 @dfn{Negation}. Two's complement negation.
1656 @dfn{Complementation}. Bitwise not.
1659 @node Infix Ops, , Prefix Ops, Integer Exprs
1660 @subsection Infix Operators
1662 @dfn{Infix operators} take two arguments, one on either side. Operators
1663 have precedence, but operations with equal precedence are performed left
1664 to right. Apart from @code{+} or @code{-}, both arguments must be
1665 absolute, and the result is absolute.
1673 @dfn{Multiplication}.
1675 @dfn{Division}. Truncation is the same as the C operator @samp{/}
1680 @dfn{Shift Left}. Same as the C operator @samp{<<}
1683 @dfn{Shift Right}. Same as the C operator @samp{>>}
1687 Intermediate precedence
1690 @dfn{Bitwise Inclusive Or}.
1694 @dfn{Bitwise Exclusive Or}.
1696 @dfn{Bitwise Or Not}.
1703 @dfn{Addition}. If either argument is absolute, the result
1704 has the segment of the other argument.
1705 If either argument is pass1 or undefined, the result is pass1.
1706 Otherwise @code{+} is illegal.
1708 @dfn{Subtraction}. If the right argument is absolute, the
1709 result has the segment of the left argument.
1710 If either argument is pass1 the result is pass1.
1711 If either argument is undefined the result is difference segment.
1712 If both arguments are in the same segment, the result is absolute---provided
1713 that segment is one of text, data or bss.
1714 Otherwise subtraction is illegal.
1718 The sense of the rule for addition is that it's only meaningful to add
1719 the @emph{offsets} in an address; you can only have a defined segment in
1720 one of the two arguments.
1722 Similarly, you can't subtract quantities from two different segments.
1724 @node Pseudo Ops, Machine Dependent, Expressions, Top
1725 @chapter Assembler Directives
1727 * Abort:: The Abort directive causes as to abort
1728 * Align:: Pad the location counter to a power of 2
1729 * App-File:: Set the logical file name
1730 * Ascii:: Fill memory with bytes of ASCII characters
1731 * Asciz:: Fill memory with bytes of ASCII characters followed
1733 * Byte:: Fill memory with 8-bit integers
1734 * Comm:: Reserve public space in the BSS segment
1735 * Data:: Change to the data segment
1736 * Desc:: Set the n_desc of a symbol
1737 * Double:: Fill memory with double-precision floating-point numbers
1738 * Else:: @code{.else}
1740 * Endif:: @code{.endif}
1741 * Equ:: @code{.equ @var{symbol}, @var{expression}}
1742 * Extern:: @code{.extern}
1743 * Fill:: Fill memory with repeated values
1744 * Float:: Fill memory with single-precision floating-point numbers
1745 * Global:: Make a symbol visible to the linker
1746 * Ident:: @code{.ident}
1747 * If:: @code{.if @var{absolute expression}}
1748 * Include:: @code{.include "@var{file}"}
1749 * Int:: Fill memory with 32-bit integers
1750 * Lcomm:: Reserve private space in the BSS segment
1751 * Line:: Set the logical line number
1752 * Ln:: @code{.ln @var{line-number}}
1753 * List:: @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
1754 * Long:: Fill memory with 32-bit integers
1755 * Lsym:: Create a local symbol
1756 * Octa:: Fill memory with 128-bit integers
1757 * Org:: Change the location counter
1758 * Quad:: Fill memory with 64-bit integers
1759 * Set:: Set the value of a symbol
1760 * Short:: Fill memory with 16-bit integers
1761 * Single:: @code{.single @var{flonums}}
1762 * Stab:: Store debugging information
1763 * Text:: Change to the text segment
1764 @c if am29k or sparc
1765 * Word:: Fill memory with 32-bit integers
1766 @c else (not am29k or sparc)
1767 * Deprecated:: Deprecated Directives
1768 * Machine Options:: Options
1769 * Machine Syntax:: Syntax
1770 * Floating Point:: Floating Point
1771 * Machine Directives:: Machine Directives
1775 All assembler directives have names that begin with a period (@samp{.}).
1776 The rest of the name is letters: their case does not matter.
1778 This chapter discusses directives present in all versions of GNU
1779 @code{as}; @pxref{Machine Dependent} for additional directives.
1781 @node Abort, Align, Pseudo Ops, Pseudo Ops
1782 @section @code{.abort}
1783 This directive stops the assembly immediately. It is for
1784 compatibility with other assemblers. The original idea was that the
1785 assembler program would be piped into the assembler. If the sender
1786 of a program quit, it could use this directive tells @code{as} to
1787 quit also. One day @code{.abort} will not be supported.
1789 @node Align, App-File, Abort, Pseudo Ops
1790 @section @code{.align @var{absolute-expression} , @var{absolute-expression}}
1791 Pad the location counter (in the current subsegment) to a particular
1792 storage boundary. The first expression is the number of low-order zero
1793 bits the location counter will have after advancement. For example
1794 @samp{.align 3} will advance the location counter until it a multiple of
1795 8. If the location counter is already a multiple of 8, no change is
1798 The second expression gives the value to be stored in the padding
1799 bytes. It (and the comma) may be omitted. If it is omitted, the
1800 padding bytes are zero.
1802 @node App-File, Ascii, Align, Pseudo Ops
1803 @section @code{.app-file @var{string}}
1804 @code{.app-file} tells @code{as} that we are about to start a new
1805 logical file. @var{String} is the new file name. In general, the
1806 filename is recognized whether or not it is surrounded by quotes @samp{"};
1807 but if you wish to specify an empty file name is permitted,
1808 you must give the quotes--@code{""}. This statement may go away in
1809 future: it is only recognized to be compatible with old @code{as}
1812 @node Ascii, Asciz, App-File, Pseudo Ops
1813 @section @code{.ascii "@var{string}"}@dots{}
1814 @code{.ascii} expects zero or more string literals (@pxref{Strings})
1815 separated by commas. It assembles each string (with no automatic
1816 trailing zero byte) into consecutive addresses.
1818 @node Asciz, Byte, Ascii, Pseudo Ops
1819 @section @code{.asciz "@var{string}"}@dots{}
1820 @code{.asciz} is just like @code{.ascii}, but each string is followed by
1821 a zero byte. The ``z'' in @samp{.asciz} stands for ``zero''.
1823 @node Byte, Comm, Asciz, Pseudo Ops
1824 @section @code{.byte @var{expressions}}
1826 @code{.byte} expects zero or more expressions, separated by commas.
1827 Each expression is assembled into the next byte.
1829 @node Comm, Data, Byte, Pseudo Ops
1830 @section @code{.comm @var{symbol} , @var{length} }
1831 @code{.comm} declares a named common area in the bss segment. Normally
1832 @code{ld} reserves memory addresses for it during linking, so no partial
1833 program defines the location of the symbol. Use @code{.comm} to tell
1834 @code{ld} that it must be at least @var{length} bytes long. @code{ld}
1835 will allocate space for each @code{.comm} symbol that is at least as
1836 long as the longest @code{.comm} request in any of the partial programs
1837 linked. @var{length} is an absolute expression.
1839 @node Data, Desc, Comm, Pseudo Ops
1840 @section @code{.data @var{subsegment}}
1841 @code{.data} tells @code{as} to assemble the following statements onto the
1842 end of the data subsegment numbered @var{subsegment} (which is an
1843 absolute expression). If @var{subsegment} is omitted, it defaults
1846 @node Desc, Double, Data, Pseudo Ops
1847 @section @code{.desc @var{symbol}, @var{absolute-expression}}
1848 This directive sets the descriptor of the symbol (@pxref{Symbol Attributes})
1849 to the low 16 bits of @var{absolute-expression}.
1851 @node Double, Else, Desc, Pseudo Ops
1852 @section @code{.double @var{flonums}}
1853 @code{.double} expects zero or more flonums, separated by commas. It assembles
1854 floating point numbers.
1856 @c The exact kind of floating point numbers
1857 @c emitted depends on how @code{as} is configured. @xref{Machine
1861 On the AMD 29K family the floating point format used is IEEE.
1864 @node Else, End, Double, Pseudo Ops
1865 @section @code{.else}
1866 @code{.else} is part of the @code{as} support for conditional assembly;
1867 @pxref{If}. It marks the beginning of a section of code to be assembled
1868 if the condition for the preceding @code{.if} was false.
1871 @node End, Endif, Else, Pseudo Ops
1872 @section @code{.end}
1873 This doesn't do anything---but isn't an s_ignore, so I suspect it's
1874 meant to do something eventually (which is why it isn't documented here
1875 as "for compatibility with blah").
1878 @node Endif, Equ, End, Pseudo Ops
1879 @section @code{.endif}
1880 @code{.endif} is part of the @code{as} support for conditional assembly;
1881 it marks the end of a block of code that is only assembled
1882 conditionally. @xref{If}.
1884 @node Equ, Extern, Endif, Pseudo Ops
1885 @section @code{.equ @var{symbol}, @var{expression}}
1887 This directive sets the value of @var{symbol} to @var{expression}.
1888 It is synonymous with @samp{.set}; @pxref{Set}.
1890 @node Extern, Fill, Equ, Pseudo Ops
1891 @section @code{.extern}
1892 @code{.extern} is accepted in the source program---for compatibility
1893 with other assemblers---but it is ignored. GNU @code{as} treats
1894 all undefined symbols as external.
1896 @node Fill, Float, Extern, Pseudo Ops
1897 @section @code{.fill @var{repeat} , @var{size} , @var{value}}
1898 @var{result}, @var{size} and @var{value} are absolute expressions.
1899 This emits @var{repeat} copies of @var{size} bytes. @var{Repeat}
1900 may be zero or more. @var{Size} may be zero or more, but if it is
1901 more than 8, then it is deemed to have the value 8, compatible with
1902 other people's assemblers. The contents of each @var{repeat} bytes
1903 is taken from an 8-byte number. The highest order 4 bytes are
1904 zero. The lowest order 4 bytes are @var{value} rendered in the
1905 byte-order of an integer on the computer @code{as} is assembling for.
1906 Each @var{size} bytes in a repetition is taken from the lowest order
1907 @var{size} bytes of this number. Again, this bizarre behavior is
1908 compatible with other people's assemblers.
1910 @var{Size} and @var{value} are optional.
1911 If the second comma and @var{value} are absent, @var{value} is
1912 assumed zero. If the first comma and following tokens are absent,
1913 @var{size} is assumed to be 1.
1915 @node Float, Global, Fill, Pseudo Ops
1916 @section @code{.float @var{flonums}}
1917 This directive assembles zero or more flonums, separated by commas. It
1918 has the same effect as @code{.single}.
1920 @c The exact kind of floating point numbers emitted depends on how
1921 @c @code{as} is configured.
1922 @c @xref{Machine Dependent}.
1925 The floating point format used for the AMD 29K family is IEEE.
1928 @node Global, Ident, Float, Pseudo Ops
1929 @section @code{.global @var{symbol}}, @code{.globl @var{symbol}}
1930 @code{.global} makes the symbol visible to @code{ld}. If you define
1931 @var{symbol} in your partial program, its value is made available to
1932 other partial programs that are linked with it. Otherwise,
1933 @var{symbol} will take its attributes from a symbol of the same name
1934 from another partial program it is linked with.
1936 This is done by setting the @code{N_EXT} bit of that symbol's type byte
1937 to 1. @xref{Symbol Attributes}.
1939 Both spellings (@samp{.globl} and @samp{.global}) are accepted, for
1940 compatibility with other assemblers.
1942 @node Ident, If, Global, Pseudo Ops
1943 @section @code{.ident}
1944 This directive is used by some assemblers to place tags in object files.
1945 GNU @code{as} simply accepts the directive for source-file
1946 compatibility with such assemblers, but does not actually emit anything
1949 @node If, Include, Ident, Pseudo Ops
1950 @section @code{.if @var{absolute expression}}
1951 @code{.if} marks the beginning of a section of code which is only
1952 considered part of the source program being assembled if the argument
1953 (which must be an @var{absolute expression}) is non-zero. The end of
1954 the conditional section of code must be marked by @code{.endif}
1955 (@pxref{Endif}); optionally, you may include code for the
1956 alternative condition, flagged by @code{.else} (@pxref{Else}.
1958 The following variants of @code{.if} are also supported:
1960 @item ifdef @var{symbol}
1961 Assembles the following section of code if the specified @var{symbol}
1969 @item ifndef @var{symbol}
1970 @itemx ifnotdef @var{symbol}
1971 Assembles the following section of code if the specified @var{symbol}
1972 has not been defined. Both spelling variants are equivalent.
1976 NO bogons, I presume?
1980 @node Include, Int, If, Pseudo Ops
1981 @section @code{.include "@var{file}"}
1982 This directive provides a way to include supporting files at specified
1983 points in your source program. The code from @var{file} is assembled as
1984 if it followed the point of the @code{.include}; when the end of the
1985 included file is reached, assembly of the original file continues. You
1986 can control the search paths used with the @samp{-I} command-line option
1987 (@pxref{Options}). Quotation marks are required around @var{file}.
1989 @node Int, Lcomm, Include, Pseudo Ops
1990 @section @code{.int @var{expressions}}
1991 Expect zero or more @var{expressions}, of any segment, separated by
1992 commas. For each expression, emit a 32-bit number that will, at run
1993 time, be the value of that expression. The byte order of the
1994 expression depends on what kind of computer will run the program.
1996 @node Lcomm, Line, Int, Pseudo Ops
1997 @section @code{.lcomm @var{symbol} , @var{length}}
1998 Reserve @var{length} (an absolute expression) bytes for a local
1999 common denoted by @var{symbol}. The segment and value of @var{symbol} are
2000 those of the new local common. The addresses are allocated in the
2001 bss segment, so at run-time the bytes will start off zeroed.
2002 @var{Symbol} is not declared global (@pxref{Global}), so is normally
2003 not visible to @code{ld}.
2007 @node Line, Ln, Lcomm, Pseudo Ops
2008 @section @code{.line @var{line-number}}, @code{.ln @var{line-number}}
2009 @code{.line}, and its alternate spelling @code{.ln}, tell
2013 @node Ln, List, Line, Pseudo Ops
2014 @section @code{.ln @var{line-number}}
2017 @code{as} to change the logical line number. @var{line-number} must be
2018 an absolute expression. The next line will have that logical line
2019 number. So any other statements on the current line (after a statement
2027 will be reported as on logical line number
2028 @var{logical line number} @minus{} 1.
2029 One day this directive will be unsupported: it is used only
2030 for compatibility with existing assembler programs. @refill
2032 @node List, Long, Ln, Pseudo Ops
2033 @section @code{.list}, @code{.nolist}, @code{.eject}, @code{.lflags}, @code{.title}, @code{.sbttl}
2034 GNU @code{as} ignores these directives; however, they're
2035 accepted for compatibility with assemblers that use them.
2037 @node Long, Lsym, List, Pseudo Ops
2038 @section @code{.long @var{expressions}}
2039 @code{.long} is the same as @samp{.int}, @pxref{Int}.
2041 @node Lsym, Octa, Long, Pseudo Ops
2042 @section @code{.lsym @var{symbol}, @var{expression}}
2043 @code{.lsym} creates a new symbol named @var{symbol}, but does not put it in
2044 the hash table, ensuring it cannot be referenced by name during the
2045 rest of the assembly. This sets the attributes of the symbol to be
2046 the same as the expression value:
2048 @var{other} = @var{descriptor} = 0
2049 @var{type} = @r{(segment of @var{expression})}
2051 @var{value} = @var{expression}
2054 @node Octa, Org, Lsym, Pseudo Ops
2055 @section @code{.octa @var{bignums}}
2056 This directive expects zero or more bignums, separated by commas. For each
2057 bignum, it emits a 16-byte integer.
2059 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2060 hence @emph{quad}-word for 8 bytes.
2062 @node Org, Quad, Octa, Pseudo Ops
2063 @section @code{.org @var{new-lc} , @var{fill}}
2065 @code{.org} will advance the location counter of the current segment to
2066 @var{new-lc}. @var{new-lc} is either an absolute expression or an
2067 expression with the same segment as the current subsegment. That is,
2068 you can't use @code{.org} to cross segments: if @var{new-lc} has the
2069 wrong segment, the @code{.org} directive is ignored. To be compatible
2070 with former assemblers, if the segment of @var{new-lc} is absolute,
2071 @code{as} will issue a warning, then pretend the segment of @var{new-lc}
2072 is the same as the current subsegment.
2074 @code{.org} may only increase the location counter, or leave it
2075 unchanged; you cannot use @code{.org} to move the location counter
2078 @c double negative used below "not undefined" because this is a specific
2079 @c reference to "undefined" (as SEG_UNKNOWN is called in this manual)
2080 @c segment. pesch@cygnus.com 18feb91
2081 Because @code{as} tries to assemble programs in one pass @var{new-lc}
2082 may not be undefined. If you really detest this restriction we eagerly await
2083 a chance to share your improved assembler.
2085 Beware that the origin is relative to the start of the segment, not
2086 to the start of the subsegment. This is compatible with other
2087 people's assemblers.
2089 When the location counter (of the current subsegment) is advanced, the
2090 intervening bytes are filled with @var{fill} which should be an
2091 absolute expression. If the comma and @var{fill} are omitted,
2092 @var{fill} defaults to zero.
2094 @node Quad, Set, Org, Pseudo Ops
2095 @section @code{.quad @var{bignums}}
2096 @code{.quad} expects zero or more bignums, separated by commas. For
2097 each bignum, it emits an 8-byte integer. If the bignum won't fit in a 8
2098 bytes, it prints a warning message; and just takes the lowest order 8
2099 bytes of the bignum.
2101 The term ``quad'' comes from contexts in which a ``word'' was two bytes;
2102 hence @emph{quad}-word for 8 bytes.
2104 @node Set, Short, Quad, Pseudo Ops
2105 @section @code{.set @var{symbol}, @var{expression}}
2107 This directive sets the value of @var{symbol} to @var{expression}. This
2108 will change @var{symbol}'s value and type to conform to
2109 @var{expression}. If @code{N_EXT} is set, it remains set.
2110 (@xref{Symbol Attributes}.)
2112 You may @code{.set} a symbol many times in the same assembly.
2113 If the expression's segment is unknowable during pass 1, a second
2114 pass over the source program will be forced. The second pass is
2115 currently not implemented. @code{as} will abort with an error
2116 message if one is required.
2118 If you @code{.set} a global symbol, the value stored in the object
2119 file is the last value stored into it.
2121 @node Short, Single, Set, Pseudo Ops
2122 @section @code{.short @var{expressions}}
2123 @c if not (sparc or amd29k)
2124 @c @code{.short} is the same as @samp{.word}. @xref{Word}.
2125 @c fi not (sparc or amd29k)
2126 @c if (sparc or amd29k)
2127 This expects zero or more @var{expressions}, and emits
2128 a 16 bit number for each.
2129 @c fi (sparc or amd29k)
2131 @node Single, Space, Short, Pseudo Ops
2132 @section @code{.single @var{flonums}}
2133 This directive assembles zero or more flonums, separated by commas. It
2134 has the same effect as @code{.float}.
2136 @c The exact kind of floating point numbers emitted depends on how
2137 @c @code{as} is configured. @xref{Machine Dependent}.
2140 The floating point format used for the AMD 29K family is IEEE.
2144 @node Space, Space, Single, Pseudo Ops
2147 @section @code{.space @var{size} , @var{fill}}
2148 This directive emits @var{size} bytes, each of value @var{fill}. Both
2149 @var{size} and @var{fill} are absolute expressions. If the comma
2150 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2155 @section @code{.space}
2156 This directive is ignored; it is accepted for compatibility with other
2160 @emph{Warning:} In other versions of GNU @code{as}, the directive
2161 @code{.space} has the effect of @code{.block} @xref{Machine Directives}.
2165 @node Stab, Text, Space, Pseudo Ops
2166 @section @code{.stabd, .stabn, .stabs}
2167 There are three directives that begin @samp{.stab}.
2168 All emit symbols (@pxref{Symbols}), for use by symbolic debuggers.
2169 The symbols are not entered in @code{as}' hash table: they
2170 cannot be referenced elsewhere in the source file.
2171 Up to five fields are required:
2174 This is the symbol's name. It may contain any character except @samp{\000},
2175 so is more general than ordinary symbol names. Some debuggers used to
2176 code arbitrarily complex structures into symbol names using this field.
2178 An absolute expression. The symbol's type is set to the low 8
2179 bits of this expression.
2180 Any bit pattern is permitted, but @code{ld} and debuggers will choke on
2183 An absolute expression.
2184 The symbol's ``other'' attribute is set to the low 8 bits of this expression.
2186 An absolute expression.
2187 The symbol's descriptor is set to the low 16 bits of this expression.
2189 An absolute expression which becomes the symbol's value.
2192 If a warning is detected while reading a @code{.stabd}, @code{.stabn},
2193 or @code{.stabs} statement, the symbol has probably already been created
2194 and you will get a half-formed symbol in your object file. This is
2195 compatible with earlier assemblers!
2198 @item .stabd @var{type} , @var{other} , @var{desc}
2200 The ``name'' of the symbol generated is not even an empty string.
2201 It is a null pointer, for compatibility. Older assemblers used a
2202 null pointer so they didn't waste space in object files with empty
2205 The symbol's value is set to the location counter,
2206 relocatably. When your program is linked, the value of this symbol
2207 will be where the location counter was when the @code{.stabd} was
2210 @item .stabn @var{type} , @var{other} , @var{desc} , @var{value}
2212 The name of the symbol is set to the empty string @code{""}.
2214 @item .stabs @var{string} , @var{type} , @var{other} , @var{desc} , @var{value}
2216 All five fields are specified.
2219 @node Text, Word, Stab, Pseudo Ops
2220 @section @code{.text @var{subsegment}}
2221 Tells @code{as} to assemble the following statements onto the end of
2222 the text subsegment numbered @var{subsegment}, which is an absolute
2223 expression. If @var{subsegment} is omitted, subsegment number zero
2226 @node Word, Deprecated, Text, Pseudo Ops
2227 @section @code{.word @var{expressions}}
2228 This directive expects zero or more @var{expressions}, of any segment,
2229 separated by commas.
2230 @c if sparc or amd29k
2231 For each expression, @code{as} emits a 32-bit number.
2232 @c fi sparc or amd29k
2233 @c if not (sparc or amd29k)
2234 @c For each expression, @code{as} emits a 16-bit number.
2235 @c fi not (sparc or amd29k)
2239 of the expression depends on what kind of computer will run the
2245 @c on the 29k this doesn't happen---32-bit addressability, period; no
2246 @c long/short jumps.
2248 @subsection Special Treatment to support Compilers
2250 In order to assemble compiler output into something that will work,
2251 @code{as} will occasionlly do strange things to @samp{.word} directives.
2252 Directives of the form @samp{.word sym1-sym2} are often emitted by
2253 compilers as part of jump tables. Therefore, when @code{as} assembles a
2254 directive of the form @samp{.word sym1-sym2}, and the difference between
2255 @code{sym1} and @code{sym2} does not fit in 16 bits, @code{as} will
2256 create a @dfn{secondary jump table}, immediately before the next label.
2257 This @var{secondary jump table} will be preceded by a short-jump to the
2258 first byte after the secondary table. This short-jump prevents the flow
2259 of control from accidentally falling into the new table. Inside the
2260 table will be a long-jump to @code{sym2}. The original @samp{.word}
2261 will contain @code{sym1} minus the address of the long-jump to
2264 If there were several occurrences of @samp{.word sym1-sym2} before the
2265 secondary jump table, all of them will be adjusted. If there was a
2266 @samp{.word sym3-sym4}, that also did not fit in sixteen bits, a
2267 long-jump to @code{sym4} will be included in the secondary jump table,
2268 and the @code{.word} directives will be adjusted to contain @code{sym3}
2269 minus the address of the long-jump to @code{sym4}; and so on, for as many
2270 entries in the original jump table as necessary.
2274 @emph{This feature may be disabled by compiling @code{as} with the
2275 @samp{-DWORKING_DOT_WORD} option.} This feature is likely to confuse
2276 assembly language programmers.
2281 @node Deprecated, Machine Dependent, Word, Pseudo Ops
2282 @section Deprecated Directives
2283 One day these directives won't work.
2284 They are included for compatibility with older assemblers.
2291 @node Machine Dependent, Machine Dependent, Pseudo Ops, Top
2293 @c chapter Machine Dependent Features
2296 @c chapter Machine Dependent Features: Motorola 680x0
2299 @chapter Machine Dependent Features: AMD 29K
2301 @c pesch@cygnus.com: This version of the manual is specifically hacked
2302 @c for gas on a particular machine.
2303 @c We should have a config method of
2304 @c automating this; in the meantime, use ignore
2305 @c for the other architectures (or for their stubs)
2312 The Vax version of @code{as} accepts any of the following options,
2313 gives a warning message that the option was ignored and proceeds.
2314 These options are for compatibility with scripts designed for other
2315 people's assemblers.
2318 @item @kbd{-D} (Debug)
2319 @itemx @kbd{-S} (Symbol Table)
2320 @itemx @kbd{-T} (Token Trace)
2321 These are obsolete options used to debug old assemblers.
2323 @item @kbd{-d} (Displacement size for JUMPs)
2324 This option expects a number following the @kbd{-d}. Like options
2325 that expect filenames, the number may immediately follow the
2326 @kbd{-d} (old standard) or constitute the whole of the command line
2327 argument that follows @kbd{-d} (GNU standard).
2329 @item @kbd{-V} (Virtualize Interpass Temporary File)
2330 Some other assemblers use a temporary file. This option
2331 commanded them to keep the information in active memory rather
2332 than in a disk file. @code{as} always does this, so this
2333 option is redundant.
2335 @item @kbd{-J} (JUMPify Longer Branches)
2336 Many 32-bit computers permit a variety of branch instructions
2337 to do the same job. Some of these instructions are short (and
2338 fast) but have a limited range; others are long (and slow) but
2339 can branch anywhere in virtual memory. Often there are 3
2340 flavors of branch: short, medium and long. Some other
2341 assemblers would emit short and medium branches, unless told by
2342 this option to emit short and long branches.
2344 @item @kbd{-t} (Temporary File Directory)
2345 Some other assemblers may use a temporary file, and this option
2346 takes a filename being the directory to site the temporary
2347 file. @code{as} does not use a temporary disk file, so this
2348 option makes no difference. @kbd{-t} needs exactly one
2352 The Vax version of the assembler accepts two options when
2353 compiled for VMS. They are @kbd{-h}, and @kbd{-+}. The
2354 @kbd{-h} option prevents @code{as} from modifying the
2355 symbol-table entries for symbols that contain lowercase
2356 characters (I think). The @kbd{-+} option causes @code{as} to
2357 print warning messages if the FILENAME part of the object file,
2358 or any symbol name is larger than 31 characters. The @kbd{-+}
2359 option also insertes some code following the @samp{_main}
2360 symbol so that the object file will be compatible with Vax-11
2363 @subsection Floating Point
2364 Conversion of flonums to floating point is correct, and
2365 compatible with previous assemblers. Rounding is
2366 towards zero if the remainder is exactly half the least significant bit.
2368 @code{D}, @code{F}, @code{G} and @code{H} floating point formats
2371 Immediate floating literals (@emph{e.g.} @samp{S`$6.9})
2372 are rendered correctly. Again, rounding is towards zero in the
2375 The @code{.float} directive produces @code{f} format numbers.
2376 The @code{.double} directive produces @code{d} format numbers.
2378 @subsection Machine Directives
2379 The Vax version of the assembler supports four directives for
2380 generating Vax floating point constants. They are described in the
2385 This expects zero or more flonums, separated by commas, and
2386 assembles Vax @code{d} format 64-bit floating point constants.
2389 This expects zero or more flonums, separated by commas, and
2390 assembles Vax @code{f} format 32-bit floating point constants.
2393 This expects zero or more flonums, separated by commas, and
2394 assembles Vax @code{g} format 64-bit floating point constants.
2397 This expects zero or more flonums, separated by commas, and
2398 assembles Vax @code{h} format 128-bit floating point constants.
2403 All DEC mnemonics are supported. Beware that @code{case@dots{}}
2404 instructions have exactly 3 operands. The dispatch table that
2405 follows the @code{case@dots{}} instruction should be made with
2406 @code{.word} statements. This is compatible with all unix
2407 assemblers we know of.
2409 @subsection Branch Improvement
2410 Certain pseudo opcodes are permitted. They are for branch
2411 instructions. They expand to the shortest branch instruction that
2412 will reach the target. Generally these mnemonics are made by
2413 substituting @samp{j} for @samp{b} at the start of a DEC mnemonic.
2414 This feature is included both for compatibility and to help
2415 compilers. If you don't need this feature, don't use these
2416 opcodes. Here are the mnemonics, and the code they can expand into.
2420 @samp{Jsb} is already an instruction mnemonic, so we chose @samp{jbsb}.
2422 @item (byte displacement)
2424 @item (word displacement)
2426 @item (long displacement)
2431 Unconditional branch.
2433 @item (byte displacement)
2435 @item (word displacement)
2437 @item (long displacement)
2441 @var{COND} may be any one of the conditional branches
2442 @code{neq nequ eql eqlu gtr geq lss gtru lequ vc vs gequ cc lssu cs}.
2443 @var{COND} may also be one of the bit tests
2444 @code{bs bc bss bcs bsc bcc bssi bcci lbs lbc}.
2445 @var{NOTCOND} is the opposite condition to @var{COND}.
2447 @item (byte displacement)
2448 @kbd{b@var{COND} @dots{}}
2449 @item (word displacement)
2450 @kbd{b@var{UNCOND} foo ; brw @dots{} ; foo:}
2451 @item (long displacement)
2452 @kbd{b@var{UNCOND} foo ; jmp @dots{} ; foo:}
2455 @var{X} may be one of @code{b d f g h l w}.
2457 @item (word displacement)
2458 @kbd{@var{OPCODE} @dots{}}
2459 @item (long displacement)
2460 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @dots{} ; bar:}
2463 @var{YYY} may be one of @code{lss leq}.
2465 @var{ZZZ} may be one of @code{geq gtr}.
2467 @item (byte displacement)
2468 @kbd{@var{OPCODE} @dots{}}
2469 @item (word displacement)
2470 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2471 @item (long displacement)
2472 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar: }
2479 @item (byte displacement)
2480 @kbd{@var{OPCODE} @dots{}}
2481 @item (word displacement)
2482 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: brw @var{destination} ; bar:}
2483 @item (long displacement)
2484 @kbd{@var{OPCODE} @dots{}, foo ; brb bar ; foo: jmp @var{destination} ; bar:}
2488 @subsection operands
2489 The immediate character is @samp{$} for Unix compatibility, not
2490 @samp{#} as DEC writes it.
2492 The indirect character is @samp{*} for Unix compatibility, not
2493 @samp{@@} as DEC writes it.
2495 The displacement sizing character is @samp{`} (an accent grave) for
2496 Unix compatibility, not @samp{^} as DEC writes it. The letter
2497 preceding @samp{`} may have either case. @samp{G} is not
2498 understood, but all other letters (@code{b i l s w}) are understood.
2500 Register names understood are @code{r0 r1 r2 @dots{} r15 ap fp sp
2501 pc}. Any case of letters will do.
2508 Any expression is permitted in an operand. Operands are comma
2511 @c There is some bug to do with recognizing expressions
2512 @c in operands, but I forget what it is. It is
2513 @c a syntax clash because () is used as an address mode
2514 @c and to encapsulate sub-expressions.
2515 @subsection Not Supported
2516 Vax bit fields can not be assembled with @code{as}. Someone
2517 can add the required code if they really need it.
2521 @node Machine Options, Machine Syntax, Machine Dependent, Machine Dependent
2523 GNU @code{as} has no additional command-line options for the AMD
2526 @node Machine Syntax, Floating Point, Machine Options, Machine Dependent
2528 @subsection Special Characters
2529 @samp{;} is the line comment character.
2531 @samp{@@} can be used instead of a newline to separate statements.
2533 The character @samp{?} is permitted in identifiers (but may not begin
2536 @subsection Register Names
2537 General-purpose registers are represented by predefined symbols of the
2538 form @samp{GR@var{nnn}} (for global registers) or @samp{LR@var{nnn}}
2539 (for local registers), where @var{nnn} represents a number between
2540 @code{0} and @code{127}, written with no leading zeros. The leading
2541 letters may be in either upper or lower case; for example, @samp{gr13}
2542 and @samp{LR7} are both valid register names.
2544 You may also refer to general-purpose registers by specifying the
2545 register number as the result of an expression (prefixed with @samp{%%}
2546 to flag the expression as a register number):
2550 @noindent---where @var{expression} must be an absolute expression
2551 evaluating to a number between @code{0} and @code{255}. The range
2552 [0, 127] refers to global registers, and the range [128, 255] to local
2555 In addition, GNU @code{as} understands the following protected
2556 special-purpose register names for the AMD 29K family:
2566 These unprotected special-purpose register names are also recognized:
2574 @node Floating Point, Machine Directives, Machine Syntax, Machine Dependent
2575 @section Floating Point
2576 The AMD 29K family uses IEEE floating-point numbers.
2578 @node Machine Directives, Opcodes, Floating Point, Machine Dependent
2579 @section Machine Directives
2582 * block:: @code{.block @var{size} , @var{fill}}
2583 * cputype:: @code{.cputype}
2584 * file:: @code{.file}
2585 * hword:: @code{.hword @var{expressions}}
2586 * line:: @code{.line}
2587 * reg:: @code{.reg @var{symbol}, @var{expression}}
2588 * sect:: @code{.sect}
2589 * use:: @code{.use @var{segment name}}
2592 @node block, cputype, Machine Directives, Machine Directives
2593 @subsection @code{.block @var{size} , @var{fill}}
2594 This directive emits @var{size} bytes, each of value @var{fill}. Both
2595 @var{size} and @var{fill} are absolute expressions. If the comma
2596 and @var{fill} are omitted, @var{fill} is assumed to be zero.
2598 In other versions of GNU @code{as}, this directive is called
2601 @node cputype, file, block, Machine Directives
2602 @subsection @code{.cputype}
2603 This directive is ignored; it is accepted for compatibility with other
2606 @node file, hword, cputype, Machine Directives
2607 @subsection @code{.file}
2608 This directive is ignored; it is accepted for compatibility with other
2612 @emph{Warning:} in other versions of GNU @code{as}, @code{.file} is
2613 used for the directive called @code{.app-file} in the AMD 29K support.
2616 @node hword, line, file, Machine Directives
2617 @subsection @code{.hword @var{expressions}}
2618 This expects zero or more @var{expressions}, and emits
2619 a 16 bit number for each. (Synonym for @samp{.short}.)
2621 @node line, reg, hword, Machine Directives
2622 @subsection @code{.line}
2623 This directive is ignored; it is accepted for compatibility with other
2626 @node reg, sect, line, Machine Directives
2627 @subsection @code{.reg @var{symbol}, @var{expression}}
2628 @code{.reg} has the same effect as @code{.lsym}; @pxref{Lsym}.
2630 @node sect, use, reg, Machine Directives
2631 @subsection @code{.sect}
2632 This directive is ignored; it is accepted for compatibility with other
2635 @node use, , sect, Machine Directives
2636 @subsection @code{.use @var{segment name}}
2637 Establishes the segment and subsegment for the following code;
2638 @var{segment name} may be one of @code{.text}, @code{.data},
2639 @code{.data1}, or @code{.lit}. With one of the first three @var{segment
2640 name} options, @samp{.use} is equivalent to the machine directive
2641 @var{segment name}; the remaining case, @samp{.use .lit}, is the same as
2645 @node Opcodes, Opcodes, Machine Directives, Machine Dependent
2647 GNU @code{as} implements all the standard AMD 29K opcodes. No
2648 additional pseudo-instructions are needed on this family.
2650 For information on the 29K machine instruction set, see @cite{Am29000
2651 User's Manual}, Advanced Micro Devices, Inc.
2658 The 680x0 version of @code{as} has two machine dependent options.
2659 One shortens undefined references from 32 to 16 bits, while the
2660 other is used to tell @code{as} what kind of machine it is
2663 You can use the @kbd{-l} option to shorten the size of references to
2664 undefined symbols. If the @kbd{-l} option is not given, references to
2665 undefined symbols will be a full long (32 bits) wide. (Since @code{as}
2666 cannot know where these symbols will end up, @code{as} can only allocate
2667 space for the linker to fill in later. Since @code{as} doesn't know how
2668 far away these symbols will be, it allocates as much space as it can.)
2669 If this option is given, the references will only be one word wide (16
2670 bits). This may be useful if you want the object file to be as small as
2671 possible, and you know that the relevant symbols will be less than 17
2674 The 680x0 version of @code{as} is most frequently used to assemble
2675 programs for the Motorola MC68020 microprocessor. Occasionally it is
2676 used to assemble programs for the mostly similar, but slightly different
2677 MC68000 or MC68010 microprocessors. You can give @code{as} the options
2678 @samp{-m68000}, @samp{-mc68000}, @samp{-m68010}, @samp{-mc68010},
2679 @samp{-m68020}, and @samp{-mc68020} to tell it what processor is the
2684 The 680x0 version of @code{as} uses syntax similar to the Sun assembler.
2685 Size modifiers are appended directly to the end of the opcode without an
2686 intervening period. For example, write @samp{movl} rather than
2689 @c pesch@cygnus.com: Vintage Release c1.37 isn't compiled with
2692 If @code{as} is compiled with SUN_ASM_SYNTAX defined, it will also allow
2693 Sun-style local labels of the form @samp{1$} through @samp{$9}.
2696 In the following table @dfn{apc} stands for any of the address
2697 registers (@samp{a0} through @samp{a7}), nothing, (@samp{}), the
2698 Program Counter (@samp{pc}), or the zero-address relative to the
2699 program counter (@samp{zpc}).
2701 The following addressing modes are understood:
2704 @samp{#@var{digits}}
2707 @samp{d0} through @samp{d7}
2709 @item Address Register
2710 @samp{a0} through @samp{a7}
2712 @item Address Register Indirect
2713 @samp{a0@@} through @samp{a7@@}
2715 @item Address Register Postincrement
2716 @samp{a0@@+} through @samp{a7@@+}
2718 @item Address Register Predecrement
2719 @samp{a0@@-} through @samp{a7@@-}
2721 @item Indirect Plus Offset
2722 @samp{@var{apc}@@(@var{digits})}
2725 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2726 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})}
2729 @samp{@var{apc}@@(@var{digits})@@(@var{digits},@var{register}:@var{size}:@var{scale})}
2730 or @samp{@var{apc}@@(@var{digits})@@(@var{register}:@var{size}:@var{scale})}
2733 @samp{@var{apc}@@(@var{digits},@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2734 or @samp{@var{apc}@@(@var{register}:@var{size}:@var{scale})@@(@var{digits})}
2736 @item Memory Indirect
2737 @samp{@var{apc}@@(@var{digits})@@(@var{digits})}
2740 @samp{@var{symbol}}, or @samp{@var{digits}}
2742 @c pesch@cygnus.com: gnu, rich concur the following needs careful
2743 @c research before documenting.
2744 , or either of the above followed
2745 by @samp{:b}, @samp{:w}, or @samp{:l}.
2749 @section Floating Point
2750 The floating point code is not too well tested, and may have
2753 Packed decimal (P) format floating literals are not supported.
2754 Feel free to add the code!
2756 The floating point formats generated by directives are these.
2759 @code{Single} precision floating point constants.
2761 @code{Double} precision floating point constants.
2764 There is no directive to produce regions of memory holding
2765 extended precision numbers, however they can be used as
2766 immediate operands to floating-point instructions. Adding a
2767 directive to create extended precision numbers would not be
2768 hard, but it has not yet seemed necessary.
2770 @section Machine Directives
2771 In order to be compatible with the Sun assembler the 680x0 assembler
2772 understands the following directives.
2775 This directive is identical to a @code{.data 1} directive.
2777 This directive is identical to a @code{.data 2} directive.
2779 This directive is identical to a @code{.align 1} directive.
2780 @c Is this true? does it work???
2782 This directive is identical to a @code{.space} directive.
2786 @c pesch@cygnus.com: I don't see any point in the following
2787 @c paragraph. Bugs are bugs; how does saying this
2790 Danger: Several bugs have been found in the opcode table (and
2791 fixed). More bugs may exist. Be careful when using obscure
2795 @subsection Branch Improvement
2797 Certain pseudo opcodes are permitted for branch instructions.
2798 They expand to the shortest branch instruction that will reach the
2799 target. Generally these mnemonics are made by substituting @samp{j} for
2800 @samp{b} at the start of a Motorola mnemonic.
2802 The following table summarizes the pseudo-operations. A @code{*} flags
2803 cases that are more fully described after the table:
2807 +---------------------------------------------------------
2809 Pseudo-Op |BYTE WORD LONG LONG non-PC relative
2810 +---------------------------------------------------------
2811 jbsr |bsrs bsr bsrl jsr jsr
2812 jra |bras bra bral jmp jmp
2813 * jXX |bXXs bXX bXXl bNXs;jmpl bNXs;jmp
2814 * dbXX |dbXX dbXX dbXX; bra; jmpl
2815 * fjXX |fbXXw fbXXw fbXXl fbNXw;jmp
2818 NX: negative of condition XX
2821 @center{@code{*}---see full description below}
2826 These are the simplest jump pseudo-operations; they always map to one
2827 particular machine instruction, depending on the displacement to the
2831 Here, @samp{j@var{XX}} stands for an entire family of pseudo-operations,
2832 where @var{XX} is a conditional branch or condition-code test. The full
2833 list of pseudo-ops in this family is:
2835 jhi jls jcc jcs jne jeq jvc
2836 jvs jpl jmi jge jlt jgt jle
2839 For the cases of non-PC relative displacements and long displacements on
2840 the 68000 or 68010, @code{as} will issue a longer code fragment in terms of
2841 @var{NX}, the opposite condition to @var{XX}:
2853 The full family of pseudo-operations covered here is
2855 dbhi dbls dbcc dbcs dbne dbeq dbvc
2856 dbvs dbpl dbmi dbge dblt dbgt dble
2860 Other than for word and byte displacements, when the source reads
2861 @samp{db@var{XX} foo}, @code{as} will emit
2870 This family includes
2872 fjne fjeq fjge fjlt fjgt fjle fjf
2873 fjt fjgl fjgle fjnge fjngl fjngle fjngt
2874 fjnle fjnlt fjoge fjogl fjogt fjole fjolt
2875 fjor fjseq fjsf fjsne fjst fjueq fjuge
2876 fjugt fjule fjult fjun
2879 For branch targets that are not PC relative, @code{as} emits
2885 when it encounters @samp{fj@var{XX} foo}.
2889 @subsection Special Characters
2890 The immediate character is @samp{#} for Sun compatibility. The
2891 line-comment character is @samp{|}. If a @samp{#} appears at the
2892 beginning of a line, it is treated as a comment unless it looks like
2893 @samp{# line file}, in which case it is treated normally.
2897 @c pesch@cygnus.com: see remarks at ignore for vax.
2901 The 32x32 version of @code{as} accepts a @kbd{-m32032} option to
2902 specify thiat it is compiling for a 32032 processor, or a
2903 @kbd{-m32532} to specify that it is compiling for a 32532 option.
2904 The default (if neither is specified) is chosen when the assembler
2908 I don't know anything about the 32x32 syntax assembled by
2909 @code{as}. Someone who undersands the processor (I've never seen
2910 one) and the possible syntaxes should write this section.
2912 @subsection Floating Point
2913 The 32x32 uses IEEE floating point numbers, but @code{as} will only
2914 create single or double precision values. I don't know if the 32x32
2915 understands extended precision numbers.
2917 @subsection Machine Directives
2918 The 32x32 has no machine dependent directives.
2922 The sparc has no machine dependent options.
2925 I don't know anything about Sparc syntax. Someone who does
2926 will have to write this section.
2928 @subsection Floating Point
2929 The Sparc uses ieee floating-point numbers.
2931 @subsection Machine Directives
2932 The Sparc version of @code{as} supports the following additional
2937 This must be followed by a symbol name, a positive number, and
2938 @code{"bss"}. This behaves somewhat like @code{.comm}, but the
2939 syntax is different.
2942 This is functionally identical to @code{.globl}.
2945 This is functionally identical to @code{.short}.
2948 This directive is ignored. Any text following it on the same
2949 line is also ignored.
2952 This must be followed by a symbol name, a positive number, and
2953 @code{"bss"}. This behaves somewhat like @code{.lcomm}, but the
2954 syntax is different.
2957 This must be followed by @code{"text"}, @code{"data"}, or
2958 @code{"data1"}. It behaves like @code{.text}, @code{.data}, or
2962 This is functionally identical to the .space directive.
2965 On the Sparc, the .word directive produces 32 bit values,
2966 instead of the 16 bit values it produces on every other machine.
2970 @section Intel 80386
2972 The 80386 has no machine dependent options.
2974 @subsection AT&T Syntax versus Intel Syntax
2975 In order to maintain compatibility with the output of @code{GCC},
2976 @code{as} supports AT&T System V/386 assembler syntax. This is quite
2977 different from Intel syntax. We mention these differences because
2978 almost all 80386 documents used only Intel syntax. Notable differences
2979 between the two syntaxes are:
2982 AT&T immediate operands are preceded by @samp{$}; Intel immediate
2983 operands are undelimited (Intel @samp{push 4} is AT&T @samp{pushl $4}).
2984 AT&T register operands are preceded by @samp{%}; Intel register operands
2985 are undelimited. AT&T absolute (as opposed to PC relative) jump/call
2986 operands are prefixed by @samp{*}; they are undelimited in Intel syntax.
2989 AT&T and Intel syntax use the opposite order for source and destination
2990 operands. Intel @samp{add eax, 4} is @samp{addl $4, %eax}. The
2991 @samp{source, dest} convention is maintained for compatibility with
2992 previous Unix assemblers.
2995 In AT&T syntax the size of memory operands is determined from the last
2996 character of the opcode name. Opcode suffixes of @samp{b}, @samp{w},
2997 and @samp{l} specify byte (8-bit), word (16-bit), and long (32-bit)
2998 memory references. Intel syntax accomplishes this by prefixes memory
2999 operands (@emph{not} the opcodes themselves) with @samp{byte ptr},
3000 @samp{word ptr}, and @samp{dword ptr}. Thus, Intel @samp{mov al, byte
3001 ptr @var{foo}} is @samp{movb @var{foo}, %al} in AT&T syntax.
3004 Immediate form long jumps and calls are
3005 @samp{lcall/ljmp $@var{segment}, $@var{offset}} in AT&T syntax; the
3007 @samp{call/jmp far @var{segment}:@var{offset}}. Also, the far return
3009 is @samp{lret $@var{stack-adjust}} in AT&T syntax; Intel syntax is
3010 @samp{ret far @var{stack-adjust}}.
3013 The AT&T assembler does not provide support for multiple segment
3014 programs. Unix style systems expect all programs to be single segments.
3017 @subsection Opcode Naming
3018 Opcode names are suffixed with one character modifiers which specify the
3019 size of operands. The letters @samp{b}, @samp{w}, and @samp{l} specify
3020 byte, word, and long operands. If no suffix is specified by an
3021 instruction and it contains no memory operands then @code{as} tries to
3022 fill in the missing suffix based on the destination register operand
3023 (the last one by convention). Thus, @samp{mov %ax, %bx} is equivalent
3024 to @samp{movw %ax, %bx}; also, @samp{mov $1, %bx} is equivalent to
3025 @samp{movw $1, %bx}. Note that this is incompatible with the AT&T Unix
3026 assembler which assumes that a missing opcode suffix implies long
3027 operand size. (This incompatibility does not affect compiler output
3028 since compilers always explicitly specify the opcode suffix.)
3030 Almost all opcodes have the same names in AT&T and Intel format. There
3031 are a few exceptions. The sign extend and zero extend instructions need
3032 two sizes to specify them. They need a size to sign/zero extend
3033 @emph{from} and a size to zero extend @emph{to}. This is accomplished
3034 by using two opcode suffixes in AT&T syntax. Base names for sign extend
3035 and zero extend are @samp{movs@dots{}} and @samp{movz@dots{}} in AT&T
3036 syntax (@samp{movsx} and @samp{movzx} in Intel syntax). The opcode
3037 suffixes are tacked on to this base name, the @emph{from} suffix before
3038 the @emph{to} suffix. Thus, @samp{movsbl %al, %edx} is AT&T syntax for
3039 ``move sign extend @emph{from} %al @emph{to} %edx.'' Possible suffixes,
3040 thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word),
3041 and @samp{wl} (from word to long).
3043 The Intel syntax conversion instructions
3046 @samp{cbw} --- sign-extend byte in @samp{%al} to word in @samp{%ax},
3048 @samp{cwde} --- sign-extend word in @samp{%ax} to long in @samp{%eax},
3050 @samp{cwd} --- sign-extend word in @samp{%ax} to long in @samp{%dx:%ax},
3052 @samp{cdq} --- sign-extend dword in @samp{%eax} to quad in @samp{%edx:%eax},
3054 are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, and @samp{cltd} in
3055 AT&T naming. @code{as} accepts either naming for these instructions.
3057 Far call/jump instructions are @samp{lcall} and @samp{ljmp} in
3058 AT&T syntax, but are @samp{call far} and @samp{jump far} in Intel
3061 @subsection Register Naming
3062 Register operands are always prefixes with @samp{%}. The 80386 registers
3066 the 8 32-bit registers @samp{%eax} (the accumulator), @samp{%ebx},
3067 @samp{%ecx}, @samp{%edx}, @samp{%edi}, @samp{%esi}, @samp{%ebp} (the
3068 frame pointer), and @samp{%esp} (the stack pointer).
3071 the 8 16-bit low-ends of these: @samp{%ax}, @samp{%bx}, @samp{%cx},
3072 @samp{%dx}, @samp{%di}, @samp{%si}, @samp{%bp}, and @samp{%sp}.
3075 the 8 8-bit registers: @samp{%ah}, @samp{%al}, @samp{%bh},
3076 @samp{%bl}, @samp{%ch}, @samp{%cl}, @samp{%dh}, and @samp{%dl} (These
3077 are the high-bytes and low-bytes of @samp{%ax}, @samp{%bx},
3078 @samp{%cx}, and @samp{%dx})
3081 the 6 segment registers @samp{%cs} (code segment), @samp{%ds}
3082 (data segment), @samp{%ss} (stack segment), @samp{%es}, @samp{%fs},
3086 the 3 processor control registers @samp{%cr0}, @samp{%cr2}, and
3090 the 6 debug registers @samp{%db0}, @samp{%db1}, @samp{%db2},
3091 @samp{%db3}, @samp{%db6}, and @samp{%db7}.
3094 the 2 test registers @samp{%tr6} and @samp{%tr7}.
3097 the 8 floating point register stack @samp{%st} or equivalently
3098 @samp{%st(0)}, @samp{%st(1)}, @samp{%st(2)}, @samp{%st(3)},
3099 @samp{%st(4)}, @samp{%st(5)}, @samp{%st(6)}, and @samp{%st(7)}.
3102 @subsection Opcode Prefixes
3103 Opcode prefixes are used to modify the following opcode. They are used
3104 to repeat string instructions, to provide segment overrides, to perform
3105 bus lock operations, and to give operand and address size (16-bit
3106 operands are specified in an instruction by prefixing what would
3107 normally be 32-bit operands with a ``operand size'' opcode prefix).
3108 Opcode prefixes are usually given as single-line instructions with no
3109 operands, and must directly precede the instruction they act upon. For
3110 example, the @samp{scas} (scan string) instruction is repeated with:
3116 Here is a list of opcode prefixes:
3119 Segment override prefixes @samp{cs}, @samp{ds}, @samp{ss}, @samp{es},
3120 @samp{fs}, @samp{gs}. These are automatically added by specifying
3121 using the @var{segment}:@var{memory-operand} form for memory references.
3124 Operand/Address size prefixes @samp{data16} and @samp{addr16}
3125 change 32-bit operands/addresses into 16-bit operands/addresses. Note
3126 that 16-bit addressing modes (i.e. 8086 and 80286 addressing modes)
3127 are not supported (yet).
3130 The bus lock prefix @samp{lock} inhibits interrupts during
3131 execution of the instruction it precedes. (This is only valid with
3132 certain instructions; see a 80386 manual for details).
3135 The wait for coprocessor prefix @samp{wait} waits for the
3136 coprocessor to complete the current instruction. This should never be
3137 needed for the 80386/80387 combination.
3140 The @samp{rep}, @samp{repe}, and @samp{repne} prefixes are added
3141 to string instructions to make them repeat @samp{%ecx} times.
3144 @subsection Memory References
3145 An Intel syntax indirect memory reference of the form
3147 @var{segment}:[@var{base} + @var{index}*@var{scale} + @var{disp}]
3149 is translated into the AT&T syntax
3151 @var{segment}:@var{disp}(@var{base}, @var{index}, @var{scale})
3153 where @var{base} and @var{index} are the optional 32-bit base and
3154 index registers, @var{disp} is the optional displacement, and
3155 @var{scale}, taking the values 1, 2, 4, and 8, multiplies @var{index}
3156 to calculate the address of the operand. If no @var{scale} is
3157 specified, @var{scale} is taken to be 1. @var{segment} specifies the
3158 optional segment register for the memory operand, and may override the
3159 default segment register (see a 80386 manual for segment register
3160 defaults). Note that segment overrides in AT&T syntax @emph{must} have
3161 be preceded by a @samp{%}. If you specify a segment override which
3162 coincides with the default segment register, @code{as} will @emph{not}
3163 output any segment register override prefixes to assemble the given
3164 instruction. Thus, segment overrides can be specified to emphasize which
3165 segment register is used for a given memory operand.
3167 Here are some examples of Intel and AT&T style memory references:
3170 @item AT&T: @samp{-4(%ebp)}, Intel: @samp{[ebp - 4]}
3171 @var{base} is @samp{%ebp}; @var{disp} is @samp{-4}. @var{segment} is
3172 missing, and the default segment is used (@samp{%ss} for addressing with
3173 @samp{%ebp} as the base register). @var{index}, @var{scale} are both missing.
3175 @item AT&T: @samp{foo(,%eax,4)}, Intel: @samp{[foo + eax*4]}
3176 @var{index} is @samp{%eax} (scaled by a @var{scale} 4); @var{disp} is
3177 @samp{foo}. All other fields are missing. The segment register here
3178 defaults to @samp{%ds}.
3180 @item AT&T: @samp{foo(,1)}; Intel @samp{[foo]}
3181 This uses the value pointed to by @samp{foo} as a memory operand.
3182 Note that @var{base} and @var{index} are both missing, but there is only
3183 @emph{one} @samp{,}. This is a syntactic exception.
3185 @item AT&T: @samp{%gs:foo}; Intel @samp{gs:foo}
3186 This selects the contents of the variable @samp{foo} with segment
3187 register @var{segment} being @samp{%gs}.
3191 Absolute (as opposed to PC relative) call and jump operands must be
3192 prefixed with @samp{*}. If no @samp{*} is specified, @code{as} will
3193 always choose PC relative addressing for jump/call labels.
3195 Any instruction that has a memory operand @emph{must} specify its size (byte,
3196 word, or long) with an opcode suffix (@samp{b}, @samp{w}, or @samp{l},
3199 @subsection Handling of Jump Instructions
3200 Jump instructions are always optimized to use the smallest possible
3201 displacements. This is accomplished by using byte (8-bit) displacement
3202 jumps whenever the target is sufficiently close. If a byte displacement
3203 is insufficient a long (32-bit) displacement is used. We do not support
3204 word (16-bit) displacement jumps (i.e. prefixing the jump instruction
3205 with the @samp{addr16} opcode prefix), since the 80386 insists upon masking
3206 @samp{%eip} to 16 bits after the word displacement is added.
3208 Note that the @samp{jcxz}, @samp{jecxz}, @samp{loop}, @samp{loopz},
3209 @samp{loope}, @samp{loopnz} and @samp{loopne} instructions only come in
3210 byte displacements, so that it is possible that use of these
3211 instructions (@code{GCC} does not use them) will cause the assembler to
3212 print an error message (and generate incorrect code). The AT&T 80386
3213 assembler tries to get around this problem by expanding @samp{jcxz foo} to
3221 @subsection Floating Point
3222 All 80387 floating point types except packed BCD are supported.
3223 (BCD support may be added without much difficulty). These data
3224 types are 16-, 32-, and 64- bit integers, and single (32-bit),
3225 double (64-bit), and extended (80-bit) precision floating point.
3226 Each supported type has an opcode suffix and a constructor
3227 associated with it. Opcode suffixes specify operand's data
3228 types. Constructors build these data types into memory.
3232 Floating point constructors are @samp{.float} or @samp{.single},
3233 @samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats.
3234 These correspond to opcode suffixes @samp{s}, @samp{l}, and @samp{t}.
3235 @samp{t} stands for temporary real, and that the 80387 only supports
3236 this format via the @samp{fldt} (load temporary real to stack top) and
3237 @samp{fstpt} (store temporary real and pop stack) instructions.
3240 Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and
3241 @samp{.quad} for the 16-, 32-, and 64-bit integer formats. The corresponding
3242 opcode suffixes are @samp{s} (single), @samp{l} (long), and @samp{q}
3243 (quad). As with the temporary real format the 64-bit @samp{q} format is
3244 only present in the @samp{fildq} (load quad integer to stack top) and
3245 @samp{fistpq} (store quad integer and pop stack) instructions.
3248 Register to register operations do not require opcode suffixes,
3249 so that @samp{fst %st, %st(1)} is equivalent to @samp{fstl %st, %st(1)}.
3251 Since the 80387 automatically synchronizes with the 80386 @samp{fwait}
3252 instructions are almost never needed (this is not the case for the
3253 80286/80287 and 8086/8087 combinations). Therefore, @code{as} suppresses
3254 the @samp{fwait} instruction whenever it is implicitly selected by one
3255 of the @samp{fn@dots{}} instructions. For example, @samp{fsave} and
3256 @samp{fnsave} are treated identically. In general, all the @samp{fn@dots{}}
3257 instructions are made equivalent to @samp{f@dots{}} instructions. If
3258 @samp{fwait} is desired it must be explicitly coded.
3261 There is some trickery concerning the @samp{mul} and @samp{imul}
3262 instructions that deserves mention. The 16-, 32-, and 64-bit expanding
3263 multiplies (base opcode @samp{0xf6}; extension 4 for @samp{mul} and 5
3264 for @samp{imul}) can be output only in the one operand form. Thus,
3265 @samp{imul %ebx, %eax} does @emph{not} select the expanding multiply;
3266 the expanding multiply would clobber the @samp{%edx} register, and this
3267 would confuse @code{GCC} output. Use @samp{imul %ebx} to get the
3268 64-bit product in @samp{%edx:%eax}.
3270 We have added a two operand form of @samp{imul} when the first operand
3271 is an immediate mode expression and the second operand is a register.
3272 This is just a shorthand, so that, multiplying @samp{%eax} by 69, for
3273 example, can be done with @samp{imul $69, %eax} rather than @samp{imul
3276 @c pesch@cygnus.com: we also ignore the following chapters, but for
3277 @c a different reason---internals are changing
3278 @c rapidly. These may need to be moved to another
3279 @c book anyhow, if we adopt the model of user/modifier
3282 @node Maintenance, Retargeting, Machine Dependent, Top
3283 @chapter Maintaining the Assembler
3284 [[this chapter is still being built]]
3287 We had these goals, in descending priority:
3290 For every program composed by a compiler, @code{as} should emit
3291 ``correct'' code. This leaves some latitude in choosing addressing
3292 modes, order of @code{relocation_info} structures in the object
3295 @item Speed, for usual case.
3296 By far the most common use of @code{as} will be assembling compiler
3299 @item Upward compatibility for existing assembler code.
3300 Well @dots{} we don't support Vax bit fields but everything else
3301 seems to be upward compatible.
3304 The code should be maintainable with few surprises. (JF: ha!)
3308 We assumed that disk I/O was slow and expensive while memory was
3309 fast and access to memory was cheap. We expect the in-memory data
3310 structures to be less than 10 times the size of the emitted object
3311 file. (Contrast this with the C compiler where in-memory structures
3312 might be 100 times object file size!)
3316 Try to read the source file from disk only one time. For other
3317 reasons, we keep large chunks of the source file in memory during
3318 assembly so this is not a problem. Also the assembly algorithm
3319 should only scan the source text once if the compiler composed the
3320 text according to a few simple rules.
3322 Emit the object code bytes only once. Don't store values and then
3325 Build the object file in memory and do direct writes to disk of
3329 RMS suggested a one-pass algorithm which seems to work well. By not
3330 parsing text during a second pass considerable time is saved on
3331 large programs (@emph{e.g.} the sort of C program @code{yacc} would
3334 It happened that the data structures needed to emit relocation
3335 information to the object file were neatly subsumed into the data
3336 structures that do backpatching of addresses after pass 1.
3338 Many of the functions began life as re-usable modules, loosely
3339 connected. RMS changed this to gain speed. For example, input
3340 parsing routines which used to work on pre-sanitized strings now
3341 must parse raw data. Hence they have to import knowledge of the
3342 assemblers' comment conventions @emph{etc}.
3344 @section Deprecated Feature(?)s
3345 We have stopped supporting some features:
3348 @code{.org} statements must have @b{defined} expressions.
3350 Vax Bit fields (@kbd{:} operator) are entirely unsupported.
3353 It might be a good idea to not support these features in a future release:
3356 @kbd{#} should begin a comment, even in column 1.
3358 Why support the logical line & file concept any more?
3360 Subsegments are a good candidate for flushing.
3361 Depends on which compilers need them I guess.
3364 @section Bugs, Ideas, Further Work
3365 Clearly the major improvement is DON'T USE A TEXT-READING
3366 ASSEMBLER for the back end of a compiler. It is much faster to
3367 interpret binary gobbledygook from a compiler's tables than to
3368 ask the compiler to write out human-readable code just so the
3369 assembler can parse it back to binary.
3371 Assuming you use @code{as} for human written programs: here are
3375 Document (here) @code{APP}.
3377 Take advantage of knowing no spaces except after opcode
3378 to speed up @code{as}. (Modify @code{app.c} to flush useless spaces:
3379 only keep space/tabs at begin of line or between 2
3382 Put pointers in this documentation to @file{a.out} documentation.
3384 Split the assembler into parts so it can gobble direct binary
3385 from @emph{e.g.} @code{cc}. It is silly for@code{cc} to compose text
3386 just so @code{as} can parse it back to binary.
3388 Rewrite hash functions: I want a more modular, faster library.
3390 Clean up LOTS of code.
3392 Include all the non-@file{.c} files in the maintenance chapter.
3396 Implement flonum short literals.
3398 Change all talk of expression operands to expression quantities,
3399 or perhaps to expression arguments.
3403 Whenever a @code{.text} or @code{.data} statement is seen, we close
3404 of the current frag with an imaginary @code{.fill 0}. This is
3405 because we only have one obstack for frags, and we can't grow new
3406 frags for a new subsegment, then go back to the old subsegment and
3407 append bytes to the old frag. All this nonsense goes away if we
3408 give each subsegment its own obstack. It makes code simpler in
3409 about 10 places, but nobody has bothered to do it because C compiler
3410 output rarely changes subsegments (compared to ending frags with
3411 relaxable addresses, which is common).
3415 @c The following files in the @file{as} directory
3416 @c are symbolic links to other files, of
3417 @c the same name, in a different directory.
3420 @c @file{atof_generic.c}
3422 @c @file{atof_vax.c}
3424 @c @file{flonum_const.c}
3426 @c @file{flonum_copy.c}
3428 @c @file{flonum_get.c}
3430 @c @file{flonum_multip.c}
3432 @c @file{flonum_normal.c}
3434 @c @file{flonum_print.c}
3437 Here is a list of the source files in the @file{as} directory.
3441 This contains the pre-processing phase, which deletes comments,
3442 handles whitespace, etc. This was recently re-written, since app
3443 used to be a separate program, but RMS wanted it to be inline.
3446 This is a subroutine to append a string to another string returning a
3447 pointer just after the last @code{char} appended. (JF: All these
3448 little routines should probably all be put in one file.)
3451 Here you will find the main program of the assembler @code{as}.
3454 This is a branch office of @file{read.c}. This understands
3455 expressions, arguments. Inside @code{as}, arguments are called
3456 (expression) @emph{operands}. This is confusing, because we also talk
3457 (elsewhere) about instruction @emph{operands}. Also, expression
3458 operands are called @emph{quantities} explicitly to avoid confusion
3459 with instruction operands. What a mess.
3462 This implements the @b{frag} concept. Without frags, finding the
3463 right size for branch instructions would be a lot harder.
3466 This contains the symbol table, opcode table @emph{etc.} hashing
3470 This is a table of values of digits, for use in atoi() type
3471 functions. Could probably be flushed by using calls to strtol(), or
3475 This contains Operating system dependent source file reading
3476 routines. Since error messages often say where we are in reading
3477 the source file, they live here too. Since @code{as} is intended to
3478 run under GNU and Unix only, this might be worth flushing. Anyway,
3479 almost all C compilers support stdio.
3482 This deals with calling the pre-processor (if needed) and feeding the
3483 chunks back to the rest of the assembler the right way.
3486 This contains operating system independent parts of fatal and
3487 warning message reporting. See @file{append.c} above.
3490 This contains operating system dependent functions that write an
3491 object file for @code{as}. See @file{input-file.c} above.
3494 This implements all the directives of @code{as}. This also deals
3495 with passing input lines to the machine dependent part of the
3499 This is a C library function that isn't in most C libraries yet.
3500 See @file{append.c} above.
3503 This implements subsegments.
3506 This implements symbols.
3509 This contains the code to perform relaxation, and to write out
3510 the object file. It is mostly operating system independent, but
3511 different OSes have different object file formats in any case.
3514 This implements @code{malloc()} or bust. See @file{append.c} above.
3517 This implements @code{realloc()} or bust. See @file{append.c} above.
3519 @item atof-generic.c
3520 The following files were taken from a machine-independent subroutine
3521 library for manipulating floating point numbers and very large
3524 @file{atof-generic.c} turns a string into a flonum internal format
3525 floating-point number.
3527 @item flonum-const.c
3528 This contains some potentially useful floating point numbers in
3532 This copies a flonum.
3534 @item flonum-multip.c
3535 This multiplies two flonums together.
3538 This copies a bignum.
3542 Here is a table of all the machine-specific files (this includes
3543 both source and header files). Typically, there is a
3544 @var{machine}.c file, a @var{machine}-opcode.h file, and an
3545 atof-@var{machine}.c file. The @var{machine}-opcode.h file should
3546 be identical to the one used by GDB (which uses it for disassembly.)
3551 This contains code to turn a flonum into a ieee literal constant.
3552 This is used by tye 680x0, 32x32, sparc, and i386 versions of @code{as}.
3555 This is the opcode-table for the i386 version of the assembler.
3558 This contains all the code for the i386 version of the assembler.
3561 This defines constants and macros used by the i386 version of the assembler.
3564 generic 68020 header file. To be linked to m68k.h on a
3565 non-sun3, non-hpux system.
3568 68010 header file for Sun2 workstations. Not well tested. To be linked
3569 to m68k.h on a sun2. (See also @samp{-DSUN_ASM_SYNTAX} in the
3573 68020 header file for Sun3 workstations. To be linked to m68k.h before
3574 compiling on a Sun3 system. (See also @samp{-DSUN_ASM_SYNTAX} in the
3578 68020 header file for a HPUX (system 5?) box. Which box, which
3579 version of HPUX, etc? I don't know.
3582 A hard- or symbolic- link to one of @file{m-generic.h},
3583 @file{m-hpux.h} or @file{m-sun3.h} depending on which kind of
3584 680x0 you are assembling for. (See also @samp{-DSUN_ASM_SYNTAX} in the
3588 Opcode table for 68020. This is now a link to the opcode table
3589 in the @code{GDB} source directory.
3592 All the mc680x0 code, in one huge, slow-to-compile file.
3595 This contains the code for the ns32032/ns32532 version of the
3598 @item ns32k-opcode.h
3599 This contains the opcode table for the ns32032/ns32532 version
3603 Vax specific file for describing Vax operands and other Vax-ish things.
3609 Vax specific parts of @code{as}. Also includes the former files
3610 @file{vax-ins-parse.c}, @file{vax-reg-parse.c} and @file{vip-op.c}.
3613 Turns a flonum into a Vax constant.
3616 This file contains the special code needed to put out a VMS
3617 style object file for the Vax.
3621 Here is a list of the header files in the source directory.
3622 (Warning: This section may not be very accurate. I didn't
3623 write the header files; I just report them.) Also note that I
3624 think many of these header files could be cleaned up or
3630 This describes the structures used to create the binary header data
3631 inside the object file. Perhaps we should use the one in
3632 @file{/usr/include}?
3635 This defines all the globally useful things, and pulls in <stdio.h>
3639 This defines macros useful for dealing with bignums.
3642 Structure and macros for dealing with expression()
3645 This defines the structure for dealing with floating point
3646 numbers. It #includes @file{bignum.h}.
3649 This contains macro for appending a byte to the current frag.
3652 Structures and function definitions for the hashing functions.
3655 Function headers for the input-file.c functions.
3658 structures and function headers for things defined in the
3659 machine dependent part of the assembler.
3662 This is the GNU systemwide include file for manipulating obstacks.
3663 Since nobody is running under real GNU yet, we include this file.
3666 Macros and function headers for reading in source files.
3668 @item struct-symbol.h
3669 Structure definition and macros for dealing with the gas
3670 internal form of a symbol.
3673 structure definition for dealing with the numbered subsegments
3674 of the text and data segments.
3677 Macros and function headers for dealing with symbols.
3680 Structure for doing segment fixups.
3683 @comment ~subsection Test Directory
3684 @comment (Note: The test directory seems to have disappeared somewhere
3685 @comment along the line. If you want it, you'll probably have to find a
3686 @comment REALLY OLD dump tape~dots{})
3688 @comment The ~file{test/} directory is used for regression testing.
3689 @comment After you modify ~@code{as}, you can get a quick go/nogo
3690 @comment confidence test by running the new ~@code{as} over the source
3691 @comment files in this directory. You use a shell script ~file{test/do}.
3693 @comment The tests in this suite are evolving. They are not comprehensive.
3694 @comment They have, however, caught hundreds of bugs early in the debugging
3695 @comment cycle of ~@code{as}. Most test statements in this suite were naturally
3696 @comment selected: they were used to demonstrate actual ~@code{as} bugs rather
3697 @comment than being written ~i{a prioi}.
3699 @comment Another testing suggestion: over 30 bugs have been found simply by
3700 @comment running examples from this manual through ~@code{as}.
3701 @comment Some examples in this manual are selected
3702 @comment to distinguish boundary conditions; they are good for testing ~@code{as}.
3704 @comment ~subsubsection Regression Testing
3705 @comment Each regression test involves assembling a file and comparing the
3706 @comment actual output of ~@code{as} to ``known good'' output files. Both
3707 @comment the object file and the error/warning message file (stderr) are
3708 @comment inspected. Optionally ~@code{as}' exit status may be checked.
3709 @comment Discrepencies are reported. Each discrepency means either that
3710 @comment you broke some part of ~@code{as} or that the ``known good'' files
3711 @comment are now out of date and should be changed to reflect the new
3712 @comment definition of ``good''.
3714 @comment Each regression test lives in its own directory, in a tree
3715 @comment rooted in the directory ~file{test/}. Each such directory
3716 @comment has a name ending in ~file{.ret}, where `ret' stands for
3717 @comment REgression Test. The ~file{.ret} ending allows ~code{find
3718 @comment (1)} to find all regression tests in the tree, without
3719 @comment needing to list them explicitly.
3721 @comment Any ~file{.ret} directory must contain a file called
3722 @comment ~file{input} which is the source file to assemble. During
3723 @comment testing an object file ~file{output} is created, as well as
3724 @comment a file ~file{stdouterr} which contains the output to both
3725 @comment stderr and stderr. If there is a file ~file{output.good} in
3726 @comment the directory, and if ~file{output} contains exactly the
3727 @comment same data as ~file{output.good}, the file ~file{output} is
3728 @comment deleted. Likewise ~file{stdouterr} is removed if it exactly
3729 @comment matches a file ~file{stdouterr.good}. If file
3730 @comment ~file{status.good} is present, containing a decimal number
3731 @comment before a newline, the exit status of ~@code{as} is compared
3732 @comment to this number. If the status numbers are not equal, a file
3733 @comment ~file{status} is written to the directory, containing the
3734 @comment actual status as a decimal number followed by newline.
3736 @comment Should any of the ~file{*.good} files fail to match their corresponding
3737 @comment actual files, this is noted by a 1-line message on the screen during
3738 @comment the regression test, and you can use ~@code{find (1)} to find any
3739 @comment files named ~file{status}, ~file {output} or ~file{stdouterr}.
3741 @node Retargeting, License, Maintenance, Top
3742 @chapter Teaching the Assembler about a New Machine
3744 This chapter describes the steps required in order to make the
3745 assembler work with another machine's assembly language. This
3746 chapter is not complete, and only describes the steps in the
3747 broadest terms. You should look at the source for the
3748 currently supported machine in order to discover some of the
3749 details that aren't mentioned here.
3751 You should create a new file called @file{@var{machine}.c}, and
3752 add the appropriate lines to the file @file{Makefile} so that
3753 you can compile your new version of the assembler. This should
3754 be straighforward; simply add lines similar to the ones there
3755 for the four current versions of the assembler.
3757 If you want to be compatible with GDB, (and the current
3758 machine-dependent versions of the assembler), you should create
3759 a file called @file{@var{machine}-opcode.h} which should
3760 contain all the information about the names of the machine
3761 instructions, their opcodes, and what addressing modes they
3762 support. If you do this right, the assembler and GDB can share
3763 this file, and you'll only have to write it once. Note that
3764 while you're writing @code{as}, you may want to use an
3765 independent program (if you have access to one), to make sure
3766 that @code{as} is emitting the correct bytes. Since @code{as}
3767 and @code{GDB} share the opcode table, an incorrect opcode
3768 table entry may make invalid bytes look OK when you disassemble
3769 them with @code{GDB}.
3771 @section Functions You will Have to Write
3773 Your file @file{@var{machine}.c} should contain definitions for
3774 the following functions and variables. It will need to include
3775 some header files in order to use some of the structures
3776 defined in the machine-independent part of the assembler. The
3777 needed header files are mentioned in the descriptions of the
3778 functions that will need them.
3783 This long integer holds the value to place at the beginning of
3784 the @file{a.out} file. It is usually @samp{OMAGIC}, except on
3785 machines that store additional information in the magic-number.
3787 @item char comment_chars[];
3788 This character array holds the values of the characters that
3789 start a comment anywhere in a line. Comments are stripped off
3790 automatically by the machine independent part of the
3791 assembler. Note that the @samp{/*} will always start a
3792 comment, and that only @samp{*/} will end a comment started by
3795 @item char line_comment_chars[];
3796 This character array holds the values of the chars that start a
3797 comment only if they are the first (non-whitespace) character
3798 on a line. If the character @samp{#} does not appear in this
3799 list, you may get unexpected results. (Various
3800 machine-independent parts of the assembler treat the comments
3801 @samp{#APP} and @samp{#NO_APP} specially, and assume that lines
3802 that start with @samp{#} are comments.)
3804 @item char EXP_CHARS[];
3805 This character array holds the letters that can separate the
3806 mantissa and the exponent of a floating point number. Typical
3807 values are @samp{e} and @samp{E}.
3809 @item char FLT_CHARS[];
3810 This character array holds the letters that--when they appear
3811 immediately after a leading zero--indicate that a number is a
3812 floating-point number. (Sort of how 0x indicates that a
3813 hexadecimal number follows.)
3815 @item pseudo_typeS md_pseudo_table[];
3816 (@var{pseudo_typeS} is defined in @file{md.h})
3817 This array contains a list of the machine_dependent directives
3818 the assembler must support. It contains the name of each
3819 pseudo op (Without the leading @samp{.}), a pointer to a
3820 function to be called when that directive is encountered, and
3821 an integer argument to be passed to that function.
3823 @item void md_begin(void)
3824 This function is called as part of the assembler's
3825 initialization. It should do any initialization required by
3826 any of your other routines.
3828 @item int md_parse_option(char **optionPTR, int *argcPTR, char ***argvPTR)
3829 This routine is called once for each option on the command line
3830 that the machine-independent part of @code{as} does not
3831 understand. This function should return non-zero if the option
3832 pointed to by @var{optionPTR} is a valid option. If it is not
3833 a valid option, this routine should return zero. The variables
3834 @var{argcPTR} and @var{argvPTR} are provided in case the option
3835 requires a filename or something similar as an argument. If
3836 the option is multi-character, @var{optionPTR} should be
3837 advanced past the end of the option, otherwise every letter in
3838 the option will be treated as a separate single-character
3841 @item void md_assemble(char *string)
3842 This routine is called for every machine-dependent
3843 non-directive line in the source file. It does all the real
3844 work involved in reading the opcode, parsing the operands,
3845 etc. @var{string} is a pointer to a null-terminated string,
3846 that comprises the input line, with all excess whitespace and
3849 @item void md_number_to_chars(char *outputPTR,long value,int nbytes)
3850 This routine is called to turn a C long int, short int, or char
3851 into the series of bytes that represents that number on the
3852 target machine. @var{outputPTR} points to an array where the
3853 result should be stored; @var{value} is the value to store; and
3854 @var{nbytes} is the number of bytes in 'value' that should be
3857 @item void md_number_to_imm(char *outputPTR,long value,int nbytes)
3858 This routine is called to turn a C long int, short int, or char
3859 into the series of bytes that represent an immediate value on
3860 the target machine. It is identical to the function @code{md_number_to_chars},
3861 except on NS32K machines.@refill
3863 @item void md_number_to_disp(char *outputPTR,long value,int nbytes)
3864 This routine is called to turn a C long int, short int, or char
3865 into the series of bytes that represent an displacement value on
3866 the target machine. It is identical to the function @code{md_number_to_chars},
3867 except on NS32K machines.@refill
3869 @item void md_number_to_field(char *outputPTR,long value,int nbytes)
3870 This routine is identical to @code{md_number_to_chars},
3871 except on NS32K machines.
3873 @item void md_ri_to_chars(struct relocation_info *riPTR,ri)
3874 (@code{struct relocation_info} is defined in @file{a.out.h})
3875 This routine emits the relocation info in @var{ri}
3876 in the appropriate bit-pattern for the target machine.
3877 The result should be stored in the location pointed
3878 to by @var{riPTR}. This routine may be a no-op unless you are
3879 attempting to do cross-assembly.
3881 @item char *md_atof(char type,char *outputPTR,int *sizePTR)
3882 This routine turns a series of digits into the appropriate
3883 internal representation for a floating-point number.
3884 @var{type} is a character from @var{FLT_CHARS[]} that describes
3885 what kind of floating point number is wanted; @var{outputPTR}
3886 is a pointer to an array that the result should be stored in;
3887 and @var{sizePTR} is a pointer to an integer where the size (in
3888 bytes) of the result should be stored. This routine should
3889 return an error message, or an empty string (not (char *)0) for
3892 @item int md_short_jump_size;
3893 This variable holds the (maximum) size in bytes of a short (16
3894 bit or so) jump created by @code{md_create_short_jump()}. This
3895 variable is used as part of the broken-word feature, and isn't
3896 needed if the assembler is compiled with
3897 @samp{-DWORKING_DOT_WORD}.
3899 @item int md_long_jump_size;
3900 This variable holds the (maximum) size in bytes of a long (32
3901 bit or so) jump created by @code{md_create_long_jump()}. This
3902 variable is used as part of the broken-word feature, and isn't
3903 needed if the assembler is compiled with
3904 @samp{-DWORKING_DOT_WORD}.
3906 @item void md_create_short_jump(char *resultPTR,long from_addr,
3907 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3908 This function emits a jump from @var{from_addr} to @var{to_addr} in
3909 the array of bytes pointed to by @var{resultPTR}. If this creates a
3910 type of jump that must be relocated, this function should call
3911 @code{fix_new()} with @var{frag} and @var{to_symbol}. The jump
3912 emitted by this function may be smaller than @var{md_short_jump_size},
3913 but it must never create a larger one.
3914 (If it creates a smaller jump, the extra bytes of memory will not be
3915 used.) This function is used as part of the broken-word feature,
3916 and isn't needed if the assembler is compiled with
3917 @samp{-DWORKING_DOT_WORD}.@refill
3919 @item void md_create_long_jump(char *ptr,long from_addr,
3920 @code{long to_addr,fragS *frag,symbolS *to_symbol)}
3921 This function is similar to the previous function,
3922 @code{md_create_short_jump()}, except that it creates a long
3923 jump instead of a short one. This function is used as part of
3924 the broken-word feature, and isn't needed if the assembler is
3925 compiled with @samp{-DWORKING_DOT_WORD}.
3927 @item int md_estimate_size_before_relax(fragS *fragPTR,int segment_type)
3928 This function does the initial setting up for relaxation. This
3929 includes forcing references to still-undefined symbols to the
3930 appropriate addressing modes.
3932 @item relax_typeS md_relax_table[];
3933 (relax_typeS is defined in md.h)
3934 This array describes the various machine dependent states a
3935 frag may be in before relaxation. You will need one group of
3936 entries for each type of addressing mode you intend to relax.
3938 @item void md_convert_frag(fragS *fragPTR)
3939 (@var{fragS} is defined in @file{as.h})
3940 This routine does the required cleanup after relaxation.
3941 Relaxation has changed the type of the frag to a type that can
3942 reach its destination. This function should adjust the opcode
3943 of the frag to use the appropriate addressing mode.
3944 @var{fragPTR} points to the frag to clean up.
3946 @item void md_end(void)
3947 This function is called just before the assembler exits. It
3948 need not free up memory unless the operating system doesn't do
3949 it automatically on exit. (In which case you'll also have to
3950 track down all the other places where the assembler allocates
3951 space but never frees it.)
3955 @section External Variables You will Need to Use
3957 You will need to refer to or change the following external variables
3958 from within the machine-dependent part of the assembler.
3961 @item extern char flagseen[];
3962 This array holds non-zero values in locations corresponding to
3963 the options that were on the command line. Thus, if the
3964 assembler was called with @samp{-W}, @var{flagseen['W']} would
3967 @item extern fragS *frag_now;
3968 This pointer points to the current frag--the frag that bytes
3969 are currently being added to. If nothing else, you will need
3970 to pass it as an argument to various machine-independent
3971 functions. It is maintained automatically by the
3972 frag-manipulating functions; you should never have to change it
3975 @item extern LITTLENUM_TYPE generic_bignum[];
3976 (@var{LITTLENUM_TYPE} is defined in @file{bignum.h}.
3977 This is where @dfn{bignums}--numbers larger than 32 bits--are
3978 returned when they are encountered in an expression. You will
3979 need to use this if you need to implement directives (or
3980 anything else) that must deal with these large numbers.
3981 @code{Bignums} are of @code{segT} @code{SEG_BIG} (defined in
3982 @file{as.h}, and have a positive @code{X_add_number}. The
3983 @code{X_add_number} of a @code{bignum} is the number of
3984 @code{LITTLENUMS} in @var{generic_bignum} that the number takes
3987 @item extern FLONUM_TYPE generic_floating_point_number;
3988 (@var{FLONUM_TYPE} is defined in @file{flonum.h}.
3989 The is where @dfn{flonums}--floating-point numbers within
3990 expressions--are returned. @code{Flonums} are of @code{segT}
3991 @code{SEG_BIG}, and have a negative @code{X_add_number}.
3992 @code{Flonums} are returned in a generic format. You will have
3993 to write a routine to turn this generic format into the
3994 appropriate floating-point format for your machine.
3996 @item extern int need_pass_2;
3997 If this variable is non-zero, the assembler has encountered an
3998 expression that cannot be assembled in a single pass. Since
3999 the second pass isn't implemented, this flag means that the
4000 assembler is punting, and is only looking for additional syntax
4001 errors. (Or something like that.)
4003 @item extern segT now_seg;
4004 This variable holds the value of the segment the assembler is
4005 currently assembling into.
4009 @section External functions will you need
4011 You will find the following external functions useful (or
4012 indispensable) when you're writing the machine-dependent part
4017 @item char *frag_more(int bytes)
4018 This function allocates @var{bytes} more bytes in the current
4019 frag (or starts a new frag, if it can't expand the current frag
4020 any more.) for you to store some object-file bytes in. It
4021 returns a pointer to the bytes, ready for you to store data in.
4023 @item void fix_new(fragS *frag, int where, short size, symbolS *add_symbol, symbolS *sub_symbol, long offset, int pcrel)
4024 This function stores a relocation fixup to be acted on later.
4025 @var{frag} points to the frag the relocation belongs in;
4026 @var{where} is the location within the frag where the relocation begins;
4027 @var{size} is the size of the relocation, and is usually 1 (a single byte),
4028 2 (sixteen bits), or 4 (a longword).
4029 The value @var{add_symbol} @minus{} @var{sub_symbol} + @var{offset}, is added to the byte(s)
4030 at @var{frag->literal[where]}. If @var{pcrel} is non-zero, the address of the
4031 location is subtracted from the result. A relocation entry is also added
4032 to the @file{a.out} file. @var{add_symbol}, @var{sub_symbol}, and/or
4033 @var{offset} may be NULL.@refill
4035 @item char *frag_var(relax_stateT type, int max_chars, int var,
4036 @code{relax_substateT subtype, symbolS *symbol, char *opcode)}
4037 This function creates a machine-dependent frag of type @var{type}
4038 (usually @code{rs_machine_dependent}).
4039 @var{max_chars} is the maximum size in bytes that the frag may grow by;
4040 @var{var} is the current size of the variable end of the frag;
4041 @var{subtype} is the sub-type of the frag. The sub-type is used to index into
4042 @var{md_relax_table[]} during @code{relaxation}.
4043 @var{symbol} is the symbol whose value should be used to when relax-ing this frag.
4044 @var{opcode} points into a byte whose value may have to be modified if the
4045 addressing mode used by this frag changes. It typically points into the
4046 @var{fr_literal[]} of the previous frag, and is used to point to a location
4047 that @code{md_convert_frag()}, may have to change.@refill
4049 @item void frag_wane(fragS *fragPTR)
4050 This function is useful from within @code{md_convert_frag}. It
4051 changes a frag to type rs_fill, and sets the variable-sized
4052 piece of the frag to zero. The frag will never change in size
4055 @item segT expression(expressionS *retval)
4056 (@var{segT} is defined in @file{as.h}; @var{expressionS} is defined in @file{expr.h})
4057 This function parses the string pointed to by the external char
4058 pointer @var{input_line_pointer}, and returns the segment-type
4059 of the expression. It also stores the results in the
4060 @var{expressionS} pointed to by @var{retval}.
4061 @var{input_line_pointer} is advanced to point past the end of
4062 the expression. (@var{input_line_pointer} is used by other
4063 parts of the assembler. If you modify it, be sure to restore
4064 it to its original value.)
4066 @item as_warn(char *message,@dots{})
4067 If warning messages are disabled, this function does nothing.
4068 Otherwise, it prints out the current file name, and the current
4069 line number, then uses @code{fprintf} to print the
4070 @var{message} and any arguments it was passed.
4072 @item as_bad(char *message,@dots{})
4073 This function should be called when @code{as} encounters
4074 conditions that are bad enough that @code{as} should not
4075 produce an object file, but should continue reading input and
4076 printing warning and bad error messages.
4078 @item as_fatal(char *message,@dots{})
4079 This function prints out the current file name and line number,
4080 prints the word @samp{FATAL:}, then uses @code{fprintf} to
4081 print the @var{message} and any arguments it was passed. Then
4082 the assembler exits. This function should only be used for
4083 serious, unrecoverable errors.
4085 @item void float_const(int float_type)
4086 This function reads floating-point constants from the current
4087 input line, and calls @code{md_atof} to assemble them. It is
4088 useful as the function to call for the directives
4089 @samp{.single}, @samp{.double}, @samp{.float}, etc.
4090 @var{float_type} must be a character from @var{FLT_CHARS}.
4092 @item void demand_empty_rest_of_line(void);
4093 This function can be used by machine-dependent directives to
4094 make sure the rest of the input line is empty. It prints a
4095 warning message if there are additional characters on the line.
4097 @item long int get_absolute_expression(void)
4098 This function can be used by machine-dependent directives to
4099 read an absolute number from the current input line. It
4100 returns the result. If it isn't given an absolute expression,
4101 it prints a warning message and returns zero.
4106 @section The concept of Frags
4108 This assembler works to optimize the size of certain addressing
4109 modes. (e.g. branch instructions) This means the size of many
4110 pieces of object code cannot be determined until after assembly
4111 is finished. (This means that the addresses of symbols cannot be
4112 determined until assembly is finished.) In order to do this,
4113 @code{as} stores the output bytes as @dfn{frags}.
4115 Here is the definition of a frag (from @file{as.h})
4121 relax_stateT fr_type;
4122 relax_substateT fr_substate;
4123 unsigned long fr_address;
4125 struct symbol *fr_symbol;
4127 struct frag *fr_next;
4134 is the size of the fixed-size piece of the frag.
4137 is the maximum (?) size of the variable-sized piece of the frag.
4140 is the type of the frag.
4145 rs_machine_dependent
4148 This stores the type of machine-dependent frag this is. (what
4149 kind of addressing mode is being used, and what size is being
4153 @var{fr_address} is only valid after relaxation is finished.
4154 Before relaxation, the only way to store an address is (pointer
4155 to frag containing the address) plus (offset into the frag).
4158 This contains a number, whose meaning depends on the type of
4160 for machine_dependent frags, this contains the offset from
4161 fr_symbol that the frag wants to go to. Thus, for branch
4162 instructions it is usually zero. (unless the instruction was
4163 @samp{jba foo+12} or something like that.)
4166 for machine_dependent frags, this points to the symbol the frag
4170 This points to the location in the frag (or in a previous frag)
4171 of the opcode for the instruction that caused this to be a frag.
4172 @var{fr_opcode} is needed if the actual opcode must be changed
4173 in order to use a different form of the addressing mode.
4174 (For example, if a conditional branch only comes in size tiny,
4175 a large-size branch could be implemented by reversing the sense
4176 of the test, and turning it into a tiny branch over a large jump.
4177 This would require changing the opcode.)
4179 @var{fr_literal} is a variable-size array that contains the
4180 actual object bytes. A frag consists of a fixed size piece of
4181 object data, (which may be zero bytes long), followed by a
4182 piece of object data whose size may not have been determined
4183 yet. Other information includes the type of the frag (which
4184 controls how it is relaxed),
4187 This is the next frag in the singly-linked list. This is
4188 usually only needed by the machine-independent part of
4194 @node License, , Retargeting, Top
4195 @unnumbered GNU GENERAL PUBLIC LICENSE
4196 @center Version 1, February 1989
4199 Copyright @copyright{} 1989 Free Software Foundation, Inc.
4200 675 Mass Ave, Cambridge, MA 02139, USA
4202 Everyone is permitted to copy and distribute verbatim copies
4203 of this license document, but changing it is not allowed.
4206 @unnumberedsec Preamble
4208 The license agreements of most software companies try to keep users
4209 at the mercy of those companies. By contrast, our General Public
4210 License is intended to guarantee your freedom to share and change free
4211 software---to make sure the software is free for all its users. The
4212 General Public License applies to the Free Software Foundation's
4213 software and to any other program whose authors commit to using it.
4214 You can use it for your programs, too.
4216 When we speak of free software, we are referring to freedom, not
4217 price. Specifically, the General Public License is designed to make
4218 sure that you have the freedom to give away or sell copies of free
4219 software, that you receive source code or can get it if you want it,
4220 that you can change the software or use pieces of it in new free
4221 programs; and that you know you can do these things.
4223 To protect your rights, we need to make restrictions that forbid
4224 anyone to deny you these rights or to ask you to surrender the rights.
4225 These restrictions translate to certain responsibilities for you if you
4226 distribute copies of the software, or if you modify it.
4228 For example, if you distribute copies of a such a program, whether
4229 gratis or for a fee, you must give the recipients all the rights that
4230 you have. You must make sure that they, too, receive or can get the
4231 source code. And you must tell them their rights.
4233 We protect your rights with two steps: (1) copyright the software, and
4234 (2) offer you this license which gives you legal permission to copy,
4235 distribute and/or modify the software.
4237 Also, for each author's protection and ours, we want to make certain
4238 that everyone understands that there is no warranty for this free
4239 software. If the software is modified by someone else and passed on, we
4240 want its recipients to know that what they have is not the original, so
4241 that any problems introduced by others will not reflect on the original
4242 authors' reputations.
4244 The precise terms and conditions for copying, distribution and
4245 modification follow.
4248 @unnumberedsec TERMS AND CONDITIONS
4251 @center TERMS AND CONDITIONS
4256 This License Agreement applies to any program or other work which
4257 contains a notice placed by the copyright holder saying it may be
4258 distributed under the terms of this General Public License. The
4259 ``Program'', below, refers to any such program or work, and a ``work based
4260 on the Program'' means either the Program or any work containing the
4261 Program or a portion of it, either verbatim or with modifications. Each
4262 licensee is addressed as ``you''.
4265 You may copy and distribute verbatim copies of the Program's source
4266 code as you receive it, in any medium, provided that you conspicuously and
4267 appropriately publish on each copy an appropriate copyright notice and
4268 disclaimer of warranty; keep intact all the notices that refer to this
4269 General Public License and to the absence of any warranty; and give any
4270 other recipients of the Program a copy of this General Public License
4271 along with the Program. You may charge a fee for the physical act of
4272 transferring a copy.
4275 You may modify your copy or copies of the Program or any portion of
4276 it, and copy and distribute such modifications under the terms of Paragraph
4277 1 above, provided that you also do the following:
4281 cause the modified files to carry prominent notices stating that
4282 you changed the files and the date of any change; and
4285 cause the whole of any work that you distribute or publish, that
4286 in whole or in part contains the Program or any part thereof, either
4287 with or without modifications, to be licensed at no charge to all
4288 third parties under the terms of this General Public License (except
4289 that you may choose to grant warranty protection to some or all
4290 third parties, at your option).
4293 If the modified program normally reads commands interactively when
4294 run, you must cause it, when started running for such interactive use
4295 in the simplest and most usual way, to print or display an
4296 announcement including an appropriate copyright notice and a notice
4297 that there is no warranty (or else, saying that you provide a
4298 warranty) and that users may redistribute the program under these
4299 conditions, and telling the user how to view a copy of this General
4303 You may charge a fee for the physical act of transferring a
4304 copy, and you may at your option offer warranty protection in
4308 Mere aggregation of another independent work with the Program (or its
4309 derivative) on a volume of a storage or distribution medium does not bring
4310 the other work under the scope of these terms.
4313 You may copy and distribute the Program (or a portion or derivative of
4314 it, under Paragraph 2) in object code or executable form under the terms of
4315 Paragraphs 1 and 2 above provided that you also do one of the following:
4319 accompany it with the complete corresponding machine-readable
4320 source code, which must be distributed under the terms of
4321 Paragraphs 1 and 2 above; or,
4324 accompany it with a written offer, valid for at least three
4325 years, to give any third party free (except for a nominal charge
4326 for the cost of distribution) a complete machine-readable copy of the
4327 corresponding source code, to be distributed under the terms of
4328 Paragraphs 1 and 2 above; or,
4331 accompany it with the information you received as to where the
4332 corresponding source code may be obtained. (This alternative is
4333 allowed only for noncommercial distribution and only if you
4334 received the program in object code or executable form alone.)
4337 Source code for a work means the preferred form of the work for making
4338 modifications to it. For an executable file, complete source code means
4339 all the source code for all modules it contains; but, as a special
4340 exception, it need not include source code for modules which are standard
4341 libraries that accompany the operating system on which the executable
4342 file runs, or for standard header files or definitions files that
4343 accompany that operating system.
4346 You may not copy, modify, sublicense, distribute or transfer the
4347 Program except as expressly provided under this General Public License.
4348 Any attempt otherwise to copy, modify, sublicense, distribute or transfer
4349 the Program is void, and will automatically terminate your rights to use
4350 the Program under this License. However, parties who have received
4351 copies, or rights to use copies, from you under this General Public
4352 License will not have their licenses terminated so long as such parties
4353 remain in full compliance.
4356 By copying, distributing or modifying the Program (or any work based
4357 on the Program) you indicate your acceptance of this license to do so,
4358 and all its terms and conditions.
4361 Each time you redistribute the Program (or any work based on the
4362 Program), the recipient automatically receives a license from the original
4363 licensor to copy, distribute or modify the Program subject to these
4364 terms and conditions. You may not impose any further restrictions on the
4365 recipients' exercise of the rights granted herein.
4368 The Free Software Foundation may publish revised and/or new versions
4369 of the General Public License from time to time. Such new versions will
4370 be similar in spirit to the present version, but may differ in detail to
4371 address new problems or concerns.
4373 Each version is given a distinguishing version number. If the Program
4374 specifies a version number of the license which applies to it and ``any
4375 later version'', you have the option of following the terms and conditions
4376 either of that version or of any later version published by the Free
4377 Software Foundation. If the Program does not specify a version number of
4378 the license, you may choose any version ever published by the Free Software
4382 If you wish to incorporate parts of the Program into other free
4383 programs whose distribution conditions are different, write to the author
4384 to ask for permission. For software which is copyrighted by the Free
4385 Software Foundation, write to the Free Software Foundation; we sometimes
4386 make exceptions for this. Our decision will be guided by the two goals
4387 of preserving the free status of all derivatives of our free software and
4388 of promoting the sharing and reuse of software generally.
4391 @heading NO WARRANTY
4398 BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
4399 FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
4400 OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
4401 PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
4402 OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
4403 MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
4404 TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
4405 PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
4406 REPAIR OR CORRECTION.
4409 IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL
4410 ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
4411 REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
4412 INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
4413 ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT
4414 LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
4415 SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
4416 WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
4417 ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
4421 @heading END OF TERMS AND CONDITIONS
4424 @center END OF TERMS AND CONDITIONS
4428 @unnumberedsec Appendix: How to Apply These Terms to Your New Programs
4430 If you develop a new program, and you want it to be of the greatest
4431 possible use to humanity, the best way to achieve this is to make it
4432 free software which everyone can redistribute and change under these
4435 To do so, attach the following notices to the program. It is safest to
4436 attach them to the start of each source file to most effectively convey
4437 the exclusion of warranty; and each file should have at least the
4438 ``copyright'' line and a pointer to where the full notice is found.
4441 @var{one line to give the program's name and a brief idea of what it does.}
4442 Copyright (C) 19@var{yy} @var{name of author}
4444 This program is free software; you can redistribute it and/or modify
4445 it under the terms of the GNU General Public License as published by
4446 the Free Software Foundation; either version 1, or (at your option)
4449 This program is distributed in the hope that it will be useful,
4450 but WITHOUT ANY WARRANTY; without even the implied warranty of
4451 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
4452 GNU General Public License for more details.
4454 You should have received a copy of the GNU General Public License
4455 along with this program; if not, write to the Free Software
4456 Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
4459 Also add information on how to contact you by electronic and paper mail.
4461 If the program is interactive, make it output a short notice like this
4462 when it starts in an interactive mode:
4465 Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
4466 Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
4467 This is free software, and you are welcome to redistribute it
4468 under certain conditions; type `show c' for details.
4471 The hypothetical commands `show w' and `show c' should show the
4472 appropriate parts of the General Public License. Of course, the
4473 commands you use may be called something other than `show w' and `show
4474 c'; they could even be mouse-clicks or menu items---whatever suits your
4477 You should also get your employer (if you work as a programmer) or your
4478 school, if any, to sign a ``copyright disclaimer'' for the program, if
4479 necessary. Here is a sample; alter the names:
4482 Yoyodyne, Inc., hereby disclaims all copyright interest in the
4483 program `Gnomovision' (a program to direct compilers to make passes
4484 at assemblers) written by James Hacker.
4486 @var{signature of Ty Coon}, 1 April 1989
4487 Ty Coon, President of Vice
4490 That's all there is to it!