X-Git-Url: https://git.libre-soc.org/?a=blobdiff_plain;f=gas%2Fdoc%2Fc-i386.texi;h=c4a39670146b6e9882d080333c2038ade21f4d00;hb=408520bcaa874edb0e37506e8559b2e4194dca05;hp=e8685dd046ff74fe1d21a32773bed3aab855e958;hpb=42e04b360158941f4d490a38cad14b11593854fa;p=binutils-gdb.git diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index e8685dd046f..c4a39670146 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 1991-2020 Free Software Foundation, Inc. +@c Copyright (C) 1991-2022 Free Software Foundation, Inc. @c This is part of the GAS manual. @c For copying conditions, see the file as.texinfo. @c man end @@ -37,6 +37,7 @@ extending the Intel architecture to 64-bits. * i386-TBM:: AMD's Trailing Bit Manipulation Instructions * i386-16bit:: Writing 16-bit Code * i386-Arch:: Specifying an x86 CPU architecture +* i386-ISA:: AMD64 ISA vs. Intel64 ISA * i386-Bugs:: AT&T Syntax bugs * i386-Notes:: Notes @end menu @@ -109,8 +110,6 @@ processor names are recognized: @code{core}, @code{core2}, @code{corei7}, -@code{l1om}, -@code{k1om}, @code{iamcu}, @code{k6}, @code{k6_2}, @@ -124,6 +123,7 @@ processor names are recognized: @code{bdver4}, @code{znver1}, @code{znver2}, +@code{znver3}, @code{btver1}, @code{btver2}, @code{generic32} and @@ -150,6 +150,7 @@ accept various extension mnemonics. For example, @code{sse}, @code{sse2}, @code{sse3}, +@code{sse4a}, @code{ssse3}, @code{sse4.1}, @code{sse4.2}, @@ -157,6 +158,7 @@ accept various extension mnemonics. For example, @code{nosse}, @code{nosse2}, @code{nosse3}, +@code{nosse4a}, @code{nossse3}, @code{nosse4.1}, @code{nosse4.2}, @@ -184,6 +186,13 @@ accept various extension mnemonics. For example, @code{movdiri}, @code{movdir64b}, @code{enqcmd}, +@code{serialize}, +@code{tsxldtrk}, +@code{kl}, +@code{nokl}, +@code{widekl}, +@code{nowidekl}, +@code{hreset}, @code{avx512f}, @code{avx512cd}, @code{avx512er}, @@ -199,7 +208,11 @@ accept various extension mnemonics. For example, @code{avx512_vbmi2}, @code{avx512_vnni}, @code{avx512_bitalg}, +@code{avx512_vp2intersect}, +@code{tdx}, @code{avx512_bf16}, +@code{avx_vnni}, +@code{avx512_fp16}, @code{noavx512f}, @code{noavx512cd}, @code{noavx512er}, @@ -216,8 +229,21 @@ accept various extension mnemonics. For example, @code{noavx512_vnni}, @code{noavx512_bitalg}, @code{noavx512_vp2intersect}, +@code{notdx}, @code{noavx512_bf16}, +@code{noavx_vnni}, +@code{noavx512_fp16}, @code{noenqcmd}, +@code{noserialize}, +@code{notsxldtrk}, +@code{amx_int8}, +@code{noamx_int8}, +@code{amx_bf16}, +@code{noamx_bf16}, +@code{amx_tile}, +@code{noamx_tile}, +@code{nouintr}, +@code{nohreset}, @code{vmx}, @code{vmfunc}, @code{smx}, @@ -235,6 +261,7 @@ accept various extension mnemonics. For example, @code{movbe}, @code{ept}, @code{lzcnt}, +@code{popcnt}, @code{hle}, @code{rtm}, @code{invpcid}, @@ -244,9 +271,11 @@ accept various extension mnemonics. For example, @code{wbnoinvd}, @code{pconfig}, @code{waitpkg}, +@code{uintr}, @code{cldemote}, @code{rdpru}, @code{mcommit}, +@code{sev_es}, @code{lwp}, @code{fma4}, @code{xop}, @@ -257,8 +286,10 @@ accept various extension mnemonics. For example, @code{3dnowa}, @code{sse4a}, @code{sse5}, -@code{svme}, -@code{abm} and +@code{snp}, +@code{invlpgb}, +@code{tlbsync}, +@code{svme} and @code{padlock}. Note that rather than extending a basic instruction set, the extension mnemonics starting with @code{no} revoke the respective functionality. @@ -283,6 +314,12 @@ Valid @var{CPU} values are identical to the processor list of This option specifies that the assembler should encode SSE instructions with VEX prefix. +@cindex @samp{-muse-unaligned-vector-move} option, i386 +@cindex @samp{-muse-unaligned-vector-move} option, x86-64 +@item -muse-unaligned-vector-move +This option specifies that the assembler should encode aligned vector +move as unaligned vector move. + @cindex @samp{-msse-check=} option, i386 @cindex @samp{-msse-check=} option, x86-64 @item -msse-check=@var{none} @@ -382,9 +419,10 @@ with default visibility can be preempted. The resulting code is slightly bigger. This option only affects the handling of branch instructions. +@cindex @samp{-mbig-obj} option, i386 @cindex @samp{-mbig-obj} option, x86-64 @item -mbig-obj -On x86-64 PE/COFF target this option forces the use of big object file +On PE/COFF target this option forces the use of big object file format, which allows more than 32768 sections. @cindex @samp{-momit-lock-prefix=} option, i386 @@ -460,6 +498,53 @@ on an instruction. It is equivalent to @option{-malign-branch-prefix-size=5}. The default doesn't align branches. +@cindex @samp{-mlfence-after-load=} option, i386 +@cindex @samp{-mlfence-after-load=} option, x86-64 +@item -mlfence-after-load=@var{no} +@itemx -mlfence-after-load=@var{yes} +These options control whether the assembler should generate lfence +after load instructions. @option{-mlfence-after-load=@var{yes}} will +generate lfence. @option{-mlfence-after-load=@var{no}} will not generate +lfence, which is the default. + +@cindex @samp{-mlfence-before-indirect-branch=} option, i386 +@cindex @samp{-mlfence-before-indirect-branch=} option, x86-64 +@item -mlfence-before-indirect-branch=@var{none} +@item -mlfence-before-indirect-branch=@var{all} +@item -mlfence-before-indirect-branch=@var{register} +@itemx -mlfence-before-indirect-branch=@var{memory} +These options control whether the assembler should generate lfence +before indirect near branch instructions. +@option{-mlfence-before-indirect-branch=@var{all}} will generate lfence +before indirect near branch via register and issue a warning before +indirect near branch via memory. +It also implicitly sets @option{-mlfence-before-ret=@var{shl}} when +there's no explicit @option{-mlfence-before-ret=}. +@option{-mlfence-before-indirect-branch=@var{register}} will generate +lfence before indirect near branch via register. +@option{-mlfence-before-indirect-branch=@var{memory}} will issue a +warning before indirect near branch via memory. +@option{-mlfence-before-indirect-branch=@var{none}} will not generate +lfence nor issue warning, which is the default. Note that lfence won't +be generated before indirect near branch via register with +@option{-mlfence-after-load=@var{yes}} since lfence will be generated +after loading branch target register. + +@cindex @samp{-mlfence-before-ret=} option, i386 +@cindex @samp{-mlfence-before-ret=} option, x86-64 +@item -mlfence-before-ret=@var{none} +@item -mlfence-before-ret=@var{shl} +@item -mlfence-before-ret=@var{or} +@item -mlfence-before-ret=@var{yes} +@itemx -mlfence-before-ret=@var{not} +These options control whether the assembler should generate lfence +before ret. @option{-mlfence-before-ret=@var{or}} will generate +generate or instruction with lfence. +@option{-mlfence-before-ret=@var{shl/yes}} will generate shl instruction +with lfence. @option{-mlfence-before-ret=@var{not}} will generate not +instruction with lfence. @option{-mlfence-before-ret=@var{none}} will not +generate lfence, which is the default. + @cindex @samp{-mx86-used-note=} option, i386 @cindex @samp{-mx86-used-note=} option, x86-64 @item -mx86-used-note=@var{no} @@ -487,7 +572,8 @@ with 01, 10 and 11 RC bits, respectively. @item -mamd64 @itemx -mintel64 This option specifies that the assembler should accept only AMD64 or -Intel64 ISA in 64-bit mode. The default is to accept both. +Intel64 ISA in 64-bit mode. The default is to accept common, Intel64 +only and AMD64 ISAs. @cindex @samp{-O0} option, i386 @cindex @samp{-O0} option, x86-64 @@ -723,21 +809,30 @@ assembler which assumes that a missing mnemonic suffix implies long operand size. (This incompatibility does not affect compiler output since compilers always explicitly specify the mnemonic suffix.) -Almost all instructions have the same names in AT&T and Intel format. -There are a few exceptions. The sign extend and zero extend -instructions need two sizes to specify them. They need a size to -sign/zero extend @emph{from} and a size to zero extend @emph{to}. This -is accomplished by using two instruction mnemonic suffixes in AT&T -syntax. Base names for sign extend and zero extend are -@samp{movs@dots{}} and @samp{movz@dots{}} in AT&T syntax (@samp{movsx} -and @samp{movzx} in Intel syntax). The instruction mnemonic suffixes -are tacked on to this base name, the @emph{from} suffix before the -@emph{to} suffix. Thus, @samp{movsbl %al, %edx} is AT&T syntax for -``move sign extend @emph{from} %al @emph{to} %edx.'' Possible suffixes, -thus, are @samp{bl} (from byte to long), @samp{bw} (from byte to word), -@samp{wl} (from word to long), @samp{bq} (from byte to quadruple word), -@samp{wq} (from word to quadruple word), and @samp{lq} (from long to -quadruple word). +When there is no sizing suffix and no (suitable) register operands to +deduce the size of memory operands, with a few exceptions and where long +operand size is possible in the first place, operand size will default +to long in 32- and 64-bit modes. Similarly it will default to short in +16-bit mode. Noteworthy exceptions are + +@itemize @bullet +@item +Instructions with an implicit on-stack operand as well as branches, +which default to quad in 64-bit mode. + +@item +Sign- and zero-extending moves, which default to byte size source +operands. + +@item +Floating point insns with integer operands, which default to short (for +perhaps historical reasons). + +@item +CRC32 with a 64-bit destination, which defaults to a quad source +operand. + +@end itemize @cindex encoding options, i386 @cindex encoding options, x86-64 @@ -751,6 +846,9 @@ Different encoding options can be specified via pseudo prefixes: @item @samp{@{disp32@}} -- prefer 32-bit displacement. +@item +@samp{@{disp16@}} -- prefer 16-bit displacement. + @item @samp{@{load@}} -- prefer load-form instruction. @@ -775,6 +873,10 @@ prefix which generates REX prefix unconditionally. @samp{@{nooptimize@}} -- disable instruction size optimization. @end itemize +Mnemonics of Intel VNNI instructions are encoded with the EVEX prefix +by default. The pseudo @samp{@{vex@}} prefix can be used to encode +mnemonics of Intel VNNI instructions with the VEX prefix. + @cindex conversion instructions, i386 @cindex i386 conversion instructions @cindex conversion instructions, x86-64 @@ -808,6 +910,59 @@ are called @samp{cbtw}, @samp{cwtl}, @samp{cwtd}, @samp{cltd}, @samp{cltq}, and @samp{cqto} in AT&T naming. @code{@value{AS}} accepts either naming for these instructions. +@cindex extension instructions, i386 +@cindex i386 extension instructions +@cindex extension instructions, x86-64 +@cindex x86-64 extension instructions +The Intel-syntax extension instructions + +@itemize @bullet +@item +@samp{movsx} --- sign-extend @samp{reg8/mem8} to @samp{reg16}. + +@item +@samp{movsx} --- sign-extend @samp{reg8/mem8} to @samp{reg32}. + +@item +@samp{movsx} --- sign-extend @samp{reg8/mem8} to @samp{reg64} +(x86-64 only). + +@item +@samp{movsx} --- sign-extend @samp{reg16/mem16} to @samp{reg32} + +@item +@samp{movsx} --- sign-extend @samp{reg16/mem16} to @samp{reg64} +(x86-64 only). + +@item +@samp{movsxd} --- sign-extend @samp{reg32/mem32} to @samp{reg64} +(x86-64 only). + +@item +@samp{movzx} --- zero-extend @samp{reg8/mem8} to @samp{reg16}. + +@item +@samp{movzx} --- zero-extend @samp{reg8/mem8} to @samp{reg32}. + +@item +@samp{movzx} --- zero-extend @samp{reg8/mem8} to @samp{reg64} +(x86-64 only). + +@item +@samp{movzx} --- zero-extend @samp{reg16/mem16} to @samp{reg32} + +@item +@samp{movzx} --- zero-extend @samp{reg16/mem16} to @samp{reg64} +(x86-64 only). +@end itemize + +@noindent +are called @samp{movsbw/movsxb/movsx}, @samp{movsbl/movsxb/movsx}, +@samp{movsbq/movsxb/movsx}, @samp{movswl/movsxw}, @samp{movswq/movsxw}, +@samp{movslq/movsxl}, @samp{movzbw/movzxb/movzx}, +@samp{movzbl/movzxb/movzx}, @samp{movzbq/movzxb/movzx}, +@samp{movzwl/movzxw} and @samp{movzwq/movzxw} in AT&T syntax. + @cindex jump instructions, i386 @cindex call instructions, i386 @cindex jump instructions, x86-64 @@ -831,6 +986,12 @@ Several x87 instructions, @samp{fadd}, @samp{fdiv}, @samp{fdivp}, assembler with different mnemonics from those in Intel IA32 specification. @code{@value{GCC}} generates those instructions with AT&T mnemonic. +@itemize @bullet +@item @samp{movslq} with AT&T mnemonic only accepts 64-bit destination +register. @samp{movsxd} should be used to encode 16-bit or 32-bit +destination register with both AT&T and Intel mnemonics. +@end itemize + @node i386-Regs @section Register Naming @@ -928,16 +1089,6 @@ available in 32-bit mode). The bottom 128 bits are overlaid with the @end itemize -The AVX2 extensions made in 64-bit mode more registers available: - -@itemize @bullet - -@item -the 16 128-bit registers @samp{%xmm16}--@samp{%xmm31} and the 16 256-bit -registers @samp{%ymm16}--@samp{%ymm31}. - -@end itemize - The AVX512 extensions added the following registers: @itemize @bullet @@ -1166,18 +1317,23 @@ data type. Constructors build these data types into memory. @cindex @code{single} directive, i386 @cindex @code{double} directive, i386 @cindex @code{tfloat} directive, i386 +@cindex @code{hfloat} directive, i386 +@cindex @code{bfloat16} directive, i386 @cindex @code{float} directive, x86-64 @cindex @code{single} directive, x86-64 @cindex @code{double} directive, x86-64 @cindex @code{tfloat} directive, x86-64 +@cindex @code{hfloat} directive, x86-64 +@cindex @code{bfloat16} directive, x86-64 @itemize @bullet @item Floating point constructors are @samp{.float} or @samp{.single}, -@samp{.double}, and @samp{.tfloat} for 32-, 64-, and 80-bit formats. -These correspond to instruction mnemonic suffixes @samp{s}, @samp{l}, -and @samp{t}. @samp{t} stands for 80-bit (ten byte) real. The 80387 -only supports this format via the @samp{fldt} (load 80-bit real to stack -top) and @samp{fstpt} (store 80-bit real and pop stack) instructions. +@samp{.double}, @samp{.tfloat}, @samp{.hfloat}, and @samp{.bfloat16} for 32-, +64-, 80-, and 16-bit (two flavors) formats respectively. The former three +correspond to instruction mnemonic suffixes @samp{s}, @samp{l}, and @samp{t}. +@samp{t} stands for 80-bit (ten byte) real. The 80387 only supports this +format via the @samp{fldt} (load 80-bit real to stack top) and @samp{fstpt} +(store 80-bit real and pop stack) instructions. @cindex @code{word} directive, i386 @cindex @code{long} directive, i386 @@ -1190,7 +1346,7 @@ top) and @samp{fstpt} (store 80-bit real and pop stack) instructions. @item Integer constructors are @samp{.word}, @samp{.long} or @samp{.int}, and @samp{.quad} for the 16-, 32-, and 64-bit integer formats. The -corresponding instruction mnemonic suffixes are @samp{s} (single), +corresponding instruction mnemonic suffixes are @samp{s} (short), @samp{l} (long), and @samp{q} (quad). As with the 80-bit real format, the 64-bit @samp{q} format is only present in the @samp{fildq} (load quad integer to stack top) and @samp{fistpq} (store quad integer and pop @@ -1348,23 +1504,25 @@ directive enables a warning when gas detects an instruction that is not supported on the CPU specified. The choices for @var{cpu_type} are: @multitable @columnfractions .20 .20 .20 .20 +@item @samp{default} @tab @samp{push} @tab @samp{pop} @item @samp{i8086} @tab @samp{i186} @tab @samp{i286} @tab @samp{i386} @item @samp{i486} @tab @samp{i586} @tab @samp{i686} @tab @samp{pentium} @item @samp{pentiumpro} @tab @samp{pentiumii} @tab @samp{pentiumiii} @tab @samp{pentium4} @item @samp{prescott} @tab @samp{nocona} @tab @samp{core} @tab @samp{core2} -@item @samp{corei7} @tab @samp{l1om} @tab @samp{k1om} @tab @samp{iamcu} +@item @samp{corei7} @tab @samp{iamcu} @item @samp{k6} @tab @samp{k6_2} @tab @samp{athlon} @tab @samp{k8} @item @samp{amdfam10} @tab @samp{bdver1} @tab @samp{bdver2} @tab @samp{bdver3} -@item @samp{bdver4} @tab @samp{znver1} @tab @samp{znver2} @tab @samp{btver1} -@item @samp{btver2} @tab @samp{generic32} @tab @samp{generic64} +@item @samp{bdver4} @tab @samp{znver1} @tab @samp{znver2} @tab @samp{znver3} +@item @samp{btver1} @tab @samp{btver2} @tab @samp{generic32} @tab @samp{generic64} @item @samp{.cmov} @tab @samp{.fxsr} @tab @samp{.mmx} -@item @samp{.sse} @tab @samp{.sse2} @tab @samp{.sse3} +@item @samp{.sse} @tab @samp{.sse2} @tab @samp{.sse3} @tab @samp{.sse4a} @item @samp{.ssse3} @tab @samp{.sse4.1} @tab @samp{.sse4.2} @tab @samp{.sse4} @item @samp{.avx} @tab @samp{.vmx} @tab @samp{.smx} @tab @samp{.ept} @item @samp{.clflush} @tab @samp{.movbe} @tab @samp{.xsave} @tab @samp{.xsaveopt} @item @samp{.aes} @tab @samp{.pclmul} @tab @samp{.fma} @tab @samp{.fsgsbase} @item @samp{.rdrnd} @tab @samp{.f16c} @tab @samp{.avx2} @tab @samp{.bmi2} -@item @samp{.lzcnt} @tab @samp{.invpcid} @tab @samp{.vmfunc} @tab @samp{.hle} +@item @samp{.lzcnt} @tab @samp{.popcnt} @tab @samp{.invpcid} @tab @samp{.vmfunc} +@item @samp{.hle} @item @samp{.rtm} @tab @samp{.adx} @tab @samp{.rdseed} @tab @samp{.prfchw} @item @samp{.smap} @tab @samp{.mpx} @tab @samp{.sha} @tab @samp{.prefetchwt1} @item @samp{.clflushopt} @tab @samp{.xsavec} @tab @samp{.xsaves} @tab @samp{.se1} @@ -1373,15 +1531,19 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.avx512vbmi} @tab @samp{.avx512_4fmaps} @tab @samp{.avx512_4vnniw} @item @samp{.avx512_vpopcntdq} @tab @samp{.avx512_vbmi2} @tab @samp{.avx512_vnni} @item @samp{.avx512_bitalg} @tab @samp{.avx512_bf16} @tab @samp{.avx512_vp2intersect} -@item @samp{.clwb} @tab @samp{.rdpid} @tab @samp{.ptwrite} @tab @item @samp{.ibt} +@item @samp{.tdx} @tab @samp{.avx_vnni} @tab @samp{.avx512_fp16} +@item @samp{.clwb} @tab @samp{.rdpid} @tab @samp{.ptwrite} @tab @samp{.ibt} @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote} @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq} -@item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} +@item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} +@item @samp{.amx_int8} @tab @samp{.amx_bf16} @tab @samp{.amx_tile} +@item @samp{.kl} @tab @samp{.widekl} @tab @samp{.uintr} @tab @samp{.hreset} @item @samp{.3dnow} @tab @samp{.3dnowa} @tab @samp{.sse4a} @tab @samp{.sse5} -@item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} @tab @samp{.abm} +@item @samp{.syscall} @tab @samp{.rdtscp} @tab @samp{.svme} @item @samp{.lwp} @tab @samp{.fma4} @tab @samp{.xop} @tab @samp{.cx16} @item @samp{.padlock} @tab @samp{.clzero} @tab @samp{.mwaitx} @tab @samp{.rdpru} -@item @samp{.mcommit} +@item @samp{.mcommit} @tab @samp{.sev_es} @tab @samp{.snp} @tab @samp{.invlpgb} +@item @samp{.tlbsync} @end multitable Apart from the warning, there are only two other effects on @@ -1413,6 +1575,29 @@ For example .arch i8086,nojumps @end smallexample +@node i386-ISA +@section AMD64 ISA vs. Intel64 ISA + +There are some discrepancies between AMD64 and Intel64 ISAs. + +@itemize @bullet +@item For @samp{movsxd} with 16-bit destination register, AMD64 +supports 32-bit source operand and Intel64 supports 16-bit source +operand. + +@item For far branches (with explicit memory operand), both ISAs support +32- and 16-bit operand size. Intel64 additionally supports 64-bit +operand size, encoded as @samp{ljmpq} and @samp{lcallq} in AT&T syntax +and with an explicit @samp{tbyte ptr} operand size specifier in Intel +syntax. + +@item @samp{lfs}, @samp{lgs}, and @samp{lss} similarly allow for 16- +and 32-bit operand size (32- and 48-bit memory operand) in both ISAs, +while Intel64 additionally supports 64-bit operand sise (80-bit memory +operands). + +@end itemize + @node i386-Bugs @section AT&T Syntax bugs