+2019-01-22 Andrew Stubbs <ams@codesourcery.com>
+
+ * doc/extend.tex (AMD GCN Function Attributes): New section.
+ * doc/install.texi (amdgcn-unknown-amdhsa): New instructions.
+ * doc/invoke.texi (AMD GCN Options): New section.
+ * doc/md.texi (Constraints for Particular Machines): Add AMD GCN.
+
2019-01-22 Eric Botcazou <ebotcazou@adacore.com>
* config/sparc/sparc.c (parc_delegitimize_address): Recognize the GOT
@menu
* Common Function Attributes::
* AArch64 Function Attributes::
+* AMD GCN Function Attributes::
* ARC Function Attributes::
* ARM Function Attributes::
* AVR Function Attributes::
@option{-mcpu=} option or the @code{cpu=} attribute conflicts with the
architectural feature rules specified above.
+@node AMD GCN Function Attributes
+@subsection AMD GCN Function Attributes
+
+These function attributes are supported by the AMD GCN back end:
+
+@table @code
+@item amdgpu_hsa_kernel
+@cindex @code{amdgpu_hsa_kernel} function attribute, AMD GCN
+This attribute indicates that the corresponding function should be compiled as
+a kernel function, that is an entry point that can be invoked from the host
+via the HSA runtime library. By default functions are only callable only from
+other GCN functions.
+
+This attribute is implicitly applied to any function named @code{main}, using
+default parameters.
+
+Kernel functions may return an integer value, which will be written to a
+conventional place within the HSA "kernargs" region.
+
+The attribute parameters configure what values are passed into the kernel
+function by the GPU drivers, via the initial register state. Some values are
+used by the compiler, and therefore forced on. Enabling other options may
+break assumptions in the compiler and/or run-time libraries.
+
+@table @code
+@item private_segment_buffer
+Set @code{enable_sgpr_private_segment_buffer} flag. Always on (required to
+locate the stack).
+
+@item dispatch_ptr
+Set @code{enable_sgpr_dispatch_ptr} flag. Always on (required to locate the
+launch dimensions).
+
+@item queue_ptr
+Set @code{enable_sgpr_queue_ptr} flag. Always on (required to convert address
+spaces).
+
+@item kernarg_segment_ptr
+Set @code{enable_sgpr_kernarg_segment_ptr} flag. Always on (required to
+locate the kernel arguments, "kernargs").
+
+@item dispatch_id
+Set @code{enable_sgpr_dispatch_id} flag.
+
+@item flat_scratch_init
+Set @code{enable_sgpr_flat_scratch_init} flag.
+
+@item private_segment_size
+Set @code{enable_sgpr_private_segment_size} flag.
+
+@item grid_workgroup_count_X
+Set @code{enable_sgpr_grid_workgroup_count_x} flag. Always on (required to
+use OpenACC/OpenMP).
+
+@item grid_workgroup_count_Y
+Set @code{enable_sgpr_grid_workgroup_count_y} flag.
+
+@item grid_workgroup_count_Z
+Set @code{enable_sgpr_grid_workgroup_count_z} flag.
+
+@item workgroup_id_X
+Set @code{enable_sgpr_workgroup_id_x} flag.
+
+@item workgroup_id_Y
+Set @code{enable_sgpr_workgroup_id_y} flag.
+
+@item workgroup_id_Z
+Set @code{enable_sgpr_workgroup_id_z} flag.
+
+@item workgroup_info
+Set @code{enable_sgpr_workgroup_info} flag.
+
+@item private_segment_wave_offset
+Set @code{enable_sgpr_private_segment_wave_byte_offset} flag. Always on
+(required to locate the stack).
+
+@item work_item_id_X
+Set @code{enable_vgpr_workitem_id} parameter. Always on (can't be disabled).
+
+@item work_item_id_Y
+Set @code{enable_vgpr_workitem_id} parameter. Always on (required to enable
+vectorization.)
+
+@item work_item_id_Z
+Set @code{enable_vgpr_workitem_id} parameter. Always on (required to use
+OpenACC/OpenMP).
+
+@end table
+@end table
+
@node ARC Function Attributes
@subsection ARC Function Attributes
@heading amd64-*-solaris2.1[0-9]*
This is a synonym for @samp{x86_64-*-solaris2.1[0-9]*}.
+@html
+<hr />
+@end html
+@anchor{amdgcn-unknown-amdhsa}
+@heading amdgcn-unknown-amdhsa
+AMD GCN GPU target.
+
+Instead of GNU Binutils, you will need to install LLVM 6, or later, and copy
+@file{bin/llvm-mc} to @file{amdgcn-unknown-amdhsa/bin/as},
+@file{bin/lld} to @file{amdgcn-unknown-amdhsa/bin/ld},
+@file{bin/llvm-nm} to @file{amdgcn-unknown-amdhsa/bin/nm}, and
+@file{bin/llvm-ar} to both @file{bin/amdgcn-unknown-amdhsa-ar} and
+@file{bin/amdgcn-unknown-amdhsa-ranlib}.
+
+Use Newlib (2019-01-16, or newer).
+
+To run the binaries, install the HSA Runtime from the
+@uref{https://rocm.github.io,,ROCm Platform}, and use
+@file{libexec/gcc/amdhsa-unknown-amdhsa/@var{version}/gcn-run} to launch them
+on the GPU.
+
@html
<hr />
@end html
-mfp-mode=@var{mode} -mvect-double -max-vect-align=@var{num} @gol
-msplit-vecmove-early -m1reg-@var{reg}}
+@emph{AMD GCN Options}
+@gccoptlist{-march=@var{gpu} -mtune=@var{gpu} -mstack-size=@var{bytes}}
+
@emph{ARC Options}
@gccoptlist{-mbarrel-shifter -mjli-always @gol
-mcpu=@var{cpu} -mA6 -mARC600 -mA7 -mARC700 @gol
@menu
* AArch64 Options::
* Adapteva Epiphany Options::
+* AMD GCN Options::
* ARC Options::
* ARM Options::
* AVR Options::
@end table
+@node AMD GCN Options
+@subsection AMD GCN Options
+@cindex AMD GCN Options
+
+These options are defined specifically for the AMD GCN port.
+
+@table @gcctabopt
+
+@item -march=@var{gpu}
+@opindex march
+@itemx -mtune=@var{gpu}
+@opindex mtune
+Set architecture type or tuning for @var{gpu}. Supported values for @var{gpu}
+are
+
+@table @samp
+@opindex fiji
+@item fiji
+Compile for GCN3 Fiji devices (gfx803).
+
+@item gfx900
+Compile for GCN5 Vega 10 devices (gfx900).
+
+@end table
+
+@item -mstack-size=@var{bytes}
+@opindex mstack-size
+Specify how many @var{bytes} of stack space will be requested for each GPU
+thread (wave-front). Beware that there may be many threads and limited memory
+available. The size of the stack allocation may also have an impact on
+run-time performance. The default is 32KB when using OpenACC or OpenMP, and
+1MB otherwise.
+
+@end table
+
@node ARC Options
@subsection ARC Options
@cindex ARC options
@end table
+@item AMD GCN ---@file{config/gcn/constraints.md}
+@table @code
+@item I
+Immediate integer in the range @minus{}16 to 64
+
+@item J
+Immediate 16-bit signed integer
+
+@item Kf
+Immediate constant @minus{}1
+
+@item L
+Immediate 15-bit unsigned integer
+
+@item A
+Immediate constant that can be inlined in an instruction encoding: integer
+@minus{}16..64, or float 0.0, +/@minus{}0.5, +/@minus{}1.0, +/@minus{}2.0,
++/@minus{}4.0, 1.0/(2.0*PI)
+
+@item B
+Immediate 32-bit signed integer that can be attached to an instruction encoding
+
+@item C
+Immediate 32-bit integer in range @minus{}16..4294967295 (i.e. 32-bit unsigned
+integer or @samp{A} constraint)
+
+@item DA
+Immediate 64-bit constant that can be split into two @samp{A} constants
+
+@item DB
+Immediate 64-bit constant that can be split into two @samp{B} constants
+
+@item U
+Any @code{unspec}
+
+@item Y
+Any @code{symbol_ref} or @code{label_ref}
+
+@item v
+VGPR register
+
+@item Sg
+SGPR register
+
+@item SD
+SGPR registers valid for instruction destinations, including VCC, M0 and EXEC
+
+@item SS
+SGPR registers valid for instruction sources, including VCC, M0, EXEC and SCC
+
+@item Sm
+SGPR registers valid as a source for scalar memory instructions (excludes M0
+and EXEC)
+
+@item Sv
+SGPR registers valid as a source or destination for vector instructions
+(excludes EXEC)
+
+@item ca
+All condition registers: SCC, VCCZ, EXECZ
+
+@item cs
+Scalar condition register: SCC
+
+@item cV
+Vector condition register: VCC, VCC_LO, VCC_HI
+
+@item e
+EXEC register (EXEC_LO and EXEC_HI)
+
+@item RB
+Memory operand with address space suitable for @code{buffer_*} instructions
+
+@item RF
+Memory operand with address space suitable for @code{flat_*} instructions
+
+@item RS
+Memory operand with address space suitable for @code{s_*} instructions
+
+@item RL
+Memory operand with address space suitable for @code{ds_*} LDS instructions
+
+@item RG
+Memory operand with address space suitable for @code{ds_*} GDS instructions
+
+@item RD
+Memory operand with address space suitable for any @code{ds_*} instructions
+
+@item RM
+Memory operand with address space suitable for @code{global_*} instructions
+
+@end table
+
+
@item ARC ---@file{config/arc/constraints.md}
@table @code
@item q