This patch creates the required framework for MVE ACLE intrinsics.
The following changes are done in this patch to support MVE ACLE intrinsics.
Header file arm_mve.h is added to source code, which contains the definitions of MVE ACLE intrinsics
and different data types used in MVE. Machine description file mve.md is also added which contains the
RTL patterns defined for MVE.
A new reigster "p0" is added which is used in by MVE predicated patterns. A new register class "VPR_REG"
is added and its contents are defined in REG_CLASS_CONTENTS.
The vec-common.md file is modified to support the standard move patterns. The prefix of neon functions
which are also used by MVE is changed from "neon_" to "simd_".
eg: neon_immediate_valid_for_move changed to simd_immediate_valid_for_move.
In the patch standard patterns mve_move, mve_store and move_load for MVE are added and neon.md and vfp.md
files are modified to support this common patterns.
Please refer to Arm reference manual [1] for more details.
[1] https://developer.arm.com/docs/ddi0553/latest
2020-03-06 Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* config.gcc (arm_mve.h): Include mve intrinsics header file.
* config/arm/aout.h (p0): Add new register name for MVE predicated
cases.
* config/arm-builtins.c (ARM_BUILTIN_SIMD_LANE_CHECK): Define macro
common to Neon and MVE.
(ARM_BUILTIN_NEON_LANE_CHECK): Renamed to ARM_BUILTIN_SIMD_LANE_CHECK.
(arm_init_simd_builtin_types): Disable poly types for MVE.
(arm_init_neon_builtins): Move a check to arm_init_builtins function.
(arm_init_builtins): Use ARM_BUILTIN_SIMD_LANE_CHECK instead of
ARM_BUILTIN_NEON_LANE_CHECK.
(mve_dereference_pointer): Add function.
(arm_expand_builtin_args): Call to mve_dereference_pointer when MVE is
enabled.
(arm_expand_neon_builtin): Moved to arm_expand_builtin function.
(arm_expand_builtin): Moved from arm_expand_neon_builtin function.
* config/arm/arm-c.c (__ARM_FEATURE_MVE): Define macro for MVE and MVE
with floating point enabled.
* config/arm/arm-protos.h (neon_immediate_valid_for_move): Renamed to
simd_immediate_valid_for_move.
(simd_immediate_valid_for_move): Renamed from
neon_immediate_valid_for_move function.
* config/arm/arm.c (arm_options_perform_arch_sanity_checks): Generate
error if vfpv2 feature bit is disabled and mve feature bit is also
disabled for HARD_FLOAT_ABI.
(use_return_insn): Check to not push VFP regs for MVE.
(aapcs_vfp_allocate): Add MVE check to have same Procedure Call Standard
as Neon.
(aapcs_vfp_allocate_return_reg): Likewise.
(thumb2_legitimate_address_p): Check to return 0 on valid Thumb-2
address operand for MVE.
(arm_rtx_costs_internal): MVE check to determine cost of rtx.
(neon_valid_immediate): Rename to simd_valid_immediate.
(simd_valid_immediate): Rename from neon_valid_immediate.
(simd_valid_immediate): MVE check on size of vector is 128 bits.
(neon_immediate_valid_for_move): Rename to
simd_immediate_valid_for_move.
(simd_immediate_valid_for_move): Rename from
neon_immediate_valid_for_move.
(neon_immediate_valid_for_logic): Modify call to neon_valid_immediate
function.
(neon_make_constant): Modify call to neon_valid_immediate function.
(neon_vector_mem_operand): Return VFP register for POST_INC or PRE_DEC
for MVE.
(output_move_neon): Add MVE check to generate vldm/vstm instrcutions.
(arm_compute_frame_layout): Calculate space for saved VFP registers for
MVE.
(arm_save_coproc_regs): Save coproc registers for MVE.
(arm_print_operand): Add case 'E' to print memory operands for MVE.
(arm_print_operand_address): Check to print register number for MVE.
(arm_hard_regno_mode_ok): Check for arm hard regno mode ok for MVE.
(arm_modes_tieable_p): Check to allow structure mode for MVE.
(arm_regno_class): Add VPR_REGNUM check.
(arm_expand_epilogue_apcs_frame): MVE check to calculate epilogue code
for APCS frame.
(arm_expand_epilogue): MVE check for enabling pop instructions in
epilogue.
(arm_print_asm_arch_directives): Modify function to disable print of
.arch_extension "mve" and "fp" for cases where MVE is enabled with
"SOFT FLOAT ABI".
(arm_vector_mode_supported_p): Check for modes available in MVE interger
and MVE floating point.
(arm_array_mode_supported_p): Add TARGET_HAVE_MVE check for array mode
pointer support.
(arm_conditional_register_usage): Enable usage of conditional regsiter
for MVE.
(fixed_regs[VPR_REGNUM]): Enable VPR_REG for MVE.
(arm_declare_function_name): Modify function to disable print of
.arch_extension "mve" and "fp" for cases where MVE is enabled with
"SOFT FLOAT ABI".
* config/arm/arm.h (TARGET_HAVE_MVE): Disable for soft float abi and
when target general registers are required.
(TARGET_HAVE_MVE_FLOAT): Likewise.
(FIXED_REGISTERS): Add bit for VFP_REG class which is enabled in arm.c
for MVE.
(CALL_USED_REGISTERS): Set bit for VFP_REG class in CALL_USED_REGISTERS
which indicate this is not available for across function calls.
(FIRST_PSEUDO_REGISTER): Modify.
(VALID_MVE_MODE): Define valid MVE mode.
(VALID_MVE_SI_MODE): Define valid MVE SI mode.
(VALID_MVE_SF_MODE): Define valid MVE SF mode.
(VALID_MVE_STRUCT_MODE): Define valid MVE struct mode.
(VPR_REGNUM): Add Vector Predication Register in arm_regs_in_sequence
for MVE.
(IS_VPR_REGNUM): Macro to check for VPR_REG register.
(REG_ALLOC_ORDER): Add VPR_REGNUM entry.
(enum reg_class): Add VPR_REG entry.
(REG_CLASS_NAMES): Add VPR_REG entry.
* config/arm/arm.md (VPR_REGNUM): Define.
(conds): Check is_mve_type attrbiute to differentiate "conditional" and
"unconditional" instructions.
(arm_movsf_soft_insn): Modify RTL to not allow for MVE.
(movdf_soft_insn): Modify RTL to not allow for MVE.
(vfp_pop_multiple_with_writeback): Enable for MVE.
(include "mve.md"): Include mve.md file.
* config/arm/arm_mve.h: Add MVE intrinsics head file.
* config/arm/constraints.md (Up): Constraint to enable "p0" register in MVE
for vector predicated operands.
* config/arm/iterators.md (VNIM1): Define.
(VNINOTM1): Define.
(VHFBF_split): Define
* config/arm/mve.md: New file.
(mve_mov<mode>): Define RTL for move, store and load in MVE.
(mve_mov<mode>): Define move RTL pattern with vec_duplicate operator for
second operand.
* config/arm/neon.md (neon_immediate_valid_for_move): Rename with
simd_immediate_valid_for_move.
(neon_mov<mode>): Split pattern and move expand pattern "movv8hf" which
is common to MVE and NEON to vec-common.md file.
(vec_init<mode><V_elem_l>): Add TARGET_HAVE_MVE check.
* config/arm/predicates.md (vpr_register_operand): Define.
* config/arm/t-arm: Add mve.md file.
* config/arm/types.md (mve_move): Add MVE instructions mve_move to
attribute "type".
(mve_store): Add MVE instructions mve_store to attribute "type".
(mve_load): Add MVE instructions mve_load to attribute "type".
(is_mve_type): Define attribute.
* config/arm/vec-common.md (mov<mode>): Modify RTL expand to support
standard move patterns in MVE along with NEON and IWMMXT with mode
iterator VNIM1.
(mov<mode>): Modify RTL expand to support standard move patterns in NEON
and IWMMXT with mode iterator V8HF.
(movv8hf): Define RTL expand to support standard "movv8hf" pattern in
NEON and MVE.
* config/arm/vfp.md (neon_immediate_valid_for_move): Rename to
simd_immediate_valid_for_move.
2020-03-16 Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* gcc.target/arm/mve/intrinsics/mve_vector_float.c: New test.
* gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_int.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vector_uint2.c: Likewise.
* gcc.target/arm/mve/mve.exp: New file.
* lib/target-supports.exp
(check_effective_target_arm_v8_1m_mve_fp_ok_nocache): Proc to check
armv8.1-m.main+mve.fp and returning corresponding options.
(check_effective_target_arm_v8_1m_mve_fp_ok): Proc to call
check_effective_target_arm_v8_1m_mve_fp_ok_nocache to check support of
MVE with floating point on the current target.
(add_options_for_arm_v8_1m_mve_fp): Proc to call
check_effective_target_arm_v8_1m_mve_fp_ok to return corresponding
compiler options for MVE with floating point.
(check_effective_target_arm_v8_1m_mve_ok_nocache): Modify to test and
return hard float-abi on success.
+2020-03-16 Andre Vieira <andre.simoesdiasvieira@arm.com>
+ Mihail Ionescu <mihail.ionescu@arm.com>
+ Srinath Parvathaneni <srinath.parvathaneni@arm.com>
+
+ * config.gcc (arm_mve.h): Include mve intrinsics header file.
+ * config/arm/aout.h (p0): Add new register name for MVE predicated
+ cases.
+ * config/arm-builtins.c (ARM_BUILTIN_SIMD_LANE_CHECK): Define macro
+ common to Neon and MVE.
+ (ARM_BUILTIN_NEON_LANE_CHECK): Renamed to ARM_BUILTIN_SIMD_LANE_CHECK.
+ (arm_init_simd_builtin_types): Disable poly types for MVE.
+ (arm_init_neon_builtins): Move a check to arm_init_builtins function.
+ (arm_init_builtins): Use ARM_BUILTIN_SIMD_LANE_CHECK instead of
+ ARM_BUILTIN_NEON_LANE_CHECK.
+ (mve_dereference_pointer): Add function.
+ (arm_expand_builtin_args): Call to mve_dereference_pointer when MVE is
+ enabled.
+ (arm_expand_neon_builtin): Moved to arm_expand_builtin function.
+ (arm_expand_builtin): Moved from arm_expand_neon_builtin function.
+ * config/arm/arm-c.c (__ARM_FEATURE_MVE): Define macro for MVE and MVE
+ with floating point enabled.
+ * config/arm/arm-protos.h (neon_immediate_valid_for_move): Renamed to
+ simd_immediate_valid_for_move.
+ (simd_immediate_valid_for_move): Renamed from
+ neon_immediate_valid_for_move function.
+ * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Generate
+ error if vfpv2 feature bit is disabled and mve feature bit is also
+ disabled for HARD_FLOAT_ABI.
+ (use_return_insn): Check to not push VFP regs for MVE.
+ (aapcs_vfp_allocate): Add MVE check to have same Procedure Call Standard
+ as Neon.
+ (aapcs_vfp_allocate_return_reg): Likewise.
+ (thumb2_legitimate_address_p): Check to return 0 on valid Thumb-2
+ address operand for MVE.
+ (arm_rtx_costs_internal): MVE check to determine cost of rtx.
+ (neon_valid_immediate): Rename to simd_valid_immediate.
+ (simd_valid_immediate): Rename from neon_valid_immediate.
+ (simd_valid_immediate): MVE check on size of vector is 128 bits.
+ (neon_immediate_valid_for_move): Rename to
+ simd_immediate_valid_for_move.
+ (simd_immediate_valid_for_move): Rename from
+ neon_immediate_valid_for_move.
+ (neon_immediate_valid_for_logic): Modify call to neon_valid_immediate
+ function.
+ (neon_make_constant): Modify call to neon_valid_immediate function.
+ (neon_vector_mem_operand): Return VFP register for POST_INC or PRE_DEC
+ for MVE.
+ (output_move_neon): Add MVE check to generate vldm/vstm instrcutions.
+ (arm_compute_frame_layout): Calculate space for saved VFP registers for
+ MVE.
+ (arm_save_coproc_regs): Save coproc registers for MVE.
+ (arm_print_operand): Add case 'E' to print memory operands for MVE.
+ (arm_print_operand_address): Check to print register number for MVE.
+ (arm_hard_regno_mode_ok): Check for arm hard regno mode ok for MVE.
+ (arm_modes_tieable_p): Check to allow structure mode for MVE.
+ (arm_regno_class): Add VPR_REGNUM check.
+ (arm_expand_epilogue_apcs_frame): MVE check to calculate epilogue code
+ for APCS frame.
+ (arm_expand_epilogue): MVE check for enabling pop instructions in
+ epilogue.
+ (arm_print_asm_arch_directives): Modify function to disable print of
+ .arch_extension "mve" and "fp" for cases where MVE is enabled with
+ "SOFT FLOAT ABI".
+ (arm_vector_mode_supported_p): Check for modes available in MVE interger
+ and MVE floating point.
+ (arm_array_mode_supported_p): Add TARGET_HAVE_MVE check for array mode
+ pointer support.
+ (arm_conditional_register_usage): Enable usage of conditional regsiter
+ for MVE.
+ (fixed_regs[VPR_REGNUM]): Enable VPR_REG for MVE.
+ (arm_declare_function_name): Modify function to disable print of
+ .arch_extension "mve" and "fp" for cases where MVE is enabled with
+ "SOFT FLOAT ABI".
+ * config/arm/arm.h (TARGET_HAVE_MVE): Disable for soft float abi and
+ when target general registers are required.
+ (TARGET_HAVE_MVE_FLOAT): Likewise.
+ (FIXED_REGISTERS): Add bit for VFP_REG class which is enabled in arm.c
+ for MVE.
+ (CALL_USED_REGISTERS): Set bit for VFP_REG class in CALL_USED_REGISTERS
+ which indicate this is not available for across function calls.
+ (FIRST_PSEUDO_REGISTER): Modify.
+ (VALID_MVE_MODE): Define valid MVE mode.
+ (VALID_MVE_SI_MODE): Define valid MVE SI mode.
+ (VALID_MVE_SF_MODE): Define valid MVE SF mode.
+ (VALID_MVE_STRUCT_MODE): Define valid MVE struct mode.
+ (VPR_REGNUM): Add Vector Predication Register in arm_regs_in_sequence
+ for MVE.
+ (IS_VPR_REGNUM): Macro to check for VPR_REG register.
+ (REG_ALLOC_ORDER): Add VPR_REGNUM entry.
+ (enum reg_class): Add VPR_REG entry.
+ (REG_CLASS_NAMES): Add VPR_REG entry.
+ * config/arm/arm.md (VPR_REGNUM): Define.
+ (conds): Check is_mve_type attrbiute to differentiate "conditional" and
+ "unconditional" instructions.
+ (arm_movsf_soft_insn): Modify RTL to not allow for MVE.
+ (movdf_soft_insn): Modify RTL to not allow for MVE.
+ (vfp_pop_multiple_with_writeback): Enable for MVE.
+ (include "mve.md"): Include mve.md file.
+ * config/arm/arm_mve.h: Add MVE intrinsics head file.
+ * config/arm/constraints.md (Up): Constraint to enable "p0" register in MVE
+ for vector predicated operands.
+ * config/arm/iterators.md (VNIM1): Define.
+ (VNINOTM1): Define.
+ (VHFBF_split): Define
+ * config/arm/mve.md: New file.
+ (mve_mov<mode>): Define RTL for move, store and load in MVE.
+ (mve_mov<mode>): Define move RTL pattern with vec_duplicate operator for
+ second operand.
+ * config/arm/neon.md (neon_immediate_valid_for_move): Rename with
+ simd_immediate_valid_for_move.
+ (neon_mov<mode>): Split pattern and move expand pattern "movv8hf" which
+ is common to MVE and NEON to vec-common.md file.
+ (vec_init<mode><V_elem_l>): Add TARGET_HAVE_MVE check.
+ * config/arm/predicates.md (vpr_register_operand): Define.
+ * config/arm/t-arm: Add mve.md file.
+ * config/arm/types.md (mve_move): Add MVE instructions mve_move to
+ attribute "type".
+ (mve_store): Add MVE instructions mve_store to attribute "type".
+ (mve_load): Add MVE instructions mve_load to attribute "type".
+ (is_mve_type): Define attribute.
+ * config/arm/vec-common.md (mov<mode>): Modify RTL expand to support
+ standard move patterns in MVE along with NEON and IWMMXT with mode
+ iterator VNIM1.
+ (mov<mode>): Modify RTL expand to support standard move patterns in NEON
+ and IWMMXT with mode iterator V8HF.
+ (movv8hf): Define RTL expand to support standard "movv8hf" pattern in
+ NEON and MVE.
+ * config/arm/vfp.md (neon_immediate_valid_for_move): Rename to
+ simd_immediate_valid_for_move.
+
+
2020-03-16 H.J. Lu <hongjiu.lu@intel.com>
PR target/89229
arm*-*-*)
cpu_type=arm
extra_objs="arm-builtins.o aarch-common.o"
- extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_fp16.h arm_cmse.h arm_bf16.h"
+ extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_fp16.h arm_cmse.h arm_bf16.h arm_mve.h"
target_type_format_char='%'
c_target_objs="arm-c.o"
cxx_target_objs="arm-c.o"
/* The assembler's names for the registers. Note that the ?xx registers are
there so that VFPv3/NEON registers D16-D31 have the same spacing as D0-D15
(each of which is overlaid on two S registers), although there are no
- actual single-precision registers which correspond to D16-D31. */
+ actual single-precision registers which correspond to D16-D31. New register
+ p0 is added which is used for MVE predicated cases. */
+
#ifndef REGISTER_NAMES
#define REGISTER_NAMES \
{ \
"wr8", "wr9", "wr10", "wr11", \
"wr12", "wr13", "wr14", "wr15", \
"wcgr0", "wcgr1", "wcgr2", "wcgr3", \
- "cc", "vfpcc", "sfp", "afp", "apsrq", "apsrge" \
+ "cc", "vfpcc", "sfp", "afp", "apsrq", "apsrge", "p0" \
}
#endif
ARM_BUILTIN_SET_FPSCR,
ARM_BUILTIN_CMSE_NONSECURE_CALLER,
+ ARM_BUILTIN_SIMD_LANE_CHECK,
#undef CRYPTO1
#undef CRYPTO2
#include "arm_vfp_builtins.def"
ARM_BUILTIN_NEON_BASE,
- ARM_BUILTIN_NEON_LANE_CHECK = ARM_BUILTIN_NEON_BASE,
#include "arm_neon_builtins.def"
an entry in our mangling table, consequently, they get default
mangling. As a further gotcha, poly8_t and poly16_t are signed
types, poly64_t and poly128_t are unsigned types. */
- arm_simd_polyQI_type_node
- = build_distinct_type_copy (intQI_type_node);
- (*lang_hooks.types.register_builtin_type) (arm_simd_polyQI_type_node,
- "__builtin_neon_poly8");
- arm_simd_polyHI_type_node
- = build_distinct_type_copy (intHI_type_node);
- (*lang_hooks.types.register_builtin_type) (arm_simd_polyHI_type_node,
- "__builtin_neon_poly16");
- arm_simd_polyDI_type_node
- = build_distinct_type_copy (unsigned_intDI_type_node);
- (*lang_hooks.types.register_builtin_type) (arm_simd_polyDI_type_node,
- "__builtin_neon_poly64");
- arm_simd_polyTI_type_node
- = build_distinct_type_copy (unsigned_intTI_type_node);
- (*lang_hooks.types.register_builtin_type) (arm_simd_polyTI_type_node,
- "__builtin_neon_poly128");
-
- /* Prevent front-ends from transforming poly vectors into string
- literals. */
- TYPE_STRING_FLAG (arm_simd_polyQI_type_node) = false;
- TYPE_STRING_FLAG (arm_simd_polyHI_type_node) = false;
-
+ if (!TARGET_HAVE_MVE)
+ {
+ arm_simd_polyQI_type_node
+ = build_distinct_type_copy (intQI_type_node);
+ (*lang_hooks.types.register_builtin_type) (arm_simd_polyQI_type_node,
+ "__builtin_neon_poly8");
+ arm_simd_polyHI_type_node
+ = build_distinct_type_copy (intHI_type_node);
+ (*lang_hooks.types.register_builtin_type) (arm_simd_polyHI_type_node,
+ "__builtin_neon_poly16");
+ arm_simd_polyDI_type_node
+ = build_distinct_type_copy (unsigned_intDI_type_node);
+ (*lang_hooks.types.register_builtin_type) (arm_simd_polyDI_type_node,
+ "__builtin_neon_poly64");
+ arm_simd_polyTI_type_node
+ = build_distinct_type_copy (unsigned_intTI_type_node);
+ (*lang_hooks.types.register_builtin_type) (arm_simd_polyTI_type_node,
+ "__builtin_neon_poly128");
+ /* Init poly vector element types with scalar poly types. */
+ arm_simd_types[Poly8x8_t].eltype = arm_simd_polyQI_type_node;
+ arm_simd_types[Poly8x16_t].eltype = arm_simd_polyQI_type_node;
+ arm_simd_types[Poly16x4_t].eltype = arm_simd_polyHI_type_node;
+ arm_simd_types[Poly16x8_t].eltype = arm_simd_polyHI_type_node;
+ /* Note: poly64x2_t is defined in arm_neon.h, to ensure it gets default
+ mangling. */
+
+ /* Prevent front-ends from transforming poly vectors into string
+ literals. */
+ TYPE_STRING_FLAG (arm_simd_polyQI_type_node) = false;
+ TYPE_STRING_FLAG (arm_simd_polyHI_type_node) = false;
+ }
/* Init all the element types built by the front-end. */
arm_simd_types[Int8x8_t].eltype = intQI_type_node;
arm_simd_types[Int8x16_t].eltype = intQI_type_node;
arm_simd_types[Uint32x4_t].eltype = unsigned_intSI_type_node;
arm_simd_types[Uint64x2_t].eltype = unsigned_intDI_type_node;
- /* Init poly vector element types with scalar poly types. */
- arm_simd_types[Poly8x8_t].eltype = arm_simd_polyQI_type_node;
- arm_simd_types[Poly8x16_t].eltype = arm_simd_polyQI_type_node;
- arm_simd_types[Poly16x4_t].eltype = arm_simd_polyHI_type_node;
- arm_simd_types[Poly16x8_t].eltype = arm_simd_polyHI_type_node;
/* Note: poly64x2_t is defined in arm_neon.h, to ensure it gets default
mangling. */
tree eltype = arm_simd_types[i].eltype;
machine_mode mode = arm_simd_types[i].mode;
+ if (eltype == NULL)
+ continue;
if (arm_simd_types[i].itype == NULL)
arm_simd_types[i].itype =
build_distinct_type_copy
system. */
arm_init_simd_builtin_scalar_types ();
- tree lane_check_fpr = build_function_type_list (void_type_node,
- intSI_type_node,
- intSI_type_node,
- NULL);
- arm_builtin_decls[ARM_BUILTIN_NEON_LANE_CHECK] =
- add_builtin_function ("__builtin_arm_lane_check", lane_check_fpr,
- ARM_BUILTIN_NEON_LANE_CHECK, BUILT_IN_MD,
- NULL, NULL_TREE);
-
for (i = 0; i < ARRAY_SIZE (neon_builtin_data); i++, fcode++)
{
arm_builtin_datum *d = &neon_builtin_data[i];
if (TARGET_MAYBE_HARD_FLOAT)
{
+ tree lane_check_fpr = build_function_type_list (void_type_node,
+ intSI_type_node,
+ intSI_type_node,
+ NULL);
+ arm_builtin_decls[ARM_BUILTIN_SIMD_LANE_CHECK]
+ = add_builtin_function ("__builtin_arm_lane_check", lane_check_fpr,
+ ARM_BUILTIN_SIMD_LANE_CHECK, BUILT_IN_MD,
+ NULL, NULL_TREE);
+
arm_init_neon_builtins ();
arm_init_vfp_builtins ();
arm_init_crypto_builtins ();
build_int_cst (build_pointer_type (array_type), 0));
}
+/* EXP is a pointer argument to a vector scatter store intrinsics.
+
+ Consider the following example:
+ VSTRW<v>.<dt> Qd, [Qm{, #+/-<imm>}]!
+ When <Qm> used as the base register for the target address,
+ this function is used to derive and return an expression for the
+ accessed memory.
+
+ The intrinsic function operates on a block of registers that has mode
+ REG_MODE. This block contains vectors of type TYPE_MODE. The function
+ references the memory at EXP of type TYPE and in mode MEM_MODE. This
+ mode may be BLKmode if no more suitable mode is available. */
+
+static tree
+mve_dereference_pointer (tree exp, tree type, machine_mode reg_mode,
+ machine_mode vector_mode)
+{
+ HOST_WIDE_INT reg_size, vector_size, nelems;
+ tree elem_type, upper_bound, array_type;
+
+ /* Work out the size of each vector in bytes. */
+ vector_size = GET_MODE_SIZE (vector_mode);
+
+ /* Work out the size of the register block in bytes. */
+ reg_size = GET_MODE_SIZE (reg_mode);
+
+ /* Work out the type of each element. */
+ gcc_assert (POINTER_TYPE_P (type));
+ elem_type = TREE_TYPE (type);
+
+ nelems = reg_size / vector_size;
+
+ /* Create a type that describes the full access. */
+ upper_bound = build_int_cst (size_type_node, nelems - 1);
+ array_type = build_array_type (elem_type, build_index_type (upper_bound));
+
+ /* Dereference EXP using that type. */
+ return fold_build2 (MEM_REF, array_type, exp,
+ build_int_cst (build_pointer_type (array_type), 0));
+}
+
/* Expand a builtin. */
static rtx
arm_expand_builtin_args (rtx target, machine_mode map_mode, int fcode,
{
machine_mode other_mode
= insn_data[icode].operand[1 - opno].mode;
- arg[argc] = neon_dereference_pointer (arg[argc],
+ if (TARGET_HAVE_MVE && mode[argc] != other_mode)
+ {
+ arg[argc] = mve_dereference_pointer (arg[argc],
TREE_VALUE (formals),
- mode[argc], other_mode,
- map_mode);
+ other_mode, map_mode);
+ }
+ else
+ arg[argc] = neon_dereference_pointer (arg[argc],
+ TREE_VALUE (formals),
+ mode[argc], other_mode,
+ map_mode);
}
/* Use EXPAND_MEMORY for ARG_BUILTIN_MEMORY and
return const0_rtx;
}
- if (fcode == ARM_BUILTIN_NEON_LANE_CHECK)
- {
- /* Builtin is only to check bounds of the lane passed to some intrinsics
- that are implemented with gcc vector extensions in arm_neon.h. */
-
- tree nlanes = CALL_EXPR_ARG (exp, 0);
- gcc_assert (TREE_CODE (nlanes) == INTEGER_CST);
- rtx lane_idx = expand_normal (CALL_EXPR_ARG (exp, 1));
- if (CONST_INT_P (lane_idx))
- neon_lane_bounds (lane_idx, 0, TREE_INT_CST_LOW (nlanes), exp);
- else
- error ("%Klane index must be a constant immediate", exp);
- /* Don't generate any RTL. */
- return const0_rtx;
- }
-
arm_builtin_datum *d
= &neon_builtin_data[fcode - ARM_BUILTIN_NEON_PATTERN_START];
int mask;
int imm;
+ if (fcode == ARM_BUILTIN_SIMD_LANE_CHECK)
+ {
+ /* Builtin is only to check bounds of the lane passed to some intrinsics
+ that are implemented with gcc vector extensions in arm_neon.h. */
+
+ tree nlanes = CALL_EXPR_ARG (exp, 0);
+ gcc_assert (TREE_CODE (nlanes) == INTEGER_CST);
+ rtx lane_idx = expand_normal (CALL_EXPR_ARG (exp, 1));
+ if (CONST_INT_P (lane_idx))
+ neon_lane_bounds (lane_idx, 0, TREE_INT_CST_LOW (nlanes), exp);
+ else
+ error ("%Klane index must be a constant immediate", exp);
+ /* Don't generate any RTL. */
+ return const0_rtx;
+ }
+
if (fcode >= ARM_BUILTIN_ACLE_BASE)
return arm_expand_acle_builtin (fcode, exp, target);
def_or_undef_macro (pfile, "__ARM_FEATURE_COMPLEX", TARGET_COMPLEX);
def_or_undef_macro (pfile, "__ARM_32BIT_STATE", TARGET_32BIT);
+ cpp_undef (pfile, "__ARM_FEATURE_MVE");
+ if (TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT)
+ {
+ builtin_define_with_int_value ("__ARM_FEATURE_MVE", 3);
+ }
+ else if (TARGET_HAVE_MVE)
+ {
+ builtin_define_with_int_value ("__ARM_FEATURE_MVE", 1);
+ }
+
cpp_undef (pfile, "__ARM_FEATURE_CMSE");
if (arm_arch8 && !arm_arch_notm)
{
extern bool clear_operation_p (rtx, bool);
extern int arm_const_double_rtx (rtx);
extern int vfp3_const_double_rtx (rtx);
-extern int neon_immediate_valid_for_move (rtx, machine_mode, rtx *, int *);
+extern int simd_immediate_valid_for_move (rtx, machine_mode, rtx *, int *);
extern int neon_immediate_valid_for_logic (rtx, machine_mode, int, rtx *,
int *);
extern int neon_immediate_valid_for_shift (rtx, machine_mode, rtx *,
else if (TARGET_HARD_FLOAT_ABI)
{
arm_pcs_default = ARM_PCS_AAPCS_VFP;
- if (!bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2))
+ if (!bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2)
+ && !bitmap_bit_p (arm_active_target.isa, isa_bit_mve))
error ("%<-mfloat-abi=hard%>: selected processor lacks an FPU");
}
else
/* Can't be done if any of the VFP regs are pushed,
since this also requires an insn. */
- if (TARGET_HARD_FLOAT)
+ if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p (regno))
return 0;
{
pcum->aapcs_vfp_reg_alloc = mask << regno;
if (mode == BLKmode
- || (mode == TImode && ! TARGET_NEON)
+ || (mode == TImode && ! (TARGET_NEON || TARGET_HAVE_MVE))
|| ! arm_hard_regno_mode_ok (FIRST_VFP_REGNUM + regno, mode))
{
int i;
int rshift = shift;
machine_mode rmode = pcum->aapcs_vfp_rmode;
rtx par;
- if (!TARGET_NEON)
+ if (!(TARGET_NEON || TARGET_HAVE_MVE))
{
/* Avoid using unsupported vector modes. */
if (rmode == V2SImode)
if (mode == BLKmode
|| (GET_MODE_CLASS (mode) == MODE_INT
&& GET_MODE_SIZE (mode) >= GET_MODE_SIZE (TImode)
- && !TARGET_NEON))
+ && !(TARGET_NEON || TARGET_HAVE_MVE)))
{
int count;
machine_mode ag_mode;
aapcs_vfp_is_call_or_return_candidate (pcs_variant, mode, type,
&ag_mode, &count);
- if (!TARGET_NEON)
+ if (!(TARGET_NEON || TARGET_HAVE_MVE))
{
if (ag_mode == V2SImode)
ag_mode = DImode;
&& CONST_INT_P (XEXP (XEXP (x, 0), 1)))))
return 1;
- else if (mode == TImode || (TARGET_NEON && VALID_NEON_STRUCT_MODE (mode)))
+ else if (mode == TImode
+ || (TARGET_NEON && VALID_NEON_STRUCT_MODE (mode))
+ || (TARGET_HAVE_MVE && VALID_MVE_STRUCT_MODE (mode)))
return 0;
else if (code == PLUS)
/* Assume that most copies can be done with a single insn,
unless we don't have HW FP, in which case everything
larger than word mode will require two insns. */
- *cost = COSTS_N_INSNS (((!TARGET_HARD_FLOAT
+ *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
&& GET_MODE_SIZE (mode) > 4)
|| mode == DImode)
? 2 : 1);
case CONST_VECTOR:
/* Fixme. */
- if (TARGET_NEON
- && TARGET_HARD_FLOAT
- && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode))
- && neon_immediate_valid_for_move (x, mode, NULL, NULL))
+ if (((TARGET_NEON && TARGET_HARD_FLOAT
+ && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE (mode)))
+ || TARGET_HAVE_MVE)
+ && simd_immediate_valid_for_move (x, mode, NULL, NULL))
*cost = COSTS_N_INSNS (1);
else
*cost = COSTS_N_INSNS (4);
return vfp3_const_double_index (x) != -1;
}
-/* Recognize immediates which can be used in various Neon instructions. Legal
- immediates are described by the following table (for VMVN variants, the
+/* Recognize immediates which can be used in various Neon and MVE instructions.
+ Legal immediates are described by the following table (for VMVN variants, the
bitwise inverse of the constant shown is recognized. In either case, VMOV
is output and the correct instruction to use for a given constant is chosen
by the assembler). The constant shown is replicated across all elements of
-1 if the given value doesn't match any of the listed patterns.
*/
static int
-neon_valid_immediate (rtx op, machine_mode mode, int inverse,
+simd_valid_immediate (rtx op, machine_mode mode, int inverse,
rtx *modconst, int *elementwidth)
{
#define CHECK(STRIDE, ELSIZE, CLASS, TEST) \
innersize = GET_MODE_UNIT_SIZE (mode);
+ /* Only support 128-bit vectors for MVE. */
+ if (TARGET_HAVE_MVE && (!vector || n_elts * innersize != 16))
+ return -1;
+
/* Vectors of float constants. */
if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
{
#undef CHECK
}
-/* Return TRUE if rtx X is legal for use as either a Neon VMOV (or, implicitly,
- VMVN) immediate. Write back width per element to *ELEMENTWIDTH (or zero for
- float elements), and a modified constant (whatever should be output for a
- VMOV) in *MODCONST. */
-
+/* Return TRUE if rtx X is legal for use as either a Neon or MVE VMOV (or,
+ implicitly, VMVN) immediate. Write back width per element to *ELEMENTWIDTH
+ (or zero for float elements), and a modified constant (whatever should be
+ output for a VMOV) in *MODCONST. "neon_immediate_valid_for_move" function is
+ modified to "simd_immediate_valid_for_move" as this function will be used
+ both by neon and mve. */
int
-neon_immediate_valid_for_move (rtx op, machine_mode mode,
+simd_immediate_valid_for_move (rtx op, machine_mode mode,
rtx *modconst, int *elementwidth)
{
rtx tmpconst;
int tmpwidth;
- int retval = neon_valid_immediate (op, mode, 0, &tmpconst, &tmpwidth);
+ int retval = simd_valid_immediate (op, mode, 0, &tmpconst, &tmpwidth);
if (retval == -1)
return 0;
/* Return TRUE if rtx X is legal for use in a VORR or VBIC instruction. If
the immediate is valid, write a constant suitable for using as an operand
to VORR/VBIC/VAND/VORN to *MODCONST and the corresponding element width to
- *ELEMENTWIDTH. See neon_valid_immediate for description of INVERSE. */
+ *ELEMENTWIDTH. See simd_valid_immediate for description of INVERSE. */
int
neon_immediate_valid_for_logic (rtx op, machine_mode mode, int inverse,
{
rtx tmpconst;
int tmpwidth;
- int retval = neon_valid_immediate (op, mode, inverse, &tmpconst, &tmpwidth);
+ int retval = simd_valid_immediate (op, mode, inverse, &tmpconst, &tmpwidth);
if (retval < 0 || retval > 5)
return 0;
gcc_unreachable ();
if (const_vec != NULL
- && neon_immediate_valid_for_move (const_vec, mode, NULL, NULL))
+ && simd_immediate_valid_for_move (const_vec, mode, NULL, NULL))
/* Load using VMOV. On Cortex-A8 this takes one cycle. */
return const_vec;
else if ((target = neon_vdup_constant (vals)) != NULL_RTX)
&& (INTVAL (XEXP (ind, 1)) & 3) == 0)
return TRUE;
+ if (type == 1 && TARGET_HAVE_MVE
+ && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
+ {
+ rtx ind1 = XEXP (ind, 0);
+ if (!REG_P (ind1))
+ return 0;
+ return VFP_REGNO_OK_FOR_SINGLE (REGNO (ind1));
+ }
+
return FALSE;
}
{
case POST_INC:
/* We have to use vldm / vstm for too-large modes. */
- if (nregs > 4)
+ if (nregs > 4 || (TARGET_HAVE_MVE && nregs >= 2))
{
templ = "v%smia%%?\t%%0!, %%h1";
ops[0] = XEXP (addr, 0);
/* We have to use vldm / vstm for too-large modes. */
if (nregs > 1)
{
- if (nregs > 4)
+ if (nregs > 4 || (TARGET_HAVE_MVE && nregs >= 2))
templ = "v%smia%%?\t%%m0, %%h1";
else
templ = "v%s1.64\t%%h1, %%A0";
{
int i;
int overlap = -1;
- for (i = 0; i < nregs; i++)
+ if (TARGET_HAVE_MVE && !BYTES_BIG_ENDIAN
+ && GET_CODE (addr) != LABEL_REF)
+ {
+ sprintf (buff, "v%srw.32\t%%q0, %%1", load ? "ld" : "st");
+ ops[0] = reg;
+ ops[1] = mem;
+ output_asm_insn (buff, ops);
+ }
+ else
{
- /* We're only using DImode here because it's a convenient size. */
- ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i);
- ops[1] = adjust_address (mem, DImode, 8 * i);
- if (reg_overlap_mentioned_p (ops[0], mem))
+ for (i = 0; i < nregs; i++)
{
- gcc_assert (overlap == -1);
- overlap = i;
+ /* We're only using DImode here because it's a convenient
+ size. */
+ ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * i);
+ ops[1] = adjust_address (mem, DImode, 8 * i);
+ if (reg_overlap_mentioned_p (ops[0], mem))
+ {
+ gcc_assert (overlap == -1);
+ overlap = i;
+ }
+ else
+ {
+ if (TARGET_HAVE_MVE && GET_CODE (addr) == LABEL_REF)
+ sprintf (buff, "v%sr.64\t%%P0, %%1", load ? "ld" : "st");
+ else
+ sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
+ output_asm_insn (buff, ops);
+ }
}
- else
+ if (overlap != -1)
{
- sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
+ ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * overlap);
+ ops[1] = adjust_address (mem, SImode, 8 * overlap);
+ if (TARGET_HAVE_MVE && GET_CODE (addr) == LABEL_REF)
+ sprintf (buff, "v%sr.32\t%%P0, %%1", load ? "ld" : "st");
+ else
+ sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
output_asm_insn (buff, ops);
}
}
- if (overlap != -1)
- {
- ops[0] = gen_rtx_REG (DImode, REGNO (reg) + 2 * overlap);
- ops[1] = adjust_address (mem, SImode, 8 * overlap);
- sprintf (buff, "v%sr%%?\t%%P0, %%1", load ? "ld" : "st");
- output_asm_insn (buff, ops);
- }
return "";
}
func_type = arm_current_func_type ();
/* Space for saved VFP registers. */
if (! IS_VOLATILE (func_type)
- && TARGET_HARD_FLOAT)
+ && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
saved += arm_get_vfp_saved_size ();
/* Allocate space for saving/restoring FPCXTNS in Armv8.1-M Mainline
saved_size += 8;
}
- if (TARGET_HARD_FLOAT)
+ if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
{
start_reg = FIRST_VFP_REGNUM;
}
return;
+ /* To print the memory operand with "Us" constraint. Based on the rtx_code
+ the memory operands output looks like following.
+ 1. [Rn], #+/-<imm>
+ 2. [Rn, #+/-<imm>]!
+ 3. [Rn]. */
+ case 'E':
+ {
+ rtx addr;
+ rtx postinc_reg = NULL;
+ unsigned inc_val = 0;
+ enum rtx_code code;
+
+ gcc_assert (MEM_P (x));
+ addr = XEXP (x, 0);
+ code = GET_CODE (addr);
+ if (code == POST_INC || code == POST_DEC || code == PRE_INC
+ || code == PRE_DEC)
+ {
+ asm_fprintf (stream, "[%r", REGNO (XEXP (addr, 0)));
+ inc_val = GET_MODE_SIZE (GET_MODE (x));
+ if (code == POST_INC || code == POST_DEC)
+ asm_fprintf (stream, "], #%s%d",(code == POST_INC)
+ ? "": "-", inc_val);
+ else
+ asm_fprintf (stream, ", #%s%d]!",(code == PRE_INC)
+ ? "": "-", inc_val);
+ }
+ else if (code == POST_MODIFY || code == PRE_MODIFY)
+ {
+ asm_fprintf (stream, "[%r", REGNO (XEXP (addr, 0)));
+ postinc_reg = XEXP ( XEXP (x, 1), 1);
+ if (postinc_reg && CONST_INT_P (postinc_reg))
+ {
+ if (code == POST_MODIFY)
+ asm_fprintf (stream, "], #%wd",INTVAL (postinc_reg));
+ else
+ asm_fprintf (stream, ", #%wd]!",INTVAL (postinc_reg));
+ }
+ }
+ else
+ {
+ gcc_assert (REG_P (addr));
+ asm_fprintf (stream, "[%r]",REGNO (addr));
+ }
+ }
+ return;
+
case 'C':
{
rtx addr;
REGNO (XEXP (x, 0)),
GET_CODE (x) == PRE_DEC ? "-" : "",
GET_MODE_SIZE (mode));
+ else if (TARGET_HAVE_MVE && (mode == OImode || mode == XImode))
+ asm_fprintf (stream, "[%r]!", REGNO (XEXP (x,0)));
else
- asm_fprintf (stream, "[%r], #%s%d",
- REGNO (XEXP (x, 0)),
+ asm_fprintf (stream, "[%r], #%s%d", REGNO (XEXP (x, 0)),
GET_CODE (x) == POST_DEC ? "-" : "",
GET_MODE_SIZE (mode));
}
{
if (GET_MODE_CLASS (mode) == MODE_CC)
return (regno == CC_REGNUM
- || (TARGET_HARD_FLOAT
+ || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
&& regno == VFPCC_REGNUM));
if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)
return false;
+ if (IS_VPR_REGNUM (regno))
+ return true;
+
if (TARGET_THUMB1)
/* For the Thumb we only allow values bigger than SImode in
registers 0 - 6, so that there is always a second low
start of an even numbered register pair. */
return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);
- if (TARGET_HARD_FLOAT && IS_VFP_REGNUM (regno))
+ if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))
{
if (mode == DFmode)
return VFP_REGNO_OK_FOR_DOUBLE (regno);
|| (mode == OImode && NEON_REGNO_OK_FOR_NREGS (regno, 4))
|| (mode == CImode && NEON_REGNO_OK_FOR_NREGS (regno, 6))
|| (mode == XImode && NEON_REGNO_OK_FOR_NREGS (regno, 8));
+ if (TARGET_HAVE_MVE)
+ return ((VALID_MVE_MODE (mode) && NEON_REGNO_OK_FOR_QUAD (regno))
+ || (mode == OImode && NEON_REGNO_OK_FOR_NREGS (regno, 4))
+ || (mode == XImode && NEON_REGNO_OK_FOR_NREGS (regno, 8)));
return false;
}
/* We specifically want to allow elements of "structure" modes to
be tieable to the structure. This more general condition allows
other rarer situations too. */
- if (TARGET_NEON
- && (VALID_NEON_DREG_MODE (mode1)
- || VALID_NEON_QREG_MODE (mode1)
- || VALID_NEON_STRUCT_MODE (mode1))
- && (VALID_NEON_DREG_MODE (mode2)
- || VALID_NEON_QREG_MODE (mode2)
- || VALID_NEON_STRUCT_MODE (mode2)))
+ if ((TARGET_NEON
+ && (VALID_NEON_DREG_MODE (mode1)
+ || VALID_NEON_QREG_MODE (mode1)
+ || VALID_NEON_STRUCT_MODE (mode1))
+ && (VALID_NEON_DREG_MODE (mode2)
+ || VALID_NEON_QREG_MODE (mode2)
+ || VALID_NEON_STRUCT_MODE (mode2)))
+ || (TARGET_HAVE_MVE
+ && (VALID_MVE_MODE (mode1)
+ || VALID_MVE_STRUCT_MODE (mode1))
+ && (VALID_MVE_MODE (mode2)
+ || VALID_MVE_STRUCT_MODE (mode2))))
return true;
return false;
if (regno == PC_REGNUM)
return NO_REGS;
+ if (IS_VPR_REGNUM (regno))
+ return VPR_REG;
+
if (TARGET_THUMB1)
{
if (regno == STACK_POINTER_REGNUM)
floats_from_frame += 4;
}
- if (TARGET_HARD_FLOAT)
+ if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
{
int start_reg;
rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);
}
}
- if (TARGET_HARD_FLOAT)
+ if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
{
/* Generate VFP register multi-pop. */
int end_reg = LAST_VFP_REGNUM + 1;
add_reg_note (insn, REG_FRAME_RELATED_EXPR, dwarf);
RTX_FRAME_RELATED_P (insn) = 1;
}
- }
+ }
if (!really_return)
return;
{
arm_initialize_isa (opt_bits, opt->isa_bits);
+ /* For the cases "-march=armv8.1-m.main+mve -mfloat-abi=soft" and
+ "-march=armv8.1-m.main+mve.fp -mfloat-abi=soft" MVE and MVE with
+ floating point instructions is disabled. So the following check
+ restricts the printing of ".arch_extension mve" and
+ ".arch_extension fp" (for mve.fp) in the assembly file. MVE needs
+ this special behaviour because the feature bit "mve" and
+ "mve_float" are not part of "fpu bits", so they are not cleared
+ when -mfloat-abi=soft (i.e nofp) but the marco TARGET_HAVE_MVE and
+ TARGET_HAVE_MVE_FLOAT are disabled. */
+ if ((bitmap_bit_p (opt_bits, isa_bit_mve) && !TARGET_HAVE_MVE)
+ || (bitmap_bit_p (opt_bits, isa_bit_mve_float)
+ && !TARGET_HAVE_MVE_FLOAT))
+ continue;
+
/* If every feature bit of this option is set in the target
ISA specification, print out the option name. However,
don't print anything if all the bits are part of the
|| mode == V2HAmode))
return true;
+ if (TARGET_HAVE_MVE
+ && (mode == V2DImode || mode == V4SImode || mode == V8HImode
+ || mode == V16QImode))
+ return true;
+
+ if (TARGET_HAVE_MVE_FLOAT
+ && (mode == V2DFmode || mode == V4SFmode || mode == V8HFmode))
+ return true;
+
return false;
}
&& (nelems >= 2 && nelems <= 4))
return true;
+ if (TARGET_HAVE_MVE && !BYTES_BIG_ENDIAN
+ && VALID_MVE_MODE (mode) && (nelems == 2 || nelems == 4))
+ return true;
+
return false;
}
if (TARGET_THUMB1)
fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;
- if (TARGET_32BIT && TARGET_HARD_FLOAT)
+ if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
{
/* VFPv3 registers are disabled when earlier VFP
versions are selected due to the definition of
call_used_regs[regno] = regno < FIRST_VFP_REGNUM + 16
|| regno >= FIRST_VFP_REGNUM + 32;
}
+ if (TARGET_HAVE_MVE)
+ fixed_regs[VPR_REGNUM] = 0;
}
if (TARGET_REALLY_IWMMXT && !TARGET_GENERAL_REGS_ONLY)
if (!opt->remove)
{
arm_initialize_isa (opt_bits, opt->isa_bits);
+ /* For the cases "-march=armv8.1-m.main+mve -mfloat-abi=soft"
+ and "-march=armv8.1-m.main+mve.fp -mfloat-abi=soft" MVE and
+ MVE with floating point instructions is disabled. So the
+ following check restricts the printing of ".arch_extension
+ mve" and ".arch_extension fp" (for mve.fp) in the assembly
+ file. MVE needs this special behaviour because the
+ feature bit "mve" and "mve_float" are not part of
+ "fpu bits", so they are not cleared when -mfloat-abi=soft
+ (i.e nofp) but the marco TARGET_HAVE_MVE and
+ TARGET_HAVE_MVE_FLOAT are disabled. */
+ if ((bitmap_bit_p (opt_bits, isa_bit_mve) && !TARGET_HAVE_MVE)
+ || (bitmap_bit_p (opt_bits, isa_bit_mve_float)
+ && !TARGET_HAVE_MVE_FLOAT))
+ continue;
if (bitmap_subset_p (opt_bits, arm_active_target.isa)
&& !bitmap_subset_p (opt_bits, isa_all_fpubits_internal))
asm_fprintf (asm_out_file, "\t.arch_extension %s\n",
instructions (most are floating-point related). */
#define TARGET_HAVE_FPCXT_CMSE (arm_arch8_1m_main)
-#define TARGET_HAVE_MVE (bitmap_bit_p (arm_active_target.isa, \
- isa_bit_mve))
+#define TARGET_HAVE_MVE (arm_float_abi != ARM_FLOAT_ABI_SOFT \
+ && bitmap_bit_p (arm_active_target.isa, \
+ isa_bit_mve) \
+ && !TARGET_GENERAL_REGS_ONLY)
-#define TARGET_HAVE_MVE_FLOAT (bitmap_bit_p (arm_active_target.isa, \
- isa_bit_mve_float))
+#define TARGET_HAVE_MVE_FLOAT (arm_float_abi != ARM_FLOAT_ABI_SOFT \
+ && bitmap_bit_p (arm_active_target.isa, \
+ isa_bit_mve_float) \
+ && !TARGET_GENERAL_REGS_ONLY)
/* Nonzero if integer division instructions supported. */
#define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
/* s0-s15 VFP scratch (aka d0-d7).
s16-s31 S VFP variable (aka d8-d15).
vfpcc Not a real register. Represents the VFP condition
- code flags. */
+ code flags.
+ vpr Used to represent MVE VPR predication. */
/* The stack backtrace structure is as follows:
fp points to here: | save code pointer | [fp]
1,1,1,1,1,1,1,1, \
1,1,1,1, \
/* Specials. */ \
- 1,1,1,1,1,1 \
+ 1,1,1,1,1,1,1 \
}
/* 1 for registers not available across function calls.
1,1,1,1,1,1,1,1, \
1,1,1,1, \
/* Specials. */ \
- 1,1,1,1,1,1 \
+ 1,1,1,1,1,1,1 \
}
#ifndef SUBTARGET_CONDITIONAL_REGISTER_USAGE
&& (LAST_VFP_REGNUM - (REGNUM) >= 2 * (N) - 1))
/* The number of hard registers is 16 ARM + 1 CC + 1 SFP + 1 AFP
- + 1 APSRQ + 1 APSRGE. */
+ + 1 APSRQ + 1 APSRGE + 1 VPR. */
/* Intel Wireless MMX Technology registers add 16 + 4 more. */
/* VFP (VFP3) adds 32 (64) + 1 VFPCC. */
-#define FIRST_PSEUDO_REGISTER 106
+#define FIRST_PSEUDO_REGISTER 107
#define DBX_REGISTER_NUMBER(REGNO) arm_dbx_register_number (REGNO)
|| (MODE) == V8HFmode || (MODE) == V4SFmode || (MODE) == V2DImode \
|| (MODE) == V8BFmode)
+#define VALID_MVE_MODE(MODE) \
+ ((MODE) == V2DImode ||(MODE) == V4SImode || (MODE) == V8HImode \
+ || (MODE) == V16QImode || (MODE) == V8HFmode || (MODE) == V4SFmode \
+ || (MODE) == V2DFmode)
+
+#define VALID_MVE_SI_MODE(MODE) \
+ ((MODE) == V2DImode ||(MODE) == V4SImode || (MODE) == V8HImode \
+ || (MODE) == V16QImode)
+
+#define VALID_MVE_SF_MODE(MODE) \
+ ((MODE) == V8HFmode || (MODE) == V4SFmode || (MODE) == V2DFmode)
+
/* Structure modes valid for Neon registers. */
#define VALID_NEON_STRUCT_MODE(MODE) \
((MODE) == TImode || (MODE) == EImode || (MODE) == OImode \
|| (MODE) == CImode || (MODE) == XImode)
+#define VALID_MVE_STRUCT_MODE(MODE) \
+ ((MODE) == TImode || (MODE) == OImode || (MODE) == XImode)
+
/* The register numbers in sequence, for passing to arm_gen_load_multiple. */
extern int arm_regs_in_sequence[];
/* Registers not for general use. */ \
CC_REGNUM, VFPCC_REGNUM, \
FRAME_POINTER_REGNUM, ARG_POINTER_REGNUM, \
- SP_REGNUM, PC_REGNUM, APSRQ_REGNUM, APSRGE_REGNUM \
+ SP_REGNUM, PC_REGNUM, APSRQ_REGNUM, \
+ APSRGE_REGNUM, VPR_REGNUM \
}
+#define IS_VPR_REGNUM(REGNUM) \
+ ((REGNUM) == VPR_REGNUM)
+
/* Use different register alloc ordering for Thumb. */
#define ADJUST_REG_ALLOC_ORDER arm_order_regs_for_local_alloc ()
VFPCC_REG,
SFP_REG,
AFP_REG,
+ VPR_REG,
ALL_REGS,
LIM_REG_CLASSES
};
#define N_REG_CLASSES (int) LIM_REG_CLASSES
/* Give names of register classes as strings for dump file. */
-#define REG_CLASS_NAMES \
+#define REG_CLASS_NAMES \
{ \
"NO_REGS", \
"LO_REGS", \
"VFPCC_REG", \
"SFP_REG", \
"AFP_REG", \
+ "VPR_REG", \
"ALL_REGS" \
}
{ 0x00000000, 0x00000000, 0x00000000, 0x00000020 }, /* VFPCC_REG */ \
{ 0x00000000, 0x00000000, 0x00000000, 0x00000040 }, /* SFP_REG */ \
{ 0x00000000, 0x00000000, 0x00000000, 0x00000080 }, /* AFP_REG */ \
- { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F } /* ALL_REGS */ \
+ { 0x00000000, 0x00000000, 0x00000000, 0x00000400 }, /* VPR_REG. */ \
+ { 0xFFFF7FFF, 0xFFFFFFFF, 0xFFFFFFFF, 0x0000000F } /* ALL_REGS. */ \
}
#define FP_SYSREGS \
(VFPCC_REGNUM 101) ; VFP Condition code pseudo register
(APSRQ_REGNUM 104) ; Q bit pseudo register
(APSRGE_REGNUM 105) ; GE bits pseudo register
+ (VPR_REGNUM 106) ; Vector Predication Register - MVE register.
]
)
;; 3rd operand to select_dominance_cc_mode
(ior (eq_attr "is_thumb1" "yes")
(eq_attr "type" "call"))
(const_string "clob")
- (if_then_else (eq_attr "is_neon_type" "no")
- (const_string "nocond")
- (const_string "unconditional"))))
+ (if_then_else
+ (ior (eq_attr "is_neon_type" "no")
+ (eq_attr "is_mve_type" "no"))
+ (const_string "nocond")
+ (const_string "unconditional"))))
; Predicable means that the insn can be conditionally executed based on
; an automatically added predicate (additional patterns are generated by
[(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,m")
(match_operand:SF 1 "general_operand" "r,mE,r"))]
"TARGET_32BIT
- && TARGET_SOFT_FLOAT
+ && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE
&& (!MEM_P (operands[0])
|| register_operand (operands[1], SFmode))"
{
(define_insn "*movdf_soft_insn"
[(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=r,r,r,r,m")
- (match_operand:DF 1 "soft_df_operand" "rDa,Db,Dc,mF,r"))]
- "TARGET_32BIT && TARGET_SOFT_FLOAT
+ (match_operand:DF 1 "soft_df_operand" "rDa,Db,Dc,mF,r"))]
+ "TARGET_32BIT && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE
&& ( register_operand (operands[0], DFmode)
|| register_operand (operands[1], DFmode))"
"*
(match_operand:SI 2 "const_int_I_operand" "I")))
(set (match_operand:DF 3 "vfp_hard_register_operand" "")
(mem:DF (match_dup 1)))])]
- "TARGET_32BIT && TARGET_HARD_FLOAT"
+ "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"
"*
{
int num_regs = XVECLEN (operands[0], 0);
(set_attr "length" "8")]
)
-;; Vector bits common to IWMMXT and Neon
+;; Vector bits common to IWMMXT, Neon and MVE
(include "vec-common.md")
;; Load the Intel Wireless Multimedia Extension patterns
(include "iwmmxt.md")
(include "sync.md")
;; Fixed-point patterns
(include "arm-fixed.md")
+;; M-profile Vector Extension
+(include "mve.md")
--- /dev/null
+/* Arm MVE intrinsics include file.
+
+ Copyright (C) 2019-2020 Free Software Foundation, Inc.
+ Contributed by Arm.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published
+ by the Free Software Foundation; either version 3, or (at your
+ option) any later version.
+
+ GCC is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
+ License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef _GCC_ARM_MVE_H
+#define _GCC_ARM_MVE_H
+
+#if !__ARM_FEATURE_MVE
+#error "MVE feature not supported"
+#endif
+
+#include <stdint.h>
+#ifndef __cplusplus
+#include <stdbool.h>
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#if (__ARM_FEATURE_MVE & 2) /* MVE Floating point. */
+typedef __fp16 float16_t;
+typedef float float32_t;
+typedef __simd128_float16_t float16x8_t;
+typedef __simd128_float32_t float32x4_t;
+#endif
+
+typedef uint16_t mve_pred16_t;
+typedef __simd128_uint8_t uint8x16_t;
+typedef __simd128_uint16_t uint16x8_t;
+typedef __simd128_uint32_t uint32x4_t;
+typedef __simd128_uint64_t uint64x2_t;
+typedef __simd128_int8_t int8x16_t;
+typedef __simd128_int16_t int16x8_t;
+typedef __simd128_int32_t int32x4_t;
+typedef __simd128_int64_t int64x2_t;
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _GCC_ARM_MVE_H. */
;; in all states: Pf, Pg
;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
;; in all states: Q
+(define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
+ "MVE VPR register")
(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
"The VFP registers @code{s0}-@code{s31}.")
;; Integer and float modes supported by Neon and IWMMXT.
(define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
+;; Integer and float modes supported by Neon, IWMMXT and MVE.
+(define_mode_iterator VNIM1 [V16QI V8HI V4SI V4SF V2DI])
+
+;; Integer and float modes supported by Neon and IWMMXT but not MVE.
+(define_mode_iterator VNINOTM1 [V2SI V4HI V8QI V2SF])
+
;; Integer and float modes supported by Neon and IWMMXT, except V2DI.
(define_mode_iterator VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
;; 16-bit floating-point vector modes suitable for moving (includes BFmode).
(define_mode_iterator VHFBF [V8HF V4HF V4BF V8BF])
+;; 16-bit floating-point vector modes suitable for moving (includes BFmode,
+;; without V8HF ).
+(define_mode_iterator VHFBF_split [V4HF V4BF V8BF])
+
;; 16-bit floating-point scalar modes suitable for moving (includes BFmode).
(define_mode_iterator HFBF [HF BF])
--- /dev/null
+;; Arm M-profile Vector Extension Machine Description
+;; Copyright (C) 2019-2020 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3. If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_mode_iterator MVE_types [V16QI V8HI V4SI V2DI TI V8HF V4SF V2DF])
+(define_mode_attr V_sz_elem2 [(V16QI "s8") (V8HI "u16") (V4SI "u32")
+ (V2DI "u64")])
+
+(define_insn "*mve_mov<mode>"
+ [(set (match_operand:MVE_types 0 "nonimmediate_operand" "=w,w,r,w,w,r,w,Us")
+ (match_operand:MVE_types 1 "general_operand" "w,r,w,Dn,Usi,r,Dm,w"))]
+ "TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT"
+{
+ if (which_alternative == 3 || which_alternative == 6)
+ {
+ int width, is_valid;
+ static char templ[40];
+
+ is_valid = simd_immediate_valid_for_move (operands[1], <MODE>mode,
+ &operands[1], &width);
+
+ gcc_assert (is_valid != 0);
+
+ if (width == 0)
+ return "vmov.f32\t%q0, %1 @ <mode>";
+ else
+ sprintf (templ, "vmov.i%d\t%%q0, %%x1 @ <mode>", width);
+ return templ;
+ }
+ switch (which_alternative)
+ {
+ case 0:
+ return "vmov\t%q0, %q1";
+ case 1:
+ return "vmov\t%e0, %Q1, %R1 @ <mode>\;vmov\t%f0, %J1, %K1";
+ case 2:
+ return "vmov\t%Q0, %R0, %e1 @ <mode>\;vmov\t%J0, %K0, %f1";
+ case 4:
+ if ((TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))
+ || (MEM_P (operands[1])
+ && GET_CODE (XEXP (operands[1], 0)) == LABEL_REF))
+ return output_move_neon (operands);
+ else
+ return "vldrb.8 %q0, %E1";
+ case 5:
+ return output_move_neon (operands);
+ case 7:
+ return "vstrb.8 %q1, %E0";
+ default:
+ gcc_unreachable ();
+ return "";
+ }
+}
+ [(set_attr "type" "mve_move,mve_move,mve_move,mve_move,mve_load,mve_move,mve_move,mve_store")
+ (set_attr "length" "4,8,8,4,8,8,4,4")
+ (set_attr "thumb2_pool_range" "*,*,*,*,1018,*,*,*")
+ (set_attr "neg_pool_range" "*,*,*,*,996,*,*,*")])
+
+(define_insn "*mve_mov<mode>"
+ [(set (match_operand:MVE_types 0 "s_register_operand" "=w,w")
+ (vec_duplicate:MVE_types
+ (match_operand:SI 1 "nonmemory_operand" "r,i")))]
+ "TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT"
+{
+ if (which_alternative == 0)
+ return "vdup.<V_sz_elem>\t%q0, %1";
+ return "vmov.<V_sz_elem>\t%q0, %1";
+}
+ [(set_attr "length" "4,4")
+ (set_attr "type" "mve_move,mve_move")])
int width, is_valid;
static char templ[40];
- is_valid = neon_immediate_valid_for_move (operands[1], <MODE>mode,
+ is_valid = simd_immediate_valid_for_move (operands[1], <MODE>mode,
&operands[1], &width);
gcc_assert (is_valid != 0);
int width, is_valid;
static char templ[40];
- is_valid = neon_immediate_valid_for_move (operands[1], <MODE>mode,
+ is_valid = simd_immediate_valid_for_move (operands[1], <MODE>mode,
&operands[1], &width);
gcc_assert (is_valid != 0);
}
})
+;; The pattern mov<mode> where mode is v8hf, v4hf, v4bf and v8bf are split into
+;; two groups. The pattern movv8hf is common for MVE and NEON, so it is moved
+;; into vec-common.md file. Remaining mov expand patterns with half float and
+;; bfloats are implemented below.
(define_expand "mov<mode>"
- [(set (match_operand:VHFBF 0 "s_register_operand")
- (match_operand:VHFBF 1 "s_register_operand"))]
+ [(set (match_operand:VHFBF_split 0 "s_register_operand")
+ (match_operand:VHFBF_split 1 "s_register_operand"))]
"TARGET_NEON"
{
gcc_checking_assert (aligned_operand (operands[0], <MODE>mode));
(define_expand "vec_init<mode><V_elem_l>"
[(match_operand:VDQ 0 "s_register_operand")
(match_operand 1 "" "")]
- "TARGET_NEON"
+ "TARGET_NEON || TARGET_HAVE_MVE"
{
neon_expand_vector_init (operands[0], operands[1]);
DONE;
return guard_addr_operand (XEXP (op, 0), mode);
})
+(define_predicate "vpr_register_operand"
+ (match_code "reg")
+{
+ return REG_P (op)
+ && (REGNO (op) >= FIRST_PSEUDO_REGISTER
+ || IS_VPR_REGNUM (REGNO (op)));
+})
+
(define_predicate "imm_for_neon_inv_logic_operand"
(match_code "const_vector")
{
(define_predicate "imm_for_neon_mov_operand"
(match_code "const_vector,const_int")
{
- return neon_immediate_valid_for_move (op, mode, NULL, NULL);
+ return simd_immediate_valid_for_move (op, mode, NULL, NULL);
})
(define_predicate "imm_for_neon_lshift_operand"
$(srcdir)/config/arm/ldmstm.md \
$(srcdir)/config/arm/ldrdstrd.md \
$(srcdir)/config/arm/marvell-f-iwmmxt.md \
+ $(srcdir)/config/arm/mve.md \
$(srcdir)/config/arm/neon.md \
$(srcdir)/config/arm/predicates.md \
$(srcdir)/config/arm/sync.md \
; The classification below is for TME instructions
;
; tme
+; The classification below is for M-profile Vector Extension instructions
+;
+; mve_move
+; mve_store
+; mve_load
(define_attr "type"
"adc_imm,\
crypto_sm4,\
coproc,\
tme,\
- memtag"
+ memtag,\
+ mve_move,\
+ mve_store,\
+ mve_load"
(const_string "untyped"))
; Is this an (integer side) multiply with a 32-bit (or smaller) result?
(const_string "yes")
(const_string "no")))
+;; YES if the "type" attribute assigned to the insn denotes an MVE instruction,
+;; No otherwise.
+(define_attr "is_mve_type" "yes,no"
+ (if_then_else (eq_attr "type"
+ "mve_move, mve_load, mve_store, mrs")
+ (const_string "yes")
+ (const_string "no")))
+
(define_insn_reservation "no_reservation" 0
(eq_attr "type" "no_insn")
"nothing")
;; Vector Moves
(define_expand "mov<mode>"
- [(set (match_operand:VALL 0 "nonimmediate_operand")
- (match_operand:VALL 1 "general_operand"))]
+ [(set (match_operand:VNIM1 0 "nonimmediate_operand")
+ (match_operand:VNIM1 1 "general_operand"))]
+ "TARGET_NEON
+ || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))
+ || (TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
+ || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))"
+ {
+ gcc_checking_assert (aligned_operand (operands[0], <MODE>mode));
+ gcc_checking_assert (aligned_operand (operands[1], <MODE>mode));
+ if (can_create_pseudo_p ())
+ {
+ if (!REG_P (operands[0]))
+ operands[1] = force_reg (<MODE>mode, operands[1]);
+ else if ((TARGET_NEON || TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT)
+ && (CONSTANT_P (operands[1])))
+ {
+ operands[1] = neon_make_constant (operands[1]);
+ gcc_assert (operands[1] != NULL_RTX);
+ }
+ }
+})
+
+(define_expand "mov<mode>"
+ [(set (match_operand:VNINOTM1 0 "nonimmediate_operand")
+ (match_operand:VNINOTM1 1 "general_operand"))]
"TARGET_NEON
|| (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (<MODE>mode))"
{
}
})
+(define_expand "movv8hf"
+ [(set (match_operand:V8HF 0 "s_register_operand")
+ (match_operand:V8HF 1 "s_register_operand"))]
+ "TARGET_NEON || TARGET_HAVE_MVE_FLOAT"
+{
+ gcc_checking_assert (aligned_operand (operands[0], E_V8HFmode));
+ gcc_checking_assert (aligned_operand (operands[1], E_V8HFmode));
+ if (can_create_pseudo_p ())
+ {
+ if (!REG_P (operands[0]))
+ operands[1] = force_reg (E_V8HFmode, operands[1]);
+ }
+})
+
;; Vector arithmetic. Expanders are blank, then unnamed insns implement
;; patterns separately for IWMMXT and Neon.
&& ( register_operand (operands[0], DImode)
|| register_operand (operands[1], DImode))
&& !(TARGET_NEON && CONST_INT_P (operands[1])
- && neon_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
+ && simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
"*
switch (which_alternative)
{
+2020-03-16 Andre Vieira <andre.simoesdiasvieira@arm.com>
+ Mihail Ionescu <mihail.ionescu@arm.com>
+ Srinath Parvathaneni <srinath.parvathaneni@arm.com>
+
+ * gcc.target/arm/mve/intrinsics/mve_vector_float.c: New test.
+ * gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.
+ * gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Likewise.
+ * gcc.target/arm/mve/intrinsics/mve_vector_int.c: Likewise.
+ * gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.
+ * gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.
+ * gcc.target/arm/mve/intrinsics/mve_vector_uint.c: Likewise.
+ * gcc.target/arm/mve/intrinsics/mve_vector_uint1.c: Likewise.
+ * gcc.target/arm/mve/intrinsics/mve_vector_uint2.c: Likewise.
+ * gcc.target/arm/mve/mve.exp: New file.
+ * lib/target-supports.exp
+ (check_effective_target_arm_v8_1m_mve_fp_ok_nocache): Proc to check
+ armv8.1-m.main+mve.fp and returning corresponding options.
+ (check_effective_target_arm_v8_1m_mve_fp_ok): Proc to call
+ check_effective_target_arm_v8_1m_mve_fp_ok_nocache to check support of
+ MVE with floating point on the current target.
+ (add_options_for_arm_v8_1m_mve_fp): Proc to call
+ check_effective_target_arm_v8_1m_mve_fp_ok to return corresponding
+ compiler options for MVE with floating point.
+ (check_effective_target_arm_v8_1m_mve_ok_nocache): Modify to test and
+ return hard float-abi on success.
+
2020-03-16 H.J. Lu <hongjiu.lu@intel.com>
PR target/89229
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+
+#include "arm_mve.h"
+
+float32x4_t
+foo32 (float32x4_t value)
+{
+ float32x4_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldmia.*" } } */
+
+float16x8_t
+foo16 (float16x8_t value)
+{
+ float16x8_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldmia.*" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+
+#include "arm_mve.h"
+
+float32x4_t value;
+
+float32x4_t
+foo32 ()
+{
+ float32x4_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldmia.*" } } */
+
+float16x8_t value1;
+
+float16x8_t
+foo16 ()
+{
+ float16x8_t b = value1;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldmia.*" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+
+#include "arm_mve.h"
+
+float32x4_t
+foo32 ()
+{
+ float32x4_t b = {10.0, 12.0, 14.0, 16.0};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64*" } } */
+
+float16x8_t
+foo16 ()
+{
+ float16x8_t b = {32.01};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64.*" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo8 (int8x16_t value)
+{
+ int8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+int16x8_t
+foo16 (int16x8_t value)
+{
+ int16x8_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+int32x4_t
+foo32 (int32x4_t value)
+{
+ int32x4_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+int64x2_t
+foo64 (int64x2_t value)
+{
+ int64x2_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+
+#include "arm_mve.h"
+
+int8x16_t value1;
+int16x8_t value2;
+int32x4_t value3;
+int64x2_t value4;
+
+int8x16_t
+foo8 ()
+{
+ int8x16_t b = value1;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+int16x8_t
+foo16 ()
+{
+ int16x8_t b = value2;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+int32x4_t
+foo32 ()
+{
+ int32x4_t b = value3;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8" } } */
+
+int64x2_t
+foo64 ()
+{
+ int64x2_t b = value4;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo8 ()
+{
+ int8x16_t b = {1, 2, 3, 4};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64.*" } } */
+
+int16x8_t
+foo16 (int16x8_t value)
+{
+ int16x8_t b = {1, 2, 3};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64.*" } } */
+
+int32x4_t
+foo32 (int32x4_t value)
+{
+ int32x4_t b = {1, 2};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64.*" } } */
+
+int64x2_t
+foo64 (int64x2_t value)
+{
+ int64x2_t b = {1};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64.*" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+
+#include "arm_mve.h"
+
+uint8x16_t
+foo8 (uint8x16_t value)
+{
+ uint8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+uint16x8_t
+foo16 (uint16x8_t value)
+{
+ uint16x8_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+uint32x4_t
+foo32 (uint32x4_t value)
+{
+ uint32x4_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+uint64x2_t
+foo64 (uint64x2_t value)
+{
+ uint64x2_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+
+#include "arm_mve.h"
+
+uint8x16_t value1;
+uint16x8_t value2;
+uint32x4_t value3;
+uint64x2_t value4;
+
+uint8x16_t
+foo8 ()
+{
+ uint8x16_t b = value1;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+uint16x8_t
+foo16 ()
+{
+ uint16x8_t b = value2;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+uint32x4_t
+foo32 ()
+{
+ uint32x4_t b = value3;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
+
+uint64x2_t
+foo64 ()
+{
+ uint64x2_t b = value4;
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldrb.8*" } } */
--- /dev/null
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+
+#include "arm_mve.h"
+
+uint8x16_t
+foo8 (uint8x16_t value)
+{
+ uint8x16_t b = {1, 2, 3, 4};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64.*" } } */
+
+uint16x8_t
+foo16 (uint16x8_t value)
+{
+ uint16x8_t b = {1, 2, 3};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64.*" } } */
+
+uint32x4_t
+foo32 (uint32x4_t value)
+{
+ uint32x4_t b = {1, 2};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64.*" } } */
+
+uint64x2_t
+foo64 (uint64x2_t value)
+{
+ uint64x2_t b = {1};
+ return b;
+}
+
+/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
+/* { dg-final { scan-assembler "vstrb.*" } } */
+/* { dg-final { scan-assembler "vldr.64.*" } } */
--- /dev/null
+# Copyright (C) 2019-2020 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3. If not see
+# <http://www.gnu.org/licenses/>.
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an ARM target.
+if ![istarget arm*-*-*] then {
+ return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# If a testcase doesn't have special options, use these.
+global DEFAULT_CFLAGS
+if ![info exists DEFAULT_CFLAGS] then {
+ set DEFAULT_CFLAGS " -ansi -pedantic-errors"
+}
+
+# This variable should only apply to tests called in this exp file.
+global dg_runtest_extra_prunes
+set dg_runtest_extra_prunes ""
+lappend dg_runtest_extra_prunes "warning: switch -m(cpu|arch)=.* conflicts with -m(cpu|arch)=.* switch"
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/intrinsics/*.\[cCS\]]] \
+ "" $DEFAULT_CFLAGS
+
+# All done.
+set dg_runtest_extra_prunes ""
+dg-finish
return [check_configured_with "enable-standard-branch-protection"]
}
+# Return 1 if the target supports ARMv8.1-M MVE with floating point
+# instructions, 0 otherwise. The test is valid for ARM.
+# Record the command line options needed.
+
+proc check_effective_target_arm_v8_1m_mve_fp_ok_nocache { } {
+ global et_arm_v8_1m_mve_fp_flags
+ set et_arm_v8_1m_mve_fp_flags ""
+
+ if { ![istarget arm*-*-*] } {
+ return 0;
+ }
+
+ # Iterate through sets of options to find the compiler flags that
+ # need to be added to the -march option.
+ foreach flags {"" "-mfloat-abi=hard -mfpu=auto -march=armv8.1-m.main+mve.fp" "-mfloat-abi=softfp -mfpu=auto -march=armv8.1-m.main+mve.fp"} {
+ if { [check_no_compiler_messages_nocache \
+ arm_v8_1m_mve_fp_ok object {
+ #include <arm_mve.h>
+ #if !(__ARM_FEATURE_MVE & 2)
+ #error "__ARM_FEATURE_MVE for floating point not defined"
+ #endif
+ } "$flags -mthumb"] } {
+ set et_arm_v8_1m_mve_fp_flags "$flags -mthumb"
+ return 1
+ }
+ }
+
+ return 0;
+}
+
+proc check_effective_target_arm_v8_1m_mve_fp_ok { } {
+ return [check_cached_effective_target arm_v8_1m_mve_fp_ok \
+ check_effective_target_arm_v8_1m_mve_fp_ok_nocache]
+}
+
+proc add_options_for_arm_v8_1m_mve_fp { flags } {
+ if { ! [check_effective_target_arm_v8_1m_mve_fp_ok] } {
+ return "$flags"
+ }
+ global et_arm_v8_1m_mve_fp_flags
+ return "$flags $et_arm_v8_1m_mve_fp_flags"
+}
+
# Return 1 if the target supports the ARMv8.1 Adv.SIMD extension, 0
# otherwise. The test is valid for AArch64 and ARM. Record the command
# line options needed.
# Iterate through sets of options to find the compiler flags that
# need to be added to the -march option.
- foreach flags {"" "-mfloat-abi=softfp -mfpu=auto" "-mfloat-abi=hard -mfpu=auto"} {
+ foreach flags {"" "-mfloat-abi=hard -mfpu=auto -march=armv8.1-m.main+mve" "-mfloat-abi=softfp -mfpu=auto -march=armv8.1-m.main+mve"} {
if { [check_no_compiler_messages_nocache \
arm_v8_1m_mve_ok object {
#if !defined (__ARM_FEATURE_MVE)