From bcc3c3f1ca89628f02802fda20f2232b9deef5f9 Mon Sep 17 00:00:00 2001 From: "Jose E. Marchesi" Date: Fri, 7 Jul 2017 15:59:30 +0200 Subject: [PATCH] Support for the SPARC M8 cpu. This patch serie adds support for the SPARC M8 processor to GCC. The SPARC M8 processor implements the Oracle SPARC Architecture 2017. - bmask* instructions are put in their own instruction type. It makes little sense to have them in the same category than array instructions. - Similarly, VIS compare instructions are put in their own instruction type. This is to better accommodate subtypes, which are not quite the same than the subtypes of `visl' instructions. - The introduction of a new `subtype' insn attribute in sparc.md avoids the need for adjusting the instruction scheduler DFAs for previous cpu models every time a new cpu is introduced. - The full set of SPARC instructions used in sparc.md, and their position in the type/subtype hierarchy, is documented in a comment. This eases the modification of the DFA schedulers, and the addition of new cpus. - The M7 DFA scheduler is reworked: + To use the new type/subtype hierarchy. + The v3pipe insn attribute is no longer needed. + More accurate latencies for instructions. + The C4 core pipeline is documented in a comment in niagara7.md. - Support for -mcpu=m8 (we are thus suggesting to abandon the niagaraN denomination for M8 and later processors.) - Support for a new VIS level, VIS4B, covering the new VIS instructions introduced in OSA2017 and implemented in the M8. Also built-ins. - A M8 DFA scheduler: + Also based on the new type/subtype hierarchy. + The functional units in the C5 core are explicitly documented in a comment in m8.md. gcc/ChangeLog: * config/sparc/m8.md: New file. * config/sparc/sparc.md: Include m8.md. * config/sparc/sparc.opt: New option -mvis4b. * config/sparc/sparc.c (dump_target_flag_bits): Handle MASK_VIS4B. (sparc_option_override): Handle VIS4B. (enum sparc_builtins): Define SPARC_BUILTIN_DICTUNPACK{8,16,32}, SPARC_BUILTIN_FPCMP{LE,GT,EQ,NE}{8,16,32}SHL, SPARC_BUILTIN_FPCMPU{LE,GT}{8,16,32}SHL, SPARC_BUILTIN_FPCMPDE{8,16,32}SHL and SPARC_BUILTIN_FPCMPUR{8,16,32}SHL. (check_constant_argument): New function. (sparc_vis_init_builtins): Define builtins __builtin_vis_dictunpack{8,16,32}, __builtin_vis_fpcmp{le,gt,eq,ne}{8,16,32}shl, __builtin_vis_fpcmpu{le,gt}{8,16,32}shl, __builtin_vis_fpcmpde{8,16,32}shl and __builtin_vis_fpcmpur{8,16,32}shl. (sparc_expand_builtin): Check that the constant operands to __builtin_vis_fpcmp*shl and _builtin_vis_dictunpack* are indeed constant and in range. * config/sparc/sparc-c.c (sparc_target_macros): Handle TARGET_VIS4B. * config/sparc/sparc.h (SPARC_IMM2_P): Define. (SPARC_IMM5_P): Likewise. * config/sparc/sparc.md (cpu_feature): Add new feagure "vis4b". (enabled): Handle vis4b. (UNSPEC_DICTUNPACK): New unspec. (UNSPEC_FPCMPSHL): Likewise. (UNSPEC_FPUCMPSHL): Likewise. (UNSPEC_FPCMPDESHL): Likewise. (UNSPEC_FPCMPURSHL): Likewise. (cpu_feature): New CPU feature `vis4b'. (dictunpack{8,16,32}): New insns. (FPCSMODE): New mode iterator. (fpcscond): New code iterator. (fpcsucond): Likewise. (fpcmp{le,gt,eq,ne}{8,16,32}{si,di}shl): New insns. (fpcmpu{le,gt}{8,16,32}{si,di}shl): Likewise. (fpcmpde{8,16,32}{si,di}shl): Likewise. (fpcmpur{8,16,32}{si,di}shl): Likewise. * config/sparc/constraints.md: Define constraints `q' for unsigned 2-bit integer constants and `t' for unsigned 5-bit integer constants. * config/sparc/predicates.md (imm5_operand_dictunpack8): New predicate. (imm5_operand_dictunpack16): Likewise. (imm5_operand_dictunpack32): Likewise. (imm2_operand): Likewise. * doc/invoke.texi (SPARC Options): Document -mvis4b. * doc/extend.texi (SPARC VIS Built-in Functions): Document the ditunpack* and fpcmp*shl builtins. * config.gcc: Handle m8 in --with-{cpu,tune} options. * config.in: Add HAVE_AS_SPARC6 define. * config/sparc/driver-sparc.c (cpu_names): Add entry for the SPARC M8. * config/sparc/sol2.h (CPP_CPU64_DEFAULT_SPEC): Define for TARGET_CPU_m8. (ASM_CPU32_DEFAUILT_SPEC): Likewise. (CPP_CPU_SPEC): Handle m8. (ASM_CPU_SPEC): Likewise. * config/sparc/sparc-opts.h (enum processor_type): Add PROCESSOR_M8. * config/sparc/sparc.c (m8_costs): New struct. (sparc_option_override): Handle TARGET_CPU_m8. (sparc32_initialize_trampoline): Likewise. (sparc64_initialize_trampoline): Likewise. (sparc_issue_rate): Likewise. (sparc_register_move_cost): Likewise. * config/sparc/sparc.h (TARGET_CPU_m8): Define. (CPP_CPU64_DEFAULT_SPEC): Define for M8. (ASM_CPU64_DEFAULT_SPEC): Likewise. (CPP_CPU_SPEC): Handle M8. (ASM_CPU_SPEC): Likewise. (AS_M8_FLAG): Define. * config/sparc/sparc.md: Add m8 to the cpu attribute. * config/sparc/sparc.opt: New option -mcpu=m8 for sparc targets. * configure.ac (HAVE_AS_SPARC6): Check for assembler support for M8 instructions. * configure: Regenerate. * doc/invoke.texi (SPARC Options): Document -mcpu=m8 and -mtune=m8. * config/sparc/niagara7.md: Rework the DFA scheduler to use insn subtypes. * config/sparc/sparc.md: Remove the `v3pipe' insn attribute. ("*movdi_insn_sp32"): Do not set v3pipe. ("*movsi_insn"): Likewise. ("*movdi_insn_sp64"): Likewise. ("*movsf_insn"): Likewise. ("*movdf_insn_sp32"): Likewise. ("*movdf_insn_sp64"): Likewise. ("*zero_extendsidi2_insn_sp64"): Likewise. ("*sign_extendsidi2_insn"): Likewise. ("*mov_insn"): Likewise. ("*mov_insn_sp64"): Likewise. ("*mov_insn_sp32"): Likewise. ("3"): Likewise. ("3"): Likewise. ("*not_3"): Likewise. ("*nand_vis"): Likewise. ("*_not1_vis"): Likewise. ("*_not2_vis"): Likewise. ("one_cmpl2"): Likewise. ("faligndata_vis"): Likewise. ("alignaddrsi_vis"): Likewise. ("alignaddrdi_vis"): Likweise. ("alignaddrlsi_vis"): Likewise. ("alignaddrldi_vis"): Likewise. ("fcmp_vis"): Likewise. ("bmaskdi_vis"): Likewise. ("bmasksi_vis"): Likewise. ("bshuffle_vis"): Likewise. ("cmask8_vis"): Likewise. ("cmask16_vis"): Likewise. ("cmask32_vis"): Likewise. ("pdistn_vis"): Likewise. ("3"): Likewise. * config/sparc/sparc.md ("subtype"): New insn attribute. ("*wrgsr_sp64"): Set insn subtype. ("*rdgsr_sp64"): Likewise. ("alignaddrsi_vis"): Likewise. ("alignaddrdi_vis"): Likewise. ("alignaddrlsi_vis"): Likewise. ("alignaddrldi_vis"): Likewise. ("3"): Likewise. ("fexpand_vis"): Likewise. ("fpmerge_vis"): Likewise. ("faligndata_vis"): Likewise. ("bshuffle_vis"): Likewise. ("cmask8_vis"): Likewise. ("cmask16_vis"): Likewise. ("cmask32_vis"): Likewise. ("fchksm16_vis"): Likewise. ("v3"): Likewise. ("fmean16_vis"): Likewise. ("fp64_vis"): Likewise. ("v8qi3"): Likewise. ("3"): Likewise. ("3"): Likewise. ("3"): Likewise. ("v8qi3"): Likewise. ("3"): Likewise. ("*movqi_insn"): Likewise. ("*movhi_insn"): Likewise. ("*movsi_insn"): Likewise. ("movsi_pic_gotdata_op"): Likewise. ("*movdi_insn_sp32"): Likewise. ("*movdi_insn_sp64"): Likewise. ("movdi_pic_gotdata_op"): Likewise. ("*movsf_insn"): Likewise. ("*movdf_insn_sp32"): Likewise. ("*movdf_insn_sp64"): Likewise. ("*zero_extendhisi2_insn"): Likewise. ("*zero_extendqihi2_insn"): Likewise. ("*zero_extendqisi2_insn"): Likewise. ("*zero_extendqidi2_insn"): Likewise. ("*zero_extendhidi2_insn"): Likewise. ("*zero_extendsidi2_insn_sp64"): Likewise. ("ldfsr"): Likewise. ("prefetch_64"): Likewise. ("prefetch_32"): Likewise. ("tie_ld32"): Likewise. ("tie_ld64"): Likewise. ("*tldo_ldub_sp32"): Likewise. ("*tldo_ldub1_sp32"): Likewise. ("*tldo_ldub2_sp32"): Likewise. ("*tldo_ldub_sp64"): Likewise. ("*tldo_ldub1_sp64"): Likewise. ("*tldo_ldub2_sp64"): Likewise. ("*tldo_ldub3_sp64"): Likewise. ("*tldo_lduh_sp32"): Likewise. ("*tldo_lduh1_sp32"): Likewise. ("*tldo_lduh_sp64"): Likewise. ("*tldo_lduh1_sp64"): Likewise. ("*tldo_lduh2_sp64"): Likewise. ("*tldo_lduw_sp32"): Likewise. ("*tldo_lduw_sp64"): Likewise. ("*tldo_lduw1_sp64"): Likewise. ("*tldo_ldx_sp64"): Likewise. ("*mov_insn"): Likewise. ("*mov_insn_sp64"): Likewise. ("*mov_insn_sp32"): Likewise. * config/sparc/sparc.md ("type"): New insn type viscmp. ("fcmp_vis"): Set insn type to viscmp. ("fpcmp8_vis"): Likewise. ("fucmp8_vis"): Likewise. ("fpcmpu_vis"): Likewise. * config/sparc/niagara7.md ("n7_vis_logical_v3pipe"): Handle viscmp. ("n7_vis_logical_11cycle"): Likewise. * config/sparc/niagara4.md ("n4_vis_logical"): Likewise. * config/sparc/niagara2.md ("niag3_vis": Likewise. * config/sparc/niagara.md ("niag_vis"): Likewise. * config/sparc/ultra3.md ("us3_fga"): Likewise. * config/sparc/ultra1_2.md ("us1_fga_double"): Likewise. * config/sparc/sparc.md: New instruction type `bmask'. (bmaskdi_vis): Use the `bmask' type. (bmasksi_vis): Likewise. * config/sparc/ultra3.md (us3_array): Likewise. * config/sparc/niagara7.md (n7_array): Likewise. * config/sparc/niagara4.md (n4_array): Likewise. * config/sparc/niagara2.md (niag2_vis): Likewise. (niag3_vis): Likewise. * config/sparc/niagara.md (niag_vis): Likewise. gcc/testsuite/ChangeLog: * gcc.target/sparc/dictunpack.c: New file. * gcc.target/sparc/fpcmpdeshl.c: Likewise. * gcc.target/sparc/fpcmpshl.c: Likewise. * gcc.target/sparc/fpcmpurshl.c: Likewise. * gcc.target/sparc/fpcmpushl.c: Likewise. From-SVN: r250049 --- gcc/ChangeLog | 226 ++++++++++++ gcc/config.gcc | 2 +- gcc/config.in | 4 + gcc/config/sparc/constraints.md | 12 +- gcc/config/sparc/driver-sparc.c | 1 + gcc/config/sparc/m8.md | 242 +++++++++++++ gcc/config/sparc/niagara.md | 2 +- gcc/config/sparc/niagara2.md | 4 +- gcc/config/sparc/niagara4.md | 7 +- gcc/config/sparc/niagara7.md | 181 +++++++--- gcc/config/sparc/predicates.md | 27 ++ gcc/config/sparc/sol2.h | 14 +- gcc/config/sparc/sparc-c.c | 7 +- gcc/config/sparc/sparc-opts.h | 1 + gcc/config/sparc/sparc.c | 312 ++++++++++++++++- gcc/config/sparc/sparc.h | 20 +- gcc/config/sparc/sparc.md | 364 +++++++++++++++----- gcc/config/sparc/sparc.opt | 7 + gcc/config/sparc/ultra1_2.md | 8 +- gcc/config/sparc/ultra3.md | 4 +- gcc/configure | 35 ++ gcc/configure.ac | 12 + gcc/doc/extend.texi | 39 +++ gcc/doc/invoke.texi | 25 +- gcc/testsuite/ChangeLog | 8 + gcc/testsuite/gcc.target/sparc/dictunpack.c | 25 ++ gcc/testsuite/gcc.target/sparc/fpcmpdeshl.c | 25 ++ gcc/testsuite/gcc.target/sparc/fpcmpshl.c | 81 +++++ gcc/testsuite/gcc.target/sparc/fpcmpurshl.c | 25 ++ gcc/testsuite/gcc.target/sparc/fpcmpushl.c | 43 +++ 30 files changed, 1579 insertions(+), 184 deletions(-) create mode 100644 gcc/config/sparc/m8.md create mode 100644 gcc/testsuite/gcc.target/sparc/dictunpack.c create mode 100644 gcc/testsuite/gcc.target/sparc/fpcmpdeshl.c create mode 100644 gcc/testsuite/gcc.target/sparc/fpcmpshl.c create mode 100644 gcc/testsuite/gcc.target/sparc/fpcmpurshl.c create mode 100644 gcc/testsuite/gcc.target/sparc/fpcmpushl.c diff --git a/gcc/ChangeLog b/gcc/ChangeLog index decb508344c..a642e4a45b3 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,229 @@ +2017-07-07 Jose E. Marchesi + + * config/sparc/m8.md: New file. + * config/sparc/sparc.md: Include m8.md. + +2017-07-07 Jose E. Marchesi + + * config/sparc/sparc.opt: New option -mvis4b. + * config/sparc/sparc.c (dump_target_flag_bits): Handle MASK_VIS4B. + (sparc_option_override): Handle VIS4B. + (enum sparc_builtins): Define + SPARC_BUILTIN_DICTUNPACK{8,16,32}, + SPARC_BUILTIN_FPCMP{LE,GT,EQ,NE}{8,16,32}SHL, + SPARC_BUILTIN_FPCMPU{LE,GT}{8,16,32}SHL, + SPARC_BUILTIN_FPCMPDE{8,16,32}SHL and + SPARC_BUILTIN_FPCMPUR{8,16,32}SHL. + (check_constant_argument): New function. + (sparc_vis_init_builtins): Define builtins + __builtin_vis_dictunpack{8,16,32}, + __builtin_vis_fpcmp{le,gt,eq,ne}{8,16,32}shl, + __builtin_vis_fpcmpu{le,gt}{8,16,32}shl, + __builtin_vis_fpcmpde{8,16,32}shl and + __builtin_vis_fpcmpur{8,16,32}shl. + (sparc_expand_builtin): Check that the constant operands to + __builtin_vis_fpcmp*shl and _builtin_vis_dictunpack* are indeed + constant and in range. + * config/sparc/sparc-c.c (sparc_target_macros): Handle + TARGET_VIS4B. + * config/sparc/sparc.h (SPARC_IMM2_P): Define. + (SPARC_IMM5_P): Likewise. + * config/sparc/sparc.md (cpu_feature): Add new feagure "vis4b". + (enabled): Handle vis4b. + (UNSPEC_DICTUNPACK): New unspec. + (UNSPEC_FPCMPSHL): Likewise. + (UNSPEC_FPUCMPSHL): Likewise. + (UNSPEC_FPCMPDESHL): Likewise. + (UNSPEC_FPCMPURSHL): Likewise. + (cpu_feature): New CPU feature `vis4b'. + (dictunpack{8,16,32}): New insns. + (FPCSMODE): New mode iterator. + (fpcscond): New code iterator. + (fpcsucond): Likewise. + (fpcmp{le,gt,eq,ne}{8,16,32}{si,di}shl): New insns. + (fpcmpu{le,gt}{8,16,32}{si,di}shl): Likewise. + (fpcmpde{8,16,32}{si,di}shl): Likewise. + (fpcmpur{8,16,32}{si,di}shl): Likewise. + * config/sparc/constraints.md: Define constraints `q' for unsigned + 2-bit integer constants and `t' for unsigned 5-bit integer + constants. + * config/sparc/predicates.md (imm5_operand_dictunpack8): New + predicate. + (imm5_operand_dictunpack16): Likewise. + (imm5_operand_dictunpack32): Likewise. + (imm2_operand): Likewise. + * doc/invoke.texi (SPARC Options): Document -mvis4b. + * doc/extend.texi (SPARC VIS Built-in Functions): Document the + ditunpack* and fpcmp*shl builtins. + +2017-07-07 Jose E. Marchesi + + * config.gcc: Handle m8 in --with-{cpu,tune} options. + * config.in: Add HAVE_AS_SPARC6 define. + * config/sparc/driver-sparc.c (cpu_names): Add entry for the SPARC + M8. + * config/sparc/sol2.h (CPP_CPU64_DEFAULT_SPEC): Define for + TARGET_CPU_m8. + (ASM_CPU32_DEFAUILT_SPEC): Likewise. + (CPP_CPU_SPEC): Handle m8. + (ASM_CPU_SPEC): Likewise. + * config/sparc/sparc-opts.h (enum processor_type): Add + PROCESSOR_M8. + * config/sparc/sparc.c (m8_costs): New struct. + (sparc_option_override): Handle TARGET_CPU_m8. + (sparc32_initialize_trampoline): Likewise. + (sparc64_initialize_trampoline): Likewise. + (sparc_issue_rate): Likewise. + (sparc_register_move_cost): Likewise. + * config/sparc/sparc.h (TARGET_CPU_m8): Define. + (CPP_CPU64_DEFAULT_SPEC): Define for M8. + (ASM_CPU64_DEFAULT_SPEC): Likewise. + (CPP_CPU_SPEC): Handle M8. + (ASM_CPU_SPEC): Likewise. + (AS_M8_FLAG): Define. + * config/sparc/sparc.md: Add m8 to the cpu attribute. + * config/sparc/sparc.opt: New option -mcpu=m8 for sparc targets. + * configure.ac (HAVE_AS_SPARC6): Check for assembler support for + M8 instructions. + * configure: Regenerate. + * doc/invoke.texi (SPARC Options): Document -mcpu=m8 and + -mtune=m8. + +2017-07-07 Jose E. Marchesi + + * config/sparc/niagara7.md: Rework the DFA scheduler to use insn + subtypes. + * config/sparc/sparc.md: Remove the `v3pipe' insn attribute. + ("*movdi_insn_sp32"): Do not set v3pipe. + ("*movsi_insn"): Likewise. + ("*movdi_insn_sp64"): Likewise. + ("*movsf_insn"): Likewise. + ("*movdf_insn_sp32"): Likewise. + ("*movdf_insn_sp64"): Likewise. + ("*zero_extendsidi2_insn_sp64"): Likewise. + ("*sign_extendsidi2_insn"): Likewise. + ("*mov_insn"): Likewise. + ("*mov_insn_sp64"): Likewise. + ("*mov_insn_sp32"): Likewise. + ("3"): Likewise. + ("3"): Likewise. + ("*not_3"): Likewise. + ("*nand_vis"): Likewise. + ("*_not1_vis"): Likewise. + ("*_not2_vis"): Likewise. + ("one_cmpl2"): Likewise. + ("faligndata_vis"): Likewise. + ("alignaddrsi_vis"): Likewise. + ("alignaddrdi_vis"): Likweise. + ("alignaddrlsi_vis"): Likewise. + ("alignaddrldi_vis"): Likewise. + ("fcmp_vis"): Likewise. + ("bmaskdi_vis"): Likewise. + ("bmasksi_vis"): Likewise. + ("bshuffle_vis"): Likewise. + ("cmask8_vis"): Likewise. + ("cmask16_vis"): Likewise. + ("cmask32_vis"): Likewise. + ("pdistn_vis"): Likewise. + ("3"): Likewise. + +2017-07-07 Jose E. Marchesi + + * config/sparc/sparc.md ("subtype"): New insn attribute. + ("*wrgsr_sp64"): Set insn subtype. + ("*rdgsr_sp64"): Likewise. + ("alignaddrsi_vis"): Likewise. + ("alignaddrdi_vis"): Likewise. + ("alignaddrlsi_vis"): Likewise. + ("alignaddrldi_vis"): Likewise. + ("3"): Likewise. + ("fexpand_vis"): Likewise. + ("fpmerge_vis"): Likewise. + ("faligndata_vis"): Likewise. + ("bshuffle_vis"): Likewise. + ("cmask8_vis"): Likewise. + ("cmask16_vis"): Likewise. + ("cmask32_vis"): Likewise. + ("fchksm16_vis"): Likewise. + ("v3"): Likewise. + ("fmean16_vis"): Likewise. + ("fp64_vis"): Likewise. + ("v8qi3"): Likewise. + ("3"): Likewise. + ("3"): Likewise. + ("3"): Likewise. + ("v8qi3"): Likewise. + ("3"): Likewise. + ("*movqi_insn"): Likewise. + ("*movhi_insn"): Likewise. + ("*movsi_insn"): Likewise. + ("movsi_pic_gotdata_op"): Likewise. + ("*movdi_insn_sp32"): Likewise. + ("*movdi_insn_sp64"): Likewise. + ("movdi_pic_gotdata_op"): Likewise. + ("*movsf_insn"): Likewise. + ("*movdf_insn_sp32"): Likewise. + ("*movdf_insn_sp64"): Likewise. + ("*zero_extendhisi2_insn"): Likewise. + ("*zero_extendqihi2_insn"): Likewise. + ("*zero_extendqisi2_insn"): Likewise. + ("*zero_extendqidi2_insn"): Likewise. + ("*zero_extendhidi2_insn"): Likewise. + ("*zero_extendsidi2_insn_sp64"): Likewise. + ("ldfsr"): Likewise. + ("prefetch_64"): Likewise. + ("prefetch_32"): Likewise. + ("tie_ld32"): Likewise. + ("tie_ld64"): Likewise. + ("*tldo_ldub_sp32"): Likewise. + ("*tldo_ldub1_sp32"): Likewise. + ("*tldo_ldub2_sp32"): Likewise. + ("*tldo_ldub_sp64"): Likewise. + ("*tldo_ldub1_sp64"): Likewise. + ("*tldo_ldub2_sp64"): Likewise. + ("*tldo_ldub3_sp64"): Likewise. + ("*tldo_lduh_sp32"): Likewise. + ("*tldo_lduh1_sp32"): Likewise. + ("*tldo_lduh_sp64"): Likewise. + ("*tldo_lduh1_sp64"): Likewise. + ("*tldo_lduh2_sp64"): Likewise. + ("*tldo_lduw_sp32"): Likewise. + ("*tldo_lduw_sp64"): Likewise. + ("*tldo_lduw1_sp64"): Likewise. + ("*tldo_ldx_sp64"): Likewise. + ("*mov_insn"): Likewise. + ("*mov_insn_sp64"): Likewise. + ("*mov_insn_sp32"): Likewise. + +2017-07-07 Jose E. Marchesi + + * config/sparc/sparc.md ("type"): New insn type viscmp. + ("fcmp_vis"): Set insn type to + viscmp. + ("fpcmp8_vis"): Likewise. + ("fucmp8_vis"): Likewise. + ("fpcmpu_vis"): Likewise. + * config/sparc/niagara7.md ("n7_vis_logical_v3pipe"): Handle + viscmp. + ("n7_vis_logical_11cycle"): Likewise. + * config/sparc/niagara4.md ("n4_vis_logical"): Likewise. + * config/sparc/niagara2.md ("niag3_vis": Likewise. + * config/sparc/niagara.md ("niag_vis"): Likewise. + * config/sparc/ultra3.md ("us3_fga"): Likewise. + * config/sparc/ultra1_2.md ("us1_fga_double"): Likewise. + +2017-07-07 Jose E. Marchesi + + * config/sparc/sparc.md: New instruction type `bmask'. + (bmaskdi_vis): Use the `bmask' type. + (bmasksi_vis): Likewise. + * config/sparc/ultra3.md (us3_array): Likewise. + * config/sparc/niagara7.md (n7_array): Likewise. + * config/sparc/niagara4.md (n4_array): Likewise. + * config/sparc/niagara2.md (niag2_vis): Likewise. + (niag3_vis): Likewise. + * config/sparc/niagara.md (niag_vis): Likewise. + 2017-07-06 Jan Hubicka * ipa-comdats.c: Remove optimize check from gate. diff --git a/gcc/config.gcc b/gcc/config.gcc index 4a729507200..a1e0f8f1e4d 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -4435,7 +4435,7 @@ case "${target}" in | sparclite | f930 | f934 | sparclite86x \ | sparclet | tsc701 \ | v9 | ultrasparc | ultrasparc3 | niagara | niagara2 \ - | niagara3 | niagara4 | niagara7) + | niagara3 | niagara4 | niagara7 | m8) # OK ;; *) diff --git a/gcc/config.in b/gcc/config.in index 44c7a68eaa8..73c9f92bb5d 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -660,6 +660,10 @@ #undef HAVE_AS_SPARC5_VIS4 #endif +/* Define if your assembler supports SPARC6 instructions. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_SPARC6 +#endif /* Define if your assembler and linker support GOTDATA_OP relocs. */ #ifndef USED_FOR_TARGET diff --git a/gcc/config/sparc/constraints.md b/gcc/config/sparc/constraints.md index 7c9ef74ce6a..cff5a61b1de 100644 --- a/gcc/config/sparc/constraints.md +++ b/gcc/config/sparc/constraints.md @@ -19,7 +19,7 @@ ;;; Unused letters: ;;; B -;;; a jkl q tuv xyz +;;; a jkl uv xyz ;; Register constraints @@ -58,6 +58,16 @@ ;; Integer constant constraints +(define_constraint "q" + "Unsigned 2-bit integer constant" + (and (match_code "const_int") + (match_test "SPARC_IMM2_P (ival)"))) + +(define_constraint "t" + "Unsigned 5-bit integer constant" + (and (match_code "const_int") + (match_test "SPARC_IMM5_P (ival)"))) + (define_constraint "A" "Signed 5-bit integer constant" (and (match_code "const_int") diff --git a/gcc/config/sparc/driver-sparc.c b/gcc/config/sparc/driver-sparc.c index b96ef47ac60..0c25d6cfa15 100644 --- a/gcc/config/sparc/driver-sparc.c +++ b/gcc/config/sparc/driver-sparc.c @@ -79,6 +79,7 @@ static const struct cpu_names { #endif { "SPARC-M7", "niagara7" }, { "SPARC-S7", "niagara7" }, + { "SPARC-M8", "m8" }, { NULL, NULL } }; diff --git a/gcc/config/sparc/m8.md b/gcc/config/sparc/m8.md new file mode 100644 index 00000000000..f0fe1b27a20 --- /dev/null +++ b/gcc/config/sparc/m8.md @@ -0,0 +1,242 @@ +;; Scheduling description for the SPARC M8. +;; Copyright (C) 2017 Free Software Foundation, Inc. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + +;; Thigs to improve: +;; +;; - Store instructions are implemented by micro-ops, one of which +;; generates the store address and is executed in the store address +;; generation unit in the slot0. We need to model that. +;; +;; - There are two V3 pipes connected to different slots. The current +;; implementation assumes that all the instructions executing in a +;; V3 pipe are issued to the unit in slot3. +;; +;; - Single-issue ALU operations incur an additional cycle of latency to +;; slot 0 and slot 1 instructions. This is not currently reflected +;; in the DFA. + +(define_automaton "m8_0") + +;; The S5 core has two dual-issue queues, PQLS and PQEX. Each queue +;; is divided into two slots: PQLS corresponds to slots 0 and 1, and +;; PQEX corresponds to slots 2 and 3. The core can issue 4 +;; instructions per-cycle, and up to 4 instructions are committed each +;; cycle. +;; +;; +;; m8_slot0 - Load Unit. +;; - Store address gen. Unit. +;; +;; +;; === PQLS ==> m8_slot1 - Store data unit. +;; - Branch unit. +;; +;; +;; === PQEX ==> m8_slot2 - Integer Unit (EXU2). +;; - 3-cycles Crypto Unit (SPU2). +;; +;; m8_slot3 - Integer Unit (EXU3). +;; - 3-cycles Crypto Unit (SPU3). +;; - Floating-point and graphics unit (FPG). +;; - Long-latency Crypto Unit. +;; - Oracle Numbers Unit (ONU). + +(define_cpu_unit "m8_slot0,m8_slot1,m8_slot2,m8_slot3" "m8_0") + +;; Some instructions stall the pipeline and avoid any other +;; instruction to be issued in the same cycle. We assume the same for +;; multi-instruction insns. + +(define_reservation "m8_single_issue" "m8_slot0 + m8_slot1 + m8_slot2 + m8_slot3") + +(define_insn_reservation "m8_single" 1 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "multi,savew,flushw,trap,bmask")) + "m8_single_issue") + +;; Most of the instructions executing in the integer units have a +;; latency of 1. + +(define_insn_reservation "m8_integer" 1 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "ialu,ialuX,shift,cmove,compare,bmask")) + "(m8_slot2 | m8_slot3)") + +;; Flushing the instruction memory takes 27 cycles. + + +(define_insn_reservation "m8_iflush" 27 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "iflush")) + "(m8_slot2 | m8_slot3), nothing*26") + +;; The integer multiplication instructions have a latency of 10 cycles +;; and execute in integer units. +;; +;; Likewise for array*, edge* and pdistn instructions. +;; +;; However, the latency is only 9 cycles if the consumer of the +;; operation is also capable of 9 cycles latency. We model this with +;; a bypass. + +(define_insn_reservation "m8_imul" 10 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "imul,array,edge,edgen,pdistn")) + "(m8_slot2 | m8_slot3), nothing*12") + +(define_bypass 9 "m8_imul" "m8_imul") + +;; The integer division instructions `sdiv' and `udivx' have a latency +;; of 30 cycles and execute in integer units. + +(define_insn_reservation "m8_idiv" 30 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "idiv")) + "(m8_slot2 | m8_slot3), nothing*29") + +;; Both integer and floating-point load instructions have a latency of +;; only 3 cycles,and execute in the slot0. +;; +;; Misaligned load instructions feature a latency of 11 cycles. +;; +;; The prefetch instruction also executes in the load unit, but it's +;; latency is only 1 cycle. + +(define_insn_reservation "m8_load" 3 + (and (eq_attr "cpu" "m8") + (ior (eq_attr "type" "fpload,sload") + (and (eq_attr "type" "load") + (eq_attr "subtype" "regular")))) + "m8_slot0, nothing*2") + +;; (define_insn_reservation "m8_load_misalign" 11 +;; (and (eq_attr "cpu" "m8") +;; (eq_attr "type" "load_mis,fpload_mis")) +;; "m8_slot0, nothing*10") + +(define_insn_reservation "m8_prefetch" 1 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "load") + (eq_attr "subtype" "prefetch")) + "m8_slot0") + +;; Both integer and floating-point store instructions have a latency +;; of 1 cycle, and execute in the store data unit in slot1. +;; +;; However, misaligned store instructions feature a latency of 3 +;; cycles. + +(define_insn_reservation "m8_store" 1 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "store,fpstore")) + "m8_slot1") + +;; (define_insn_reservation "m8_store_misalign" 3 +;; (and (eq_attr "cpu" "m8") +;; (eq_attr "type" "store_mis,fpstore_mis")) +;; "m8_slot1, nothing*2") + +;; Control-transfer instructions execute in the Branch Unit in the +;; slot1. + +(define_insn_reservation "m8_cti" 1 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "cbcond,uncond_cbcond,branch,call,sibcall,call_no_delay_slot,uncond_branch,return")) + "m8_slot1") + +;; Many instructions executing in the Floating-point and Graphics Unit +;; (FGU) serving slot3 feature a default latency of 9 cycles. + +(define_insn_reservation "m8_fp" 9 + (and (eq_attr "cpu" "m8") + (ior (eq_attr "type" "fpmove,fpcmove,fpcrmove,fp,fpcmp,fpmul,fgm_pack,fgm_mul,pdist") + (and (eq_attr "type" "fga") + (eq_attr "subtype" "fpu")))) + "m8_slot3, nothing*8") + +;; Floating-point division and floating-point square-root instructions +;; have high latencies. They execute in the FGU. + +(define_insn_reservation "m8_fpdivs" 26 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "fpdivs")) + "m8_slot3, nothing*25") + +(define_insn_reservation "m8_fpsqrts" 33 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "fpsqrts")) + "m8_slot3, nothing*32") + +(define_insn_reservation "m8_fpdivd" 30 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "fpdivd")) + "m8_slot3, nothing*29") + +(define_insn_reservation "m8_fpsqrtd" 41 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "fpsqrtd")) + "m8_slot3, nothing*40") + +;; SIMD VIS instructions executing in the Floating-point and graphics +;; unit (FPG) in slot3 usually have a latency of 5 cycles. +;; +;; However, the latency for many instructions is only 3 cycles if the +;; consumer can also be executed in 3 cycles. We model this with a +;; bypass. In these cases the instructions are executed in one of the +;; two 3-cycle crypto units (SPU, also known as "v3-pipes") in slots 2 +;; and 3. + +(define_insn_reservation "m8_vis" 5 + (and (eq_attr "cpu" "m8") + (ior (eq_attr "type" "viscmp,lzd") + (and (eq_attr "type" "fga") + (eq_attr "subtype" "maxmin,cmask,other")) + (and (eq_attr "type" "vismv") + (eq_attr "subtype" "single,movstouw")) + (and (eq_attr "type" "visl") + (eq_attr "subtype" "single")))) + "m8_slot3, nothing*4") + +(define_bypass 3 "m8_vis" "m8_vis") + +(define_insn_reservation "m8_gsr" 5 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "gsr") + (eq_attr "subtype" "alignaddr")) + "m8_slot3, nothing*4") + +;; A few VIS instructions have a latency of 1. + +(define_insn_reservation "m8_vis_1cycle" 1 + (and (eq_attr "cpu" "m8") + (ior (and (eq_attr "type" "vismv") + (eq_attr "subtype" "double,movxtod,movdtox")) + (and (eq_attr "type" "visl") + (eq_attr "subtype" "double")) + (and (eq_attr "type" "fga") + (eq_attr "subtype" "addsub64")))) + "m8_slot3") + +;; Reading and writing to the gsr register takes more than 70 cycles. + +(define_insn_reservation "m8_gsr_reg" 70 + (and (eq_attr "cpu" "m8") + (eq_attr "type" "gsr") + (eq_attr "subtype" "reg")) + "m8_slot3, nothing*69") diff --git a/gcc/config/sparc/niagara.md b/gcc/config/sparc/niagara.md index f79771fc2f3..a8e23b8f894 100644 --- a/gcc/config/sparc/niagara.md +++ b/gcc/config/sparc/niagara.md @@ -114,5 +114,5 @@ */ (define_insn_reservation "niag_vis" 8 (and (eq_attr "cpu" "niagara") - (eq_attr "type" "fga,visl,vismv,fgm_pack,fgm_mul,pdist,edge,edgen,gsr,array")) + (eq_attr "type" "fga,visl,viscmp,vismv,fgm_pack,fgm_mul,pdist,edge,edgen,gsr,array,bmask")) "niag_pipe*8") diff --git a/gcc/config/sparc/niagara2.md b/gcc/config/sparc/niagara2.md index 9bcdd064f36..3190d556e53 100644 --- a/gcc/config/sparc/niagara2.md +++ b/gcc/config/sparc/niagara2.md @@ -111,10 +111,10 @@ (define_insn_reservation "niag2_vis" 6 (and (eq_attr "cpu" "niagara2") - (eq_attr "type" "fga,vismv,visl,fgm_pack,fgm_mul,pdist,edge,edgen,array,gsr")) + (eq_attr "type" "fga,vismv,visl,viscmp,fgm_pack,fgm_mul,pdist,edge,edgen,array,bmask,gsr")) "niag2_pipe*6") (define_insn_reservation "niag3_vis" 9 (and (eq_attr "cpu" "niagara3") - (eq_attr "type" "fga,vismv,visl,fgm_pack,fgm_mul,pdist,pdistn,edge,edgen,array,gsr")) + (eq_attr "type" "fga,vismv,visl,viscmp,fgm_pack,fgm_mul,pdist,pdistn,edge,edgen,array,bmask,gsr")) "niag2_pipe*9") diff --git a/gcc/config/sparc/niagara4.md b/gcc/config/sparc/niagara4.md index ad0a04b12d3..a3417d21c71 100644 --- a/gcc/config/sparc/niagara4.md +++ b/gcc/config/sparc/niagara4.md @@ -66,7 +66,7 @@ (define_insn_reservation "n4_array" 12 (and (eq_attr "cpu" "niagara4") - (eq_attr "type" "array,edge,edgen")) + (eq_attr "type" "array,bmask,edge,edgen")) "n4_slot1, nothing*11") (define_insn_reservation "n4_vis_move_1cycle" 1 @@ -90,8 +90,9 @@ (define_insn_reservation "n4_vis_logical" 3 (and (eq_attr "cpu" "niagara4") - (and (eq_attr "type" "visl,pdistn") - (eq_attr "fptype" "double"))) + (ior (and (eq_attr "type" "visl,pdistn") + (eq_attr "fptype" "double")) + (eq_attr "type" "viscmp"))) "n4_slot1, nothing*2") (define_insn_reservation "n4_vis_logical_11cycle" 11 diff --git a/gcc/config/sparc/niagara7.md b/gcc/config/sparc/niagara7.md index 12d6ab0fba5..23b67075e2b 100644 --- a/gcc/config/sparc/niagara7.md +++ b/gcc/config/sparc/niagara7.md @@ -19,64 +19,120 @@ (define_automaton "niagara7_0") -(define_cpu_unit "n7_slot0,n7_slot1,n7_slot2" "niagara7_0") -(define_reservation "n7_single_issue" "n7_slot0 + n7_slot1 + n7_slot2") +;; The S4 core has a dual-issue queue. This queue is divided into two +;; slots. One instruction can be issued each cycle to each slot, and +;; up to 2 instructions are committed each cycle. Each slot serves +;; several execution units, as depicted below: +;; +;; +;; m7_slot0 - Integer unit. +;; - Load/Store unit. +;; === QUEUE ==> +;; +;; m7_slot1 - Integer unit. +;; - Branch unit. +;; - Floating-point and graphics unit. +;; - 3-cycles crypto unit. -(define_cpu_unit "n7_load_store" "niagara7_0") +(define_cpu_unit "n7_slot0,n7_slot1" "niagara7_0") + +;; Some instructions stall the pipeline and avoid any other +;; instruction to be issued in the same cycle. We assume the same for +;; multi-instruction insns. + +(define_reservation "n7_single_issue" "n7_slot0 + n7_slot1") (define_insn_reservation "n7_single" 1 (and (eq_attr "cpu" "niagara7") (eq_attr "type" "multi,savew,flushw,trap")) "n7_single_issue") -(define_insn_reservation "n7_iflush" 27 - (and (eq_attr "cpu" "niagara7") - (eq_attr "type" "iflush")) - "(n7_slot0 | n7_slot1), nothing*26") +;; Most of the instructions executing in the integer unit have a +;; latency of 1. (define_insn_reservation "n7_integer" 1 (and (eq_attr "cpu" "niagara7") (eq_attr "type" "ialu,ialuX,shift,cmove,compare")) "(n7_slot0 | n7_slot1)") +;; Flushing the instruction memory takes 27 cycles. + +(define_insn_reservation "n7_iflush" 27 + (and (eq_attr "cpu" "niagara7") + (eq_attr "type" "iflush")) + "(n7_slot0 | n7_slot1), nothing*26") + +;; The integer multiplication instructions have a latency of 12 cycles +;; and execute in the integer unit. +;; +;; Likewise for array*, edge* and pdistn instructions. + (define_insn_reservation "n7_imul" 12 (and (eq_attr "cpu" "niagara7") - (eq_attr "type" "imul")) - "n7_slot1, nothing*11") + (eq_attr "type" "imul,array,edge,edgen,pdistn")) + "(n7_slot0 | n7_slot1), nothing*11") + +;; The integer division instructions have a latency of 35 cycles and +;; execute in the integer unit. (define_insn_reservation "n7_idiv" 35 (and (eq_attr "cpu" "niagara7") (eq_attr "type" "idiv")) - "n7_slot1, nothing*34") + "(n7_slot0 | n7_slot1), nothing*34") + +;; Both integer and floating-point load instructions have a latency of +;; 5 cycles, and execute in the slot0. +;; +;; The prefetch instruction also executes in the load/store unit, but +;; its latency is only 1 cycle. (define_insn_reservation "n7_load" 5 (and (eq_attr "cpu" "niagara7") - (eq_attr "type" "load,fpload,sload")) - "(n7_slot0 + n7_load_store), nothing*4") + (ior (eq_attr "type" "fpload,sload") + (and (eq_attr "type" "load") + (eq_attr "subtype" "regular")))) + "n7_slot0, nothing*4") + +(define_insn_reservation "n7_prefetch" 1 + (and (eq_attr "cpu" "niagara7") + (eq_attr "type" "load") + (eq_attr "subtype" "prefetch")) + "n7_slot0") + +;; Both integer and floating-point store instructions have a latency +;; of 1 cycle, and execute in the load/store unit in slot0. (define_insn_reservation "n7_store" 1 (and (eq_attr "cpu" "niagara7") (eq_attr "type" "store,fpstore")) - "(n7_slot0 | n7_slot2) + n7_load_store") + "n7_slot0") + +;; Control-transfer instructions execute in the Branch Unit in the +;; slot1. (define_insn_reservation "n7_cti" 1 (and (eq_attr "cpu" "niagara7") (eq_attr "type" "cbcond,uncond_cbcond,branch,call,sibcall,call_no_delay_slot,uncond_branch,return")) "n7_slot1") +;; Many instructions executing in the Floating-point and Graphics unit +;; in the slot1 feature a latency of 11 cycles. + (define_insn_reservation "n7_fp" 11 (and (eq_attr "cpu" "niagara7") - (eq_attr "type" "fpmove,fpcmove,fpcrmove,fp,fpcmp,fpmul")) + (ior (eq_attr "type" "fpmove,fpcmove,fpcrmove,fp,fpcmp,fpmul,fgm_pack,fgm_mul,pdist") + (and (eq_attr "type" "fga") + (eq_attr "subtype" "fpu,maxmin")))) "n7_slot1, nothing*10") -(define_insn_reservation "n7_array" 12 - (and (eq_attr "cpu" "niagara7") - (eq_attr "type" "array,edge,edgen")) - "n7_slot1, nothing*11") +;; Floating-point division and floating-point square-root instructions +;; have high latencies. They execute in the floating-point and +;; graphics unit in the slot1. + (define_insn_reservation "n7_fpdivs" 24 (and (eq_attr "cpu" "niagara7") - (eq_attr "type" "fpdivs,fpsqrts")) + (eq_attr "type" "fpdivs,fpsqrts")) "n7_slot1, nothing*23") (define_insn_reservation "n7_fpdivd" 37 @@ -84,53 +140,66 @@ (eq_attr "type" "fpdivd,fpsqrtd")) "n7_slot1, nothing*36") -(define_insn_reservation "n7_lzd" 12 - (and (eq_attr "cpu" "niagara7") - (eq_attr "type" "lzd")) - "(n7_slot0 | n7_slot1), nothing*11") - -;; There is an internal unit called the "V3 pipe", that was originally -;; intended to process some of the short cryptographic instructions. -;; However, as soon as in the T4 several of the VIS instructions -;; (notably non-FP instructions) have been moved to the V3 pipe. -;; Consequently, these instructions feature a latency of 3 instead of -;; 11 or 12 cycles, provided their consumers also execute in the V3 -;; pipe. +;; SIMD VIS instructions executing in the Floating-point and graphics +;; unit (FPG) in slot1 usually have a latency of either 11 or 12 +;; cycles. ;; -;; This is modelled here with a bypass. +;; However, the latency for many instructions is only 3 cycles if the +;; consumer can also be executed in 3 cycles. We model this with a +;; bypass. In these cases the instructions are executed in the +;; 3-cycle crypto unit which also serves slot1. + +(define_insn_reservation "n7_vis_11cycles" 11 + (and (eq_attr "cpu" "niagara7") + (ior (and (eq_attr "type" "fga") + (eq_attr "subtype" "addsub64,other")) + (and (eq_attr "type" "vismv") + (eq_attr "subtype" "double,single")) + (and (eq_attr "type" "visl") + (eq_attr "subtype" "double,single")))) + "n7_slot1, nothing*10") -(define_insn_reservation "n7_vis_fga" 11 +(define_insn_reservation "n7_vis_12cycles" 12 (and (eq_attr "cpu" "niagara7") - (eq_attr "type" "fga,gsr")) - "n7_slot1, nothing*10") + (ior (eq_attr "type" "bmask,viscmp") + (and (eq_attr "type" "fga") + (eq_attr "subtype" "cmask")) + (and (eq_attr "type" "vismv") + (eq_attr "subtype" "movstouw")))) + "n7_slot1, nothing*11") + +(define_bypass 3 "n7_vis_*" "n7_vis_*") + +;; Some other VIS instructions have a latency of 12 cycles, and won't +;; be executed in the 3-cycle crypto pipe. -(define_insn_reservation "n7_vis_fgm" 11 +(define_insn_reservation "n7_lzd" 12 (and (eq_attr "cpu" "niagara7") - (eq_attr "type" "fgm_pack,fgm_mul,pdist")) - "n7_slot1, nothing*10") + (ior (eq_attr "type" "lzd,") + (and (eq_attr "type" "gsr") + (eq_attr "subtype" "alignaddr")))) + "n7_slot1, nothing*11") -(define_insn_reservation "n7_vis_move_v3pipe" 11 +;; A couple of VIS instructions feature very low latencies in the M7. + +(define_insn_reservation "n7_single_vis" 1 (and (eq_attr "cpu" "niagara7") - (and (eq_attr "type" "vismv") - (eq_attr "v3pipe" "true"))) + (eq_attr "type" "vismv") + (eq_attr "subtype" "movxtod")) "n7_slot1") -(define_insn_reservation "n7_vis_move_11cycle" 11 +(define_insn_reservation "n7_double_vis" 2 (and (eq_attr "cpu" "niagara7") - (and (eq_attr "type" "vismv") - (eq_attr "v3pipe" "false"))) - "n7_slot1, nothing*10") + (eq_attr "type" "vismv") + (eq_attr "subtype" "movdtox")) + "n7_slot1, nothing") -(define_insn_reservation "n7_vis_logical_v3pipe" 11 - (and (eq_attr "cpu" "niagara7") - (and (eq_attr "type" "visl,pdistn") - (eq_attr "v3pipe" "true"))) - "n7_slot1, nothing*2") +;; Reading and writing to the gsr register takes a high number of +;; cycles that is not documented in the PRM. Let's use the same value +;; than the M8. -(define_insn_reservation "n7_vis_logical_11cycle" 11 +(define_insn_reservation "n7_gsr_reg" 70 (and (eq_attr "cpu" "niagara7") - (and (eq_attr "type" "visl") - (eq_attr "v3pipe" "false"))) - "n7_slot1, nothing*10") - -(define_bypass 3 "*_v3pipe" "*_v3pipe") + (eq_attr "type" "gsr") + (eq_attr "subtype" "reg")) + "n7_slot1, nothing*70") diff --git a/gcc/config/sparc/predicates.md b/gcc/config/sparc/predicates.md index 951933efb39..3f8526dc3ef 100644 --- a/gcc/config/sparc/predicates.md +++ b/gcc/config/sparc/predicates.md @@ -328,6 +328,33 @@ (and (match_code "const_int") (match_test "SPARC_SIMM5_P (INTVAL (op))")))) +;; Return true if OP is a constant in the range 0..7. This is an +;; acceptable second operand for dictunpack instructions setting a +;; V8QI mode in the destination register. +(define_predicate "imm5_operand_dictunpack8" + (and (match_code "const_int") + (match_test "(INTVAL (op) >= 0 && INTVAL (op) < 8)"))) + +;; Return true if OP is a constant in the range 7..15. This is an +;; acceptable second operand for dictunpack instructions setting a +;; V4HI mode in the destination register. +(define_predicate "imm5_operand_dictunpack16" + (and (match_code "const_int") + (match_test "(INTVAL (op) >= 8 && INTVAL (op) < 16)"))) + +;; Return true if OP is a constant in the range 15..31. This is an +;; acceptable second operand for dictunpack instructions setting a +;; V2SI mode in the destination register. +(define_predicate "imm5_operand_dictunpack32" + (and (match_code "const_int") + (match_test "(INTVAL (op) >= 16 && INTVAL (op) < 32)"))) + +;; Return true if OP is a constant that is representable by a 2-bit +;; unsigned field. This is an acceptable third operand for +;; fpcmp*shl instructions. +(define_predicate "imm2_operand" + (and (match_code "const_int") + (match_test "SPARC_IMM2_P (INTVAL (op))"))) ;; Predicates for miscellaneous instructions. diff --git a/gcc/config/sparc/sol2.h b/gcc/config/sparc/sol2.h index 8a50bfeefc7..b8177c0b692 100644 --- a/gcc/config/sparc/sol2.h +++ b/gcc/config/sparc/sol2.h @@ -174,13 +174,22 @@ along with GCC; see the file COPYING3. If not see #define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG AS_NIAGARA7_FLAG #endif +#if TARGET_CPU_DEFAULT == TARGET_CPU_m8 +#undef CPP_CPU64_DEFAULT_SPEC +#define CPP_CPU64_DEFAULT_SPEC "" +#undef ASM_CPU32_DEFAULT_SPEC +#define ASM_CPU32_DEFAULT_SPEC AS_SPARC32_FLAG AS_M8_FLAG +#undef ASM_CPU64_DEFAULT_SPEC +#define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG AS_M8_FLAG +#endif + #undef CPP_CPU_SPEC #define CPP_CPU_SPEC "\ %{mcpu=sparclet|mcpu=tsc701:-D__sparclet__} \ %{mcpu=sparclite|mcpu-f930|mcpu=f934:-D__sparclite__} \ %{mcpu=v8:" DEF_ARCH32_SPEC("-D__sparcv8") "} \ %{mcpu=supersparc:-D__supersparc__ " DEF_ARCH32_SPEC("-D__sparcv8") "} \ -%{mcpu=v9|mcpu=ultrasparc|mcpu=ultrasparc3|mcpu=niagara|mcpu=niagara2|mcpu=niagara3|mcpu=niagara4|mcpu=niagara7:" DEF_ARCH32_SPEC("-D__sparcv8") "} \ +%{mcpu=v9|mcpu=ultrasparc|mcpu=ultrasparc3|mcpu=niagara|mcpu=niagara2|mcpu=niagara3|mcpu=niagara4|mcpu=niagara7|mcpu=m8:" DEF_ARCH32_SPEC("-D__sparcv8") "} \ %{!mcpu*:%(cpp_cpu_default)} \ " @@ -290,7 +299,8 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); %{mcpu=niagara3:" DEF_ARCH32_SPEC("-xarch=v8plus" AS_NIAGARA3_FLAG) DEF_ARCH64_SPEC("-xarch=v9" AS_NIAGARA3_FLAG) "} \ %{mcpu=niagara4:" DEF_ARCH32_SPEC(AS_SPARC32_FLAG AS_NIAGARA4_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA4_FLAG) "} \ %{mcpu=niagara7:" DEF_ARCH32_SPEC(AS_SPARC32_FLAG AS_NIAGARA7_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA7_FLAG) "} \ -%{!mcpu=niagara7:%{!mcpu=niagara4:%{!mcpu=niagara3:%{!mcpu=niagara2:%{!mcpu=niagara:%{!mcpu=ultrasparc3:%{!mcpu=ultrasparc:%{!mcpu=v9:%{mcpu*:" DEF_ARCH32_SPEC("-xarch=v8") DEF_ARCH64_SPEC("-xarch=v9") "}}}}}}}}} \ +%{mcpu=m8:" DEF_ARCH32_SPEC(AS_SPARC32_FLAG AS_M8_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_M8_FLAG) "} \ +%{!mcpu=m8:%{!mcpu=niagara7:%{!mcpu=niagara4:%{!mcpu=niagara3:%{!mcpu=niagara2:%{!mcpu=niagara:%{!mcpu=ultrasparc3:%{!mcpu=ultrasparc:%{!mcpu=v9:%{mcpu*:" DEF_ARCH32_SPEC("-xarch=v8") DEF_ARCH64_SPEC("-xarch=v9") "}}}}}}}}}} \ %{!mcpu*:%(asm_cpu_default)} \ " diff --git a/gcc/config/sparc/sparc-c.c b/gcc/config/sparc/sparc-c.c index 960317350fe..4aacfff05ff 100644 --- a/gcc/config/sparc/sparc-c.c +++ b/gcc/config/sparc/sparc-c.c @@ -40,7 +40,12 @@ sparc_target_macros (void) cpp_assert (parse_in, "machine=sparc"); } - if (TARGET_VIS4) + if (TARGET_VIS4B) + { + cpp_define (parse_in, "__VIS__=0x410"); + cpp_define (parse_in, "__VIS=0x410"); + } + else if (TARGET_VIS4) { cpp_define (parse_in, "__VIS__=0x400"); cpp_define (parse_in, "__VIS=0x400"); diff --git a/gcc/config/sparc/sparc-opts.h b/gcc/config/sparc/sparc-opts.h index 6e7c2ace277..581e86e49d1 100644 --- a/gcc/config/sparc/sparc-opts.h +++ b/gcc/config/sparc/sparc-opts.h @@ -46,6 +46,7 @@ enum processor_type { PROCESSOR_NIAGARA3, PROCESSOR_NIAGARA4, PROCESSOR_NIAGARA7, + PROCESSOR_M8, PROCESSOR_NATIVE }; diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c index 790a0367b67..9f9a29ac4d2 100644 --- a/gcc/config/sparc/sparc.c +++ b/gcc/config/sparc/sparc.c @@ -448,6 +448,30 @@ struct processor_costs niagara7_costs = { 0, /* shift penalty */ }; +static const +struct processor_costs m8_costs = { + COSTS_N_INSNS (3), /* int load */ + COSTS_N_INSNS (3), /* int signed load */ + COSTS_N_INSNS (3), /* int zeroed load */ + COSTS_N_INSNS (3), /* float load */ + COSTS_N_INSNS (9), /* fmov, fneg, fabs */ + COSTS_N_INSNS (9), /* fadd, fsub */ + COSTS_N_INSNS (9), /* fcmp */ + COSTS_N_INSNS (9), /* fmov, fmovr */ + COSTS_N_INSNS (9), /* fmul */ + COSTS_N_INSNS (26), /* fdivs */ + COSTS_N_INSNS (30), /* fdivd */ + COSTS_N_INSNS (33), /* fsqrts */ + COSTS_N_INSNS (41), /* fsqrtd */ + COSTS_N_INSNS (12), /* imul */ + COSTS_N_INSNS (10), /* imulX */ + 0, /* imul bit factor */ + COSTS_N_INSNS (57), /* udiv/sdiv */ + COSTS_N_INSNS (30), /* udivx/sdivx */ + COSTS_N_INSNS (1), /* movcc/movr */ + 0, /* shift penalty */ +}; + static const struct processor_costs *sparc_costs = &cypress_costs; #ifdef HAVE_AS_RELAX_OPTION @@ -1222,6 +1246,8 @@ dump_target_flag_bits (const int flags) fprintf (stderr, "VIS3 "); if (flags & MASK_VIS4) fprintf (stderr, "VIS4 "); + if (flags & MASK_VIS4B) + fprintf (stderr, "VIS4B "); if (flags & MASK_CBCOND) fprintf (stderr, "CBCOND "); if (flags & MASK_DEPRECATED_V8_INSNS) @@ -1286,6 +1312,7 @@ sparc_option_override (void) { TARGET_CPU_niagara3, PROCESSOR_NIAGARA3 }, { TARGET_CPU_niagara4, PROCESSOR_NIAGARA4 }, { TARGET_CPU_niagara7, PROCESSOR_NIAGARA7 }, + { TARGET_CPU_m8, PROCESSOR_M8 }, { -1, PROCESSOR_V7 } }; const struct cpu_default *def; @@ -1337,7 +1364,11 @@ sparc_option_override (void) MASK_V9|MASK_POPC|MASK_VIS3|MASK_FMAF|MASK_CBCOND }, /* UltraSPARC M7 */ { "niagara7", MASK_ISA, - MASK_V9|MASK_POPC|MASK_VIS4|MASK_FMAF|MASK_CBCOND|MASK_SUBXC } + MASK_V9|MASK_POPC|MASK_VIS4|MASK_FMAF|MASK_CBCOND|MASK_SUBXC }, + /* UltraSPARC M8 */ + { "m8", MASK_ISA, + MASK_V9|MASK_POPC|MASK_VIS4|MASK_FMAF|MASK_CBCOND|MASK_SUBXC + |MASK_VIS4B } }; const struct cpu_table *cpu; unsigned int i; @@ -1467,6 +1498,9 @@ sparc_option_override (void) #ifndef HAVE_AS_SPARC5_VIS4 & ~(MASK_VIS4 | MASK_SUBXC) #endif +#ifndef HAVE_AS_SPARC6 + & ~(MASK_VIS4B) +#endif #ifndef HAVE_AS_LEON & ~(MASK_LEON | MASK_LEON3) #endif @@ -1485,11 +1519,15 @@ sparc_option_override (void) if (TARGET_VIS4) target_flags |= MASK_VIS3 | MASK_VIS2 | MASK_VIS; - /* Don't allow -mvis, -mvis2, -mvis3, -mvis4 or -mfmaf if FPU is - disabled. */ + /* -mvis4b implies -mvis4, -mvis3, -mvis2 and -mvis */ + if (TARGET_VIS4B) + target_flags |= MASK_VIS4 | MASK_VIS3 | MASK_VIS2 | MASK_VIS; + + /* Don't allow -mvis, -mvis2, -mvis3, -mvis4, -mvis4b and -mfmaf if + FPU is disabled. */ if (! TARGET_FPU) target_flags &= ~(MASK_VIS | MASK_VIS2 | MASK_VIS3 | MASK_VIS4 - | MASK_FMAF); + | MASK_VIS4B | MASK_FMAF); /* -mvis assumes UltraSPARC+, so we are sure v9 instructions are available; -m64 also implies v9. */ @@ -1529,7 +1567,8 @@ sparc_option_override (void) || sparc_cpu == PROCESSOR_NIAGARA3 || sparc_cpu == PROCESSOR_NIAGARA4) align_functions = 32; - else if (sparc_cpu == PROCESSOR_NIAGARA7) + else if (sparc_cpu == PROCESSOR_NIAGARA7 + || sparc_cpu == PROCESSOR_M8) align_functions = 64; } @@ -1597,6 +1636,9 @@ sparc_option_override (void) case PROCESSOR_NIAGARA7: sparc_costs = &niagara7_costs; break; + case PROCESSOR_M8: + sparc_costs = &m8_costs; + break; case PROCESSOR_NATIVE: gcc_unreachable (); }; @@ -1659,13 +1701,14 @@ sparc_option_override (void) || sparc_cpu == PROCESSOR_NIAGARA4) ? 2 : (sparc_cpu == PROCESSOR_ULTRASPARC3 - ? 8 : (sparc_cpu == PROCESSOR_NIAGARA7 + ? 8 : ((sparc_cpu == PROCESSOR_NIAGARA7 + || sparc_cpu == PROCESSOR_M8) ? 32 : 3))), global_options.x_param_values, global_options_set.x_param_values); - /* For PARAM_L1_CACHE_LINE_SIZE we use the default 32 bytes (see - params.def), so no maybe_set_param_value is needed. + /* PARAM_L1_CACHE_LINE_SIZE is the size of the L1 cache line, in + bytes. The Oracle SPARC Architecture (previously the UltraSPARC Architecture) specification states that when a PREFETCH[A] @@ -1681,6 +1724,11 @@ sparc_option_override (void) L2 and L3, but only 32B are brought into the L1D$. (Assuming it is a read_n prefetch, which is the only type which allocates to the L1.) */ + maybe_set_param_value (PARAM_L1_CACHE_LINE_SIZE, + (sparc_cpu == PROCESSOR_M8 + ? 64 : 32), + global_options.x_param_values, + global_options_set.x_param_values); /* PARAM_L1_CACHE_SIZE is the size of the L1D$ (most SPARC chips use Hardvard level-1 caches) in kilobytes. Both UltraSPARC and @@ -1692,7 +1740,8 @@ sparc_option_override (void) || sparc_cpu == PROCESSOR_NIAGARA2 || sparc_cpu == PROCESSOR_NIAGARA3 || sparc_cpu == PROCESSOR_NIAGARA4 - || sparc_cpu == PROCESSOR_NIAGARA7) + || sparc_cpu == PROCESSOR_NIAGARA7 + || sparc_cpu == PROCESSOR_M8) ? 16 : 64), global_options.x_param_values, global_options_set.x_param_values); @@ -1701,7 +1750,8 @@ sparc_option_override (void) /* PARAM_L2_CACHE_SIZE is the size fo the L2 in kilobytes. Note that 512 is the default in params.def. */ maybe_set_param_value (PARAM_L2_CACHE_SIZE, - (sparc_cpu == PROCESSOR_NIAGARA4 + ((sparc_cpu == PROCESSOR_NIAGARA4 + || sparc_cpu == PROCESSOR_M8) ? 128 : (sparc_cpu == PROCESSOR_NIAGARA7 ? 256 : 512)), global_options.x_param_values, @@ -9478,7 +9528,8 @@ sparc32_initialize_trampoline (rtx m_tramp, rtx fnaddr, rtx cxt) && sparc_cpu != PROCESSOR_NIAGARA2 && sparc_cpu != PROCESSOR_NIAGARA3 && sparc_cpu != PROCESSOR_NIAGARA4 - && sparc_cpu != PROCESSOR_NIAGARA7) + && sparc_cpu != PROCESSOR_NIAGARA7 + && sparc_cpu != PROCESSOR_M8) emit_insn (gen_flushsi (validize_mem (adjust_address (m_tramp, SImode, 8)))); /* Call __enable_execute_stack after writing onto the stack to make sure @@ -9524,7 +9575,8 @@ sparc64_initialize_trampoline (rtx m_tramp, rtx fnaddr, rtx cxt) && sparc_cpu != PROCESSOR_NIAGARA2 && sparc_cpu != PROCESSOR_NIAGARA3 && sparc_cpu != PROCESSOR_NIAGARA4 - && sparc_cpu != PROCESSOR_NIAGARA7) + && sparc_cpu != PROCESSOR_NIAGARA7 + && sparc_cpu != PROCESSOR_M8) emit_insn (gen_flushdi (validize_mem (adjust_address (m_tramp, DImode, 8)))); /* Call __enable_execute_stack after writing onto the stack to make sure @@ -9724,7 +9776,8 @@ sparc_use_sched_lookahead (void) || sparc_cpu == PROCESSOR_NIAGARA3) return 0; if (sparc_cpu == PROCESSOR_NIAGARA4 - || sparc_cpu == PROCESSOR_NIAGARA7) + || sparc_cpu == PROCESSOR_NIAGARA7 + || sparc_cpu == PROCESSOR_M8) return 2; if (sparc_cpu == PROCESSOR_ULTRASPARC || sparc_cpu == PROCESSOR_ULTRASPARC3) @@ -9758,6 +9811,7 @@ sparc_issue_rate (void) return 2; case PROCESSOR_ULTRASPARC: case PROCESSOR_ULTRASPARC3: + case PROCESSOR_M8: return 4; } } @@ -10340,6 +10394,45 @@ enum sparc_builtins SPARC_BUILTIN_FPSUBS8, SPARC_BUILTIN_FPSUBUS8, SPARC_BUILTIN_FPSUBUS16, + + /* VIS 4.0B builtins. */ + + /* Note that all the DICTUNPACK* entries should be kept + contiguous. */ + SPARC_BUILTIN_FIRST_DICTUNPACK, + SPARC_BUILTIN_DICTUNPACK8 = SPARC_BUILTIN_FIRST_DICTUNPACK, + SPARC_BUILTIN_DICTUNPACK16, + SPARC_BUILTIN_DICTUNPACK32, + SPARC_BUILTIN_LAST_DICTUNPACK = SPARC_BUILTIN_DICTUNPACK32, + + /* Note that all the FPCMP*SHL entries should be kept + contiguous. */ + SPARC_BUILTIN_FIRST_FPCMPSHL, + SPARC_BUILTIN_FPCMPLE8SHL = SPARC_BUILTIN_FIRST_FPCMPSHL, + SPARC_BUILTIN_FPCMPGT8SHL, + SPARC_BUILTIN_FPCMPEQ8SHL, + SPARC_BUILTIN_FPCMPNE8SHL, + SPARC_BUILTIN_FPCMPLE16SHL, + SPARC_BUILTIN_FPCMPGT16SHL, + SPARC_BUILTIN_FPCMPEQ16SHL, + SPARC_BUILTIN_FPCMPNE16SHL, + SPARC_BUILTIN_FPCMPLE32SHL, + SPARC_BUILTIN_FPCMPGT32SHL, + SPARC_BUILTIN_FPCMPEQ32SHL, + SPARC_BUILTIN_FPCMPNE32SHL, + SPARC_BUILTIN_FPCMPULE8SHL, + SPARC_BUILTIN_FPCMPUGT8SHL, + SPARC_BUILTIN_FPCMPULE16SHL, + SPARC_BUILTIN_FPCMPUGT16SHL, + SPARC_BUILTIN_FPCMPULE32SHL, + SPARC_BUILTIN_FPCMPUGT32SHL, + SPARC_BUILTIN_FPCMPDE8SHL, + SPARC_BUILTIN_FPCMPDE16SHL, + SPARC_BUILTIN_FPCMPDE32SHL, + SPARC_BUILTIN_FPCMPUR8SHL, + SPARC_BUILTIN_FPCMPUR16SHL, + SPARC_BUILTIN_FPCMPUR32SHL, + SPARC_BUILTIN_LAST_FPCMPSHL = SPARC_BUILTIN_FPCMPUR32SHL, SPARC_BUILTIN_MAX }; @@ -10347,6 +10440,27 @@ enum sparc_builtins static GTY (()) tree sparc_builtins[(int) SPARC_BUILTIN_MAX]; static enum insn_code sparc_builtins_icode[(int) SPARC_BUILTIN_MAX]; +/* Return true if OPVAL can be used for operand OPNUM of instruction ICODE. + The instruction should require a constant operand of some sort. The + function prints an error if OPVAL is not valid. */ + +static int +check_constant_argument (enum insn_code icode, int opnum, rtx opval) +{ + if (GET_CODE (opval) != CONST_INT) + { + error ("%qs expects a constant argument", insn_data[icode].name); + return false; + } + + if (!(*insn_data[icode].operand[opnum].predicate) (opval, VOIDmode)) + { + error ("constant argument out of range for %qs", insn_data[icode].name); + return false; + } + return true; +} + /* Add a SPARC builtin function with NAME, ICODE, CODE and TYPE. Return the function decl or NULL_TREE if the builtin was not added. */ @@ -10440,6 +10554,12 @@ sparc_vis_init_builtins (void) v8qi, v8qi, 0); tree si_ftype_v8qi_v8qi = build_function_type_list (intSI_type_node, v8qi, v8qi, 0); + tree v8qi_ftype_df_si = build_function_type_list (v8qi, double_type_node, + intSI_type_node, 0); + tree v4hi_ftype_df_si = build_function_type_list (v4hi, double_type_node, + intSI_type_node, 0); + tree v2si_ftype_df_si = build_function_type_list (v2si, double_type_node, + intDI_type_node, 0); tree di_ftype_di_di = build_function_type_list (intDI_type_node, intDI_type_node, intDI_type_node, 0); @@ -10894,6 +11014,156 @@ sparc_vis_init_builtins (void) def_builtin_const ("__builtin_vis_fpsubus16", CODE_FOR_ussubv4hi3, SPARC_BUILTIN_FPSUBUS16, v4hi_ftype_v4hi_v4hi); } + + if (TARGET_VIS4B) + { + def_builtin_const ("__builtin_vis_dictunpack8", CODE_FOR_dictunpack8, + SPARC_BUILTIN_DICTUNPACK8, v8qi_ftype_df_si); + def_builtin_const ("__builtin_vis_dictunpack16", CODE_FOR_dictunpack16, + SPARC_BUILTIN_DICTUNPACK16, v4hi_ftype_df_si); + def_builtin_const ("__builtin_vis_dictunpack32", CODE_FOR_dictunpack32, + SPARC_BUILTIN_DICTUNPACK32, v2si_ftype_df_si); + + if (TARGET_ARCH64) + { + tree di_ftype_v8qi_v8qi_si = build_function_type_list (intDI_type_node, + v8qi, v8qi, + intSI_type_node, 0); + tree di_ftype_v4hi_v4hi_si = build_function_type_list (intDI_type_node, + v4hi, v4hi, + intSI_type_node, 0); + tree di_ftype_v2si_v2si_si = build_function_type_list (intDI_type_node, + v2si, v2si, + intSI_type_node, 0); + + def_builtin_const ("__builtin_vis_fpcmple8shl", CODE_FOR_fpcmple8dishl, + SPARC_BUILTIN_FPCMPLE8SHL, di_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpgt8shl", CODE_FOR_fpcmpgt8dishl, + SPARC_BUILTIN_FPCMPGT8SHL, di_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpeq8shl", CODE_FOR_fpcmpeq8dishl, + SPARC_BUILTIN_FPCMPEQ8SHL, di_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpne8shl", CODE_FOR_fpcmpne8dishl, + SPARC_BUILTIN_FPCMPNE8SHL, di_ftype_v8qi_v8qi_si); + + def_builtin_const ("__builtin_vis_fpcmple16shl", CODE_FOR_fpcmple16dishl, + SPARC_BUILTIN_FPCMPLE16SHL, di_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpgt16shl", CODE_FOR_fpcmpgt16dishl, + SPARC_BUILTIN_FPCMPGT16SHL, di_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpeq16shl", CODE_FOR_fpcmpeq16dishl, + SPARC_BUILTIN_FPCMPEQ16SHL, di_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpne16shl", CODE_FOR_fpcmpne16dishl, + SPARC_BUILTIN_FPCMPNE16SHL, di_ftype_v4hi_v4hi_si); + + def_builtin_const ("__builtin_vis_fpcmple32shl", CODE_FOR_fpcmple32dishl, + SPARC_BUILTIN_FPCMPLE32SHL, di_ftype_v2si_v2si_si); + def_builtin_const ("__builtin_vis_fpcmpgt32shl", CODE_FOR_fpcmpgt32dishl, + SPARC_BUILTIN_FPCMPGT32SHL, di_ftype_v2si_v2si_si); + def_builtin_const ("__builtin_vis_fpcmpeq32shl", CODE_FOR_fpcmpeq32dishl, + SPARC_BUILTIN_FPCMPEQ32SHL, di_ftype_v2si_v2si_si); + def_builtin_const ("__builtin_vis_fpcmpne32shl", CODE_FOR_fpcmpne32dishl, + SPARC_BUILTIN_FPCMPNE32SHL, di_ftype_v2si_v2si_si); + + + def_builtin_const ("__builtin_vis_fpcmpule8shl", CODE_FOR_fpcmpule8dishl, + SPARC_BUILTIN_FPCMPULE8SHL, di_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpugt8shl", CODE_FOR_fpcmpugt8dishl, + SPARC_BUILTIN_FPCMPUGT8SHL, di_ftype_v8qi_v8qi_si); + + def_builtin_const ("__builtin_vis_fpcmpule16shl", CODE_FOR_fpcmpule16dishl, + SPARC_BUILTIN_FPCMPULE16SHL, di_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpugt16shl", CODE_FOR_fpcmpugt16dishl, + SPARC_BUILTIN_FPCMPUGT16SHL, di_ftype_v4hi_v4hi_si); + + def_builtin_const ("__builtin_vis_fpcmpule32shl", CODE_FOR_fpcmpule32dishl, + SPARC_BUILTIN_FPCMPULE32SHL, di_ftype_v2si_v2si_si); + def_builtin_const ("__builtin_vis_fpcmpugt32shl", CODE_FOR_fpcmpugt32dishl, + SPARC_BUILTIN_FPCMPUGT32SHL, di_ftype_v2si_v2si_si); + + def_builtin_const ("__builtin_vis_fpcmpde8shl", CODE_FOR_fpcmpde8dishl, + SPARC_BUILTIN_FPCMPDE8SHL, di_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpde16shl", CODE_FOR_fpcmpde16dishl, + SPARC_BUILTIN_FPCMPDE16SHL, di_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpde32shl", CODE_FOR_fpcmpde32dishl, + SPARC_BUILTIN_FPCMPDE32SHL, di_ftype_v2si_v2si_si); + + def_builtin_const ("__builtin_vis_fpcmpur8shl", CODE_FOR_fpcmpur8dishl, + SPARC_BUILTIN_FPCMPUR8SHL, di_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpur16shl", CODE_FOR_fpcmpur16dishl, + SPARC_BUILTIN_FPCMPUR16SHL, di_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpur32shl", CODE_FOR_fpcmpur32dishl, + SPARC_BUILTIN_FPCMPUR32SHL, di_ftype_v2si_v2si_si); + + } + else + { + tree si_ftype_v8qi_v8qi_si = build_function_type_list (intSI_type_node, + v8qi, v8qi, + intSI_type_node, 0); + tree si_ftype_v4hi_v4hi_si = build_function_type_list (intSI_type_node, + v4hi, v4hi, + intSI_type_node, 0); + tree si_ftype_v2si_v2si_si = build_function_type_list (intSI_type_node, + v2si, v2si, + intSI_type_node, 0); + + def_builtin_const ("__builtin_vis_fpcmple8shl", CODE_FOR_fpcmple8sishl, + SPARC_BUILTIN_FPCMPLE8SHL, si_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpgt8shl", CODE_FOR_fpcmpgt8sishl, + SPARC_BUILTIN_FPCMPGT8SHL, si_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpeq8shl", CODE_FOR_fpcmpeq8sishl, + SPARC_BUILTIN_FPCMPEQ8SHL, si_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpne8shl", CODE_FOR_fpcmpne8sishl, + SPARC_BUILTIN_FPCMPNE8SHL, si_ftype_v8qi_v8qi_si); + + def_builtin_const ("__builtin_vis_fpcmple16shl", CODE_FOR_fpcmple16sishl, + SPARC_BUILTIN_FPCMPLE16SHL, si_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpgt16shl", CODE_FOR_fpcmpgt16sishl, + SPARC_BUILTIN_FPCMPGT16SHL, si_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpeq16shl", CODE_FOR_fpcmpeq16sishl, + SPARC_BUILTIN_FPCMPEQ16SHL, si_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpne16shl", CODE_FOR_fpcmpne16sishl, + SPARC_BUILTIN_FPCMPNE16SHL, si_ftype_v4hi_v4hi_si); + + def_builtin_const ("__builtin_vis_fpcmple32shl", CODE_FOR_fpcmple32sishl, + SPARC_BUILTIN_FPCMPLE32SHL, si_ftype_v2si_v2si_si); + def_builtin_const ("__builtin_vis_fpcmpgt32shl", CODE_FOR_fpcmpgt32sishl, + SPARC_BUILTIN_FPCMPGT32SHL, si_ftype_v2si_v2si_si); + def_builtin_const ("__builtin_vis_fpcmpeq32shl", CODE_FOR_fpcmpeq32sishl, + SPARC_BUILTIN_FPCMPEQ32SHL, si_ftype_v2si_v2si_si); + def_builtin_const ("__builtin_vis_fpcmpne32shl", CODE_FOR_fpcmpne32sishl, + SPARC_BUILTIN_FPCMPNE32SHL, si_ftype_v2si_v2si_si); + + + def_builtin_const ("__builtin_vis_fpcmpule8shl", CODE_FOR_fpcmpule8sishl, + SPARC_BUILTIN_FPCMPULE8SHL, si_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpugt8shl", CODE_FOR_fpcmpugt8sishl, + SPARC_BUILTIN_FPCMPUGT8SHL, si_ftype_v8qi_v8qi_si); + + def_builtin_const ("__builtin_vis_fpcmpule16shl", CODE_FOR_fpcmpule16sishl, + SPARC_BUILTIN_FPCMPULE16SHL, si_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpugt16shl", CODE_FOR_fpcmpugt16sishl, + SPARC_BUILTIN_FPCMPUGT16SHL, si_ftype_v4hi_v4hi_si); + + def_builtin_const ("__builtin_vis_fpcmpule32shl", CODE_FOR_fpcmpule32sishl, + SPARC_BUILTIN_FPCMPULE32SHL, si_ftype_v2si_v2si_si); + def_builtin_const ("__builtin_vis_fpcmpugt32shl", CODE_FOR_fpcmpugt32sishl, + SPARC_BUILTIN_FPCMPUGT32SHL, si_ftype_v2si_v2si_si); + + def_builtin_const ("__builtin_vis_fpcmpde8shl", CODE_FOR_fpcmpde8sishl, + SPARC_BUILTIN_FPCMPDE8SHL, si_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpde16shl", CODE_FOR_fpcmpde16sishl, + SPARC_BUILTIN_FPCMPDE16SHL, si_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpde32shl", CODE_FOR_fpcmpde32sishl, + SPARC_BUILTIN_FPCMPDE32SHL, si_ftype_v2si_v2si_si); + + def_builtin_const ("__builtin_vis_fpcmpur8shl", CODE_FOR_fpcmpur8sishl, + SPARC_BUILTIN_FPCMPUR8SHL, si_ftype_v8qi_v8qi_si); + def_builtin_const ("__builtin_vis_fpcmpur16shl", CODE_FOR_fpcmpur16sishl, + SPARC_BUILTIN_FPCMPUR16SHL, si_ftype_v4hi_v4hi_si); + def_builtin_const ("__builtin_vis_fpcmpur32shl", CODE_FOR_fpcmpur32sishl, + SPARC_BUILTIN_FPCMPUR32SHL, si_ftype_v2si_v2si_si); + } + } } /* Implement TARGET_BUILTIN_DECL hook. */ @@ -10948,6 +11218,19 @@ sparc_expand_builtin (tree exp, rtx target, insn_op = &insn_data[icode].operand[idx]; op[arg_count] = expand_normal (arg); + /* Some of the builtins require constant arguments. We check + for this here. */ + if ((code >= SPARC_BUILTIN_FIRST_FPCMPSHL + && code <= SPARC_BUILTIN_LAST_FPCMPSHL + && arg_count == 3) + || (code >= SPARC_BUILTIN_FIRST_DICTUNPACK + && code <= SPARC_BUILTIN_LAST_DICTUNPACK + && arg_count == 2)) + { + if (!check_constant_argument (icode, idx, op[arg_count])) + return const0_rtx; + } + if (code == SPARC_BUILTIN_LDFSR || code == SPARC_BUILTIN_STFSR) { if (!address_operand (op[arg_count], SImode)) @@ -11458,7 +11741,8 @@ sparc_register_move_cost (machine_mode mode ATTRIBUTE_UNUSED, || sparc_cpu == PROCESSOR_NIAGARA2 || sparc_cpu == PROCESSOR_NIAGARA3 || sparc_cpu == PROCESSOR_NIAGARA4 - || sparc_cpu == PROCESSOR_NIAGARA7) + || sparc_cpu == PROCESSOR_NIAGARA7 + || sparc_cpu == PROCESSOR_M8) return 12; return 6; diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h index 581774e586b..d7c617e06c3 100644 --- a/gcc/config/sparc/sparc.h +++ b/gcc/config/sparc/sparc.h @@ -143,6 +143,7 @@ extern enum cmodel sparc_cmodel; #define TARGET_CPU_niagara3 15 #define TARGET_CPU_niagara4 16 #define TARGET_CPU_niagara7 19 +#define TARGET_CPU_m8 20 #if TARGET_CPU_DEFAULT == TARGET_CPU_v9 \ || TARGET_CPU_DEFAULT == TARGET_CPU_ultrasparc \ @@ -151,7 +152,8 @@ extern enum cmodel sparc_cmodel; || TARGET_CPU_DEFAULT == TARGET_CPU_niagara2 \ || TARGET_CPU_DEFAULT == TARGET_CPU_niagara3 \ || TARGET_CPU_DEFAULT == TARGET_CPU_niagara4 \ - || TARGET_CPU_DEFAULT == TARGET_CPU_niagara7 + || TARGET_CPU_DEFAULT == TARGET_CPU_niagara7 \ + || TARGET_CPU_DEFAULT == TARGET_CPU_m8 #define CPP_CPU32_DEFAULT_SPEC "" #define ASM_CPU32_DEFAULT_SPEC "" @@ -192,6 +194,10 @@ extern enum cmodel sparc_cmodel; #define CPP_CPU64_DEFAULT_SPEC "-D__sparc_v9__" #define ASM_CPU64_DEFAULT_SPEC AS_NIAGARA7_FLAG #endif +#if TARGET_CPU_DEFAULT == TARGET_CPU_m8 +#define CPP_CPU64_DEFAULT_SPEC "-D__sparc_v9__" +#define ASM_CPU64_DEFAULT_SPEC AS_M8_FLAG +#endif #else @@ -295,6 +301,7 @@ extern enum cmodel sparc_cmodel; %{mcpu=niagara3:-D__sparc_v9__} \ %{mcpu=niagara4:-D__sparc_v9__} \ %{mcpu=niagara7:-D__sparc_v9__} \ +%{mcpu=m8:-D__sparc_v9__} \ %{!mcpu*:%(cpp_cpu_default)} \ " #define CPP_ARCH32_SPEC "" @@ -347,6 +354,7 @@ extern enum cmodel sparc_cmodel; %{mcpu=niagara3:%{!mv8plus:-Av9" AS_NIAGARA3_FLAG "}} \ %{mcpu=niagara4:%{!mv8plus:" AS_NIAGARA4_FLAG "}} \ %{mcpu=niagara7:%{!mv8plus:" AS_NIAGARA7_FLAG "}} \ +%{mcpu=m8:%{!mv8plus:" AS_M8_FLAG "}} \ %{!mcpu*:%(asm_cpu_default)} \ " @@ -1039,6 +1047,10 @@ extern char leaf_reg_remap[]; /* Local macro to handle the two v9 classes of FP regs. */ #define FP_REG_CLASS_P(CLASS) ((CLASS) == FP_REGS || (CLASS) == EXTRA_FP_REGS) +/* Predicate for 2-bit and 5-bit unsigned constants. */ +#define SPARC_IMM2_P(X) (((unsigned HOST_WIDE_INT) (X) & ~0x3) == 0) +#define SPARC_IMM5_P(X) (((unsigned HOST_WIDE_INT) (X) & ~0x1F) == 0) + /* Predicates for 5-bit, 10-bit, 11-bit and 13-bit signed constants. */ #define SPARC_SIMM5_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x10 < 0x20) #define SPARC_SIMM10_P(X) ((unsigned HOST_WIDE_INT) (X) + 0x200 < 0x400) @@ -1799,6 +1811,12 @@ extern int sparc_indent_opcode; #define AS_NIAGARA7_FLAG AS_NIAGARA4_FLAG #endif +#ifdef HAVE_AS_SPARC6 +#define AS_M8_FLAG "-xarch=sparc6" +#else +#define AS_M8_FLAG AS_NIAGARA7_FLAG +#endif + #ifdef HAVE_AS_LEON #define AS_LEON_FLAG "-Aleon" #define AS_LEONV7_FLAG "-Aleon" diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md index 5c5096bca2a..cac1bd9343f 100644 --- a/gcc/config/sparc/sparc.md +++ b/gcc/config/sparc/sparc.md @@ -94,6 +94,12 @@ UNSPEC_ADDV UNSPEC_SUBV UNSPEC_NEGV + + UNSPEC_DICTUNPACK + UNSPEC_FPCMPSHL + UNSPEC_FPUCMPSHL + UNSPEC_FPCMPDESHL + UNSPEC_FPCMPURSHL ]) (define_c_enum "unspecv" [ @@ -238,7 +244,8 @@ niagara2, niagara3, niagara4, - niagara7" + niagara7, + m8" (const (symbol_ref "sparc_cpu_attr"))) ;; Attribute for the instruction set. @@ -251,7 +258,7 @@ (symbol_ref "TARGET_SPARCLET") (const_string "sparclet")] (const_string "v7")))) -(define_attr "cpu_feature" "none,fpu,fpunotv9,v9,vis,vis3,vis4" +(define_attr "cpu_feature" "none,fpu,fpunotv9,v9,vis,vis3,vis4,vis4b" (const_string "none")) (define_attr "lra" "disabled,enabled" @@ -265,10 +272,92 @@ (eq_attr "cpu_feature" "v9") (symbol_ref "TARGET_V9") (eq_attr "cpu_feature" "vis") (symbol_ref "TARGET_VIS") (eq_attr "cpu_feature" "vis3") (symbol_ref "TARGET_VIS3") - (eq_attr "cpu_feature" "vis4") (symbol_ref "TARGET_VIS4")] + (eq_attr "cpu_feature" "vis4") (symbol_ref "TARGET_VIS4") + (eq_attr "cpu_feature" "vis4b") (symbol_ref "TARGET_VIS4B")] (const_int 0))) -;; Insn type. +;; The SPARC instructions used by the backend are organized into a +;; hierarchy using the insn attributes "type" and "subtype". +;; +;; The mnemonics used in the list below are the architectural names +;; used in the Oracle SPARC Architecture specs. A / character +;; separates the type from the subtype where appropriate. For +;; brevity, text enclosed in {} denotes alternatives, while text +;; enclosed in [] is optional. +;; +;; Please keep this list updated. It is of great help for keeping the +;; correctness and coherence of the DFA schedulers. +;; +;; ialu: +;; ialuX: ADD[X]C SUB[X]C +;; shift: SLL[X] SRL[X] SRA[X] +;; cmove: MOV{A,N,NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS} +;; MOVF{A,N,U,G,UG,L,UL,LG,NE,E,UE,GE,UGE,LE,ULE,O} +;; MOVR{Z,LEZ,LZ,NZ,GZ,GEZ} +;; compare: ADDcc ADDCcc ANDcc ORcc SUBcc SUBCcc XORcc XNORcc +;; imul: MULX SMUL[cc] UMUL UMULXHI XMULX XMULXHI +;; idiv: UDIVX SDIVX +;; flush: FLUSH +;; load/regular: LD{UB,UH,UW} LDFSR +;; load/prefetch: PREFETCH +;; fpload: LDF LDDF LDQF +;; sload: LD{SB,SH,SW} +;; store: ST{B,H,W,X} STFSR +;; fpstore: STF STDF STQF +;; cbcond: CWB{NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS} +;; CXB{NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS} +;; uncond_branch: BA BPA JMPL +;; branch: B{NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS} +;; BP{NE,E,G,LE,GE,L,GU,LEU,CC,CS,POS,NEG,VC,VS} +;; FB{U,G,UG,L,UL,LG,NE,BE,UE,GE,UGE,LE,ULE,O} +;; call: CALL +;; return: RESTORE RETURN +;; fpmove: FABS{s,d,q} FMOV{s,d,q} FNEG{s,d,q} +;; fpcmove: FMOV{S,D,Q}{icc,xcc,fcc} +;; fpcrmove: FMOVR{s,d,q}{Z,LEZ,LZ,NZ,GZ,GEZ} +;; fp: FADD{s,d,q} FSUB{s,d,q} FHSUB{s,d} FNHADD{s,d} FNADD{s,d} +;; FiTO{s,d,q} FsTO{i,x,d,q} FdTO{i,x,s,q} FxTO{d,s,q} FqTO{i,x,s,d} +;; fpcmp: FCMP{s,d,q} FCMPE{s,d,q} +;; fpmul: FMADD{s,d} FMSUB{s,d} FMUL{s,d,q} FNMADD{s,d} +;; FNMSUB{s,d} FNMUL{s,d} FNsMULd FsMULd +;; FdMULq +;; array: ARRAY{8,16,32} +;; bmask: BMASK +;; edge: EDGE{8,16,32}[L]cc +;; edgen: EDGE{8,16,32}[L]n +;; fpdivs: FDIV{s,q} +;; fpsqrts: FSQRT{s,q} +;; fpdivd: FDIVd +;; fpsqrtd: FSQRTd +;; lzd: LZCNT +;; fga/addsub64: FP{ADD,SUB}64 +;; fga/fpu: FCHKSM16 FEXPANd FMEAN16 FPMERGE +;; FS{LL,RA,RL}{16,32} +;; fga/maxmin: FP{MAX,MIN}[U]{8,16,32} +;; fga/cmask: CMASK{8,16,32} +;; fga/other: BSHUFFLE FALIGNDATAg FP{ADD,SUB}[S]{8,16,32} +;; FP{ADD,SUB}US{8,16} DICTUNPACK +;; gsr/reg: RDGSR WRGSR +;; gsr/alignaddr: ALIGNADDRESS[_LITTLE] +;; vismv/double: FSRC2d +;; vismv/single: MOVwTOs FSRC2s +;; vismv/movstouw: MOVsTOuw +;; vismv/movxtod: MOVxTOd +;; vismv/movdtox: MOVdTOx +;; visl/single: F{AND,NAND,NOR,OR,NOT1}s +;; F{AND,OR}NOT{1,2}s +;; FONEs F{ZERO,XNOR,XOR}s FNOT2s +;; visl/double: FONEd FZEROd FNOT1d F{OR,AND,XOR}d F{NOR,NAND,XNOR}d +;; F{OR,AND}NOT1d F{OR,AND}NOT2d +;; viscmp: FPCMP{LE,GT,NE,EQ}{8,16,32} FPCMPU{LE,GT,NE,EQ}{8,16,32} +;; FPCMP{LE,GT,EQ,NE}{8,16,32}SHL FPCMPU{LE,GT,EQ,NE}{8,16,32}SHL +;; FPCMPDE{8,16,32}SHL FPCMPUR{8,16,32}SHL +;; fgm_pack: FPACKFIX FPACK{8,16,32} +;; fgm_mul: FMUL8SUx16 FMUL8ULx16 FMUL8x16 FMUL8x16AL +;; FMUL8x16AU FMULD8SUx16 FMULD8ULx16 +;; pdist: PDIST +;; pdistn: PDISTN + (define_attr "type" "ialu,compare,shift, load,sload,store, @@ -281,12 +370,20 @@ fpcmp, fpmul,fpdivs,fpdivd, fpsqrts,fpsqrtd, - fga,visl,vismv,fgm_pack,fgm_mul,pdist,pdistn,edge,edgen,gsr,array, + fga,visl,vismv,viscmp, + fgm_pack,fgm_mul,pdist,pdistn,edge,edgen,gsr,array,bmask, cmove, ialuX, multi,savew,flushw,iflush,trap,lzd" (const_string "ialu")) +(define_attr "subtype" + "single,double,movstouw,movxtod,movdtox, + addsub64,cmask,fpu,maxmin,other, + reg,alignaddr, + prefetch,regular" + (const_string "single")) + ;; True if branch/call has empty delay slot and will emit a nop in it (define_attr "empty_delay_slot" "false,true" (symbol_ref "(empty_delay_slot (insn) @@ -487,9 +584,6 @@ (const_string "true") ] (const_string "false"))) -;; True if the instruction executes in the V3 pipeline, in M7 and later processors. -(define_attr "v3pipe" "false,true" (const_string "false")) - (define_delay (eq_attr "type" "call") [(eq_attr "in_call_delay" "true") (nil) (nil)]) @@ -519,6 +613,7 @@ (include "niagara2.md") (include "niagara4.md") (include "niagara7.md") +(include "m8.md") ;; Operand and operator predicates and constraints @@ -1507,6 +1602,7 @@ ldub\t%1, %0 stb\t%r1, %0" [(set_attr "type" "*,load,store") + (set_attr "subtype" "*,regular,*") (set_attr "us3load_type" "*,3cycle,*")]) (define_expand "movhi" @@ -1529,6 +1625,7 @@ lduh\t%1, %0 sth\t%r1, %0" [(set_attr "type" "*,*,load,store") + (set_attr "subtype" "*,*,regular,*") (set_attr "us3load_type" "*,*,3cycle,*")]) ;; We always work with constants here. @@ -1566,8 +1663,8 @@ fzeros\t%0 fones\t%0" [(set_attr "type" "*,*,load,store,vismv,vismv,fpmove,fpload,fpstore,visl,visl") - (set_attr "cpu_feature" "*,*,*,*,vis3,vis3,*,*,*,vis,vis") - (set_attr "v3pipe" "*,*,*,*,true,true,*,*,*,true,true")]) + (set_attr "subtype" "*,*,regular,*,movstouw,single,*,*,*,single,single") + (set_attr "cpu_feature" "*,*,*,*,vis3,vis3,*,*,*,vis,vis")]) (define_insn "*movsi_lo_sum" [(set (match_operand:SI 0 "register_operand" "=r") @@ -1624,7 +1721,8 @@ return "ld\t[%1 + %2], %0"; #endif } - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "regular")]) (define_expand "movsi_pic_label_ref" [(set (match_dup 3) (high:SI @@ -1733,11 +1831,12 @@ std\t%1, %0 fzero\t%0 fone\t%0" - [(set_attr "type" "store,*,load,store,load,store,*,*,fpload,fpstore,*,*,fpmove,*,*,*,fpload,fpstore,visl,visl") + [(set_attr "type" "store,*,load,store,load,store,*,*,fpload,fpstore,*,*,fpmove,*,*,*,fpload,fpstore,visl, +visl") + (set_attr "subtype" "*,*,regular,*,regular,*,*,*,*,*,*,*,*,*,*,*,*,*,double,double") (set_attr "length" "*,2,*,*,*,*,2,2,*,*,2,2,*,2,2,2,*,*,*,*") (set_attr "fptype" "*,*,*,*,*,*,*,*,*,*,*,*,double,*,*,*,*,*,double,double") (set_attr "cpu_feature" "v9,*,*,*,*,*,*,*,fpu,fpu,fpu,fpu,v9,fpunotv9,vis3,vis3,fpu,fpu,vis,vis") - (set_attr "v3pipe" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,true,true") (set_attr "lra" "*,*,disabled,disabled,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*")]) (define_insn "*movdi_insn_sp64" @@ -1759,9 +1858,9 @@ fzero\t%0 fone\t%0" [(set_attr "type" "*,*,load,store,vismv,vismv,fpmove,fpload,fpstore,visl,visl") + (set_attr "subtype" "*,*,regular,*,movdtox,movxtod,*,*,*,double,double") (set_attr "fptype" "*,*,*,*,*,*,double,*,*,double,double") - (set_attr "cpu_feature" "*,*,*,*,vis3,vis3,*,*,*,vis,vis") - (set_attr "v3pipe" "*,*,*,*,*,*,*,*,*,true,true")]) + (set_attr "cpu_feature" "*,*,*,*,vis3,vis3,*,*,*,vis,vis")]) (define_expand "movdi_pic_label_ref" [(set (match_dup 3) (high:DI @@ -1847,7 +1946,8 @@ return "ldx\t[%1 + %2], %0"; #endif } - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "regular")]) (define_insn "*sethi_di_medlow_embmedany_pic" [(set (match_operand:DI 0 "register_operand" "=r") @@ -2289,8 +2389,8 @@ } } [(set_attr "type" "visl,visl,fpmove,*,*,*,vismv,vismv,fpload,load,fpstore,store") - (set_attr "cpu_feature" "vis,vis,fpu,*,*,*,vis3,vis3,fpu,*,fpu,*") - (set_attr "v3pipe" "true,true,*,*,*,*,true,true,*,*,*,*")]) + (set_attr "subtype" "single,single,*,*,*,*,movstouw,single,*,regular,*,*") + (set_attr "cpu_feature" "vis,vis,fpu,*,*,*,vis3,vis3,fpu,*,fpu,*")]) ;; The following 3 patterns build SFmode constants in integer registers. @@ -2362,10 +2462,10 @@ ldd\t%1, %0 std\t%1, %0" [(set_attr "type" "store,*,visl,visl,fpmove,*,*,*,fpload,fpstore,load,store,*,*,*,load,store") + (set_attr "subtype" "*,*,double,double,*,*,*,*,*,*,regular,*,*,*,*,regular,*") (set_attr "length" "*,2,*,*,*,2,2,2,*,*,*,*,2,2,2,*,*") (set_attr "fptype" "*,*,double,double,double,*,*,*,*,*,*,*,*,*,*,*,*") (set_attr "cpu_feature" "v9,*,vis,vis,v9,fpunotv9,vis3,vis3,fpu,fpu,*,*,fpu,fpu,*,*,*") - (set_attr "v3pipe" "*,*,true,true,*,*,*,*,*,*,*,*,*,*,*,*,*") (set_attr "lra" "*,*,*,*,*,*,*,*,*,*,disabled,disabled,*,*,*,*,*")]) (define_insn "*movdf_insn_sp64" @@ -2387,10 +2487,10 @@ stx\t%r1, %0 #" [(set_attr "type" "visl,visl,fpmove,vismv,vismv,load,store,*,load,store,*") + (set_attr "subtype" "double,double,*,movdtox,movxtod,regular,*,*,regular,*,*") (set_attr "length" "*,*,*,*,*,*,*,*,*,*,2") (set_attr "fptype" "double,double,double,double,double,*,*,*,*,*,*") - (set_attr "cpu_feature" "vis,vis,fpu,vis3,vis3,fpu,fpu,*,*,*,*") - (set_attr "v3pipe" "true,true,*,*,*,*,*,*,*,*,*")]) + (set_attr "cpu_feature" "vis,vis,fpu,vis3,vis3,fpu,fpu,*,*,*,*")]) ;; This pattern builds DFmode constants in integer registers. (define_split @@ -2916,6 +3016,7 @@ "" "lduh\t%1, %0" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_expand "zero_extendqihi2" @@ -2932,6 +3033,7 @@ and\t%1, 0xff, %0 ldub\t%1, %0" [(set_attr "type" "*,load") + (set_attr "subtype" "*,regular") (set_attr "us3load_type" "*,3cycle")]) (define_expand "zero_extendqisi2" @@ -2948,6 +3050,7 @@ and\t%1, 0xff, %0 ldub\t%1, %0" [(set_attr "type" "*,load") + (set_attr "subtype" "*,regular") (set_attr "us3load_type" "*,3cycle")]) (define_expand "zero_extendqidi2" @@ -2964,6 +3067,7 @@ and\t%1, 0xff, %0 ldub\t%1, %0" [(set_attr "type" "*,load") + (set_attr "subtype" "*,regular") (set_attr "us3load_type" "*,3cycle")]) (define_expand "zero_extendhidi2" @@ -2995,6 +3099,7 @@ "TARGET_ARCH64" "lduh\t%1, %0" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) ;; ??? Write truncdisi pattern using sra? @@ -3015,8 +3120,8 @@ lduw\t%1, %0 movstouw\t%1, %0" [(set_attr "type" "shift,load,vismv") - (set_attr "cpu_feature" "*,*,vis3") - (set_attr "v3pipe" "*,*,true")]) + (set_attr "subtype" "*,regular,movstouw") + (set_attr "cpu_feature" "*,*,vis3")]) (define_insn_and_split "*zero_extendsidi2_insn_sp32" [(set (match_operand:DI 0 "register_operand" "=r") @@ -3331,8 +3436,7 @@ movstosw\t%1, %0" [(set_attr "type" "shift,sload,vismv") (set_attr "us3load_type" "*,3cycle,*") - (set_attr "cpu_feature" "*,*,vis3") - (set_attr "v3pipe" "*,*,true")]) + (set_attr "cpu_feature" "*,*,vis3")]) ;; Special pattern for optimizing bit-field compares. This is needed @@ -7356,7 +7460,8 @@ [(unspec_volatile [(match_operand:SI 0 "memory_operand" "m")] UNSPECV_LDFSR)] "TARGET_FPU" "ld\t%0, %%fsr" - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "regular")]) (define_insn "stfsr" [(set (match_operand:SI 0 "memory_operand" "=m") @@ -7720,7 +7825,8 @@ gcc_assert (locality >= 0 && locality < 4); return prefetch_instr [read_or_write][locality == 0 ? 0 : 1]; } - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "prefetch")]) (define_insn "prefetch_32" [(prefetch (match_operand:SI 0 "address_operand" "p") @@ -7745,7 +7851,8 @@ gcc_assert (locality >= 0 && locality < 4); return prefetch_instr [read_or_write][locality == 0 ? 0 : 1]; } - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "prefetch")]) ;; Trap instructions. @@ -7966,7 +8073,8 @@ UNSPEC_TLSIE))] "TARGET_TLS && TARGET_ARCH32" "ld\\t[%1 + %2], %0, %%tie_ld(%a3)" - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "regular")]) (define_insn "tie_ld64" [(set (match_operand:DI 0 "register_operand" "=r") @@ -7976,7 +8084,8 @@ UNSPEC_TLSIE))] "TARGET_TLS && TARGET_ARCH64" "ldx\\t[%1 + %2], %0, %%tie_ldx(%a3)" - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "regular")]) (define_insn "tie_add32" [(set (match_operand:SI 0 "register_operand" "=r") @@ -8036,6 +8145,7 @@ "TARGET_TLS && TARGET_ARCH32" "ldub\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_ldub1_sp32" @@ -8048,6 +8158,7 @@ "TARGET_TLS && TARGET_ARCH32" "ldub\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_ldub2_sp32" @@ -8060,6 +8171,7 @@ "TARGET_TLS && TARGET_ARCH32" "ldub\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_ldsb1_sp32" @@ -8095,6 +8207,7 @@ "TARGET_TLS && TARGET_ARCH64" "ldub\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_ldub1_sp64" @@ -8107,6 +8220,7 @@ "TARGET_TLS && TARGET_ARCH64" "ldub\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_ldub2_sp64" @@ -8119,6 +8233,7 @@ "TARGET_TLS && TARGET_ARCH64" "ldub\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_ldub3_sp64" @@ -8131,6 +8246,7 @@ "TARGET_TLS && TARGET_ARCH64" "ldub\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_ldsb1_sp64" @@ -8178,6 +8294,7 @@ "TARGET_TLS && TARGET_ARCH32" "lduh\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_lduh1_sp32" @@ -8190,6 +8307,7 @@ "TARGET_TLS && TARGET_ARCH32" "lduh\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_ldsh1_sp32" @@ -8213,6 +8331,7 @@ "TARGET_TLS && TARGET_ARCH64" "lduh\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_lduh1_sp64" @@ -8225,6 +8344,7 @@ "TARGET_TLS && TARGET_ARCH64" "lduh\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_lduh2_sp64" @@ -8237,6 +8357,7 @@ "TARGET_TLS && TARGET_ARCH64" "lduh\t[%1 + %2], %0, %%tldo_add(%3)" [(set_attr "type" "load") + (set_attr "subtype" "regular") (set_attr "us3load_type" "3cycle")]) (define_insn "*tldo_ldsh1_sp64" @@ -8271,7 +8392,8 @@ (match_operand:SI 1 "register_operand" "r"))))] "TARGET_TLS && TARGET_ARCH32" "ld\t[%1 + %2], %0, %%tldo_add(%3)" - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "regular")]) (define_insn "*tldo_lduw_sp64" [(set (match_operand:SI 0 "register_operand" "=r") @@ -8281,7 +8403,8 @@ (match_operand:DI 1 "register_operand" "r"))))] "TARGET_TLS && TARGET_ARCH64" "lduw\t[%1 + %2], %0, %%tldo_add(%3)" - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "regular")]) (define_insn "*tldo_lduw1_sp64" [(set (match_operand:DI 0 "register_operand" "=r") @@ -8292,7 +8415,8 @@ (match_operand:DI 1 "register_operand" "r")))))] "TARGET_TLS && TARGET_ARCH64" "lduw\t[%1 + %2], %0, %%tldo_add(%3)" - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "regular")]) (define_insn "*tldo_ldsw1_sp64" [(set (match_operand:DI 0 "register_operand" "=r") @@ -8314,7 +8438,8 @@ (match_operand:DI 1 "register_operand" "r"))))] "TARGET_TLS && TARGET_ARCH64" "ldx\t[%1 + %2], %0, %%tldo_add(%3)" - [(set_attr "type" "load")]) + [(set_attr "type" "load") + (set_attr "subtype" "regular")]) (define_insn "*tldo_stb_sp32" [(set (mem:QI (plus:SI (unspec:SI [(match_operand:SI 2 "register_operand" "r") @@ -8519,8 +8644,8 @@ movstouw\t%1, %0 movwtos\t%1, %0" [(set_attr "type" "visl,visl,vismv,fpload,fpstore,store,load,store,*,vismv,vismv") - (set_attr "cpu_feature" "vis,vis,vis,*,*,*,*,*,*,vis3,vis3") - (set_attr "v3pipe" "true,true,true,*,*,*,*,*,*,true,true")]) + (set_attr "subtype" "single,single,single,*,*,*,regular,*,*,movstouw,single") + (set_attr "cpu_feature" "vis,vis,vis,*,*,*,*,*,*,vis3,vis3")]) (define_insn "*mov_insn_sp64" [(set (match_operand:VM64 0 "nonimmediate_operand" "=e,e,e,e,W,m,*r, m,*r, e,*r") @@ -8542,8 +8667,8 @@ movxtod\t%1, %0 mov\t%1, %0" [(set_attr "type" "visl,visl,vismv,fpload,fpstore,store,load,store,vismv,vismv,*") - (set_attr "cpu_feature" "vis,vis,vis,*,*,*,*,*,vis3,vis3,*") - (set_attr "v3pipe" "true,true,true,*,*,*,*,*,*,*,*")]) + (set_attr "subtype" "double,double,double,*,*,*,regular,*,movdtox,movxtod,*") + (set_attr "cpu_feature" "vis,vis,vis,*,*,*,*,*,vis3,vis3,*")]) (define_insn "*mov_insn_sp32" [(set (match_operand:VM64 0 "nonimmediate_operand" @@ -8572,9 +8697,9 @@ ldd\t%1, %0 std\t%1, %0" [(set_attr "type" "store,*,visl,visl,vismv,*,*,fpload,fpstore,load,store,*,*,*,load,store") + (set_attr "subtype" "*,*,double,double,double,*,*,*,*,regular,*,*,*,*,regular,*") (set_attr "length" "*,2,*,*,*,2,2,*,*,*,*,2,2,2,*,*") (set_attr "cpu_feature" "*,*,vis,vis,vis,vis3,vis3,*,*,*,*,*,*,*,*,*") - (set_attr "v3pipe" "*,*,true,true,true,*,*,*,*,*,*,*,*,*,*,*") (set_attr "lra" "*,*,*,*,*,*,*,*,*,disabled,disabled,*,*,*,*,*")]) (define_split @@ -8652,8 +8777,8 @@ "TARGET_VIS" "fp\t%1, %2, %0" [(set_attr "type" "fga") - (set_attr "fptype" "") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "other") + (set_attr "fptype" "")]) (define_mode_iterator VL [V1SI V2HI V4QI V1DI V2SI V4HI V8QI]) (define_mode_attr vlsuf [(V1SI "s") (V2HI "s") (V4QI "s") @@ -8669,8 +8794,7 @@ "TARGET_VIS" "f\t%1, %2, %0" [(set_attr "type" "visl") - (set_attr "fptype" "") - (set_attr "v3pipe" "true")]) + (set_attr "fptype" "")]) (define_insn "*not_3" [(set (match_operand:VL 0 "register_operand" "=") @@ -8679,8 +8803,7 @@ "TARGET_VIS" "f\t%1, %2, %0" [(set_attr "type" "visl") - (set_attr "fptype" "") - (set_attr "v3pipe" "true")]) + (set_attr "fptype" "")]) ;; (ior (not (op1)) (not (op2))) is the canonical form of NAND. (define_insn "*nand_vis" @@ -8690,8 +8813,7 @@ "TARGET_VIS" "fnand\t%1, %2, %0" [(set_attr "type" "visl") - (set_attr "fptype" "") - (set_attr "v3pipe" "true")]) + (set_attr "fptype" "")]) (define_code_iterator vlnotop [ior and]) @@ -8702,8 +8824,7 @@ "TARGET_VIS" "fnot1\t%1, %2, %0" [(set_attr "type" "visl") - (set_attr "fptype" "") - (set_attr "v3pipe" "true")]) + (set_attr "fptype" "")]) (define_insn "*_not2_vis" [(set (match_operand:VL 0 "register_operand" "=") @@ -8712,8 +8833,7 @@ "TARGET_VIS" "fnot2\t%1, %2, %0" [(set_attr "type" "visl") - (set_attr "fptype" "") - (set_attr "v3pipe" "true")]) + (set_attr "fptype" "")]) (define_insn "one_cmpl2" [(set (match_operand:VL 0 "register_operand" "=") @@ -8721,8 +8841,7 @@ "TARGET_VIS" "fnot1\t%1, %0" [(set_attr "type" "visl") - (set_attr "fptype" "") - (set_attr "v3pipe" "true")]) + (set_attr "fptype" "")]) ;; Hard to generate VIS instructions. We have builtins for these. @@ -8764,6 +8883,7 @@ "TARGET_VIS" "fexpand\t%1, %0" [(set_attr "type" "fga") + (set_attr "subtype" "fpu") (set_attr "fptype" "double")]) (define_insn "fpmerge_vis" @@ -8778,6 +8898,7 @@ "TARGET_VIS" "fpmerge\t%1, %2, %0" [(set_attr "type" "fga") + (set_attr "subtype" "fpu") (set_attr "fptype" "double")]) ;; Partitioned multiply instructions @@ -8866,7 +8987,8 @@ [(set (reg:DI GSR_REG) (match_operand:DI 0 "arith_operand" "rI"))] "TARGET_VIS && TARGET_ARCH64" "wr\t%%g0, %0, %%gsr" - [(set_attr "type" "gsr")]) + [(set_attr "type" "gsr") + (set_attr "subtype" "reg")]) (define_insn "wrgsr_v8plus" [(set (reg:DI GSR_REG) (match_operand:DI 0 "arith_operand" "I,r")) @@ -8897,7 +9019,8 @@ [(set (match_operand:DI 0 "register_operand" "=r") (reg:DI GSR_REG))] "TARGET_VIS && TARGET_ARCH64" "rd\t%%gsr, %0" - [(set_attr "type" "gsr")]) + [(set_attr "type" "gsr") + (set_attr "subtype" "reg")]) (define_insn "rdgsr_v8plus" [(set (match_operand:DI 0 "register_operand" "=r") (reg:DI GSR_REG)) @@ -8920,8 +9043,8 @@ "TARGET_VIS" "faligndata\t%1, %2, %0" [(set_attr "type" "fga") - (set_attr "fptype" "double") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "other") + (set_attr "fptype" "double")]) (define_insn "alignaddrsi_vis" [(set (match_operand:SI 0 "register_operand" "=r") @@ -8932,7 +9055,7 @@ "TARGET_VIS" "alignaddr\t%r1, %r2, %0" [(set_attr "type" "gsr") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "alignaddr")]) (define_insn "alignaddrdi_vis" [(set (match_operand:DI 0 "register_operand" "=r") @@ -8943,7 +9066,7 @@ "TARGET_VIS" "alignaddr\t%r1, %r2, %0" [(set_attr "type" "gsr") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "alignaddr")]) (define_insn "alignaddrlsi_vis" [(set (match_operand:SI 0 "register_operand" "=r") @@ -8955,7 +9078,7 @@ "TARGET_VIS" "alignaddrl\t%r1, %r2, %0" [(set_attr "type" "gsr") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "alignaddr")]) (define_insn "alignaddrldi_vis" [(set (match_operand:DI 0 "register_operand" "=r") @@ -8967,7 +9090,7 @@ "TARGET_VIS" "alignaddrl\t%r1, %r2, %0" [(set_attr "type" "gsr") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "alignaddr")]) (define_insn "pdist_vis" [(set (match_operand:DI 0 "register_operand" "=e") @@ -9059,9 +9182,7 @@ UNSPEC_FCMP))] "TARGET_VIS" "fcmp\t%1, %2, %0" - [(set_attr "type" "visl") - (set_attr "fptype" "double") - (set_attr "v3pipe" "true")]) + [(set_attr "type" "viscmp")]) (define_insn "fpcmp8_vis" [(set (match_operand:P 0 "register_operand" "=r") @@ -9070,8 +9191,7 @@ UNSPEC_FCMP))] "TARGET_VIS4" "fpcmp8\t%1, %2, %0" - [(set_attr "type" "visl") - (set_attr "fptype" "double")]) + [(set_attr "type" "viscmp")]) (define_expand "vcond" [(match_operand:GCM 0 "register_operand" "") @@ -9134,8 +9254,7 @@ (plus:DI (match_dup 1) (match_dup 2)))] "TARGET_VIS2 && TARGET_ARCH64" "bmask\t%r1, %r2, %0" - [(set_attr "type" "array") - (set_attr "v3pipe" "true")]) + [(set_attr "type" "bmask")]) (define_insn "bmasksi_vis" [(set (match_operand:SI 0 "register_operand" "=r") @@ -9145,8 +9264,7 @@ (zero_extend:DI (plus:SI (match_dup 1) (match_dup 2))))] "TARGET_VIS2" "bmask\t%r1, %r2, %0" - [(set_attr "type" "array") - (set_attr "v3pipe" "true")]) + [(set_attr "type" "bmask")]) (define_insn "bshuffle_vis" [(set (match_operand:VM64 0 "register_operand" "=e") @@ -9157,8 +9275,8 @@ "TARGET_VIS2" "bshuffle\t%1, %2, %0" [(set_attr "type" "fga") - (set_attr "fptype" "double") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "other") + (set_attr "fptype" "double")]) ;; The rtl expanders will happily convert constant permutations on other ;; modes down to V8QI. Rely on this to avoid the complexity of the byte @@ -9261,7 +9379,7 @@ "TARGET_VIS3" "cmask8\t%r0" [(set_attr "type" "fga") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "cmask")]) (define_insn "cmask16_vis" [(set (reg:DI GSR_REG) @@ -9271,7 +9389,7 @@ "TARGET_VIS3" "cmask16\t%r0" [(set_attr "type" "fga") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "cmask")]) (define_insn "cmask32_vis" [(set (reg:DI GSR_REG) @@ -9281,7 +9399,7 @@ "TARGET_VIS3" "cmask32\t%r0" [(set_attr "type" "fga") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "cmask")]) (define_insn "fchksm16_vis" [(set (match_operand:V4HI 0 "register_operand" "=e") @@ -9290,7 +9408,8 @@ UNSPEC_FCHKSM16))] "TARGET_VIS3" "fchksm16\t%1, %2, %0" - [(set_attr "type" "fga")]) + [(set_attr "type" "fga") + (set_attr "subtype" "fpu")]) (define_code_iterator vis3_shift [ashift ss_ashift lshiftrt ashiftrt]) (define_code_attr vis3_shift_insn @@ -9304,7 +9423,8 @@ (match_operand:GCM 2 "register_operand" "")))] "TARGET_VIS3" "\t%1, %2, %0" - [(set_attr "type" "fga")]) + [(set_attr "type" "fga") + (set_attr "subtype" "fpu")]) (define_insn "pdistn_vis" [(set (match_operand:P 0 "register_operand" "=r") @@ -9314,8 +9434,7 @@ "TARGET_VIS3" "pdistn\t%1, %2, %0" [(set_attr "type" "pdistn") - (set_attr "fptype" "double") - (set_attr "v3pipe" "true")]) + (set_attr "fptype" "double")]) (define_insn "fmean16_vis" [(set (match_operand:V4HI 0 "register_operand" "=e") @@ -9332,7 +9451,8 @@ (const_int 1))))] "TARGET_VIS3" "fmean16\t%1, %2, %0" - [(set_attr "type" "fga")]) + [(set_attr "type" "fga") + (set_attr "subtype" "fpu")]) (define_insn "fp64_vis" [(set (match_operand:V1DI 0 "register_operand" "=e") @@ -9340,7 +9460,8 @@ (match_operand:V1DI 2 "register_operand" "e")))] "TARGET_VIS3" "fp64\t%1, %2, %0" - [(set_attr "type" "fga")]) + [(set_attr "type" "fga") + (set_attr "subtype" "addsub64")]) (define_insn "v8qi3" [(set (match_operand:V8QI 0 "register_operand" "=e") @@ -9348,7 +9469,8 @@ (match_operand:V8QI 2 "register_operand" "e")))] "TARGET_VIS4" "fp8\t%1, %2, %0" - [(set_attr "type" "fga")]) + [(set_attr "type" "fga") + (set_attr "subtype" "other")]) (define_mode_iterator VASS [V4HI V2SI V2HI V1SI]) (define_code_iterator vis3_addsub_ss [ss_plus ss_minus]) @@ -9364,7 +9486,7 @@ "TARGET_VIS3" "\t%1, %2, %0" [(set_attr "type" "fga") - (set_attr "v3pipe" "true")]) + (set_attr "subtype" "other")]) (define_mode_iterator VMMAX [V8QI V4HI V2SI]) (define_code_iterator vis4_minmax [smin smax]) @@ -9379,7 +9501,8 @@ (match_operand:VMMAX 2 "register_operand" "")))] "TARGET_VIS4" "\t%1, %2, %0" - [(set_attr "type" "fga")]) + [(set_attr "type" "fga") + (set_attr "subtype" "maxmin")]) (define_code_iterator vis4_uminmax [umin umax]) (define_code_attr vis4_uminmax_insn @@ -9393,7 +9516,8 @@ (match_operand:VMMAX 2 "register_operand" "")))] "TARGET_VIS4" "\t%1, %2, %0" - [(set_attr "type" "fga")]) + [(set_attr "type" "fga") + (set_attr "subtype" "maxmin")]) ;; The use of vis3_addsub_ss_patname in the VIS4 instruction below is ;; intended. @@ -9403,7 +9527,8 @@ (match_operand:V8QI 2 "register_operand" "e")))] "TARGET_VIS4" "8\t%1, %2, %0" - [(set_attr "type" "fga")]) + [(set_attr "type" "fga") + (set_attr "subtype" "other")]) (define_mode_iterator VAUS [V4HI V8QI]) (define_code_iterator vis4_addsub_us [us_plus us_minus]) @@ -9418,7 +9543,8 @@ (match_operand:VAUS 2 "register_operand" "")))] "TARGET_VIS4" "\t%1, %2, %0" - [(set_attr "type" "fga")]) + [(set_attr "type" "fga") + (set_attr "subtype" "other")]) (define_insn "fucmp8_vis" [(set (match_operand:P 0 "register_operand" "=r") @@ -9427,8 +9553,7 @@ UNSPEC_FUCMP))] "TARGET_VIS3" "fucmp8\t%1, %2, %0" - [(set_attr "type" "visl") - (set_attr "v3pipe" "true")]) + [(set_attr "type" "viscmp")]) (define_insn "fpcmpu_vis" [(set (match_operand:P 0 "register_operand" "=r") @@ -9437,8 +9562,7 @@ UNSPEC_FUCMP))] "TARGET_VIS4" "fpcmpu\t%1, %2, %0" - [(set_attr "type" "visl") - (set_attr "fptype" "double")]) + [(set_attr "type" "viscmp")]) (define_insn "*naddsf3" [(set (match_operand:SF 0 "register_operand" "=f") @@ -9542,4 +9666,62 @@ [(set_attr "type" "fp") (set_attr "fptype" "double")]) +;; VIS4B instructions. + +(define_mode_iterator DUMODE [V2SI V4HI V8QI]) + +(define_insn "dictunpack" + [(set (match_operand:DUMODE 0 "register_operand" "=e") + (unspec:DUMODE [(match_operand:DF 1 "register_operand" "e") + (match_operand:SI 2 "imm5_operand_dictunpack" "t")] + UNSPEC_DICTUNPACK))] + "TARGET_VIS4B" + "dictunpack\t%1, %2, %0" + [(set_attr "type" "fga") + (set_attr "subtype" "other")]) + +(define_mode_iterator FPCSMODE [V2SI V4HI V8QI]) +(define_code_iterator fpcscond [le gt eq ne]) +(define_code_iterator fpcsucond [le gt]) + +(define_insn "fpcmpshl" + [(set (match_operand:P 0 "register_operand" "=r") + (unspec:P [(fpcscond:FPCSMODE (match_operand:FPCSMODE 1 "register_operand" "e") + (match_operand:FPCSMODE 2 "register_operand" "e")) + (match_operand:SI 3 "imm2_operand" "q")] + UNSPEC_FPCMPSHL))] + "TARGET_VIS4B" + "fpcmpshl\t%1, %2, %3, %0" + [(set_attr "type" "viscmp")]) + +(define_insn "fpcmpushl" + [(set (match_operand:P 0 "register_operand" "=r") + (unspec:P [(fpcsucond:FPCSMODE (match_operand:FPCSMODE 1 "register_operand" "e") + (match_operand:FPCSMODE 2 "register_operand" "e")) + (match_operand:SI 3 "imm2_operand" "q")] + UNSPEC_FPUCMPSHL))] + "TARGET_VIS4B" + "fpcmpushl\t%1, %2, %3, %0" + [(set_attr "type" "viscmp")]) + +(define_insn "fpcmpdeshl" + [(set (match_operand:P 0 "register_operand" "=r") + (unspec:P [(match_operand:FPCSMODE 1 "register_operand" "e") + (match_operand:FPCSMODE 2 "register_operand" "e") + (match_operand:SI 3 "imm2_operand" "q")] + UNSPEC_FPCMPDESHL))] + "TARGET_VIS4B" + "fpcmpdeshl\t%1, %2, %3, %0" + [(set_attr "type" "viscmp")]) + +(define_insn "fpcmpurshl" + [(set (match_operand:P 0 "register_operand" "=r") + (unspec:P [(match_operand:FPCSMODE 1 "register_operand" "e") + (match_operand:FPCSMODE 2 "register_operand" "e") + (match_operand:SI 3 "imm2_operand" "q")] + UNSPEC_FPCMPURSHL))] + "TARGET_VIS4B" + "fpcmpurshl\t%1, %2, %3, %0" + [(set_attr "type" "viscmp")]) + (include "sync.md") diff --git a/gcc/config/sparc/sparc.opt b/gcc/config/sparc/sparc.opt index 86f85d9058f..cc51bd4b584 100644 --- a/gcc/config/sparc/sparc.opt +++ b/gcc/config/sparc/sparc.opt @@ -81,6 +81,10 @@ mvis4 Target Report Mask(VIS4) Use UltraSPARC Visual Instruction Set version 4.0 extensions. +mvis4b +Target Report Mask(VIS4B) +Use additional VIS instructions introduced in OSA2017. + mcbcond Target Report Mask(CBCOND) Use UltraSPARC Compare-and-Branch extensions. @@ -209,6 +213,9 @@ Enum(sparc_processor_type) String(niagara4) Value(PROCESSOR_NIAGARA4) EnumValue Enum(sparc_processor_type) String(niagara7) Value(PROCESSOR_NIAGARA7) +EnumValue +Enum(sparc_processor_type) String(m8) Value(PROCESSOR_M8) + mcmodel= Target RejectNegative Joined Var(sparc_cmodel_string) Use given SPARC-V9 code model. diff --git a/gcc/config/sparc/ultra1_2.md b/gcc/config/sparc/ultra1_2.md index 6af285931e4..a4fb88345d6 100644 --- a/gcc/config/sparc/ultra1_2.md +++ b/gcc/config/sparc/ultra1_2.md @@ -263,10 +263,10 @@ (define_insn_reservation "us1_fga_double" 2 - (and (and - (eq_attr "cpu" "ultrasparc") - (eq_attr "type" "fga,visl,vismv")) - (eq_attr "fptype" "double")) + (and (eq_attr "cpu" "ultrasparc") + (ior (and (eq_attr "type" "fga,visl,vismv") + (eq_attr "fptype" "double")) + (eq_attr "type" "viscmp"))) "us1_fpa + us1_fp_double + us1_slotany, nothing") (define_bypass 1 "us1_fga_double" "us1_fga_double") diff --git a/gcc/config/sparc/ultra3.md b/gcc/config/sparc/ultra3.md index 6296b38cbbd..db20cd9c982 100644 --- a/gcc/config/sparc/ultra3.md +++ b/gcc/config/sparc/ultra3.md @@ -56,7 +56,7 @@ (define_insn_reservation "us3_array" 2 (and (eq_attr "cpu" "ultrasparc3") - (eq_attr "type" "array,edgen")) + (eq_attr "type" "array,edgen,bmask")) "us3_ms + us3_slotany, nothing") ;; ??? Not entirely accurate. @@ -176,7 +176,7 @@ (define_insn_reservation "us3_fga" 3 (and (eq_attr "cpu" "ultrasparc3") - (eq_attr "type" "fga,visl,vismv")) + (eq_attr "type" "fga,visl,viscmp,vismv")) "us3_fpa + us3_slotany, nothing*2") (define_insn_reservation "us3_fgm" diff --git a/gcc/configure b/gcc/configure index 4c5900fc1ba..893f9587efa 100755 --- a/gcc/configure +++ b/gcc/configure @@ -25282,6 +25282,41 @@ $as_echo "#define HAVE_AS_SPARC5_VIS4 1" >>confdefs.h fi + { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for SPARC6 instructions" >&5 +$as_echo_n "checking assembler for SPARC6 instructions... " >&6; } +if test "${gcc_cv_as_sparc_sparc6+set}" = set; then : + $as_echo_n "(cached) " >&6 +else + gcc_cv_as_sparc_sparc6=no + if test x$gcc_cv_as != x; then + $as_echo '.text + .register %g2, #scratch + .register %g3, #scratch + .align 4 + rd %entropy, %g1 + fpsll64x %f0, %f2, %f4' > conftest.s + if { ac_try='$gcc_cv_as $gcc_cv_as_flags -xarch=sparc6 -o conftest.o conftest.s >&5' + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 + (eval $ac_try) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; } + then + gcc_cv_as_sparc_sparc6=yes + else + echo "configure: failed program was" >&5 + cat conftest.s >&5 + fi + rm -f conftest.o conftest.s + fi +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_sparc_sparc6" >&5 +$as_echo "$gcc_cv_as_sparc_sparc6" >&6; } +if test $gcc_cv_as_sparc_sparc6 = yes; then + +$as_echo "#define HAVE_AS_SPARC6 1" >>confdefs.h + +fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for LEON instructions" >&5 $as_echo_n "checking assembler for LEON instructions... " >&6; } diff --git a/gcc/configure.ac b/gcc/configure.ac index f50223a70ba..c6a9929a093 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -4003,6 +4003,18 @@ foo: [AC_DEFINE(HAVE_AS_SPARC5_VIS4, 1, [Define if your assembler supports SPARC5 and VIS 4.0 instructions.])]) + gcc_GAS_CHECK_FEATURE([SPARC6 instructions], + gcc_cv_as_sparc_sparc6,, + [-xarch=sparc6], + [.text + .register %g2, #scratch + .register %g3, #scratch + .align 4 + rd %entropy, %g1 + fpsll64x %f0, %f2, %f4],, + [AC_DEFINE(HAVE_AS_SPARC6, 1, + [Define if your assembler supports SPARC6 instructions.])]) + gcc_GAS_CHECK_FEATURE([LEON instructions], gcc_cv_as_sparc_leon,, [-Aleon], diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 5cb512fe575..3bef461c8f3 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -19253,6 +19253,45 @@ v4hi __builtin_vis_fpminu16 (v4hi, v4hi); v2si __builtin_vis_fpminu32 (v2si, v2si); @end smallexample +When you use the @option{-mvis4b} switch, the VIS version 4.0B +built-in functions also become available: + +@smallexample +v8qi __builtin_vis_dictunpack8 (double, int); +v4hi __builtin_vis_dictunpack16 (double, int); +v2si __builtin_vis_dictunpack32 (double, int); + +long __builtin_vis_fpcmple8shl (v8qi, v8qi, int); +long __builtin_vis_fpcmpgt8shl (v8qi, v8qi, int); +long __builtin_vis_fpcmpeq8shl (v8qi, v8qi, int); +long __builtin_vis_fpcmpne8shl (v8qi, v8qi, int); + +long __builtin_vis_fpcmple16shl (v4hi, v4hi, int); +long __builtin_vis_fpcmpgt16shl (v4hi, v4hi, int); +long __builtin_vis_fpcmpeq16shl (v4hi, v4hi, int); +long __builtin_vis_fpcmpne16shl (v4hi, v4hi, int); + +long __builtin_vis_fpcmple32shl (v2si, v2si, int); +long __builtin_vis_fpcmpgt32shl (v2si, v2si, int); +long __builtin_vis_fpcmpeq32shl (v2si, v2si, int); +long __builtin_vis_fpcmpne32shl (v2si, v2si, int); + +long __builtin_vis_fpcmpule8shl (v8qi, v8qi, int); +long __builtin_vis_fpcmpugt8shl (v8qi, v8qi, int); +long __builtin_vis_fpcmpule16shl (v4hi, v4hi, int); +long __builtin_vis_fpcmpugt16shl (v4hi, v4hi, int); +long __builtin_vis_fpcmpule32shl (v2si, v2si, int); +long __builtin_vis_fpcmpugt32shl (v2si, v2si, int); + +long __builtin_vis_fpcmpde8shl (v8qi, v8qi, int); +long __builtin_vis_fpcmpde16shl (v4hi, v4hi, int); +long __builtin_vis_fpcmpde32shl (v2si, v2si, int); + +long __builtin_vis_fpcmpur8shl (v8qi, v8qi, int); +long __builtin_vis_fpcmpur16shl (v4hi, v4hi, int); +long __builtin_vis_fpcmpur32shl (v2si, v2si, int); +@end smallexample + @node SPU Built-in Functions @subsection SPU Built-in Functions diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d0b90503ced..aa848bb2348 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1125,6 +1125,7 @@ See RS/6000 and PowerPC Options. -muser-mode -mno-user-mode @gol -mv8plus -mno-v8plus -mvis -mno-vis @gol -mvis2 -mno-vis2 -mvis3 -mno-vis3 @gol +-mvis4 -mno-vis4 -mvis4b -mno-vis4b @gol -mcbcond -mno-cbcond -mfmaf -mno-fmaf @gol -mpopc -mno-popc -msubxc -mno-subxc@gol -mfix-at697f -mfix-ut699 @gol @@ -23893,7 +23894,7 @@ for machine type @var{cpu_type}. Supported values for @var{cpu_type} are @samp{leon}, @samp{leon3}, @samp{leon3v7}, @samp{sparclite}, @samp{f930}, @samp{f934}, @samp{sparclite86x}, @samp{sparclet}, @samp{tsc701}, @samp{v9}, @samp{ultrasparc}, @samp{ultrasparc3}, @samp{niagara}, @samp{niagara2}, -@samp{niagara3}, @samp{niagara4} and @samp{niagara7}. +@samp{niagara3}, @samp{niagara4}, @samp{niagara7} and @samp{m8}. Native Solaris and GNU/Linux toolchains also support the value @samp{native}, which selects the best architecture option for the host processor. @@ -23921,7 +23922,8 @@ f930, f934, sparclite86x tsc701 @item v9 -ultrasparc, ultrasparc3, niagara, niagara2, niagara3, niagara4, niagara7 +ultrasparc, ultrasparc3, niagara, niagara2, niagara3, niagara4, +niagara7, m8 @end table By default (unless configured otherwise), GCC generates code for the V7 @@ -23965,7 +23967,8 @@ additionally optimizes it for Sun UltraSPARC T2 chips. With UltraSPARC T3 chips. With @option{-mcpu=niagara4}, the compiler additionally optimizes it for Sun UltraSPARC T4 chips. With @option{-mcpu=niagara7}, the compiler additionally optimizes it for -Oracle SPARC M7 chips. +Oracle SPARC M7 chips. With @option{-mcpu=m8}, the compiler +additionally optimizes it for Oracle M8 chips. @item -mtune=@var{cpu_type} @opindex mtune @@ -23980,8 +23983,8 @@ that select a particular CPU implementation. Those are @samp{leon3}, @samp{leon3v7}, @samp{f930}, @samp{f934}, @samp{sparclite86x}, @samp{tsc701}, @samp{ultrasparc}, @samp{ultrasparc3}, @samp{niagara}, @samp{niagara2}, @samp{niagara3}, -@samp{niagara4} and @samp{niagara7}. With native Solaris and -GNU/Linux toolchains, @samp{native} can also be used. +@samp{niagara4}, @samp{niagara7} and @samp{m8}. With native Solaris +and GNU/Linux toolchains, @samp{native} can also be used. @item -mv8plus @itemx -mno-v8plus @@ -24029,6 +24032,18 @@ default is @option{-mvis4} when targeting a cpu that supports such instructions, such as niagara-7 and later. Setting @option{-mvis4} also sets @option{-mvis3}, @option{-mvis2} and @option{-mvis}. +@item -mvis4b +@itemx -mno-vis4b +@opindex mvis4b +@opindex mno-vis4b +With @option{-mvis4b}, GCC generates code that takes advantage of +version 4.0 of the UltraSPARC Visual Instruction Set extensions, plus +the additional VIS instructions introduced in the Oracle SPARC +Architecture 2017. The default is @option{-mvis4b} when targeting a +cpu that supports such instructions, such as m8 and later. Setting +@option{-mvis4b} also sets @option{-mvis4}, @option{-mvis3}, +@option{-mvis2} and @option{-mvis}. + @item -mcbcond @itemx -mno-cbcond @opindex mcbcond diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index f0f068b663f..6e53e295b5a 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,11 @@ +2017-07-07 Jose E. Marchesi + + * gcc.target/sparc/dictunpack.c: New file. + * gcc.target/sparc/fpcmpdeshl.c: Likewise. + * gcc.target/sparc/fpcmpshl.c: Likewise. + * gcc.target/sparc/fpcmpurshl.c: Likewise. + * gcc.target/sparc/fpcmpushl.c: Likewise. + 2017-07-06 Harald Anlauf PR fortran/70071 diff --git a/gcc/testsuite/gcc.target/sparc/dictunpack.c b/gcc/testsuite/gcc.target/sparc/dictunpack.c new file mode 100644 index 00000000000..4334dee2b2e --- /dev/null +++ b/gcc/testsuite/gcc.target/sparc/dictunpack.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-mvis4b" } */ + +typedef unsigned char vec8 __attribute__((vector_size(8))); +typedef short vec16 __attribute__((vector_size(8))); +typedef int vec32 __attribute__((vector_size(8))); + +vec8 test_dictunpack8 (double a) +{ + return __builtin_vis_dictunpack8 (a, 6); +} + +vec16 test_dictunpack16 (double a) +{ + return __builtin_vis_dictunpack16 (a, 14); +} + +vec32 test_dictunpack32 (double a) +{ + return __builtin_vis_dictunpack32 (a, 30); +} + +/* { dg-final { scan-assembler "dictunpack\t%" } } */ +/* { dg-final { scan-assembler "dictunpack\t%" } } */ +/* { dg-final { scan-assembler "dictunpack\t%" } } */ diff --git a/gcc/testsuite/gcc.target/sparc/fpcmpdeshl.c b/gcc/testsuite/gcc.target/sparc/fpcmpdeshl.c new file mode 100644 index 00000000000..3e3daa6e99f --- /dev/null +++ b/gcc/testsuite/gcc.target/sparc/fpcmpdeshl.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-mvis4b" } */ + +typedef unsigned char vec8 __attribute__((vector_size(8))); +typedef short vec16 __attribute__((vector_size(8))); +typedef int vec32 __attribute__((vector_size(8))); + +long test_fpcmpde8shl (vec8 a, vec8 b) +{ + return __builtin_vis_fpcmpde8shl (a, b, 2); +} + +long test_fpcmpde16shl (vec16 a, vec16 b) +{ + return __builtin_vis_fpcmpde16shl (a, b, 2); +} + +long test_fpcmpde32shl (vec32 a, vec32 b) +{ + return __builtin_vis_fpcmpde32shl (a, b, 2); +} + +/* { dg-final { scan-assembler "fpcmpde8shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpde16shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpde32shl\t%" } } */ diff --git a/gcc/testsuite/gcc.target/sparc/fpcmpshl.c b/gcc/testsuite/gcc.target/sparc/fpcmpshl.c new file mode 100644 index 00000000000..0985251cbfd --- /dev/null +++ b/gcc/testsuite/gcc.target/sparc/fpcmpshl.c @@ -0,0 +1,81 @@ +/* { dg-do compile } */ +/* { dg-options "-mvis4b" } */ + +typedef unsigned char vec8 __attribute__((vector_size(8))); +typedef short vec16 __attribute__((vector_size(8))); +typedef int vec32 __attribute__((vector_size(8))); + +long test_fpcmple8shl (vec8 a, vec8 b) +{ + return __builtin_vis_fpcmple8shl (a, b, 2); +} + +long test_fpcmpgt8shl (vec8 a, vec8 b) +{ + return __builtin_vis_fpcmpgt8shl (a, b, 2); +} + +long test_fpcmpeq8shl (vec8 a, vec8 b) +{ + return __builtin_vis_fpcmpeq8shl (a, b, 2); +} + +long test_fpcmpne8shl (vec8 a, vec8 b) +{ + return __builtin_vis_fpcmpne8shl (a, b, 2); +} + +long test_fpcmple16shl (vec16 a, vec16 b) +{ + return __builtin_vis_fpcmple16shl (a, b, 2); +} + +long test_fpcmpgt16shl (vec16 a, vec16 b) +{ + return __builtin_vis_fpcmpgt16shl (a, b, 2); +} + +long test_fpcmpeq16shl (vec16 a, vec16 b) +{ + return __builtin_vis_fpcmpeq16shl (a, b, 2); +} + +long test_fpcmpne16shl (vec16 a, vec16 b) +{ + return __builtin_vis_fpcmpne16shl (a, b, 2); +} + +long test_fpcmple32shl (vec32 a, vec32 b) +{ + return __builtin_vis_fpcmple32shl (a, b, 2); +} + +long test_fpcmpgt32shl (vec32 a, vec32 b) +{ + return __builtin_vis_fpcmpgt32shl (a, b, 2); +} + +long test_fpcmpeq32shl (vec32 a, vec32 b) +{ + return __builtin_vis_fpcmpeq32shl (a, b, 2); +} + +long test_fpcmpne32shl (vec32 a, vec32 b) +{ + return __builtin_vis_fpcmpne32shl (a, b, 2); +} + +/* { dg-final { scan-assembler "fpcmple8shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpgt8shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpeq8shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpne8shl\t%" } } */ + +/* { dg-final { scan-assembler "fpcmple16shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpgt16shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpeq16shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpne16shl\t%" } } */ + +/* { dg-final { scan-assembler "fpcmple32shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpgt32shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpeq32shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpne32shl\t%" } } */ diff --git a/gcc/testsuite/gcc.target/sparc/fpcmpurshl.c b/gcc/testsuite/gcc.target/sparc/fpcmpurshl.c new file mode 100644 index 00000000000..db74e01b5f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/sparc/fpcmpurshl.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-mvis4b" } */ + +typedef unsigned char vec8 __attribute__((vector_size(8))); +typedef short vec16 __attribute__((vector_size(8))); +typedef int vec32 __attribute__((vector_size(8))); + +long test_fpcmpur8shl (vec8 a, vec8 b) +{ + return __builtin_vis_fpcmpur8shl (a, b, 2); +} + +long test_fpcmpur16shl (vec16 a, vec16 b) +{ + return __builtin_vis_fpcmpur16shl (a, b, 2); +} + +long test_fpcmpur32shl (vec32 a, vec32 b) +{ + return __builtin_vis_fpcmpur32shl (a, b, 2); +} + +/* { dg-final { scan-assembler "fpcmpur8shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpur16shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpur32shl\t%" } } */ diff --git a/gcc/testsuite/gcc.target/sparc/fpcmpushl.c b/gcc/testsuite/gcc.target/sparc/fpcmpushl.c new file mode 100644 index 00000000000..fc58deddb45 --- /dev/null +++ b/gcc/testsuite/gcc.target/sparc/fpcmpushl.c @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options "-mvis4b" } */ + +typedef unsigned char vec8 __attribute__((vector_size(8))); +typedef short vec16 __attribute__((vector_size(8))); +typedef int vec32 __attribute__((vector_size(8))); + +long test_fpcmpule8shl (vec8 a, vec8 b) +{ + return __builtin_vis_fpcmpule8shl (a, b, 2); +} + +long test_fpcmpugt8shl (vec8 a, vec8 b) +{ + return __builtin_vis_fpcmpugt8shl (a, b, 2); +} + +long test_fpcmpule16shl (vec16 a, vec16 b) +{ + return __builtin_vis_fpcmpule16shl (a, b, 2); +} + +long test_fpcmpugt16shl (vec16 a, vec16 b) +{ + return __builtin_vis_fpcmpugt16shl (a, b, 2); +} + +long test_fpcmpule32shl (vec32 a, vec32 b) +{ + return __builtin_vis_fpcmpule32shl (a, b, 2); +} + +long test_fpcmpugt32shl (vec32 a, vec32 b) +{ + return __builtin_vis_fpcmpugt32shl (a, b, 2); +} + +/* { dg-final { scan-assembler "fpcmpule8shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpugt8shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpule16shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpugt16shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpule32shl\t%" } } */ +/* { dg-final { scan-assembler "fpcmpugt32shl\t%" } } */ -- 2.30.2