From: Michael Meissner Date: Wed, 15 Jun 2016 18:17:58 +0000 (+0000) Subject: vsx.md (VSINT_84): Add DImode to enable loading DImode constants with XXSPLTIB in... X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=1a3c3ee9bc4639fc67e037b6837d2625327555fd;p=gcc.git vsx.md (VSINT_84): Add DImode to enable loading DImode constants with XXSPLTIB in vector registers. [gcc] 2016-06-15 Michael Meissner * config/rs6000/vsx.md (VSINT_84): Add DImode to enable loading DImode constants with XXSPLTIB in vector registers. (vsx_extract_, V2DImode/V2DFmode): Combine both vsx_extract__internal{1,2} into a single insn that handles direct move (both ISA 2.07 and ISA 3.0 versions), and optimizes extraction of the element at the top of the register as a scalar value. (vsx_extract__internal1): Likewise. (vsx_extract__internal2): Likewise. * config/rs6000/constraints.md (wi constraint): Remove a comment about DImode not being allowed in Altivec registers. (wB constraint): New constraint for constants that can be generated in Altivec registers with VSPLTISW/VUPKHSW. * config/rs6000/predicates.md (xxspltib_constant_split): Update comments. (xxspltib_constant_nosplit): Likewise. * config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Add support for -mupper-regs-di to enable DImode to go into Altivec registers. (POWERPC_MASKS): Likewise. (power7 cpu): Likewise. * config/rs6000/rs6000.opt (-mupper-regs-di): Likewise. * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support for DImode being allowed in Altivec registers. Update wi/wj constraints. Set scalar_in_vmx_p flag. (rs6000_option_override_internal): Add checks for -mupper-regs-di. (xxspltib_constant_p): Allow CONST_INT's with VOIDmode. Don't return true if we could use VSPLTISW/VUPKHSW instead of XXSPLTIB. (rs6000_opt_masks): Add -mupper-regs-di. * config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use direct move to use wi and not wj. (lfiwzx): Likewise. (floatsi2_lfiwax_mem): Combine alternatives into a single alternative. (floatunssi2_lfiwzx_mem): Likewise. (fix_truncdi2_fctidz): Change second alternative to allow any VSX register, instead of just Altivec registers, to allow either operand to be an Altivec register or both. (fixuns_truncdi2_fctiduz): Likewise. (movdi_internal32): Add support for -mupper-regs-di. Add support to load constants via XXSPLTIB or VSPLTISW. Add spacing to allow the alternatives and attributes to be lined up to be easier to read. (movdi_internal64): Likewise. (64-bit DImode splitters): Change predicates to only split loading up GPR registers. Add splits for using XXSPLTIB or VSPLTISW to load constants in ISA 3.0 or ISA 2.07 respectively. * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mupper-regs-di. Update -mupper-regs-df and -mupper-regs-sf to mention -mcpu=power9 sets these options. * doc/md.texi (PowerPC and IBM RS6000 constraints): Document the wB constraint. [gcc/testsuite] 2016-06-15 Michael Meissner * gcc.target/powerpc/p9-dimode1.c: New test. * gcc.target/powerpc/p9-dimode2.c: Likewise. From-SVN: r237490 --- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 7f3fe7e2fea..2650405a3f7 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,58 @@ +2016-06-15 Michael Meissner + + * config/rs6000/vsx.md (VSINT_84): Add DImode to enable loading + DImode constants with XXSPLTIB in vector registers. + (vsx_extract_, V2DImode/V2DFmode): Combine both + vsx_extract__internal{1,2} into a single insn that handles + direct move (both ISA 2.07 and ISA 3.0 versions), and optimizes + extraction of the element at the top of the register as a scalar + value. + (vsx_extract__internal1): Likewise. + (vsx_extract__internal2): Likewise. + * config/rs6000/constraints.md (wi constraint): Remove a comment + about DImode not being allowed in Altivec registers. + (wB constraint): New constraint for constants that can be + generated in Altivec registers with VSPLTISW/VUPKHSW. + * config/rs6000/predicates.md (xxspltib_constant_split): Update + comments. + (xxspltib_constant_nosplit): Likewise. + * config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Add + support for -mupper-regs-di to enable DImode to go into Altivec + registers. + (POWERPC_MASKS): Likewise. + (power7 cpu): Likewise. + * config/rs6000/rs6000.opt (-mupper-regs-di): Likewise. + * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support + for DImode being allowed in Altivec registers. Update wi/wj + constraints. Set scalar_in_vmx_p flag. + (rs6000_option_override_internal): Add checks for -mupper-regs-di. + (xxspltib_constant_p): Allow CONST_INT's with VOIDmode. Don't + return true if we could use VSPLTISW/VUPKHSW instead of XXSPLTIB. + (rs6000_opt_masks): Add -mupper-regs-di. + * config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use + direct move to use wi and not wj. + (lfiwzx): Likewise. + (floatsi2_lfiwax_mem): Combine alternatives into a single + alternative. + (floatunssi2_lfiwzx_mem): Likewise. + (fix_truncdi2_fctidz): Change second alternative to allow + any VSX register, instead of just Altivec registers, to allow + either operand to be an Altivec register or both. + (fixuns_truncdi2_fctiduz): Likewise. + (movdi_internal32): Add support for -mupper-regs-di. Add support + to load constants via XXSPLTIB or VSPLTISW. Add spacing to allow + the alternatives and attributes to be lined up to be easier to + read. + (movdi_internal64): Likewise. + (64-bit DImode splitters): Change predicates to only split loading + up GPR registers. Add splits for using XXSPLTIB or VSPLTISW to + load constants in ISA 3.0 or ISA 2.07 respectively. + * doc/invoke.texi (RS/6000 and PowerPC Options): Document + -mupper-regs-di. Update -mupper-regs-df and -mupper-regs-sf to + mention -mcpu=power9 sets these options. + * doc/md.texi (PowerPC and IBM RS6000 constraints): Document the + wB constraint. + 2016-06-15 Pitchumani Sivanupandi PR target/67353 diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md index ef8f617d9a8..8ef8f9b429e 100644 --- a/gcc/config/rs6000/constraints.md +++ b/gcc/config/rs6000/constraints.md @@ -77,8 +77,6 @@ (define_register_constraint "wh" "rs6000_constraints[RS6000_CONSTRAINT_wh]" "Floating point register if direct moves are available, or NO_REGS.") -;; At present, DImode is not allowed in the Altivec registers. If in the -;; future it is allowed, wi/wj can be set to VSX_REGS instead of FLOAT_REGS. (define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]" "FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.") @@ -135,6 +133,13 @@ (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]" "Floating point register if the LFIWZX instruction is enabled or NO_REGS.") +;; wB needs ISA 2.07 VUPKHSW +(define_constraint "wB" + "Signed 5-bit constant integer that can be loaded into an altivec register." + (and (match_code "const_int") + (and (match_test "TARGET_P8_VECTOR") + (match_operand 0 "s5bit_cint_operand")))) + (define_constraint "wD" "Int constant that is the element number of the 64-bit scalar in a vector." (and (match_code "const_int") diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index ed3e84ebb14..3d0f48ea712 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -565,9 +565,8 @@ } }) -;; Return 1 if the operand is a CONST_VECTOR or VEC_DUPLICATE of a constant -;; that can loaded with a XXSPLTIB instruction and then a VUPKHSB, VECSB2W or -;; VECSB2D instruction. +;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB +;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction. (define_predicate "xxspltib_constant_split" (match_code "const_vector,vec_duplicate,const_int") @@ -582,8 +581,8 @@ }) -;; Return 1 if the operand is a CONST_VECTOR that can loaded directly with a -;; XXSPLTIB instruction. +;; Return 1 if the operand is constant that can loaded directly with a XXSPLTIB +;; instruction. (define_predicate "xxspltib_constant_nosplit" (match_code "const_vector,vec_duplicate,const_int") diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def index 27239f1d371..a67b2d91b4e 100644 --- a/gcc/config/rs6000/rs6000-cpus.def +++ b/gcc/config/rs6000/rs6000-cpus.def @@ -45,6 +45,7 @@ | OPTION_MASK_POPCNTD \ | OPTION_MASK_ALTIVEC \ | OPTION_MASK_VSX \ + | OPTION_MASK_UPPER_REGS_DI \ | OPTION_MASK_UPPER_REGS_DF) /* For now, don't provide an embedded version of ISA 2.07. */ @@ -119,6 +120,7 @@ | OPTION_MASK_SOFT_FLOAT \ | OPTION_MASK_STRICT_ALIGN_OPTIONAL \ | OPTION_MASK_TOC_FUSION \ + | OPTION_MASK_UPPER_REGS_DI \ | OPTION_MASK_UPPER_REGS_DF \ | OPTION_MASK_UPPER_REGS_SF \ | OPTION_MASK_VSX \ @@ -211,7 +213,8 @@ RS6000_CPU ("power6x", PROCESSOR_POWER6, MASK_POWERPC64 | MASK_PPC_GPOPT RS6000_CPU ("power7", PROCESSOR_POWER7, /* Don't add MASK_ISEL by default */ POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD - | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF) + | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF + | OPTION_MASK_UPPER_REGS_DI) RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER) RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER) RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 2d7df6b3b7c..7e9e908619a 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1938,7 +1938,8 @@ rs6000_hard_regno_mode_ok (int regno, machine_mode mode) || FLOAT128_VECTOR_P (mode) || reg_addr[mode].scalar_in_vmx_p || (TARGET_VSX_TIMODE && mode == TImode) - || (TARGET_VADDUQM && mode == V1TImode))) + || (TARGET_VADDUQM && mode == V1TImode) + || (TARGET_UPPER_REGS_DI && mode == DImode))) { if (FP_REGNO_P (regno)) return FP_REGNO_P (last_regno); @@ -3082,7 +3083,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS; rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS; /* V2DFmode */ rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS; /* V4SFmode */ - rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS; /* DImode */ if (TARGET_VSX_TIMODE) rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS; /* TImode */ @@ -3094,6 +3094,11 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) } else rs6000_constraints[RS6000_CONSTRAINT_ws] = FLOAT_REGS; + + if (TARGET_UPPER_REGS_DF) /* DImode */ + rs6000_constraints[RS6000_CONSTRAINT_wi] = VSX_REGS; + else + rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS; } /* Add conditional constraints based on various options, to allow us to @@ -3306,6 +3311,9 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) if (TARGET_UPPER_REGS_DF) reg_addr[DFmode].scalar_in_vmx_p = true; + if (TARGET_UPPER_REGS_DI) + reg_addr[DImode].scalar_in_vmx_p = true; + if (TARGET_UPPER_REGS_SF) reg_addr[SFmode].scalar_in_vmx_p = true; } @@ -4085,9 +4093,9 @@ rs6000_option_override_internal (bool global_init_p) rs6000_isa_flags &= ~OPTION_MASK_DFP; } - /* Allow an explicit -mupper-regs to set both -mupper-regs-df and - -mupper-regs-sf, depending on the cpu, unless the user explicitly also set - the individual option. */ + /* Allow an explicit -mupper-regs to set -mupper-regs-df, -mupper-regs-di, + and -mupper-regs-sf, depending on the cpu, unless the user explicitly also + set the individual option. */ if (TARGET_UPPER_REGS > 0) { if (TARGET_VSX @@ -4096,6 +4104,12 @@ rs6000_option_override_internal (bool global_init_p) rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DF; rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF; } + if (TARGET_VSX + && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI)) + { + rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DI; + rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI; + } if (TARGET_P8_VECTOR && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF)) { @@ -4111,6 +4125,12 @@ rs6000_option_override_internal (bool global_init_p) rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF; rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF; } + if (TARGET_VSX + && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI)) + { + rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DI; + rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI; + } if (TARGET_P8_VECTOR && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF)) { @@ -4126,6 +4146,13 @@ rs6000_option_override_internal (bool global_init_p) rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF; } + if (TARGET_UPPER_REGS_DI && !TARGET_VSX) + { + if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF) + error ("-mupper-regs-di requires -mvsx"); + rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF; + } + if (TARGET_UPPER_REGS_SF && !TARGET_P8_VECTOR) { if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF) @@ -4386,6 +4413,7 @@ rs6000_option_override_internal (bool global_init_p) if (TARGET_FLOAT128_HW && (rs6000_isa_flags & (OPTION_MASK_P9_VECTOR | OPTION_MASK_DIRECT_MOVE + | OPTION_MASK_UPPER_REGS_DI | OPTION_MASK_UPPER_REGS_DF | OPTION_MASK_UPPER_REGS_SF)) == 0) { @@ -6284,7 +6312,7 @@ xxspltib_constant_p (rtx op, if (mode == VOIDmode) mode = GET_MODE (op); - else if (mode != GET_MODE (op)) + else if (mode != GET_MODE (op) && GET_MODE (op) != VOIDmode) return false; /* Handle (vec_duplicate ). */ @@ -6337,8 +6365,8 @@ xxspltib_constant_p (rtx op, } /* Handle integer constants being loaded into the upper part of the VSX - register as a scalar. If the value isn't 0/-1, only allow it if - the mode can go in Altivec registers. */ + register as a scalar. If the value isn't 0/-1, only allow it if the mode + can go in Altivec registers. Prefer VSPLTISW/VUPKHSW over XXSPLITIB. */ else if (CONST_INT_P (op)) { if (!SCALAR_INT_MODE_P (mode)) @@ -6348,9 +6376,14 @@ xxspltib_constant_p (rtx op, if (!IN_RANGE (value, -128, 127)) return false; - if (!IN_RANGE (value, -1, 0) - && (reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID) == 0) - return false; + if (!IN_RANGE (value, -1, 0)) + { + if (!(reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID)) + return false; + + if (EASY_VECTOR_15 (value)) + return false; + } } else @@ -35485,6 +35518,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] = { "string", OPTION_MASK_STRING, false, true }, { "toc-fusion", OPTION_MASK_TOC_FUSION, false, true }, { "update", OPTION_MASK_NO_UPDATE, true , true }, + { "upper-regs-di", OPTION_MASK_UPPER_REGS_DI, false, true }, { "upper-regs-df", OPTION_MASK_UPPER_REGS_DF, false, true }, { "upper-regs-sf", OPTION_MASK_UPPER_REGS_SF, false, true }, { "vsx", OPTION_MASK_VSX, false, true }, diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 133eef1c14a..3825cc011d6 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4866,7 +4866,7 @@ (define_insn_and_split "floatsi2_lfiwax" [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") (float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r"))) - (clobber (match_scratch:DI 2 "=wj"))] + (clobber (match_scratch:DI 2 "=wi"))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX && && can_create_pseudo_p ()" "#" @@ -4905,11 +4905,11 @@ (set_attr "type" "fpload")]) (define_insn_and_split "floatsi2_lfiwax_mem" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,") + [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") (float:SFDF (sign_extend:DI - (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z")))) - (clobber (match_scratch:DI 2 "=0,d"))] + (match_operand:SI 1 "indexed_or_indirect_operand" "Z")))) + (clobber (match_scratch:DI 2 "=wi"))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX && " "#" @@ -4941,7 +4941,7 @@ (define_insn_and_split "floatunssi2_lfiwzx" [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") (unsigned_float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r"))) - (clobber (match_scratch:DI 2 "=wj"))] + (clobber (match_scratch:DI 2 "=wi"))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX && " "#" @@ -4980,11 +4980,11 @@ (set_attr "type" "fpload")]) (define_insn_and_split "floatunssi2_lfiwzx_mem" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,") + [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") (unsigned_float:SFDF (zero_extend:DI - (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z")))) - (clobber (match_scratch:DI 2 "=0,d"))] + (match_operand:SI 1 "indexed_or_indirect_operand" "Z")))) + (clobber (match_scratch:DI 2 "=wi"))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX && " "#" @@ -5288,7 +5288,7 @@ (define_insn "*fix_truncdi2_fctidz" [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi") - (fix:DI (match_operand:SFDF 1 "gpc_reg_operand" ",")))] + (fix:DI (match_operand:SFDF 1 "gpc_reg_operand" ",")))] "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS && TARGET_FCFID" "@ @@ -5360,7 +5360,7 @@ (define_insn "*fixuns_truncdi2_fctiduz" [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi") - (unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" ",")))] + (unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" ",")))] "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS && TARGET_FCTIDUZ" "@ @@ -7700,9 +7700,25 @@ ;; non-offsettable address by using r->r which won't make progress. ;; Use of fprs is disparaged slightly otherwise reload prefers to reload ;; a gpr into a fpr instead of reloading an invalid 'Y' address + +;; GPR store GPR load GPR move FPR store FPR load FPR move +;; GPR const AVX store AVX store AVX load AVX load VSX move +;; P9 0 P9 -1 AVX 0/-1 VSX 0 VSX -1 P9 const +;; AVX const + (define_insn "*movdi_internal32" - [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=Y,r,r,?m,?*d,?*d,r") - (match_operand:DI 1 "input_operand" "r,Y,r,d,m,d,IJKnGHF"))] + [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" + "=Y, r, r, ?m, ?*d, ?*d, + r, ?Y, ?Z, ?*wb, ?*wv, ?wi, + ?wo, ?wo, ?wv, ?wi, ?wi, ?wv, + ?wv") + + (match_operand:DI 1 "input_operand" + "r, Y, r, d, m, d, + IJKnGHF, wb, wv, Y, Z, wi, + Oj, wM, OjwM, Oj, wM, wS, + wB"))] + "! TARGET_POWERPC64 && (gpc_reg_operand (operands[0], DImode) || gpc_reg_operand (operands[1], DImode))" @@ -7713,8 +7729,24 @@ stfd%U0%X0 %1,%0 lfd%U1%X1 %0,%1 fmr %0,%1 + # + stxsd %1,%0 + stxsdx %x1,%y0 + lxsd %0,%1 + lxsdx %x0,%y1 + xxlor %x0,%x1,%x1 + xxspltib %x0,0 + xxspltib %x0,255 + vspltisw %0,%1 + xxlxor %x0,%x0,%x0 + xxlorc %x0,%x0,%x0 + # #" - [(set_attr "type" "store,load,*,fpstore,fpload,fp,*")]) + [(set_attr "type" + "store, load, *, fpstore, fpload, fp, + *, fpstore, fpstore, fpload, fpload, vecsimple, + vecsimple, vecsimple, vecsimple, vecsimple, vecsimple, vecsimple, + vecsimple")]) (define_split [(set (match_operand:DI 0 "gpc_reg_operand" "") @@ -7744,9 +7776,26 @@ [(pc)] { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }) +;; GPR store GPR load GPR move GPR li GPR lis GPR # +;; FPR store FPR load FPR move AVX store AVX store AVX load +;; AVX load VSX move P9 0 P9 -1 AVX 0/-1 VSX 0 +;; VSX -1 P9 const AVX const From SPR To SPR SPR<->SPR +;; FPR->GPR GPR->FPR VSX->GPR GPR->VSX (define_insn "*movdi_internal64" - [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,r,*h,*h,r,?*wg,r,?*wj,?*wi") - (match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,*h,r,0,*wg,r,*wj,r,O"))] + [(set (match_operand:DI 0 "nonimmediate_operand" + "=Y, r, r, r, r, r, + ?m, ?*d, ?*d, ?Y, ?Z, ?*wb, + ?*wv, ?wi, ?wo, ?wo, ?wv, ?wi, + ?wi, ?wv, ?wv, r, *h, *h, + ?*r, ?*wg, ?*r, ?*wj") + + (match_operand:DI 1 "input_operand" + "r, Y, r, I, L, nF, + d, m, d, wb, wv, Y, + Z, wi, Oj, wM, OjwM, Oj, + wM, wS, wB, *h, r, 0, + wg, r, wj, r"))] + "TARGET_POWERPC64 && (gpc_reg_operand (operands[0], DImode) || gpc_reg_operand (operands[1], DImode))" @@ -7760,21 +7809,43 @@ stfd%U0%X0 %1,%0 lfd%U1%X1 %0,%1 fmr %0,%1 + stxsd %1,%0 + stxsdx %x1,%y0 + lxsd %0,%1 + lxsdx %x0,%y1 + xxlor %x0,%x1,%x1 + xxspltib %x0,0 + xxspltib %x0,255 + vspltisw %0,%1 + xxlxor %x0,%x0,%x0 + xxlorc %x0,%x0,%x0 + # + # mf%1 %0 mt%0 %1 nop mftgpr %0,%1 mffgpr %0,%1 mfvsrd %0,%x1 - mtvsrd %x0,%1 - xxlxor %x0,%x0,%x0" - [(set_attr "type" "store,load,*,*,*,*,fpstore,fpload,fp,mfjmpr,mtjmpr,*,mftgpr,mffgpr,mftgpr,mffgpr,vecsimple") - (set_attr "length" "4,4,4,4,4,20,4,4,4,4,4,4,4,4,4,4,4")]) + mtvsrd %x0,%1" + [(set_attr "type" + "store, load, *, *, *, *, + fpstore, fpload, fp, fpstore, fpstore, fpload, + fpload, vecsimple, vecsimple, vecsimple, vecsimple, vecsimple, + vecsimple, vecsimple, vecsimple, mfjmpr, mtjmpr, *, + mftgpr, mffgpr, mftgpr, mffgpr") + + (set_attr "length" + "4, 4, 4, 4, 4, 20, + 4, 4, 4, 4, 4, 4, + 4, 4, 4, 4, 4, 8, + 8, 4, 4, 4, 4, 4, + 4, 4, 4, 4")]) ; Some DImode loads are best done as a load of -1 followed by a mask ; instruction. (define_split - [(set (match_operand:DI 0 "gpc_reg_operand") + [(set (match_operand:DI 0 "int_reg_operand_not_pseudo") (match_operand:DI 1 "const_int_operand"))] "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1 @@ -7791,7 +7862,7 @@ ;; When non-easy constants can go in the TOC, this should use ;; easy_fp_constant predicate. (define_split - [(set (match_operand:DI 0 "gpc_reg_operand" "") + [(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "") (match_operand:DI 1 "const_int_operand" ""))] "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1" [(set (match_dup 0) (match_dup 2)) @@ -7805,7 +7876,7 @@ }") (define_split - [(set (match_operand:DI 0 "gpc_reg_operand" "") + [(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "") (match_operand:DI 1 "const_scalar_int_operand" ""))] "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1" [(set (match_dup 0) (match_dup 2)) @@ -7817,6 +7888,43 @@ else FAIL; }") + +(define_split + [(set (match_operand:DI 0 "altivec_register_operand" "") + (match_operand:DI 1 "s5bit_cint_operand" ""))] + "TARGET_UPPER_REGS_DI && TARGET_VSX && reload_completed" + [(const_int 0)] +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + int r = REGNO (op0); + rtx op0_v4si = gen_rtx_REG (V4SImode, r); + + emit_insn (gen_altivec_vspltisw (op0_v4si, op1)); + if (op1 != const0_rtx && op1 != constm1_rtx) + { + rtx op0_v2di = gen_rtx_REG (V2DImode, r); + emit_insn (gen_altivec_vupkhsw (op0_v2di, op0_v4si)); + } + DONE; +}) + +(define_split + [(set (match_operand:DI 0 "altivec_register_operand" "") + (match_operand:DI 1 "xxspltib_constant_split" ""))] + "TARGET_UPPER_REGS_DI && TARGET_P9_VECTOR && reload_completed" + [(const_int 0)] +{ + rtx op0 = operands[0]; + rtx op1 = operands[1]; + int r = REGNO (op0); + rtx op0_v16qi = gen_rtx_REG (V16QImode, r); + + emit_insn (gen_xxspltib_v16qi (op0_v16qi, op1)); + emit_insn (gen_vsx_sign_extend_qi_di (operands[0], op0_v16qi)); + DONE; +}) + ;; TImode/PTImode is similar, except that we usually want to compute the ;; address into a register and use lsi/stsi (the exception is during reload). diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index 92c5396c47e..4b9905fe767 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -597,6 +597,10 @@ mupper-regs Target Report Var(TARGET_UPPER_REGS) Init(-1) Save Allow float/double variables in upper registers if cpu allows it. +mupper-regs-di +Target Report Mask(UPPER_REGS_DI) Var(rs6000_isa_flags) +Allow 64-bit integer variables in upper registers with -mcpu=power7 or -mvsx. + moptimize-swaps Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save Analyze and remove doubleword swaps from VSX computations. diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 58e1cb52b97..a07d66e17f0 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -260,7 +260,7 @@ (V2DI "wi")]) ;; Iterators for loading constants with xxspltib -(define_mode_iterator VSINT_84 [V4SI V2DI]) +(define_mode_iterator VSINT_84 [V4SI V2DI DI]) (define_mode_iterator VSINT_842 [V8HI V4SI V2DI]) ;; Constants for creating unspecs @@ -2095,77 +2095,69 @@ [(set_attr "type" "vecperm")]) ;; Extract a DF/DI element from V2DF/V2DI -(define_expand "vsx_extract_" - [(set (match_operand: 0 "register_operand" "") - (vec_select: (match_operand:VSX_D 1 "register_operand" "") - (parallel - [(match_operand:QI 2 "u5bit_cint_operand" "")])))] - "VECTOR_MEM_VSX_P (mode)" - "") - ;; Optimize cases were we can do a simple or direct move. ;; Or see if we can avoid doing the move at all -(define_insn "*vsx_extract__internal1" - [(set (match_operand: 0 "register_operand" "=d,,r,r") + +;; There are some unresolved problems with reload that show up if an Altivec +;; register was picked. Limit the scalar value to FPRs for now. + +(define_insn "vsx_extract_" + [(set (match_operand: 0 "gpc_reg_operand" + "=d, wm, wo, d") + (vec_select: - (match_operand:VSX_D 1 "register_operand" "d,,,") + (match_operand:VSX_D 1 "gpc_reg_operand" + ", , , ") + (parallel - [(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD,wL")])))] - "VECTOR_MEM_VSX_P (mode) && TARGET_POWERPC64 && TARGET_DIRECT_MOVE" + [(match_operand:QI 2 "const_0_to_1_operand" + "wD, wD, wL, n")])))] + "VECTOR_MEM_VSX_P (mode)" { + int element = INTVAL (operands[2]); int op0_regno = REGNO (operands[0]); int op1_regno = REGNO (operands[1]); + int fldDM; - if (op0_regno == op1_regno) - return "nop"; - - if (INT_REGNO_P (op0_regno)) - return ((INTVAL (operands[2]) == VECTOR_ELEMENT_MFVSRLD_64BIT) - ? "mfvsrdl %0,%x1" - : "mfvsrd %0,%x1"); + gcc_assert (IN_RANGE (element, 0, 1)); + gcc_assert (VSX_REGNO_P (op1_regno)); - if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno)) - return "fmr %0,%1"; + if (element == VECTOR_ELEMENT_SCALAR_64BIT) + { + if (op0_regno == op1_regno) + return ASM_COMMENT_START " vec_extract to same register"; - return "xxlor %x0,%x1,%x1"; -} - [(set_attr "type" "fp,vecsimple,mftgpr,mftgpr") - (set_attr "length" "4")]) + else if (INT_REGNO_P (op0_regno) && TARGET_DIRECT_MOVE + && TARGET_POWERPC64) + return "mfvsrd %0,%x1"; -(define_insn "*vsx_extract__internal2" - [(set (match_operand: 0 "vsx_register_operand" "=d,,") - (vec_select: - (match_operand:VSX_D 1 "vsx_register_operand" "d,wd,wd") - (parallel [(match_operand:QI 2 "u5bit_cint_operand" "wD,wD,i")])))] - "VECTOR_MEM_VSX_P (mode) - && (!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE - || INTVAL (operands[2]) != VECTOR_ELEMENT_SCALAR_64BIT)" -{ - int fldDM; - gcc_assert (UINTVAL (operands[2]) <= 1); + else if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno)) + return "fmr %0,%1"; - if (INTVAL (operands[2]) == VECTOR_ELEMENT_SCALAR_64BIT) - { - int op0_regno = REGNO (operands[0]); - int op1_regno = REGNO (operands[1]); + else if (VSX_REGNO_P (op0_regno)) + return "xxlor %x0,%x1,%x1"; - if (op0_regno == op1_regno) - return "nop"; + else + gcc_unreachable (); + } - if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno)) - return "fmr %0,%1"; + else if (element == VECTOR_ELEMENT_MFVSRLD_64BIT && INT_REGNO_P (op0_regno) + && TARGET_P9_VECTOR && TARGET_POWERPC64 && TARGET_DIRECT_MOVE) + return "mfvsrdl %0,%x1"; - return "xxlor %x0,%x1,%x1"; + else if (VSX_REGNO_P (op0_regno)) + { + fldDM = element << 1; + if (!BYTES_BIG_ENDIAN) + fldDM = 3 - fldDM; + operands[3] = GEN_INT (fldDM); + return "xxpermdi %x0,%x1,%x1,%3"; } - fldDM = INTVAL (operands[2]) << 1; - if (!BYTES_BIG_ENDIAN) - fldDM = 3 - fldDM; - operands[3] = GEN_INT (fldDM); - return "xxpermdi %x0,%x1,%x1,%3"; + else + gcc_unreachable (); } - [(set_attr "type" "fp,vecsimple,vecperm") - (set_attr "length" "4")]) + [(set_attr "type" "vecsimple,mftgpr,mftgpr,vecperm")]) ;; Optimize extracting a single scalar element from memory if the scalar is in ;; the correct location to use a single load. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 4a89f5f556c..78734786829 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1009,6 +1009,7 @@ See RS/6000 and PowerPC Options. -mquad-memory-atomic -mno-quad-memory-atomic @gol -mcompat-align-parm -mno-compat-align-parm @gol -mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol +-mupper-regs-di -mno-upper-regs-di @gol -mupper-regs -mno-upper-regs -mmodulo -mno-modulo @gol -mfloat128 -mno-float128 -mfloat128-hardware -mno-float128-hardware @gol -mpower9-fusion -mno-mpower9-fusion -mpower9-vector -mno-power9-vector @gol @@ -20255,6 +20256,17 @@ Generate code that uses (does not use) the atomic quad word memory instructions. The @option{-mquad-memory-atomic} option requires use of 64-bit mode. +@item -mupper-regs-di +@itemx -mno-upper-regs-di +@opindex mupper-regs-di +@opindex mno-upper-regs-di +Generate code that uses (does not use) the scalar instructions that +target all 64 registers in the vector/scalar floating point register +set that were added in version 2.06 of the PowerPC ISA when processing +integers. @option{-mupper-regs-di} is turned on by default if you use +any of the @option{-mcpu=power7}, @option{-mcpu=power8}, +@option{-mcpu=power9}, or @option{-mvsx} options. + @item -mupper-regs-df @itemx -mno-upper-regs-df @opindex mupper-regs-df @@ -20263,8 +20275,8 @@ Generate code that uses (does not use) the scalar double precision instructions that target all 64 registers in the vector/scalar floating point register set that were added in version 2.06 of the PowerPC ISA. @option{-mupper-regs-df} is turned on by default if you -use any of the @option{-mcpu=power7}, @option{-mcpu=power8}, or -@option{-mvsx} options. +use any of the @option{-mcpu=power7}, @option{-mcpu=power8}, +@option{-mcpu=power9}, or @option{-mvsx} options. @item -mupper-regs-sf @itemx -mno-upper-regs-sf @@ -20274,8 +20286,8 @@ Generate code that uses (does not use) the scalar single precision instructions that target all 64 registers in the vector/scalar floating point register set that were added in version 2.07 of the PowerPC ISA. @option{-mupper-regs-sf} is turned on by default if you -use either of the @option{-mcpu=power8} or @option{-mpower8-vector} -options. +use either of the @option{-mcpu=power8}, @option{-mpower8-vector}, or +@option{-mpower9} options. @item -mupper-regs @itemx -mno-upper-regs diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 12fc7128343..1a52ff0688f 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -3211,6 +3211,9 @@ FP or VSX register to perform ISA 2.07 float ops or NO_REGS. @item wz Floating point register if the LFIWZX instruction is enabled or NO_REGS. +@item wB +Signed 5-bit constant integer that can be loaded into an altivec register. + @item wD Int constant that is the element number of the 64-bit scalar in a vector. diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index a10f14353f2..be12713dd9b 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2016-06-15 Michael Meissner + + * gcc.target/powerpc/p9-dimode1.c: New test. + * gcc.target/powerpc/p9-dimode2.c: Likewise. + 2016-06-15 Jakub Jelinek * gcc.c-torture/compile/20160615-1.c: New test. diff --git a/gcc/testsuite/gcc.target/powerpc/p9-dimode1.c b/gcc/testsuite/gcc.target/powerpc/p9-dimode1.c new file mode 100644 index 00000000000..6ba610ba938 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/p9-dimode1.c @@ -0,0 +1,50 @@ +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */ + +/* Verify P9 changes to allow DImode into Altivec registers, and generate + constants using XXSPLTIB. */ + +#ifndef _ARCH_PPC64 +#error "This code is 64-bit." +#endif + +double +p9_zero (void) +{ + long l = 0; + double ret; + + __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l)); + + return ret; +} + +double +p9_plus_1 (void) +{ + long l = 1; + double ret; + + __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l)); + + return ret; +} + +double +p9_minus_1 (void) +{ + long l = -1; + double ret; + + __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l)); + + return ret; +} + +/* { dg-final { scan-assembler "xxspltib" } } */ +/* { dg-final { scan-assembler-not "mtvsrd" } } */ +/* { dg-final { scan-assembler-not "lfd" } } */ +/* { dg-final { scan-assembler-not "ld" } } */ +/* { dg-final { scan-assembler-not "lxsd" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/p9-dimode2.c b/gcc/testsuite/gcc.target/powerpc/p9-dimode2.c new file mode 100644 index 00000000000..0567a655277 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/p9-dimode2.c @@ -0,0 +1,27 @@ +/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */ + +/* Verify that large integer constants are loaded via direct move instead of being + loaded from memory. */ + +#ifndef _ARCH_PPC64 +#error "This code is 64-bit." +#endif + +double +p9_large (void) +{ + long l = 0x12345678; + double ret; + + __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l)); + + return ret; +} + +/* { dg-final { scan-assembler "mtvsrd" } } */ +/* { dg-final { scan-assembler-not "ld" } } */ +/* { dg-final { scan-assembler-not "lfd" } } */ +/* { dg-final { scan-assembler-not "lxsd" } } */