From: Michael Meissner Date: Thu, 10 Nov 2016 19:38:33 +0000 (+0000) Subject: rs6000.c (rs6000_hard_regno_mode_ok): If ISA 3.0... X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=456f0dfa1c8975b3d456dc6ad06e998d8eed22ed;p=gcc.git rs6000.c (rs6000_hard_regno_mode_ok): If ISA 3.0... [gcc] 2016-11-10 Michael Meissner * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): If ISA 3.0, enable HImode and QImode to go in vector registers by default if the -mvsx-small-integer option is enabled. (rs6000_secondary_reload_simple_move): Likewise. (rs6000_preferred_reload_class): Don't force integer constants to be loaded into vector registers that we can easily make into memory (or being created in the GPRs and moved over with direct move). * config/rs6000/vsx.md (UNSPEC_P9_MEMORY): Delete, no longer used. (vsx_extract_): Rework V4SImode, V8HImode, and V16QImode vector extraction on ISA 3.0 when the scalar integer can be allocated in vector registers. Generate the VEC_SELECT directy, and don't use UNSPEC's to avoid having the scalar type in a vector register. Make the expander target registers, and let the combiner fold in results storing to memory, if the machine supports stores. (vsx_extract__di): Likewise. (vsx_extract__p9): Likewise. (vsx_extract__di_p9): Likewise. (vsx_extract__store_p9): Likewise. (vsx_extract_si): Likewise. (vsx_extract__p8): Likewise. (p9_lxsizx): Delete, no longer used. (p9_stxsix): Likewise. * config/rs6000/rs6000.md (INT_ISA3): New mode iterator for integers in vector registers for ISA 3.0. (QHI): Update comment. (zero_extendqi2): Add support for ISA 3.0 scalar load or vector extract instructions in sign/zero extend. (zero_extendhi): Likewise. (extendqi): Likewise. (extendhi2): Likewise. (HImode splitter for load/sign extend in vector register): Likewise. (float2): Eliminate old method of optimizing floating point conversions to/from small data types and rewrite it to support QImode/HImode being allowed in vector registers on ISA 3.0. (float2_internal): Likewise. (floatuns2): Likewise. (floatuns2_internal): Likewise. (fix_trunc2): Likewise. (fix_trunc2_internal): Likewise. (fixuns_trunc2): Likewise. (fixuns_trunc2_internal): Likewise. VSPLITISW on ISA 2.07. (movhi_internal): Combine movhi_internal and movqi_internal into one mov_internal with an iterator. Add support for QImode and HImode being allowed in vector registers. Make large number of attributes and constraints easier to read. (movqi_internal): Likewise. (mov_internal): Likewise. (movdi_internal64): Fix constraint to allow loading -16..15 with VSPLITISW on ISA 2.07. (integer XXSPLTIB splitter): Add support for QI, HI, and SImode as well as DImode. [gcc/testsuite] 2016-11-10 Michael Meissner * gcc.target/powerpc/vsx-qimode.c: New test for QImode, HImode being allowed in vector registers. * gcc.target/powerpc/vsx-qimode2.c: Likewise. * gcc.target/powerpc/vsx-qimode3.c: Likewise. * gcc.target/powerpc/vsx-himode.c: Likewise. * gcc.target/powerpc/vsx-himode2.c: Likewise. * gcc.target/powerpc/vsx-himode3.c: Likewise. * gcc.target/powerpc/p9-extract-1.c: Change MFVSRD to just MFVSR, to allow matching MFVSRD or MFVSRW. From-SVN: r242048 --- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 66ff3618c8e..5fed9715531 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,63 @@ +2016-11-10 Michael Meissner + + * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): If ISA 3.0, + enable HImode and QImode to go in vector registers by default if + the -mvsx-small-integer option is enabled. + (rs6000_secondary_reload_simple_move): Likewise. + (rs6000_preferred_reload_class): Don't force integer constants to + be loaded into vector registers that we can easily make into + memory (or being created in the GPRs and moved over with direct + move). + * config/rs6000/vsx.md (UNSPEC_P9_MEMORY): Delete, no longer + used. + (vsx_extract_): Rework V4SImode, V8HImode, and V16QImode + vector extraction on ISA 3.0 when the scalar integer can be + allocated in vector registers. Generate the VEC_SELECT directy, + and don't use UNSPEC's to avoid having the scalar type in a vector + register. Make the expander target registers, and let the + combiner fold in results storing to memory, if the machine + supports stores. + (vsx_extract__di): Likewise. + (vsx_extract__p9): Likewise. + (vsx_extract__di_p9): Likewise. + (vsx_extract__store_p9): Likewise. + (vsx_extract_si): Likewise. + (vsx_extract__p8): Likewise. + (p9_lxsizx): Delete, no longer used. + (p9_stxsix): Likewise. + * config/rs6000/rs6000.md (INT_ISA3): New mode iterator for + integers in vector registers for ISA 3.0. + (QHI): Update comment. + (zero_extendqi2): Add support for ISA 3.0 scalar load or + vector extract instructions in sign/zero extend. + (zero_extendhi): Likewise. + (extendqi): Likewise. + (extendhi2): Likewise. + (HImode splitter for load/sign extend in vector register): + Likewise. + (float2): Eliminate old method of + optimizing floating point conversions to/from small data types and + rewrite it to support QImode/HImode being allowed in vector + registers on ISA 3.0. + (float2_internal): Likewise. + (floatuns2): Likewise. + (floatuns2_internal): Likewise. + (fix_trunc2): Likewise. + (fix_trunc2_internal): Likewise. + (fixuns_trunc2): Likewise. + (fixuns_trunc2_internal): Likewise. + VSPLITISW on ISA 2.07. + (movhi_internal): Combine movhi_internal and movqi_internal into + one mov_internal with an iterator. Add support for QImode + and HImode being allowed in vector registers. Make large number + of attributes and constraints easier to read. + (movqi_internal): Likewise. + (mov_internal): Likewise. + (movdi_internal64): Fix constraint to allow loading -16..15 with + VSPLITISW on ISA 2.07. + (integer XXSPLTIB splitter): Add support for QI, HI, and SImode as + well as DImode. + 2016-11-10 Pat Haugen PR rtl-optimization/78241 diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index b0c2b2e69ee..ac0bcbdcd50 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -2019,8 +2019,14 @@ rs6000_hard_regno_mode_ok (int regno, machine_mode mode) if(GET_MODE_SIZE (mode) == UNITS_PER_FP_WORD) return 1; - if (TARGET_VSX_SMALL_INTEGER && mode == SImode) - return 1; + if (TARGET_VSX_SMALL_INTEGER) + { + if (mode == SImode) + return 1; + + if (TARGET_P9_VECTOR && (mode == HImode || mode == QImode)) + return 1; + } } if (PAIRED_SIMD_REGNO_P (regno) && TARGET_PAIRED_FLOAT @@ -3403,7 +3409,14 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p) reg_addr[SFmode].scalar_in_vmx_p = true; if (TARGET_VSX_SMALL_INTEGER) - reg_addr[SImode].scalar_in_vmx_p = true; + { + reg_addr[SImode].scalar_in_vmx_p = true; + if (TARGET_P9_VECTOR) + { + reg_addr[HImode].scalar_in_vmx_p = true; + reg_addr[QImode].scalar_in_vmx_p = true; + } + } } /* Setup the fusion operations. */ @@ -20606,8 +20619,14 @@ rs6000_secondary_reload_simple_move (enum rs6000_reg_type to_type, } /* ISA 2.07: MTVSRWZ or MFVSRWZ. */ - if (TARGET_VSX_SMALL_INTEGER && mode == SImode) - return true; + if (TARGET_VSX_SMALL_INTEGER) + { + if (mode == SImode) + return true; + + if (TARGET_P9_VECTOR && (mode == HImode || mode == QImode)) + return true; + } /* ISA 2.07: MTVSRWZ or MFVSRWZ. */ if (mode == SDmode) @@ -21412,6 +21431,33 @@ rs6000_preferred_reload_class (rtx x, enum reg_class rclass) if (GET_CODE (x) == CONST_VECTOR && easy_vector_constant (x, mode)) return ALTIVEC_REGS; + /* If this is an integer constant that can easily be loaded into + vector registers, allow it. */ + if (CONST_INT_P (x)) + { + HOST_WIDE_INT value = INTVAL (x); + + /* ISA 2.07 can generate -1 in all registers with XXLORC. ISA + 2.06 can generate it in the Altivec registers with + VSPLTI. */ + if (value == -1) + { + if (TARGET_P8_VECTOR) + return rclass; + else if (rclass == ALTIVEC_REGS || rclass == VSX_REGS) + return ALTIVEC_REGS; + else + return NO_REGS; + } + + /* ISA 3.0 can load -128..127 using the XXSPLTIB instruction and + a sign extend in the Altivec registers. */ + if (IN_RANGE (value, -128, 127) && TARGET_P9_VECTOR + && TARGET_VSX_SMALL_INTEGER + && (rclass == ALTIVEC_REGS || rclass == VSX_REGS)) + return ALTIVEC_REGS; + } + /* Force constant to memory. */ return NO_REGS; } diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index d4095498981..b3fe92a899c 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -325,6 +325,9 @@ ; Any supported integer mode that fits in one register. (define_mode_iterator INT1 [QI HI SI (DI "TARGET_POWERPC64")]) +; Integer modes supported in VSX registers with ISA 3.0 instructions +(define_mode_iterator INT_ISA3 [QI HI SI DI]) + ; Everything we can extend QImode to. (define_mode_iterator EXTQI [SI (DI "TARGET_POWERPC64")]) @@ -334,7 +337,7 @@ ; Everything we can extend SImode to. (define_mode_iterator EXTSI [(DI "TARGET_POWERPC64")]) -; QImode or HImode for small atomic ops +; QImode or HImode for small integer moves and small atomic ops (define_mode_iterator QHI [QI HI]) ; QImode, HImode, SImode for fused ops only for GPR loads @@ -735,13 +738,15 @@ ;; complex forms. Basic data transfer is done later. (define_insn "zero_extendqi2" - [(set (match_operand:EXTQI 0 "gpc_reg_operand" "=r,r") - (zero_extend:EXTQI (match_operand:QI 1 "reg_or_mem_operand" "m,r")))] + [(set (match_operand:EXTQI 0 "gpc_reg_operand" "=r,r,?*wJwK,?*wK") + (zero_extend:EXTQI (match_operand:QI 1 "reg_or_mem_operand" "m,r,Z,*wK")))] "" "@ lbz%U1%X1 %0,%1 - rlwinm %0,%1,0,0xff" - [(set_attr "type" "load,shift")]) + rlwinm %0,%1,0,0xff + lxsibzx %x0,%y1 + vextractub %0,%1,7" + [(set_attr "type" "load,shift,fpload,vecperm")]) (define_insn_and_split "*zero_extendqi2_dot" [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y") @@ -786,13 +791,15 @@ (define_insn "zero_extendhi2" - [(set (match_operand:EXTHI 0 "gpc_reg_operand" "=r,r") - (zero_extend:EXTHI (match_operand:HI 1 "reg_or_mem_operand" "m,r")))] + [(set (match_operand:EXTHI 0 "gpc_reg_operand" "=r,r,?*wJwK,?*wK") + (zero_extend:EXTHI (match_operand:HI 1 "reg_or_mem_operand" "m,r,Z,wK")))] "" "@ lhz%U1%X1 %0,%1 - rlwinm %0,%1,0,0xffff" - [(set_attr "type" "load,shift")]) + rlwinm %0,%1,0,0xffff + lxsihzx %x0,%y1 + vextractuh %0,%1,6" + [(set_attr "type" "load,shift,fpload,vecperm")]) (define_insn_and_split "*zero_extendhi2_dot" [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y") @@ -893,11 +900,13 @@ (define_insn "extendqi2" - [(set (match_operand:EXTQI 0 "gpc_reg_operand" "=r") - (sign_extend:EXTQI (match_operand:QI 1 "gpc_reg_operand" "r")))] + [(set (match_operand:EXTQI 0 "gpc_reg_operand" "=r,?*wK") + (sign_extend:EXTQI (match_operand:QI 1 "gpc_reg_operand" "r,?*wK")))] "" - "extsb %0,%1" - [(set_attr "type" "exts")]) + "@ + extsb %0,%1 + vextsb2d %0,%1" + [(set_attr "type" "exts,vecperm")]) (define_insn_and_split "*extendqi2_dot" [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y") @@ -942,20 +951,36 @@ (define_expand "extendhi2" - [(set (match_operand:EXTHI 0 "gpc_reg_operand" "") - (sign_extend:EXTHI (match_operand:HI 1 "gpc_reg_operand" "")))] + [(set (match_operand:EXTHI 0 "gpc_reg_operand") + (sign_extend:EXTHI (match_operand:HI 1 "gpc_reg_operand")))] "" "") (define_insn "*extendhi2" - [(set (match_operand:EXTHI 0 "gpc_reg_operand" "=r,r") - (sign_extend:EXTHI (match_operand:HI 1 "reg_or_mem_operand" "m,r")))] + [(set (match_operand:EXTHI 0 "gpc_reg_operand" "=r,r,?*wK,?*wK") + (sign_extend:EXTHI (match_operand:HI 1 "reg_or_mem_operand" "m,r,Z,wK")))] "rs6000_gen_cell_microcode" "@ lha%U1%X1 %0,%1 - extsh %0,%1" - [(set_attr "type" "load,exts") - (set_attr "sign_extend" "yes")]) + extsh %0,%1 + # + vextsh2d %0,%1" + [(set_attr "type" "load,exts,fpload,vecperm") + (set_attr "sign_extend" "yes") + (set_attr "length" "4,4,8,4")]) + +(define_split + [(set (match_operand:EXTHI 0 "altivec_register_operand") + (sign_extend:EXTHI + (match_operand:HI 1 "indexed_or_indirect_operand")))] + "TARGET_P9_VECTOR && reload_completed" + [(set (match_dup 2) + (match_dup 1)) + (set (match_dup 0) + (sign_extend:EXTHI (match_dup 2)))] +{ + operands[2] = gen_rtx_REG (HImode, REGNO (operands[1])); +}) (define_insn "*extendhi2_noload" [(set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") @@ -5307,30 +5332,33 @@ (set_attr "type" "fp")]) ;; ISA 3.0 adds instructions lxsi[bh]zx to directly load QImode and HImode to -;; vector registers. At the moment, QI/HImode are not allowed in floating -;; point or vector registers, so we use UNSPEC's to use the load byte and -;; half-word instructions. +;; vector registers. These insns favor doing the sign/zero extension in +;; the vector registers, rather then loading up a GPR, doing a sign/zero +;; extension and then a direct move. (define_expand "float2" [(parallel [(set (match_operand:FP_ISA3 0 "vsx_register_operand") (float:FP_ISA3 (match_operand:QHI 1 "input_operand"))) (clobber (match_scratch:DI 2)) - (clobber (match_scratch:DI 3))])] - "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE && TARGET_POWERPC64" + (clobber (match_scratch:DI 3)) + (clobber (match_scratch: 4))])] + "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE && TARGET_POWERPC64 + && TARGET_VSX_SMALL_INTEGER" { if (MEM_P (operands[1])) operands[1] = rs6000_address_for_fpconvert (operands[1]); }) (define_insn_and_split "*float2_internal" - [(set (match_operand:FP_ISA3 0 "vsx_register_operand" "=,") + [(set (match_operand:FP_ISA3 0 "vsx_register_operand" "=,,") (float:FP_ISA3 - (match_operand:QHI 1 "reg_or_indexed_operand" "r,Z"))) - (clobber (match_scratch:DI 2 "=wi,v")) - (clobber (match_scratch:DI 3 "=r,X"))] + (match_operand:QHI 1 "reg_or_indexed_operand" "wK,r,Z"))) + (clobber (match_scratch:DI 2 "=wK,wi,wK")) + (clobber (match_scratch:DI 3 "=X,r,X")) + (clobber (match_scratch: 4 "=X,X,wK"))] "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE && TARGET_POWERPC64 - && TARGET_UPPER_REGS_DI" + && TARGET_UPPER_REGS_DI && TARGET_VSX_SMALL_INTEGER" "#" "&& reload_completed" [(const_int 0)] @@ -5341,26 +5369,20 @@ if (!MEM_P (input)) { - rtx tmp = operands[3]; - emit_insn (gen_extenddi2 (tmp, input)); - emit_move_insn (di, tmp); + if (altivec_register_operand (input, mode)) + emit_insn (gen_extenddi2 (di, input)); + else + { + rtx tmp = operands[3]; + emit_insn (gen_extenddi2 (tmp, input)); + emit_move_insn (di, tmp); + } } else { - machine_mode vmode; - rtx di_vector; - - emit_insn (gen_p9_lxsizx (di, input)); - - if (mode == QImode) - vmode = V16QImode; - else if (mode == HImode) - vmode = V8HImode; - else - gcc_unreachable (); - - di_vector = gen_rtx_REG (vmode, REGNO (di)); - emit_insn (gen_vsx_sign_extend__di (di, di_vector)); + rtx tmp = operands[4]; + emit_move_insn (tmp, input); + emit_insn (gen_extenddi2 (di, tmp)); } emit_insn (gen_floatdi2 (result, di)); @@ -5368,24 +5390,26 @@ }) (define_expand "floatuns2" - [(parallel [(set (match_operand:FP_ISA3 0 "vsx_register_operand" "") + [(parallel [(set (match_operand:FP_ISA3 0 "vsx_register_operand") (unsigned_float:FP_ISA3 (match_operand:QHI 1 "input_operand" ""))) (clobber (match_scratch:DI 2 "")) (clobber (match_scratch:DI 3 ""))])] - "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE && TARGET_POWERPC64" + "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE && TARGET_POWERPC64 + && TARGET_VSX_SMALL_INTEGER" { if (MEM_P (operands[1])) operands[1] = rs6000_address_for_fpconvert (operands[1]); }) (define_insn_and_split "*floatuns2_internal" - [(set (match_operand:FP_ISA3 0 "vsx_register_operand" "=,") + [(set (match_operand:FP_ISA3 0 "vsx_register_operand" "=,,") (unsigned_float:FP_ISA3 - (match_operand:QHI 1 "reg_or_indexed_operand" "r,Z"))) - (clobber (match_scratch:DI 2 "=wi,wi")) - (clobber (match_scratch:DI 3 "=r,X"))] - "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE && TARGET_POWERPC64" + (match_operand:QHI 1 "reg_or_indexed_operand" "wJwK,r,Z"))) + (clobber (match_scratch:DI 2 "=wK,wi,wJwK")) + (clobber (match_scratch:DI 3 "=X,r,X"))] + "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE && TARGET_POWERPC64 + && TARGET_VSX_SMALL_INTEGER" "#" "&& reload_completed" [(const_int 0)] @@ -5393,15 +5417,15 @@ rtx result = operands[0]; rtx input = operands[1]; rtx di = operands[2]; - rtx tmp = operands[3]; - if (!MEM_P (input)) + if (MEM_P (input) || altivec_register_operand (input, mode)) + emit_insn (gen_zero_extenddi2 (di, input)); + else { + rtx tmp = operands[3]; emit_insn (gen_zero_extenddi2 (tmp, input)); emit_move_insn (di, tmp); } - else - emit_insn (gen_p9_lxsizx (di, input)); emit_insn (gen_floatdi2 (result, di)); DONE; @@ -5516,19 +5540,43 @@ [(set_attr "type" "fp")]) (define_expand "fix_trunc2" - [(use (match_operand:QHI 0 "rs6000_nonimmediate_operand" "")) - (use (match_operand:SFDF 1 "vsx_register_operand" ""))] - "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE && TARGET_POWERPC64" + [(parallel [(set (match_operand: 0 "nonimmediate_operand") + (fix:QHI (match_operand:SFDF 1 "gpc_reg_operand"))) + (clobber (match_scratch:DI 2))])] + "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT + && TARGET_VSX_SMALL_INTEGER" +{ + if (MEM_P (operands[0])) + operands[0] = rs6000_address_for_fpconvert (operands[0]); +}) + +(define_insn_and_split "*fix_trunc2_internal" + [(set (match_operand: 0 "reg_or_indexed_operand" "=wIwJ,rZ") + (fix:QHI + (match_operand:SFDF 1 "gpc_reg_operand" ","))) + (clobber (match_scratch:DI 2 "=X,wi"))] + "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT + && TARGET_VSX_SMALL_INTEGER" + "#" + "&& reload_completed" + [(const_int 0)] { - rtx op0 = operands[0]; - rtx op1 = operands[1]; - rtx di_tmp = gen_reg_rtx (DImode); + rtx dest = operands[0]; + rtx src = operands[1]; - if (MEM_P (op0)) - op0 = rs6000_address_for_fpconvert (op0); + if (vsx_register_operand (dest, mode)) + { + rtx di_dest = gen_rtx_REG (DImode, REGNO (dest)); + emit_insn (gen_fix_truncdi2 (di_dest, src)); + } + else + { + rtx tmp = operands[2]; + rtx tmp2 = gen_rtx_REG (mode, REGNO (tmp)); - emit_insn (gen_fctiwz_ (di_tmp, op1)); - emit_insn (gen_p9_stxsix (op0, di_tmp)); + emit_insn (gen_fix_truncdi2 (tmp, src)); + emit_move_insn (dest, tmp2); + } DONE; }) @@ -5605,22 +5653,45 @@ [(set_attr "type" "fp")]) (define_expand "fixuns_trunc2" - [(use (match_operand:QHI 0 "rs6000_nonimmediate_operand" "")) - (use (match_operand:SFDF 1 "vsx_register_operand" ""))] - "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE && TARGET_POWERPC64" + [(parallel [(set (match_operand: 0 "nonimmediate_operand") + (unsigned_fix:QHI (match_operand:SFDF 1 "gpc_reg_operand"))) + (clobber (match_scratch:DI 2))])] + "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT + && TARGET_VSX_SMALL_INTEGER" +{ + if (MEM_P (operands[0])) + operands[0] = rs6000_address_for_fpconvert (operands[0]); +}) + +(define_insn_and_split "*fixuns_trunc2_internal" + [(set (match_operand: 0 "reg_or_indexed_operand" "=wIwJ,rZ") + (unsigned_fix:QHI + (match_operand:SFDF 1 "gpc_reg_operand" ","))) + (clobber (match_scratch:DI 2 "=X,wi"))] + "TARGET_P9_VECTOR && TARGET_DIRECT_MOVE_64BIT + && TARGET_VSX_SMALL_INTEGER" + "#" + "&& reload_completed" + [(const_int 0)] { - rtx op0 = operands[0]; - rtx op1 = operands[1]; - rtx di_tmp = gen_reg_rtx (DImode); + rtx dest = operands[0]; + rtx src = operands[1]; - if (MEM_P (op0)) - op0 = rs6000_address_for_fpconvert (op0); + if (vsx_register_operand (dest, mode)) + { + rtx di_dest = gen_rtx_REG (DImode, REGNO (dest)); + emit_insn (gen_fixuns_truncdi2 (di_dest, src)); + } + else + { + rtx tmp = operands[2]; + rtx tmp2 = gen_rtx_REG (mode, REGNO (tmp)); - emit_insn (gen_fctiwuz_ (di_tmp, op1)); - emit_insn (gen_p9_stxsix (op0, di_tmp)); + emit_insn (gen_fixuns_truncdi2 (tmp, src)); + emit_move_insn (dest, tmp2); + } DONE; }) - ; Here, we use (set (reg) (unspec:DI [(fix:SI ...)] UNSPEC_FCTIWZ)) ; rather than (set (subreg:SI (reg)) (fix:SI ...)) ; because the first makes it clear that operand 0 is not live @@ -6643,8 +6714,8 @@ ;; Split loading -128..127 to use XXSPLITB and VEXTSW2D (define_split - [(set (match_operand:DI 0 "altivec_register_operand" "") - (match_operand:DI 1 "xxspltib_constant_split" ""))] + [(set (match_operand:DI 0 "altivec_register_operand") + (match_operand:DI 1 "xxspltib_constant_split"))] "TARGET_VSX_SMALL_INTEGER && TARGET_P9_VECTOR && reload_completed" [(const_int 0)] { @@ -6684,41 +6755,55 @@ (const_int 0)))] "") -(define_insn "*movhi_internal" - [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,m,r,r,*c*l,*h") - (match_operand:HI 1 "input_operand" "r,m,r,i,*h,r,0"))] - "gpc_reg_operand (operands[0], HImode) - || gpc_reg_operand (operands[1], HImode)" - "@ - mr %0,%1 - lhz%U1%X1 %0,%1 - sth%U0%X0 %1,%0 - li %0,%w1 - mf%1 %0 - mt%0 %1 - nop" - [(set_attr "type" "*,load,store,*,mfjmpr,mtjmpr,*")]) - (define_expand "mov" [(set (match_operand:INT 0 "general_operand" "") (match_operand:INT 1 "any_operand" ""))] "" "{ rs6000_emit_move (operands[0], operands[1], mode); DONE; }") -(define_insn "*movqi_internal" - [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,m,r,r,*c*l,*h") - (match_operand:QI 1 "input_operand" "r,m,r,i,*h,r,0"))] - "gpc_reg_operand (operands[0], QImode) - || gpc_reg_operand (operands[1], QImode)" +;; MR LHZ/LBZ LXSI*ZX STH/STB STXSI*X LI +;; XXLOR load 0 load -1 VSPLTI* # MFVSRWZ +;; MTVSRWZ MF%1 MT%1 NOP +(define_insn "*mov_internal" + [(set (match_operand:QHI 0 "nonimmediate_operand" + "=r, r, ?*wJwK, m, Z, r, + ?*wJwK, ?*wJwK, ?*wJwK, ?*wK, ?*wK, r, + ?*wJwK, r, *c*l, *h") + + (match_operand:QHI 1 "input_operand" + "r, m, Z, r, wJwK, i, + wJwK, O, wM, wB, wS, ?*wJwK, + r, *h, r, 0"))] + + "gpc_reg_operand (operands[0], mode) + || gpc_reg_operand (operands[1], mode)" "@ mr %0,%1 - lbz%U1%X1 %0,%1 - stb%U0%X0 %1,%0 + lz%U1%X1 %0,%1 + lxsizx %x0,%y1 + st%U0%X0 %1,%0 + stxsix %1,%y0 li %0,%1 + xxlor %x0,%x1,%x1 + xxspltib %x0,0 + xxspltib %x0,255 + vspltis %0,%1 + # + mfvsrwz %0,%x1 + mtvsrwz %x0,%1 mf%1 %0 mt%0 %1 nop" - [(set_attr "type" "*,load,store,*,mfjmpr,mtjmpr,*")]) + [(set_attr "type" + "*, load, fpload, store, fpstore, *, + vecsimple, vecperm, vecperm, vecperm, vecperm, mftgpr, + mffgpr, mfjmpr, mtjmpr, *") + + (set_attr "length" + "4, 4, 4, 4, 4, 4, + 4, 4, 4, 4, 8, 4, + 4, 4, 4, 4")]) + ;; Here is how to move condition codes around. When we store CC data in ;; an integer register or memory, we store just the high-order 4 bits. @@ -8142,7 +8227,7 @@ xxlor %x0,%x1,%x1 xxspltib %x0,0 xxspltib %x0,255 - vspltisw %0,%1 + # xxlxor %x0,%x0,%x0 xxlorc %x0,%x0,%x0 # @@ -8236,9 +8321,11 @@ DONE; }) +;; Split integer constants that can be loaded with XXSPLTIB and a +;; sign extend operation. (define_split - [(set (match_operand:DI 0 "altivec_register_operand" "") - (match_operand:DI 1 "xxspltib_constant_split" ""))] + [(set (match_operand:INT_ISA3 0 "altivec_register_operand" "") + (match_operand:INT_ISA3 1 "xxspltib_constant_split" ""))] "TARGET_UPPER_REGS_DI && TARGET_P9_VECTOR && reload_completed" [(const_int 0)] { @@ -8248,7 +8335,15 @@ rtx op0_v16qi = gen_rtx_REG (V16QImode, r); emit_insn (gen_xxspltib_v16qi (op0_v16qi, op1)); - emit_insn (gen_vsx_sign_extend_qi_di (operands[0], op0_v16qi)); + if (mode == DImode) + emit_insn (gen_vsx_sign_extend_qi_di (operands[0], op0_v16qi)); + else if (mode == SImode) + emit_insn (gen_vsx_sign_extend_qi_si (operands[0], op0_v16qi)); + else if (mode == HImode) + { + rtx op0_v8hi = gen_rtx_REG (V8HImode, r); + emit_insn (gen_altivec_vupkhsb (op0_v8hi, op0_v16qi)); + } DONE; }) diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index 2c74a8ebbe2..ebb0f6dc099 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -338,7 +338,6 @@ UNSPEC_VSX_XVCVDPSXDS UNSPEC_VSX_XVCVDPUXDS UNSPEC_VSX_SIGN_EXTEND - UNSPEC_P9_MEMORY UNSPEC_VSX_VSLO UNSPEC_VSX_EXTRACT UNSPEC_VSX_SXEXPDP @@ -2519,72 +2518,29 @@ ;; types are currently allowed in a vector register, so we extract to a DImode ;; and either do a direct move or store. (define_expand "vsx_extract_" - [(parallel [(set (match_operand: 0 "nonimmediate_operand") + [(parallel [(set (match_operand: 0 "gpc_reg_operand") (vec_select: (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand") (parallel [(match_operand:QI 2 "const_int_operand")]))) - (clobber (match_dup 3))])] + (clobber (match_scratch:VSX_EXTRACT_I 3))])] "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" { - machine_mode smode = ((mode != V4SImode && TARGET_VEXTRACTUB) - ? DImode : mode); - operands[3] = gen_rtx_SCRATCH (smode); -}) - -;; Under ISA 3.0, we can use the byte/half-word/word integer stores if we are -;; extracting a vector element and storing it to memory, rather than using -;; direct move to a GPR and a GPR store. -(define_insn_and_split "*vsx_extract__p9" - [(set (match_operand: 0 "nonimmediate_operand" "=r,Z") - (vec_select: - (match_operand:VSX_EXTRACT_I2 1 "gpc_reg_operand" "v,v") - (parallel [(match_operand:QI 2 "" "n,n")]))) - (clobber (match_scratch:DI 3 "=v,v"))] - "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB" - "#" - "&& (reload_completed || MEM_P (operands[0]))" - [(const_int 0)] -{ - rtx dest = operands[0]; - rtx src = operands[1]; - rtx element = operands[2]; - rtx di_tmp = operands[3]; - - if (GET_CODE (di_tmp) == SCRATCH) - di_tmp = gen_reg_rtx (DImode); - - emit_insn (gen_vsx_extract__di (di_tmp, src, element)); - - if (REG_P (dest)) - emit_move_insn (gen_rtx_REG (DImode, REGNO (dest)), di_tmp); - else if (SUBREG_P (dest)) - emit_move_insn (gen_rtx_REG (DImode, subreg_regno (dest)), di_tmp); - else if (MEM_P (operands[0])) + /* If we have ISA 3.0, we can do a xxextractuw/vextractu{b,h}. */ + if (TARGET_VSX_SMALL_INTEGER && TARGET_P9_VECTOR) { - if (can_create_pseudo_p ()) - dest = rs6000_address_for_fpconvert (dest); - - if (mode == V16QImode) - emit_insn (gen_p9_stxsibx (dest, di_tmp)); - else if (mode == V8HImode) - emit_insn (gen_p9_stxsihx (dest, di_tmp)); - else - gcc_unreachable (); + emit_insn (gen_vsx_extract__p9 (operands[0], operands[1], + operands[2])); + DONE; } - else - gcc_unreachable (); - - DONE; -} - [(set_attr "type" "vecsimple,fpstore")]) +}) -(define_insn "vsx_extract__di" - [(set (match_operand:DI 0 "gpc_reg_operand" "=") - (zero_extend:DI - (vec_select: - (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "") - (parallel [(match_operand:QI 2 "" "n")]))))] - "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB" +(define_insn "vsx_extract__p9" + [(set (match_operand: 0 "gpc_reg_operand" "=") + (vec_select: + (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "") + (parallel [(match_operand:QI 2 "" "n")])))] + "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB + && TARGET_VSX_SMALL_INTEGER" { /* Note, the element number has already been adjusted for endianness, so we don't have to adjust it here. */ @@ -2599,13 +2555,51 @@ } [(set_attr "type" "vecsimple")]) +;; Optimize zero extracts to eliminate the AND after the extract. +(define_insn_and_split "*vsx_extract__di_p9" + [(set (match_operand:DI 0 "gpc_reg_operand" "=") + (zero_extend:DI + (vec_select: + (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "") + (parallel [(match_operand:QI 2 "const_int_operand" "n")]))))] + "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB + && TARGET_VSX_SMALL_INTEGER" + "#" + "&& reload_completed" + [(set (match_dup 3) + (vec_select: + (match_dup 1) + (parallel [(match_dup 2)])))] +{ + operands[3] = gen_rtx_REG (mode, REGNO (operands[0])); +}) + +;; Optimize stores to use the ISA 3.0 scalar store instructions +(define_insn_and_split "*vsx_extract__store_p9" + [(set (match_operand: 0 "memory_operand" "=Z") + (vec_select: + (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "") + (parallel [(match_operand:QI 2 "const_int_operand" "n")]))) + (clobber (match_scratch: 3 "="))] + "VECTOR_MEM_VSX_P (mode) && TARGET_VEXTRACTUB + && TARGET_VSX_SMALL_INTEGER" + "#" + "&& reload_completed" + [(set (match_dup 3) + (vec_select: + (match_dup 1) + (parallel [(match_dup 2)]))) + (set (match_dup 0) + (match_dup 3))]) + (define_insn_and_split "*vsx_extract_si" [(set (match_operand:SI 0 "nonimmediate_operand" "=r,wHwI,Z") (vec_select:SI (match_operand:V4SI 1 "gpc_reg_operand" "wJv,wJv,wJv") (parallel [(match_operand:QI 2 "const_0_to_3_operand" "n,n,n")]))) (clobber (match_scratch:V4SI 3 "=wJv,wJv,wJv"))] - "VECTOR_MEM_VSX_P (V4SImode) && TARGET_DIRECT_MOVE_64BIT" + "VECTOR_MEM_VSX_P (V4SImode) && TARGET_DIRECT_MOVE_64BIT + && (!TARGET_P9_VECTOR || !TARGET_VSX_SMALL_INTEGER)" "#" "&& reload_completed" [(const_int 0)] @@ -2624,10 +2618,10 @@ value = INTVAL (element); if (value != 1) { - if (TARGET_VEXTRACTUB) + if (TARGET_P9_VECTOR && TARGET_VSX_SMALL_INTEGER) { - rtx di_tmp = gen_rtx_REG (DImode, REGNO (vec_tmp)); - emit_insn (gen_vsx_extract_v4si_di (di_tmp,src, element)); + rtx si_tmp = gen_rtx_REG (SImode, REGNO (vec_tmp)); + emit_insn (gen_vsx_extract_v4si_p9 (si_tmp,src, element)); } else emit_insn (gen_altivec_vspltw_direct (vec_tmp, src, element)); @@ -2663,7 +2657,8 @@ (match_operand:VSX_EXTRACT_I2 1 "gpc_reg_operand" "v") (parallel [(match_operand:QI 2 "" "n")]))) (clobber (match_scratch:VSX_EXTRACT_I2 3 "=v"))] - "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT" + "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT + && (!TARGET_P9_VECTOR || !TARGET_VSX_SMALL_INTEGER)" "#" "&& reload_completed" [(const_int 0)] @@ -3253,26 +3248,6 @@ [(set_attr "type" "vecexts")]) -;; ISA 3.0 memory operations -(define_insn "p9_lxsizx" - [(set (match_operand:DI 0 "vsx_register_operand" "=wi") - (unspec:DI [(zero_extend:DI - (match_operand:QHI 1 "indexed_or_indirect_operand" "Z"))] - UNSPEC_P9_MEMORY))] - "TARGET_P9_VECTOR" - "lxsizx %x0,%y1" - [(set_attr "type" "fpload")]) - -(define_insn "p9_stxsix" - [(set (match_operand:QHI 0 "reg_or_indexed_operand" "=r,Z") - (unspec:QHI [(match_operand:DI 1 "vsx_register_operand" "wi,wi")] - UNSPEC_P9_MEMORY))] - "TARGET_P9_VECTOR" - "@ - mfvsrd %0,%x1 - stxsix %x1,%y0" - [(set_attr "type" "mffgpr,fpstore")]) - ;; ISA 3.0 Binary Floating-Point Support ;; VSX Scalar Extract Exponent Double-Precision diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index b3f944221f5..6c82dbf2812 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,15 @@ +2016-11-10 Michael Meissner + + * gcc.target/powerpc/vsx-qimode.c: New test for QImode, HImode + being allowed in vector registers. + * gcc.target/powerpc/vsx-qimode2.c: Likewise. + * gcc.target/powerpc/vsx-qimode3.c: Likewise. + * gcc.target/powerpc/vsx-himode.c: Likewise. + * gcc.target/powerpc/vsx-himode2.c: Likewise. + * gcc.target/powerpc/vsx-himode3.c: Likewise. + * gcc.target/powerpc/p9-extract-1.c: Change MFVSRD to just MFVSR, + to allow matching MFVSRD or MFVSRW. + 2016-11-10 Pat Haugen PR rtl-optimization/78241 diff --git a/gcc/testsuite/gcc.target/powerpc/p9-extract-1.c b/gcc/testsuite/gcc.target/powerpc/p9-extract-1.c index 1aefc8f3bf1..fceb334195e 100644 --- a/gcc/testsuite/gcc.target/powerpc/p9-extract-1.c +++ b/gcc/testsuite/gcc.target/powerpc/p9-extract-1.c @@ -17,7 +17,7 @@ int extract_schar_3 (vector signed char a) { return vec_extract (a, 15); } /* { dg-final { scan-assembler "vextractub" } } */ /* { dg-final { scan-assembler "vextractuh" } } */ /* { dg-final { scan-assembler "xxextractuw" } } */ -/* { dg-final { scan-assembler "mfvsrd" } } */ +/* { dg-final { scan-assembler "mfvsr" } } */ /* { dg-final { scan-assembler-not "stxvd2x" } } */ /* { dg-final { scan-assembler-not "stxv" } } */ /* { dg-final { scan-assembler-not "lwa" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-himode.c b/gcc/testsuite/gcc.target/powerpc/vsx-himode.c new file mode 100644 index 00000000000..883864e5885 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-himode.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-options "-mcpu=power9 -O2 -mvsx-small-integer" } */ + +double load_asm_d_constraint (short *p) +{ + double ret; + __asm__ ("xxlor %x0,%x1,%x1\t# load d constraint" : "=d" (ret) : "d" (*p)); + return ret; +} + +void store_asm_d_constraint (short *p, double x) +{ + short i; + __asm__ ("xxlor %x0,%x1,%x1\t# store d constraint" : "=d" (i) : "d" (x)); + *p = i; +} + +/* { dg-final { scan-assembler "lxsihzx" } } */ +/* { dg-final { scan-assembler "stxsihx" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-himode2.c b/gcc/testsuite/gcc.target/powerpc/vsx-himode2.c new file mode 100644 index 00000000000..96de758e988 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-himode2.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-options "-mcpu=power9 -O2 -mvsx-small-integer" } */ + +unsigned int foo (unsigned short u) +{ + unsigned int ret; + __asm__ ("xxlor %x0,%x1,%x1\t# v, v constraints" : "=v" (ret) : "v" (u)); + return ret; +} + +/* { dg-final { scan-assembler "mtvsrwz" } } */ +/* { dg-final { scan-assembler "mfvsrwz" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-himode3.c b/gcc/testsuite/gcc.target/powerpc/vsx-himode3.c new file mode 100644 index 00000000000..2f4a858d3d8 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-himode3.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-options "-mcpu=power9 -O2 -mvsx-small-integer" } */ + +double load_asm_v_constraint (short *p) +{ + double ret; + __asm__ ("xxlor %x0,%x1,%x1\t# load v constraint" : "=d" (ret) : "v" (*p)); + return ret; +} + +void store_asm_v_constraint (short *p, double x) +{ + short i; + __asm__ ("xxlor %x0,%x1,%x1\t# store v constraint" : "=v" (i) : "d" (x)); + *p = i; +} + +/* { dg-final { scan-assembler "lxsihzx" } } */ +/* { dg-final { scan-assembler "stxsihx" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-qimode.c b/gcc/testsuite/gcc.target/powerpc/vsx-qimode.c new file mode 100644 index 00000000000..eb82c56c29b --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-qimode.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-options "-mcpu=power9 -O2 -mvsx-small-integer" } */ + +double load_asm_d_constraint (signed char *p) +{ + double ret; + __asm__ ("xxlor %x0,%x1,%x1\t# load d constraint" : "=d" (ret) : "d" (*p)); + return ret; +} + +void store_asm_d_constraint (signed char *p, double x) +{ + signed char i; + __asm__ ("xxlor %x0,%x1,%x1\t# store d constraint" : "=d" (i) : "d" (x)); + *p = i; +} + +/* { dg-final { scan-assembler "lxsibzx" } } */ +/* { dg-final { scan-assembler "stxsibx" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-qimode2.c b/gcc/testsuite/gcc.target/powerpc/vsx-qimode2.c new file mode 100644 index 00000000000..02aa2072836 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-qimode2.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-options "-mcpu=power9 -O2 -mvsx-small-integer" } */ + +unsigned int foo (unsigned char u) +{ + unsigned int ret; + __asm__ ("xxlor %x0,%x1,%x1\t# v, v constraints" : "=v" (ret) : "v" (u)); + return ret; +} + +/* { dg-final { scan-assembler "mtvsrwz" } } */ +/* { dg-final { scan-assembler "mfvsrwz" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-qimode3.c b/gcc/testsuite/gcc.target/powerpc/vsx-qimode3.c new file mode 100644 index 00000000000..0e1da329105 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vsx-qimode3.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-options "-mcpu=power9 -O2 -mvsx-small-integer" } */ + +double load_asm_v_constraint (signed char *p) +{ + double ret; + __asm__ ("xxlor %x0,%x1,%x1\t# load v constraint" : "=d" (ret) : "v" (*p)); + return ret; +} + +void store_asm_v_constraint (signed char *p, double x) +{ + signed char i; + __asm__ ("xxlor %x0,%x1,%x1\t# store v constraint" : "=v" (i) : "d" (x)); + *p = i; +} + +/* { dg-final { scan-assembler "lxsibzx" } } */ +/* { dg-final { scan-assembler "stxsibx" } } */