From: Ilya Leoshkevich Date: Mon, 21 Sep 2020 11:31:05 +0000 (+0200) Subject: IBM Z: Store long doubles in vector registers when possible X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=e627cda56865;p=gcc.git IBM Z: Store long doubles in vector registers when possible On z14+, there are instructions for working with 128-bit floats (long doubles) in vector registers. It's beneficial to use them instead of instructions that operate on floating point register pairs, because it allows to store 4 times more data in registers at a time, relieving register pressure. The raw performance of the new instructions is almost the same as that of the new ones. Implement by storing TFmode values in vector registers on z14+. Since not all operations are available with the new instructions, keep the old ones available using the new FPRX2 mode, and convert between it and TFmode when necessary (this is called "forwarder" expanders below). Change the existing TFmode expanders to call either new- or old-style ones depending on whether we are on z14+ or older machines ("dispatcher" expanders). gcc/ChangeLog: 2020-11-03 Ilya Leoshkevich * config/s390/s390-modes.def (FPRX2): New mode. * config/s390/s390-protos.h (s390_fma_allowed_p): New function. * config/s390/s390.c (s390_fma_allowed_p): Likewise. (s390_build_signbit_mask): Support 128-bit masks. (print_operand): Support printing the second word of a TFmode operand as vector register. (constant_modes): Add FPRX2mode. (s390_class_max_nregs): Return 1 for TFmode on z14+. (s390_is_fpr128): New function. (s390_is_vr128): Likewise. (s390_can_change_mode_class): Use s390_is_fpr128 and s390_is_vr128 in order to determine whether mode refers to a FPR pair or to a VR. (s390_emit_compare): Force TFmode operands into registers on z14+. * config/s390/s390.h (HAVE_TF): New macro. (EXPAND_MOVTF): New macro. (EXPAND_TF): Likewise. * config/s390/s390.md (PFPO_OP_TYPE_FPRX2): PFPO_OP_TYPE_TF alias. (ALL): Add FPRX2. (FP_ALL): Add FPRX2 for z14+, restrict TFmode to z13-. (FP): Likewise. (FP_ANYTF): New mode iterator. (BFP): Add FPRX2 for z14+, restrict TFmode to z13-. (TD_TF): Likewise. (xde): Add FPRX2. (nBFP): Likewise. (nDFP): Likewise. (DSF): Likewise. (DFDI): Likewise. (SFSI): Likewise. (DF): Likewise. (SF): Likewise. (fT0): Likewise. (bt): Likewise. (_d): Likewise. (HALF_TMODE): Likewise. (tf_fpr): New mode_attr. (type): New mode_attr. (*cmp_ccz_0): Use type instead of mode with fsimp. (*cmp_ccs_0_fastmath): Likewise. (*cmptf_ccs): New pattern for wfcxb. (*cmptf_ccsfps): New pattern for wfkxb. (mov): Rename to mov. (signbit2): Rename to signbit2. (isinf2): Renamed to isinf2. (*TDC_insn_): Use type instead of mode with fsimp. (fixuns_trunc2): Rename to fixuns_trunc2. (fix_trunctf2): Rename to fix_trunctf2_fpr. (floatdi2): Rename to floatdi2, use type instead of mode with itof. (floatsi2): Rename to floatsi2, use type instead of mode with itof. (*floatuns2): Use type instead of mode for itof. (floatuns2): Rename to floatuns2. (trunctf2): Rename to trunctf2_fpr, use type instead of mode with fsimp. (extend2): Rename to extend2. (2): Rename to 2, use type instead of mode with fsimp. (rint2): Rename to rint2, use type instead of mode with fsimp. (2): Use type instead of mode for fsimp. (rint2): Likewise. (trunc2): Rename to trunc2. (trunc2): Rename to trunc2. (extend2): Rename to extend2. (extend2): Rename to extend2. (add3): Rename to add3, use type instead of mode with fsimp. (*add3_cc): Use type instead of mode with fsimp. (*add3_cconly): Likewise. (sub3): Rename to sub3, use type instead of mode with fsimp. (*sub3_cc): Use type instead of mode with fsimp. (*sub3_cconly): Likewise. (mul3): Rename to mul3, use type instead of mode with fsimp. (fma4): Restrict using s390_fma_allowed_p. (fms4): Restrict using s390_fma_allowed_p. (div3): Rename to div3, use type instead of mode with fdiv. (neg2): Rename to neg2. (*neg2_cc): Use type instead of mode with fsimp. (*neg2_cconly): Likewise. (*neg2_nocc): Likewise. (*neg2): Likeiwse. (abs2): Rename to abs2, use type instead of mode with fdiv. (*abs2_cc): Use type instead of mode with fsimp. (*abs2_cconly): Likewise. (*abs2_nocc): Likewise. (*abs2): Likewise. (*negabs2_cc): Likewise. (*negabs2_cconly): Likewise. (*negabs2_nocc): Likewise. (*negabs2): Likewise. (sqrt2): Rename to sqrt2, use type instead of mode with fsqrt. (cbranch4): Use FP_ANYTF instead of FP. (copysign3): Rename to copysign3, use type instead of mode with fsimp. * config/s390/s390.opt (flag_vx_long_double_fma): New undocumented option. * config/s390/vector.md (V_HW): Add TF for z14+. (V_HW2): Likewise. (VFT): Likewise. (VF_HW): Likewise. (V_128): Likewise. (tf_vr): New mode_attr. (tointvec): Add TF. (mov): Rename to mov. (movetf): New dispatcher. (*vec_tf_to_v1tf): Rename to *vec_tf_to_v1tf_fpr, restrict to z13-. (*vec_tf_to_v1tf_vr): New pattern for z14+. (*fprx2_to_tf): Likewise. (*mov_tf_to_fprx2_0): Likewise. (*mov_tf_to_fprx2_1): Likewise. (add3): Rename to add3. (addtf3): New dispatcher. (sub3): Rename to sub3. (subtf3): New dispatcher. (mul3): Rename to mul3. (multf3): New dispatcher. (div3): Rename to div3. (divtf3): New dispatcher. (sqrt2): Rename to sqrt2. (sqrttf2): New dispatcher. (fma4): Restrict using s390_fma_allowed_p. (fms4): Likewise. (neg_fma4): Likewise. (neg_fms4): Likewise. (neg2): Rename to neg2. (negtf2): New dispatcher. (abs2): Rename to abs2. (abstf2): New dispatcher. (floattf2_vr): New forwarder. (floattf2): New dispatcher. (floatunstf2_vr): New forwarder. (floatunstf2): New dispatcher. (fix_trunctf2_vr): New forwarder. (fix_trunctf2): New dispatcher. (fixuns_trunctf2_vr): New forwarder. (fixuns_trunctf2): New dispatcher. (2): New pattern. (tf2): New forwarder. (rint2): New pattern. (rinttf2): New forwarder. (*trunctfdf2_vr): New pattern. (trunctfdf2_vr): New forwarder. (trunctfdf2): New dispatcher. (trunctfsf2_vr): New forwarder. (trunctfsf2): New dispatcher. (extenddftf2_vr): New pattern. (extenddftf2): New dispatcher. (extendsftf2_vr): New forwarder. (extendsftf2): New dispatcher. (signbittf2_vr): New forwarder. (signbittf2): New dispatchers. (isinftf2_vr): New forwarder. (isinftf2): New dispatcher. * config/s390/vx-builtins.md (*vftci_cconly): Use VF_HW instead of VECF_HW, add missing constraint, add vw support. (vftci_intcconly): Use VF_HW instead of VECF_HW. (*vftci): Rename to vftci, use VF_HW instead of VECF_HW, and vw support. (vftci_intcc): Use VF_HW instead of VECF_HW. --- diff --git a/gcc/config/s390/s390-modes.def b/gcc/config/s390/s390-modes.def index b1f8e1fc9e3..316ca5cf58b 100644 --- a/gcc/config/s390/s390-modes.def +++ b/gcc/config/s390/s390-modes.def @@ -22,9 +22,12 @@ along with GCC; see the file COPYING3. If not see /* 256-bit integer mode is needed for STACK_SAVEAREA_MODE. */ INT_MODE (OI, 32); -/* Define TFmode to work around reload problem PR 20927. */ +/* 128-bit float stored in a VR on z14+ or a FPR pair on older machines. */ FLOAT_MODE (TF, 16, ieee_quad_format); +/* 128-bit float stored in a FPR pair. */ +FLOAT_MODE (FPRX2, 16, ieee_quad_format); + /* Add any extra modes needed to represent the condition code. */ /* diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h index 029f7289fac..ad2f7f77c18 100644 --- a/gcc/config/s390/s390-protos.h +++ b/gcc/config/s390/s390-protos.h @@ -51,6 +51,7 @@ extern bool s390_hard_regno_rename_ok (unsigned int, unsigned int); extern int s390_class_max_nregs (enum reg_class, machine_mode); extern bool s390_function_arg_vector (machine_mode, const_tree); extern bool s390_return_addr_from_memory(void); +extern bool s390_fma_allowed_p (machine_mode); #if S390_USE_TARGET_ATTRIBUTE extern tree s390_valid_target_attribute_tree (tree args, struct gcc_options *opts, diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 847cedde674..2300a517b64 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -456,6 +456,16 @@ s390_return_addr_from_memory () return cfun_gpr_save_slot(RETURN_REGNUM) == SAVE_SLOT_STACK; } +/* Return nonzero if it's OK to use fused multiply-add for MODE. */ +bool +s390_fma_allowed_p (machine_mode mode) +{ + if (TARGET_VXE && mode == TFmode) + return flag_vx_long_double_fma; + + return true; +} + /* Indicate which ABI has been used for passing vector args. 0 - no vector type arguments have been passed where the ABI is relevant 1 - the old ABI has been used @@ -1850,6 +1860,10 @@ s390_emit_compare (enum rtx_code code, rtx op0, rtx op1) machine_mode mode = s390_select_ccmode (code, op0, op1); rtx cc; + /* Force OP1 into register in order to satisfy VXE TFmode patterns. */ + if (TARGET_VXE && GET_MODE (op1) == TFmode) + op1 = force_reg (TFmode, op1); + if (GET_MODE_CLASS (GET_MODE (op0)) == MODE_CC) { /* Do not output a redundant compare instruction if a @@ -6959,6 +6973,13 @@ s390_expand_vec_init (rtx target, rtx vals) extern rtx s390_build_signbit_mask (machine_mode mode) { + if (mode == TFmode && TARGET_VXE) + { + wide_int mask_val = wi::set_bit_in_zero (127, 128); + rtx mask = immed_wide_int_const (mask_val, TImode); + return gen_lowpart (TFmode, mask); + } + /* Generate the integral element mask value. */ machine_mode inner_mode = GET_MODE_INNER (mode); int inner_bitsize = GET_MODE_BITSIZE (inner_mode); @@ -7902,6 +7923,7 @@ print_operand_address (FILE *file, rtx addr) CONST_VECTOR: Generate a bitmask for vgbm instruction. 'x': print integer X as if it's an unsigned halfword. 'v': print register number as vector register (v1 instead of f1). + 'V': print the second word of a TFmode operand as vector register. */ void @@ -8071,13 +8093,13 @@ print_operand (FILE *file, rtx x, int code) case REG: /* Print FP regs as fx instead of vx when they are accessed through non-vector mode. */ - if (code == 'v' + if ((code == 'v' || code == 'V') || VECTOR_NOFP_REG_P (x) || (FP_REG_P (x) && VECTOR_MODE_P (GET_MODE (x))) || (VECTOR_REG_P (x) && (GET_MODE_SIZE (GET_MODE (x)) / s390_class_max_nregs (FP_REGS, GET_MODE (x))) > 8)) - fprintf (file, "%%v%s", reg_names[REGNO (x)] + 2); + fprintf (file, "%%v%s", reg_names[REGNO (x) + (code == 'V')] + 2); else fprintf (file, "%s", reg_names[REGNO (x)]); break; @@ -8623,7 +8645,7 @@ replace_constant_pool_ref (rtx_insn *insn, rtx ref, rtx offset) static machine_mode constant_modes[] = { - TFmode, TImode, TDmode, + TFmode, FPRX2mode, TImode, TDmode, V16QImode, V8HImode, V4SImode, V2DImode, V1TImode, V4SFmode, V2DFmode, V1TFmode, DFmode, DImode, DDmode, @@ -10418,7 +10440,8 @@ s390_class_max_nregs (enum reg_class rclass, machine_mode mode) full VRs. */ if (TARGET_VX && SCALAR_FLOAT_MODE_P (mode) - && GET_MODE_SIZE (mode) >= 16) + && GET_MODE_SIZE (mode) >= 16 + && !(TARGET_VXE && mode == TFmode)) reg_pair_required_p = true; /* Even if complex types would fit into a single FPR/VR we force @@ -10441,6 +10464,24 @@ s390_class_max_nregs (enum reg_class rclass, machine_mode mode) return (GET_MODE_SIZE (mode) + reg_size - 1) / reg_size; } +/* Return nonzero if mode M describes a 128-bit float in a floating point + register pair. */ + +static bool +s390_is_fpr128 (machine_mode m) +{ + return m == FPRX2mode || (!TARGET_VXE && m == TFmode); +} + +/* Return nonzero if mode M describes a 128-bit float in a vector + register. */ + +static bool +s390_is_vr128 (machine_mode m) +{ + return m == V1TFmode || (TARGET_VXE && m == TFmode); +} + /* Implement TARGET_CAN_CHANGE_MODE_CLASS. */ static bool @@ -10451,11 +10492,11 @@ s390_can_change_mode_class (machine_mode from_mode, machine_mode small_mode; machine_mode big_mode; - /* V1TF and TF have different representations in vector - registers. */ + /* 128-bit values have different representations in floating point and + vector registers. */ if (reg_classes_intersect_p (VEC_REGS, rclass) - && ((from_mode == V1TFmode && to_mode == TFmode) - || (from_mode == TFmode && to_mode == V1TFmode))) + && ((s390_is_fpr128 (from_mode) && s390_is_vr128 (to_mode)) + || (s390_is_vr128 (from_mode) && s390_is_fpr128 (to_mode)))) return false; if (GET_MODE_SIZE (from_mode) == GET_MODE_SIZE (to_mode)) diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h index ec5128c0af2..8c028317b6b 100644 --- a/gcc/config/s390/s390.h +++ b/gcc/config/s390/s390.h @@ -1186,5 +1186,40 @@ struct GTY(()) machine_function #define TARGET_INDIRECT_BRANCH_TABLE s390_indirect_branch_table +#ifdef GENERATOR_FILE +/* gencondmd.c is built before insn-flags.h. */ +#define HAVE_TF(icode) true +#else +#define HAVE_TF(icode) (HAVE_##icode##_fpr || HAVE_##icode##_vr) +#endif + +/* Dispatcher for movtf. */ +#define EXPAND_MOVTF(icode) \ + do \ + { \ + if (TARGET_VXE) \ + emit_insn (gen_##icode##_vr (operands[0], operands[1])); \ + else \ + emit_insn (gen_##icode##_fpr (operands[0], operands[1])); \ + DONE; \ + } \ + while (false) + +/* Like EXPAND_MOVTF, but also legitimizes operands. */ +#define EXPAND_TF(icode, nops) \ + do \ + { \ + const size_t __nops = (nops); \ + expand_operand ops[__nops]; \ + create_output_operand (&ops[0], operands[0], GET_MODE (operands[0])); \ + for (size_t i = 1; i < __nops; i++) \ + create_input_operand (&ops[i], operands[i], GET_MODE (operands[i])); \ + if (TARGET_VXE) \ + expand_insn (CODE_FOR_##icode##_vr, __nops, ops); \ + else \ + expand_insn (CODE_FOR_##icode##_fpr, __nops, ops); \ + DONE; \ + } \ + while (false) #endif /* S390_H */ diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md index 050374980ae..a2c033b2515 100644 --- a/gcc/config/s390/s390.md +++ b/gcc/config/s390/s390.md @@ -405,6 +405,7 @@ (PFPO_OP_TYPE_SF 0x5) (PFPO_OP_TYPE_DF 0x6) (PFPO_OP_TYPE_TF 0x7) + (PFPO_OP_TYPE_FPRX2 0x7) (PFPO_OP_TYPE_SD 0x8) (PFPO_OP_TYPE_DD 0x9) (PFPO_OP_TYPE_TD 0xa) @@ -627,20 +628,29 @@ ;; Iterators -(define_mode_iterator ALL [TI DI SI HI QI TF DF SF TD DD SD V1QI V2QI V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI V1DI V2DI V1SF V2SF V4SF V1TI V1DF V2DF V1TF]) +(define_mode_iterator ALL [TI DI SI HI QI TF FPRX2 DF SF TD DD SD V1QI V2QI + V4QI V8QI V16QI V1HI V2HI V4HI V8HI V1SI V2SI V4SI + V1DI V2DI V1SF V2SF V4SF V1TI V1DF V2DF V1TF]) ;; These mode iterators allow floating point patterns to be generated from the ;; same template. -(define_mode_iterator FP_ALL [TF DF SF (TD "TARGET_HARD_DFP") (DD "TARGET_HARD_DFP") +(define_mode_iterator FP_ALL [(TF "!TARGET_VXE") (FPRX2 "TARGET_VXE") DF SF + (TD "TARGET_HARD_DFP") (DD "TARGET_HARD_DFP") (SD "TARGET_HARD_DFP")]) -(define_mode_iterator FP [TF DF SF (TD "TARGET_HARD_DFP") (DD "TARGET_HARD_DFP")]) -(define_mode_iterator BFP [TF DF SF]) +(define_mode_iterator FP [(TF "!TARGET_VXE") (FPRX2 "TARGET_VXE") DF SF + (TD "TARGET_HARD_DFP") (DD "TARGET_HARD_DFP")]) +;; Like FP, but without a condition on TF. Useful for expanders that must be +;; the same for FP and VR variants of TF. +(define_mode_iterator FP_ANYTF [TF (FPRX2 "TARGET_VXE") DF SF + (TD "TARGET_HARD_DFP") + (DD "TARGET_HARD_DFP")]) +(define_mode_iterator BFP [(TF "!TARGET_VXE") (FPRX2 "TARGET_VXE") DF SF]) (define_mode_iterator DFP [TD DD]) (define_mode_iterator DFP_ALL [TD DD SD]) (define_mode_iterator DSF [DF SF]) (define_mode_iterator SD_SF [SF SD]) (define_mode_iterator DD_DF [DF DD]) -(define_mode_iterator TD_TF [TF TD]) +(define_mode_iterator TD_TF [(TF "!TARGET_VXE") (FPRX2 "TARGET_VXE") TD]) ; 32 bit int<->fp conversion instructions are available since VXE2 (z15). (define_mode_iterator VX_CONV_BFP [DF (SF "TARGET_VXE2")]) @@ -714,7 +724,8 @@ ;; In FP templates, a string like "ltbr" will expand to "ltxbr" in ;; TF/TDmode, "ltdbr" in DF/DDmode, and "ltebr" in SF/SDmode. -(define_mode_attr xde [(TF "x") (DF "d") (SF "e") (TD "x") (DD "d") (SD "e") (V4SF "e") (V2DF "d")]) +(define_mode_attr xde [(TF "x") (FPRX2 "x") (DF "d") (SF "e") (TD "x") + (DD "d") (SD "e") (V4SF "e") (V2DF "d")]) ;; In FP templates, a in "mr" will expand to "mxr" in ;; TF/TDmode, "mdr" in DF/DDmode, "meer" in SFmode and "mer in @@ -727,19 +738,22 @@ ;; These mode attributes are supposed to be used in the `enabled' insn ;; attribute to disable certain alternatives for certain modes. -(define_mode_attr nBFP [(TF "0") (DF "0") (SF "0") (TD "*") (DD "*") (DD "*")]) -(define_mode_attr nDFP [(TF "*") (DF "*") (SF "*") (TD "0") (DD "0") (DD "0")]) -(define_mode_attr DSF [(TF "0") (DF "*") (SF "*") (TD "0") (DD "0") (SD "0")]) -(define_mode_attr DFDI [(TF "0") (DF "*") (SF "0") +(define_mode_attr nBFP [(TF "0") (FPRX2 "0") (DF "0") (SF "0") (TD "*") + (DD "*") (DD "*")]) +(define_mode_attr nDFP [(TF "*") (FPRX2 "*") (DF "*") (SF "*") (TD "0") + (DD "0") (DD "0")]) +(define_mode_attr DSF [(TF "0") (FPRX2 "0") (DF "*") (SF "*") (TD "0") + (DD "0") (SD "0")]) +(define_mode_attr DFDI [(TF "0") (FPRX2 "0") (DF "*") (SF "0") (TD "0") (DD "0") (DD "0") (TI "0") (DI "*") (SI "0")]) -(define_mode_attr SFSI [(TF "0") (DF "0") (SF "*") +(define_mode_attr SFSI [(TF "0") (FPRX2 "0") (DF "0") (SF "*") (TD "0") (DD "0") (DD "0") (TI "0") (DI "0") (SI "*")]) -(define_mode_attr DF [(TF "0") (DF "*") (SF "0") +(define_mode_attr DF [(TF "0") (FPRX2 "0") (DF "*") (SF "0") (TD "0") (DD "0") (DD "0") (TI "0") (DI "0") (SI "0")]) -(define_mode_attr SF [(TF "0") (DF "0") (SF "*") +(define_mode_attr SF [(TF "0") (FPRX2 "0") (DF "0") (SF "*") (TD "0") (DD "0") (DD "0") (TI "0") (DI "0") (SI "0")]) @@ -749,15 +763,17 @@ ;; sign bit instructions only handle single source and target fp registers ;; these instructions can only be used for TFmode values if the source and ;; target operand uses the same fp register. -(define_mode_attr fT0 [(TF "0") (DF "f") (SF "f")]) +(define_mode_attr fT0 [(TF "0") (FPRX2 "0") (DF "f") (SF "f")]) ;; This attribute adds b for bfp instructions and t for dfp instructions and is used ;; within instruction mnemonics. -(define_mode_attr bt [(TF "b") (DF "b") (SF "b") (TD "t") (DD "t") (SD "t")]) +(define_mode_attr bt [(TF "b") (FPRX2 "b") (DF "b") (SF "b") (TD "t") (DD "t") + (SD "t")]) ;; This attribute is used within instruction mnemonics. It evaluates to d for dfp ;; modes and to an empty string for bfp modes. -(define_mode_attr _d [(TF "") (DF "") (SF "") (TD "d") (DD "d") (SD "d")]) +(define_mode_attr _d [(TF "") (FPRX2 "") (DF "") (SF "") (TD "d") (DD "d") + (SD "d")]) ;; In GPR and P templates, a constraint like "" will expand to "d" in DImode ;; and "0" in SImode. This allows to combine instructions of which the 31bit @@ -829,7 +845,7 @@ ;; This attribute expands to DF for TFmode and to DD for TDmode . It is ;; used for Txmode splitters splitting a Txmode copy into 2 Dxmode copies. -(define_mode_attr HALF_TMODE [(TF "DF") (TD "DD")]) +(define_mode_attr HALF_TMODE [(TF "DF") (FPRX2 "DF") (TD "DD")]) ;; Maximum unsigned integer that fits in MODE. (define_mode_attr max_uint [(HI "65535") (QI "255")]) @@ -850,6 +866,13 @@ ;; Allow return and simple_return to be defined from a single template. (define_code_iterator ANY_RETURN [return simple_return]) +;; Facilitate dispatching TFmode expanders on z14+. +(define_mode_attr tf_fpr [(TF "_fpr") (FPRX2 "") (DF "") (SF "") (TD "") + (DD "") (SD "")]) + +;; Mode names as seen in type mode_attr values. +(define_mode_attr type [(TF "tf") (FPRX2 "tf") (DF "df") (SF "sf") (TD "td") + (DD "dd") (SD "sd")]) ; Condition code modes generated by vector fp comparisons. These will @@ -1421,7 +1444,7 @@ "TARGET_HARD_FLOAT" "ltr\t%0,%0" [(set_attr "op_type" "RRE") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) (define_insn "*cmp_ccs_0_fastmath" [(set (reg CC_REGNUM) @@ -1433,7 +1456,7 @@ && !flag_signaling_nans" "ltr\t%0,%0" [(set_attr "op_type" "RRE") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; VX: TFmode in FPR pairs: use cxbr instead of wfcxb ; cxtr, cdtr, cxbr, cdbr, cebr, cdb, ceb, wfcsb, wfcdb @@ -1451,6 +1474,18 @@ (set_attr "cpu_facility" "*,*,vx,vxe") (set_attr "enabled" "*,,,")]) +; VX: TFmode in VR: use wfcxb +(define_insn "*cmptf_ccs" + [(set (reg CC_REGNUM) + (compare (match_operand:TF 0 "register_operand" "v") + (match_operand:TF 1 "register_operand" "v")))] + "s390_match_ccmode(insn, CCSmode) && TARGET_VXE" + "wfcxb\t%0,%1" + [(set_attr "op_type" "VRR") + (set_attr "cpu_facility" "vxe")]) + +; VX: TFmode in FPR pairs: use kxbr instead of wfkxb +; kxtr, kdtr, kxbr, kdbr, kebr, kdb, keb, wfksb, wfkdb (define_insn "*cmp_ccsfps" [(set (reg CC_REGNUM) (compare (match_operand:FP 0 "register_operand" "f,f,v,v") @@ -1465,6 +1500,16 @@ (set_attr "cpu_facility" "*,*,vx,vxe") (set_attr "enabled" "*,,,")]) +; VX: TFmode in VR: use wfkxb +(define_insn "*cmptf_ccsfps" + [(set (reg CC_REGNUM) + (compare (match_operand:TF 0 "register_operand" "v") + (match_operand:TF 1 "register_operand" "v")))] + "s390_match_ccmode (insn, CCSFPSmode) && TARGET_VXE" + "wfkxb\t%0,%1" + [(set_attr "op_type" "VRR") + (set_attr "cpu_facility" "vxe")]) + ; Compare and Branch instructions ; cij, cgij, crj, cgrj, cfi, cgfi, cr, cgr @@ -2489,7 +2534,7 @@ ; mov(tf|td) instruction pattern(s). ; -(define_expand "mov" +(define_expand "mov" [(set (match_operand:TD_TF 0 "nonimmediate_operand" "") (match_operand:TD_TF 1 "general_operand" ""))] "" @@ -3418,7 +3463,7 @@ ; Test data class. ; -(define_expand "signbit2" +(define_expand "signbit2" [(set (reg:CCZ CC_REGNUM) (unspec:CCZ [(match_operand:FP_ALL 1 "register_operand" "f") (match_dup 2)] @@ -3430,7 +3475,7 @@ operands[2] = GEN_INT (S390_TDC_SIGNBIT_SET); }) -(define_expand "isinf2" +(define_expand "isinf2" [(set (reg:CCZ CC_REGNUM) (unspec:CCZ [(match_operand:FP_ALL 1 "register_operand" "f") (match_dup 2)] @@ -3468,7 +3513,7 @@ "TARGET_HARD_FLOAT" "t<_d>c\t%0,%1" [(set_attr "op_type" "RXE") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) @@ -4984,7 +5029,7 @@ ; This is the only entry point for fixuns_trunc. It multiplexes the ; expansion to either the *_emu expanders below for pre z196 machines ; or emits the default pattern otherwise. -(define_expand "fixuns_trunc2" +(define_expand "fixuns_trunc2" [(parallel [(set (match_operand:GPR 0 "register_operand" "") (unsigned_fix:GPR (match_operand:FP 1 "register_operand" ""))) @@ -5247,12 +5292,12 @@ ; fix_trunctf(si|di)2 instruction pattern(s). ; -(define_expand "fix_trunctf2" +(define_expand "fix_trunctf2_fpr" [(parallel [(set (match_operand:GPR 0 "register_operand" "") (fix:GPR (match_operand:TF 1 "register_operand" ""))) (unspec:GPR [(const_int BFP_RND_TOWARD_0)] UNSPEC_ROUND) (clobber (reg:CC CC_REGNUM))])] - "TARGET_HARD_FLOAT" + "TARGET_HARD_FLOAT && !TARGET_VXE" "") @@ -5261,7 +5306,7 @@ ; ; cxgbr, cdgbr, cegbr, cxgtr, cdgtr -(define_insn "floatdi2" +(define_insn "floatdi2" [(set (match_operand:FP 0 "register_operand" "=f,v") (float:FP (match_operand:DI 1 "register_operand" "d,v")))] "TARGET_ZARCH && TARGET_HARD_FLOAT" @@ -5269,12 +5314,12 @@ cgr\t%0,%1 wcdgb\t%v0,%v1,0,0" [(set_attr "op_type" "RRE,VRR") - (set_attr "type" "itof" ) + (set_attr "type" "itof" ) (set_attr "cpu_facility" "*,vx") (set_attr "enabled" "*,")]) ; cxfbr, cdfbr, cefbr, wcefb -(define_insn "floatsi2" +(define_insn "floatsi2" [(set (match_operand:BFP 0 "register_operand" "=f,v") (float:BFP (match_operand:SI 1 "register_operand" "d,v")))] "TARGET_HARD_FLOAT" @@ -5282,7 +5327,7 @@ cfbr\t%0,%1 wcefb\t%v0,%v1,0,0" [(set_attr "op_type" "RRE,VRR") - (set_attr "type" "itof" ) + (set_attr "type" "itof" ) (set_attr "cpu_facility" "*,vxe2") (set_attr "enabled" "*,")]) @@ -5293,7 +5338,7 @@ "TARGET_Z196 && TARGET_HARD_FLOAT" "cftr\t%0,0,%1,0" [(set_attr "op_type" "RRE") - (set_attr "type" "itof" )]) + (set_attr "type" "itof")]) ; ; floatuns(si|di)(tf|df|sf|td|dd)2 instruction pattern(s). @@ -5319,9 +5364,9 @@ && (!TARGET_VX || mode != DFmode || mode != DImode)" "clr\t%0,0,%1,0" [(set_attr "op_type" "RRE") - (set_attr "type" "itof")]) + (set_attr "type" "itof")]) -(define_expand "floatuns2" +(define_expand "floatuns2" [(set (match_operand:FP 0 "register_operand" "") (unsigned_float:FP (match_operand:GPR 1 "register_operand" "")))] "TARGET_Z196 && TARGET_HARD_FLOAT") @@ -5347,7 +5392,7 @@ ; ; ldxbr, lexbr -(define_insn "trunctf2" +(define_insn "trunctf2_fpr" [(set (match_operand:DSF 0 "register_operand" "=f") (float_truncate:DSF (match_operand:TF 1 "register_operand" "f"))) (clobber (match_scratch:TF 2 "=f"))] @@ -5427,9 +5472,9 @@ lbr\t%0,%1 lb\t%0,%1" [(set_attr "op_type" "RRE,RXE") - (set_attr "type" "fsimp, fload")]) + (set_attr "type" "fsimp, fload")]) -(define_expand "extend2" +(define_expand "extend2" [(set (match_operand:BFP 0 "register_operand" "") (float_extend:BFP (match_operand:DSF 1 "nonimmediate_operand" "")))] "TARGET_HARD_FLOAT @@ -5471,27 +5516,27 @@ ; For all of them the inexact exceptions are suppressed. ; fiebra, fidbra, fixbra -(define_insn "2" +(define_insn "2" [(set (match_operand:BFP 0 "register_operand" "=f") (unspec:BFP [(match_operand:BFP 1 "register_operand" "f")] FPINT))] "TARGET_Z196" "fibra\t%0,,%1,4" [(set_attr "op_type" "RRF") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; rint is supposed to raise an inexact exception so we can use the ; older instructions. ; fiebr, fidbr, fixbr -(define_insn "rint2" +(define_insn "rint2" [(set (match_operand:BFP 0 "register_operand" "=f") (unspec:BFP [(match_operand:BFP 1 "register_operand" "f")] UNSPEC_FPINT_RINT))] "" "fibr\t%0,0,%1" [(set_attr "op_type" "RRF") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; Decimal Floating Point - load fp integer @@ -5504,7 +5549,7 @@ "TARGET_HARD_DFP" "fitr\t%0,,%1,4" [(set_attr "op_type" "RRF") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; fidtr, fixtr (define_insn "rint2" @@ -5514,7 +5559,7 @@ "TARGET_HARD_DFP" "fitr\t%0,0,%1,0" [(set_attr "op_type" "RRF") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; ; Binary <-> Decimal floating point trunc patterns @@ -5538,7 +5583,7 @@ "TARGET_HARD_DFP" "pfpo") -(define_expand "trunc2" +(define_expand "trunc2" [(set (reg:BFP FPR4_REGNUM) (match_operand:BFP 1 "nonimmediate_operand" "")) (set (reg:SI GPR0_REGNUM) (match_dup 2)) (parallel @@ -5565,7 +5610,7 @@ operands[2] = GEN_INT (flags); }) -(define_expand "trunc2" +(define_expand "trunc2" [(set (reg:DFP_ALL FPR4_REGNUM) (match_operand:DFP_ALL 1 "nonimmediate_operand" "")) (set (reg:SI GPR0_REGNUM) (match_dup 2)) @@ -5611,7 +5656,7 @@ "TARGET_HARD_DFP" "pfpo") -(define_expand "extend2" +(define_expand "extend2" [(set (reg:BFP FPR4_REGNUM) (match_operand:BFP 1 "nonimmediate_operand" "")) (set (reg:SI GPR0_REGNUM) (match_dup 2)) (parallel @@ -5638,7 +5683,7 @@ operands[2] = GEN_INT (flags); }) -(define_expand "extend2" +(define_expand "extend2" [(set (reg:DFP_ALL FPR4_REGNUM) (match_operand:DFP_ALL 1 "nonimmediate_operand" "")) (set (reg:SI GPR0_REGNUM) (match_dup 2)) @@ -6117,7 +6162,7 @@ ; axbr, adbr, aebr, axb, adb, aeb, adtr, axtr ; FIXME: wfadb does not clobber cc -(define_insn "add3" +(define_insn "add3" [(set (match_operand:FP 0 "register_operand" "=f,f,f,v,v") (plus:FP (match_operand:FP 1 "nonimmediate_operand" "%f,0,0,v,v") (match_operand:FP 2 "general_operand" "f,f,R,v,v"))) @@ -6130,7 +6175,7 @@ wfadb\t%v0,%v1,%v2 wfasb\t%v0,%v1,%v2" [(set_attr "op_type" "RRF,RRE,RXE,VRR,VRR") - (set_attr "type" "fsimp") + (set_attr "type" "fsimp") (set_attr "cpu_facility" "*,*,*,vx,vxe") (set_attr "enabled" ",,,,")]) @@ -6148,7 +6193,7 @@ abr\t%0,%2 ab\t%0,%2" [(set_attr "op_type" "RRF,RRE,RXE") - (set_attr "type" "fsimp") + (set_attr "type" "fsimp") (set_attr "enabled" ",,")]) ; axbr, adbr, aebr, axb, adb, aeb, adtr, axtr @@ -6164,7 +6209,7 @@ abr\t%0,%2 ab\t%0,%2" [(set_attr "op_type" "RRF,RRE,RXE") - (set_attr "type" "fsimp") + (set_attr "type" "fsimp") (set_attr "enabled" ",,")]) ; @@ -6562,7 +6607,7 @@ ; FIXME: (clobber (match_scratch:CC 3 "=c,c,c,X,X")) does not work - why? ; sxbr, sdbr, sebr, sdb, seb, sxtr, sdtr -(define_insn "sub3" +(define_insn "sub3" [(set (match_operand:FP 0 "register_operand" "=f,f,f,v,v") (minus:FP (match_operand:FP 1 "register_operand" "f,0,0,v,v") (match_operand:FP 2 "general_operand" "f,f,R,v,v"))) @@ -6575,7 +6620,7 @@ wfsdb\t%v0,%v1,%v2 wfssb\t%v0,%v1,%v2" [(set_attr "op_type" "RRF,RRE,RXE,VRR,VRR") - (set_attr "type" "fsimp") + (set_attr "type" "fsimp") (set_attr "cpu_facility" "*,*,*,vx,vxe") (set_attr "enabled" ",,,,")]) @@ -6593,7 +6638,7 @@ sbr\t%0,%2 sb\t%0,%2" [(set_attr "op_type" "RRF,RRE,RXE") - (set_attr "type" "fsimp") + (set_attr "type" "fsimp") (set_attr "enabled" ",,")]) ; sxbr, sdbr, sebr, sdb, seb, sxtr, sdtr @@ -6609,7 +6654,7 @@ sbr\t%0,%2 sb\t%0,%2" [(set_attr "op_type" "RRF,RRE,RXE") - (set_attr "type" "fsimp") + (set_attr "type" "fsimp") (set_attr "enabled" ",,")]) @@ -7143,7 +7188,7 @@ ; ; mxbr, mdbr, meebr, mxb, mxb, meeb, mdtr, mxtr -(define_insn "mul3" +(define_insn "mul3" [(set (match_operand:FP 0 "register_operand" "=f,f,f,v,v") (mult:FP (match_operand:FP 1 "nonimmediate_operand" "%f,0,0,v,v") (match_operand:FP 2 "general_operand" "f,f,R,v,v")))] @@ -7155,7 +7200,7 @@ wfmdb\t%v0,%v1,%v2 wfmsb\t%v0,%v1,%v2" [(set_attr "op_type" "RRF,RRE,RXE,VRR,VRR") - (set_attr "type" "fmul") + (set_attr "type" "fmul") (set_attr "cpu_facility" "*,*,*,vx,vxe") (set_attr "enabled" ",,,,")]) @@ -7165,7 +7210,7 @@ (fma:DSF (match_operand:DSF 1 "nonimmediate_operand" "%f,f,v,v") (match_operand:DSF 2 "nonimmediate_operand" "f,R,v,v") (match_operand:DSF 3 "register_operand" "0,0,v,v")))] - "TARGET_HARD_FLOAT" + "TARGET_HARD_FLOAT && s390_fma_allowed_p (mode)" "@ mabr\t%0,%1,%2 mab\t%0,%1,%2 @@ -7182,7 +7227,7 @@ (fma:DSF (match_operand:DSF 1 "nonimmediate_operand" "%f,f,v,v") (match_operand:DSF 2 "nonimmediate_operand" "f,R,v,v") (neg:DSF (match_operand:DSF 3 "register_operand" "0,0,v,v"))))] - "TARGET_HARD_FLOAT" + "TARGET_HARD_FLOAT && s390_fma_allowed_p (mode)" "@ msbr\t%0,%1,%2 msb\t%0,%1,%2 @@ -7448,7 +7493,7 @@ ; ; dxbr, ddbr, debr, dxb, ddb, deb, ddtr, dxtr -(define_insn "div3" +(define_insn "div3" [(set (match_operand:FP 0 "register_operand" "=f,f,f,v,v") (div:FP (match_operand:FP 1 "register_operand" "f,0,0,v,v") (match_operand:FP 2 "general_operand" "f,f,R,v,v")))] @@ -7460,7 +7505,7 @@ wfddb\t%v0,%v1,%v2 wfdsb\t%v0,%v1,%v2" [(set_attr "op_type" "RRF,RRE,RXE,VRR,VRR") - (set_attr "type" "fdiv") + (set_attr "type" "fdiv") (set_attr "cpu_facility" "*,*,*,vx,vxe") (set_attr "enabled" ",,,,")]) @@ -8777,10 +8822,10 @@ operands[6] = gen_label_rtx ();") ; -; neg(df|sf)2 instruction pattern(s). +; neg(tf|df|sf)2 instruction pattern(s). ; -(define_expand "neg2" +(define_expand "neg2" [(parallel [(set (match_operand:BFP 0 "register_operand") (neg:BFP (match_operand:BFP 1 "register_operand"))) @@ -8797,7 +8842,7 @@ "s390_match_ccmode (insn, CCSmode) && TARGET_HARD_FLOAT" "lcbr\t%0,%1" [(set_attr "op_type" "RRE") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; lcxbr, lcdbr, lcebr (define_insn "*neg2_cconly" @@ -8808,7 +8853,7 @@ "s390_match_ccmode (insn, CCSmode) && TARGET_HARD_FLOAT" "lcbr\t%0,%1" [(set_attr "op_type" "RRE") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; lcdfr (define_insn "*neg2_nocc" @@ -8817,7 +8862,7 @@ "TARGET_DFP" "lcdfr\t%0,%1" [(set_attr "op_type" "RRE") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; lcxbr, lcdbr, lcebr ; FIXME: wflcdb does not clobber cc @@ -8833,7 +8878,7 @@ wflcsb\t%0,%1" [(set_attr "op_type" "RRE,VRR,VRR") (set_attr "cpu_facility" "*,vx,vxe") - (set_attr "type" "fsimp,*,*") + (set_attr "type" "fsimp,*,*") (set_attr "enabled" "*,,")]) @@ -8901,10 +8946,10 @@ (set_attr "z10prop" "z10_c")]) ; -; abs(df|sf)2 instruction pattern(s). +; abs(tf|df|sf)2 instruction pattern(s). ; -(define_expand "abs2" +(define_expand "abs2" [(parallel [(set (match_operand:BFP 0 "register_operand" "=f") (abs:BFP (match_operand:BFP 1 "register_operand" "f"))) @@ -8922,7 +8967,7 @@ "s390_match_ccmode (insn, CCSmode) && TARGET_HARD_FLOAT" "lpbr\t%0,%1" [(set_attr "op_type" "RRE") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; lpxbr, lpdbr, lpebr (define_insn "*abs2_cconly" @@ -8933,7 +8978,7 @@ "s390_match_ccmode (insn, CCSmode) && TARGET_HARD_FLOAT" "lpbr\t%0,%1" [(set_attr "op_type" "RRE") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; lpdfr (define_insn "*abs2_nocc" @@ -8942,7 +8987,7 @@ "TARGET_DFP" "lpdfr\t%0,%1" [(set_attr "op_type" "RRE") - (set_attr "type" "fsimp")]) + (set_attr "type" "fsimp")]) ; lpxbr, lpdbr, lpebr ; FIXME: wflpdb does not clobber cc @@ -8956,7 +9001,7 @@ wflpdb\t%0,%1" [(set_attr "op_type" "RRE,VRR") (set_attr "cpu_facility" "*,vx") - (set_attr "type" "fsimp,*") + (set_attr "type" "fsimp