From: Richard Sandiford Date: Mon, 1 Jul 2019 08:58:23 +0000 (+0000) Subject: Make ira_get_dup_out_num handle more cases X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=ed680e2cc18c73f90e6bfbd3f346a8820476371b;p=gcc.git Make ira_get_dup_out_num handle more cases SVE has a prefix instruction (MOVPRFX) that acts as a move but is designed to be easily fusible with the following instruction. The SVE port therefore has lots of patterns with constraints of the form: A: operand 0: =w,?w ... operand n: 0, w where the first alternative is a single instruction and the second alternative uses MOVPRFX. Ideally we want operand n to be allocated to the same register as operand 0 in this case. add_insn_allocno_copies is the main IRA routine that deals with tied operands. It is (rightly) very conservative, and only handles cases in which we're confident about saving a full move. So for a pattern like: B: operand 0: =w,w ... operand n: 0,w we don't (and shouldn't) assume that tying operands 0 and n would save the cost of a move. But in A, the second alternative has a ? marker, which makes it more expensive than the first alternative by a full reload. So I think for copy elision we should ignore the untied operand n in the second alternative of A. One approach would be to add '*' markers to each pattern and make ira_get_dup_out_num honour them. But I think the rule applies on first principles, so marking with '*' shouldn't be necessary. This patch instead makes ira_get_dup_out_num ignore expensive alternatives if there are other alternatives that match exactly. The cheapest way of doing that seemed to be to take expensive alternatives out of consideration in ira_setup_alts, which provides a bitmask of alternatives and has all the information available. add_insn_allocno_copies is the only current user of ira_setup_alts, so no other code should be affected. If all available alternatives are disparaged or need a reload, there's not much we can do to cut them down at this stage, since it's hard to predict which operands will be reloaded and which registers will need to be spilled. An interesting case is patterns like this msp430 one: ;; Alternatives 2 and 3 are to handle cases generated by reload. (define_insn "subqi3" [(set (match_operand:QI 0 "nonimmediate_operand" "=rYs, rm, &?r, ?&r") (minus:QI (match_operand:QI 1 "general_operand" "0, 0, !r, !i") (match_operand:QI 2 "general_operand" " riYs, rmi, rmi, r")))] "" "@ SUB.B\t%2, %0 SUB%X0.B\t%2, %0 MOV%X0.B\t%1, %0 { SUB%X0.B\t%2, %0 MOV%X0.B\t%1, %0 { SUB%X0.B\t%2, %0" ) Here alternative 3 is significantly more expensive then alternative 0 (reject costs 0 and 606 respectively). But if operand 1 is an integer constant, we'll still use alternative 3 if operand 2 is an allocated register. On the other hand, if operand 1 is an integer constant but operand 2 is spilled to memory, we'll move the constant into a register and use the first alternative. So in this case, if operand 1 is a register, we should consider only the first two alternatives and thus try to tie operand 1 to operand 0 (which we didn't do previously). If operand 1 is a constant integer, we should consider at least alternatives 0, 1 and 3. We could exclude alternative 2, but I don't have any evidence that that's useful. 2019-07-01 Richard Sandiford gcc/ * ira.c (ira_setup_alts): If any valid alternatives have zero cost, exclude any others that are disparaged or that are bound to need a reload or spill. (ira_get_dup_out_num): Expand comment. From-SVN: r272849 --- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index d006f50b524..8f70d1b385e 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2019-07-01 Richard Sandiford + + * ira.c (ira_setup_alts): If any valid alternatives have zero cost, + exclude any others that are disparaged or that are bound to need + a reload or spill. + (ira_get_dup_out_num): Expand comment. + 2019-07-01 Richard Sandiford * ira.c (ira_setup_alts): Use preprocess_constraints to get the diff --git a/gcc/ira.c b/gcc/ira.c index 4075522ab1d..d9327db4e8f 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -1787,7 +1787,9 @@ setup_prohibited_mode_move_regs (void) /* Extract INSN and return the set of alternatives that we should consider. This excludes any alternatives whose constraints are obviously impossible to meet (e.g. because the constraint requires a constant and the operand - is nonconstant). */ + is nonconstant). It also excludes alternatives that are bound to need + a spill or reload, as long as we have other alternatives that match + exactly. */ alternative_mask ira_setup_alts (rtx_insn *insn) { @@ -1800,6 +1802,7 @@ ira_setup_alts (rtx_insn *insn) preprocess_constraints (insn); alternative_mask preferred = get_preferred_alternatives (insn); alternative_mask alts = 0; + alternative_mask exact_alts = 0; /* Check that the hard reg set is enough for holding all alternatives. It is hard to imagine the situation when the assertion is wrong. */ @@ -1816,20 +1819,24 @@ ira_setup_alts (rtx_insn *insn) { for (nalt = 0; nalt < recog_data.n_alternatives; nalt++) { - if (!TEST_BIT (preferred, nalt) || TEST_BIT (alts, nalt)) + if (!TEST_BIT (preferred, nalt) || TEST_BIT (exact_alts, nalt)) continue; const operand_alternative *op_alt = &recog_op_alt[nalt * recog_data.n_operands]; + int this_reject = 0; for (nop = 0; nop < recog_data.n_operands; nop++) { int c, len; + this_reject += op_alt[nop].reject; + rtx op = recog_data.operand[nop]; p = op_alt[nop].constraint; if (*p == 0 || *p == ',') continue; - + + bool win_p = false; do switch (c = *p, len = CONSTRAINT_LEN (c, p), c) { @@ -1847,7 +1854,14 @@ ira_setup_alts (rtx_insn *insn) case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': - goto op_success; + { + rtx other = recog_data.operand[c - '0']; + if (MEM_P (other) + ? rtx_equal_p (other, op) + : REG_P (op) || SUBREG_P (op)) + goto op_success; + win_p = true; + } break; case 'g': @@ -1861,7 +1875,11 @@ ira_setup_alts (rtx_insn *insn) { case CT_REGISTER: if (reg_class_for_constraint (cn) != NO_REGS) - goto op_success; + { + if (REG_P (op) || SUBREG_P (op)) + goto op_success; + win_p = true; + } break; case CT_CONST_INT: @@ -1872,9 +1890,14 @@ ira_setup_alts (rtx_insn *insn) break; case CT_ADDRESS: + goto op_success; + case CT_MEMORY: case CT_SPECIAL_MEMORY: - goto op_success; + if (MEM_P (op)) + goto op_success; + win_p = true; + break; case CT_FIXED_FORM: if (constraint_satisfied_p (op, cn)) @@ -1885,12 +1908,22 @@ ira_setup_alts (rtx_insn *insn) } } while (p += len, c); - break; + if (!win_p) + break; + /* We can make the alternative match by spilling a register + to memory or loading something into a register. Count a + cost of one reload (the equivalent of the '?' constraint). */ + this_reject += 6; op_success: ; } + if (nop >= recog_data.n_operands) - alts |= ALTERNATIVE_BIT (nalt); + { + alts |= ALTERNATIVE_BIT (nalt); + if (this_reject == 0) + exact_alts |= ALTERNATIVE_BIT (nalt); + } } if (commutative < 0) break; @@ -1900,13 +1933,13 @@ ira_setup_alts (rtx_insn *insn) if (curr_swapped) break; } - return alts; + return exact_alts ? exact_alts : alts; } /* Return the number of the output non-early clobber operand which should be the same in any case as operand with number OP_NUM (or - negative value if there is no such operand). The function takes - only really possible alternatives into consideration. */ + negative value if there is no such operand). ALTS is the mask + of alternatives that we should consider. */ int ira_get_dup_out_num (int op_num, alternative_mask alts) {