From: Richard Sandiford Date: Tue, 31 Dec 2019 08:28:24 +0000 (+0000) Subject: Fix SSA update when vectorisation adds a vdef to a read-only loop X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=e3969868d63b2930a0714ca3cb466831fd4be634;p=gcc.git Fix SSA update when vectorisation adds a vdef to a read-only loop This patch fixes an awkward corner case in which: (a) we apply if-conversion to a loop; (b) the original scalar loop doesn't have a vdef, and thus doesn't need a virtual phi; (c) the vectorised main loop does need a vdef and a virtual phi (see below); (d) we also vectorise the epilogue; and (e) the vectorised epilogue still needs a scalar epilogue The specific case in which (c) applies is if a read-only loop is vectorised using IFN_LOAD_LANES, which uses clobber statements to mark the lifetime of the temporary array. The vectoriser relies on the SSA renamer to update virtual operands. All would probably be well if it postponed this update until after it had vectorised both the main loop and the epilogue loop. However, when vectorising the epilogue, vect_do_peeling does: create_lcssa_for_virtual_phi (loop); update_ssa (TODO_update_ssa_only_virtuals); (with "loop" in this case being the to-be-vectorised epilogue loop). So the vectoriser puts the virtual operand into SSA form for the vectorised main loop as a separate step, during the early stages of vectorising the epilogue. I wasn't sure at first why that update_ssa was there. It looked initially like it was related to create_lcssa_for_virtual_phi, which seemed strange when create_lcssa_for_virtual_phi keeps the SSA form up-to-date. But before r241099 it had the following comment, which AFAICT is still the reason: /* We might have a queued need to update virtual SSA form. As we delete the update SSA machinery below after doing a regular incremental SSA update during loop copying make sure we don't lose that fact. ??? Needing to update virtual SSA form by renaming is unfortunate but not all of the vectorizer code inserting new loads / stores properly assigns virtual operands to those statements. */ The patch restores that comment since IMO it's helpful. (a), (d) and (e) mean that we copy the original un-if-converted scalar loop to act as the scalar epilogue. The update_ssa above means that this copying needs to cope with any new virtual SSA names in the main loop. The code to do that (reasonably) assumed that one of two things was true: (1) the scalar loop and the vector loops don't have vdefs, and so no virtual operand update is needed. The definition that applies on entry to the loops is the same in all cases. (2) the scalar loop and the vector loops have virtual phis, and so -- after applying create_lcssa_for_virtual_phi on the to-be-vectorised epilogue loop -- the virtual operand update can be handled in the same way as for normal SSA names. But (b) and (c) together mean that the scalar loop and the still-to-be-vectorised epilogue loop have no virtual phi that (2) can use. We'd therefore keep the original vuses when duplicating, rather than updating them to the definition that applies on exit from the epilogue loop. (Since the epilogue is still unvectorised and has no vdefs, the definition that applies on exit is the same as the one that applies on entry.) This patch therefore adds a third case: the scalar loop and to-be-vectorised epilogue have no virtual defs, but the main loop does. 2019-12-31 Richard Sandiford gcc/ * tree-vect-loop-manip.c (create_lcssa_for_virtual_phi): Return the incoming virtual operand definition. (vect_do_peeling): When vectorizing an epilogue loop, handle the case in which the main loop has a virtual phi and the epilogue and scalar loops don't. Restore an earlier comment about the update_ssa call. gcc/testsuite/ * gcc.dg/vect/vect-epilogues-2.c: New test. From-SVN: r279802 --- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 03e3f803616..a75b9012e58 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,12 @@ +2020-12-31 Richard Sandiford + + * tree-vect-loop-manip.c (create_lcssa_for_virtual_phi): Return + the incoming virtual operand definition. + (vect_do_peeling): When vectorizing an epilogue loop, handle the + case in which the main loop has a virtual phi and the epilogue + and scalar loops don't. Restore an earlier comment about the + update_ssa call. + 2019-12-30 Olivier Hainque * config/vxworks/stdint.h: New file. Include _yvals.h diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 76bcf820e8d..097c61d6f7d 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,7 @@ +2020-12-31 Richard Sandiford + + * gcc.dg/vect/vect-epilogues-2.c: New test. + 2019-12-30 Olivier Hainque * gcc.dg/vect/tree-vect.h: #undef OFFSET. diff --git a/gcc/testsuite/gcc.dg/vect/vect-epilogues-2.c b/gcc/testsuite/gcc.dg/vect/vect-epilogues-2.c new file mode 100644 index 00000000000..b251e1f2dfd --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-epilogues-2.c @@ -0,0 +1,57 @@ +/* { dg-do compile } */ + +int +f1 (int *x, int n) +{ + int res = 0; + for (int i = 0; i < n; ++i) + res += x[i * 2] == 1 ? 2 : 3; + return res; +} + +int +f2 (int *x) +{ + int res = 0; + for (int i = 0; i < 0x83; ++i) + res += x[i * 2] == 1 ? 2 : 3; + return res; +} + +int +f3 (int *x, int n) +{ + int res = 0; + for (int i = 0; i < n; ++i) + res += x[i * 2] == 1 ? 2 : 3; + return res + x[0x100]; +} + +int +f4 (int *x) +{ + int res = 0; + for (int i = 0; i < 0x83; ++i) + res += x[i * 2] == 1 ? 2 : 3; + return res + x[0x100]; +} + +int +f5 (int *x, int n, int a) +{ + int res = 0; + for (int i = 0; i < n; ++i) + res += x[i * 2] == 1 ? 2 : 3; + x[a] += 1; + return res; +} + +int +f6 (int *x, int a) +{ + int res = 0; + for (int i = 0; i < 0x83; ++i) + res += x[i * 2] == 1 ? 2 : 3; + x[a] += 1; + return res; +} diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c index b4dda971b18..dd2d943f167 100644 --- a/gcc/tree-vect-loop-manip.c +++ b/gcc/tree-vect-loop-manip.c @@ -1249,9 +1249,12 @@ slpeel_can_duplicate_loop_p (const class loop *loop, const_edge e) the *guard[12] routines, which assume loop closed SSA form for all PHIs (but normally loop closed SSA form doesn't require virtual PHIs to be in the same form). Doing this early simplifies the checking what - uses should be renamed. */ + uses should be renamed. -static void + If we create a new phi after the loop, return the definition that + applies on entry to the loop, otherwise return null. */ + +static tree create_lcssa_for_virtual_phi (class loop *loop) { gphi_iterator gsi; @@ -1283,10 +1286,12 @@ create_lcssa_for_virtual_phi (class loop *loop) && !flow_bb_inside_loop_p (loop, gimple_bb (stmt))) FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) SET_USE (use_p, new_vop); + + return PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop)); } break; } - + return NULL_TREE; } /* Function vect_get_loop_location. @@ -2483,9 +2488,42 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, class loop *prolog, *epilog = NULL, *loop = LOOP_VINFO_LOOP (loop_vinfo); class loop *first_loop = loop; bool irred_flag = loop_preheader_edge (loop)->flags & EDGE_IRREDUCIBLE_LOOP; - create_lcssa_for_virtual_phi (loop); + + /* We might have a queued need to update virtual SSA form. As we + delete the update SSA machinery below after doing a regular + incremental SSA update during loop copying make sure we don't + lose that fact. + ??? Needing to update virtual SSA form by renaming is unfortunate + but not all of the vectorizer code inserting new loads / stores + properly assigns virtual operands to those statements. */ update_ssa (TODO_update_ssa_only_virtuals); + create_lcssa_for_virtual_phi (loop); + + /* If we're vectorizing an epilogue loop, the update_ssa above will + have ensured that the virtual operand is in SSA form throughout the + vectorized main loop. Normally it is possible to trace the updated + vector-stmt vdefs back to scalar-stmt vdefs and vector-stmt vuses + back to scalar-stmt vuses, meaning that the effect of the SSA update + remains local to the main loop. However, there are rare cases in + which the vectorized loop has vdefs even when the original scalar + loop didn't. For example, vectorizing a load with IFN_LOAD_LANES + introduces clobbers of the temporary vector array, which in turn + needs new vdefs. If the scalar loop doesn't write to memory, these + new vdefs will be the only ones in the vector loop. + + In that case, update_ssa will have added a new virtual phi to the + main loop, which previously didn't need one. Ensure that we (locally) + maintain LCSSA form for the virtual operand, just as we would have + done if the virtual phi had existed from the outset. This makes it + easier to duplicate the scalar epilogue loop below. */ + tree vop_to_rename = NULL_TREE; + if (loop_vec_info orig_loop_vinfo = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo)) + { + class loop *orig_loop = LOOP_VINFO_LOOP (orig_loop_vinfo); + vop_to_rename = create_lcssa_for_virtual_phi (orig_loop); + } + if (MAY_HAVE_DEBUG_BIND_STMTS) { gcc_assert (!adjust_vec.exists ()); @@ -2706,6 +2744,26 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, as the transformations mentioned above make less or no sense when not vectorizing. */ epilog = vect_epilogues ? get_loop_copy (loop) : scalar_loop; + if (vop_to_rename) + { + /* Vectorizing the main loop can sometimes introduce a vdef to + a loop that previously didn't have one; see the comment above + the definition of VOP_TO_RENAME for details. The definition + D that holds on E will then be different from the definition + VOP_TO_RENAME that holds during SCALAR_LOOP, so we need to + rename VOP_TO_RENAME to D when copying the loop. + + The virtual operand is in LCSSA form for the main loop, + and no stmt between the main loop and E needs a vdef, + so we know that D is provided by a phi rather than by a + vdef on a normal gimple stmt. */ + basic_block vdef_bb = e->src; + gphi *vphi; + while (!(vphi = get_virtual_phi (vdef_bb))) + vdef_bb = get_immediate_dominator (CDI_DOMINATORS, vdef_bb); + gcc_assert (vop_to_rename != gimple_phi_result (vphi)); + set_current_def (vop_to_rename, gimple_phi_result (vphi)); + } epilog = slpeel_tree_duplicate_loop_to_edge_cfg (loop, epilog, e); if (!epilog) {