optimize permutes in SLP, remove vect_attempt_slp_rearrange_stmts
This introduces a permute optimization phase for SLP which is
intended to cover the existing permute eliding for SLP reductions
plus handling commonizing the easy cases.
It currently uses graphds to compute a postorder on the reverse
SLP graph and it handles all cases vect_attempt_slp_rearrange_stmts
did (hopefully - I've adjusted most testcases that triggered it
a few days ago). It restricts itself to move around bijective
permutations to simplify things for now, mainly around constant nodes.
As a prerequesite it makes the SLP graph cyclic (ugh). It looks
like it would pay off to compute a PRE/POST order visit array
once and elide all the recursive SLP graph walks and their
visited hash-set. At least for the time where we do not change
the SLP graph during such walk.
I do not like using graphds too much but at least I don't have to
re-implement yet another RPO walk, so maybe it isn't too bad.
It now computes permute placement during iteration and thus should
get cycles more obviously correct.
Richard.
2020-10-06 Richard Biener <rguenther@suse.de>
* tree-vect-data-refs.c (vect_slp_analyze_instance_dependence):
Use SLP_TREE_REPRESENTATIVE.
* tree-vectorizer.h (_slp_tree::vertex): New member used
for graphds interfacing.
* tree-vect-slp.c (vect_build_slp_tree_2): Allocate space
for PHI SLP children.
(vect_analyze_slp_backedges): New function filling in SLP
node children for PHIs that correspond to backedge values.
(vect_analyze_slp): Call vect_analyze_slp_backedges for the
graph.
(vect_slp_analyze_node_operations): Deal with a cyclic graph.
(vect_schedule_slp_instance): Likewise.
(vect_schedule_slp): Likewise.
(slp_copy_subtree): Remove.
(vect_slp_rearrange_stmts): Likewise.
(vect_attempt_slp_rearrange_stmts): Likewise.
(vect_slp_build_vertices): New functions.
(vect_slp_permute): Likewise.
(vect_slp_perms_eq): Likewise.
(vect_optimize_slp): Remove special code to elide
permutations with SLP reductions. Implement generic
permute optimization.
* gcc.dg/vect/bb-slp-50.c: New testcase.
* gcc.dg/vect/bb-slp-51.c: Likewise.