[RS6000] Don't restore fixed regs
[gcc.git] / gcc / tree-predcom.c
1 /* Predictive commoning.
2 Copyright (C) 2005-2017 Free Software Foundation, Inc.
3
4 This file is part of GCC.
5
6 GCC is free software; you can redistribute it and/or modify it
7 under the terms of the GNU General Public License as published by the
8 Free Software Foundation; either version 3, or (at your option) any
9 later version.
10
11 GCC is distributed in the hope that it will be useful, but WITHOUT
12 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 for more details.
15
16 You should have received a copy of the GNU General Public License
17 along with GCC; see the file COPYING3. If not see
18 <http://www.gnu.org/licenses/>. */
19
20 /* This file implements the predictive commoning optimization. Predictive
21 commoning can be viewed as CSE around a loop, and with some improvements,
22 as generalized strength reduction-- i.e., reusing values computed in
23 earlier iterations of a loop in the later ones. So far, the pass only
24 handles the most useful case, that is, reusing values of memory references.
25 If you think this is all just a special case of PRE, you are sort of right;
26 however, concentrating on loops is simpler, and makes it possible to
27 incorporate data dependence analysis to detect the opportunities, perform
28 loop unrolling to avoid copies together with renaming immediately,
29 and if needed, we could also take register pressure into account.
30
31 Let us demonstrate what is done on an example:
32
33 for (i = 0; i < 100; i++)
34 {
35 a[i+2] = a[i] + a[i+1];
36 b[10] = b[10] + i;
37 c[i] = c[99 - i];
38 d[i] = d[i + 1];
39 }
40
41 1) We find data references in the loop, and split them to mutually
42 independent groups (i.e., we find components of a data dependence
43 graph). We ignore read-read dependences whose distance is not constant.
44 (TODO -- we could also ignore antidependences). In this example, we
45 find the following groups:
46
47 a[i]{read}, a[i+1]{read}, a[i+2]{write}
48 b[10]{read}, b[10]{write}
49 c[99 - i]{read}, c[i]{write}
50 d[i + 1]{read}, d[i]{write}
51
52 2) Inside each of the group, we verify several conditions:
53 a) all the references must differ in indices only, and the indices
54 must all have the same step
55 b) the references must dominate loop latch (and thus, they must be
56 ordered by dominance relation).
57 c) the distance of the indices must be a small multiple of the step
58 We are then able to compute the difference of the references (# of
59 iterations before they point to the same place as the first of them).
60 Also, in case there are writes in the loop, we split the groups into
61 chains whose head is the write whose values are used by the reads in
62 the same chain. The chains are then processed independently,
63 making the further transformations simpler. Also, the shorter chains
64 need the same number of registers, but may require lower unrolling
65 factor in order to get rid of the copies on the loop latch.
66
67 In our example, we get the following chains (the chain for c is invalid).
68
69 a[i]{read,+0}, a[i+1]{read,-1}, a[i+2]{write,-2}
70 b[10]{read,+0}, b[10]{write,+0}
71 d[i + 1]{read,+0}, d[i]{write,+1}
72
73 3) For each read, we determine the read or write whose value it reuses,
74 together with the distance of this reuse. I.e. we take the last
75 reference before it with distance 0, or the last of the references
76 with the smallest positive distance to the read. Then, we remove
77 the references that are not used in any of these chains, discard the
78 empty groups, and propagate all the links so that they point to the
79 single root reference of the chain (adjusting their distance
80 appropriately). Some extra care needs to be taken for references with
81 step 0. In our example (the numbers indicate the distance of the
82 reuse),
83
84 a[i] --> (*) 2, a[i+1] --> (*) 1, a[i+2] (*)
85 b[10] --> (*) 1, b[10] (*)
86
87 4) The chains are combined together if possible. If the corresponding
88 elements of two chains are always combined together with the same
89 operator, we remember just the result of this combination, instead
90 of remembering the values separately. We may need to perform
91 reassociation to enable combining, for example
92
93 e[i] + f[i+1] + e[i+1] + f[i]
94
95 can be reassociated as
96
97 (e[i] + f[i]) + (e[i+1] + f[i+1])
98
99 and we can combine the chains for e and f into one chain.
100
101 5) For each root reference (end of the chain) R, let N be maximum distance
102 of a reference reusing its value. Variables R0 up to RN are created,
103 together with phi nodes that transfer values from R1 .. RN to
104 R0 .. R(N-1).
105 Initial values are loaded to R0..R(N-1) (in case not all references
106 must necessarily be accessed and they may trap, we may fail here;
107 TODO sometimes, the loads could be guarded by a check for the number
108 of iterations). Values loaded/stored in roots are also copied to
109 RN. Other reads are replaced with the appropriate variable Ri.
110 Everything is put to SSA form.
111
112 As a small improvement, if R0 is dead after the root (i.e., all uses of
113 the value with the maximum distance dominate the root), we can avoid
114 creating RN and use R0 instead of it.
115
116 In our example, we get (only the parts concerning a and b are shown):
117 for (i = 0; i < 100; i++)
118 {
119 f = phi (a[0], s);
120 s = phi (a[1], f);
121 x = phi (b[10], x);
122
123 f = f + s;
124 a[i+2] = f;
125 x = x + i;
126 b[10] = x;
127 }
128
129 6) Factor F for unrolling is determined as the smallest common multiple of
130 (N + 1) for each root reference (N for references for that we avoided
131 creating RN). If F and the loop is small enough, loop is unrolled F
132 times. The stores to RN (R0) in the copies of the loop body are
133 periodically replaced with R0, R1, ... (R1, R2, ...), so that they can
134 be coalesced and the copies can be eliminated.
135
136 TODO -- copy propagation and other optimizations may change the live
137 ranges of the temporary registers and prevent them from being coalesced;
138 this may increase the register pressure.
139
140 In our case, F = 2 and the (main loop of the) result is
141
142 for (i = 0; i < ...; i += 2)
143 {
144 f = phi (a[0], f);
145 s = phi (a[1], s);
146 x = phi (b[10], x);
147
148 f = f + s;
149 a[i+2] = f;
150 x = x + i;
151 b[10] = x;
152
153 s = s + f;
154 a[i+3] = s;
155 x = x + i;
156 b[10] = x;
157 }
158
159 Apart from predictive commoning on Load-Load and Store-Load chains, we
160 also support Store-Store chains -- stores killed by other store can be
161 eliminated. Given below example:
162
163 for (i = 0; i < n; i++)
164 {
165 a[i] = 1;
166 a[i+2] = 2;
167 }
168
169 It can be replaced with:
170
171 t0 = a[0];
172 t1 = a[1];
173 for (i = 0; i < n; i++)
174 {
175 a[i] = 1;
176 t2 = 2;
177 t0 = t1;
178 t1 = t2;
179 }
180 a[n] = t0;
181 a[n+1] = t1;
182
183 If the loop runs more than 1 iterations, it can be further simplified into:
184
185 for (i = 0; i < n; i++)
186 {
187 a[i] = 1;
188 }
189 a[n] = 2;
190 a[n+1] = 2;
191
192 The interesting part is this can be viewed either as general store motion
193 or general dead store elimination in either intra/inter-iterations way.
194
195 TODO: For now, we don't support store-store chains in multi-exit loops. We
196 force to not unroll in case of store-store chain even if other chains might
197 ask for unroll.
198
199 Predictive commoning can be generalized for arbitrary computations (not
200 just memory loads), and also nontrivial transfer functions (e.g., replacing
201 i * i with ii_last + 2 * i + 1), to generalize strength reduction. */
202
203 #include "config.h"
204 #include "system.h"
205 #include "coretypes.h"
206 #include "backend.h"
207 #include "rtl.h"
208 #include "tree.h"
209 #include "gimple.h"
210 #include "predict.h"
211 #include "tree-pass.h"
212 #include "ssa.h"
213 #include "gimple-pretty-print.h"
214 #include "alias.h"
215 #include "fold-const.h"
216 #include "cfgloop.h"
217 #include "tree-eh.h"
218 #include "gimplify.h"
219 #include "gimple-iterator.h"
220 #include "gimplify-me.h"
221 #include "tree-ssa-loop-ivopts.h"
222 #include "tree-ssa-loop-manip.h"
223 #include "tree-ssa-loop-niter.h"
224 #include "tree-ssa-loop.h"
225 #include "tree-into-ssa.h"
226 #include "tree-dfa.h"
227 #include "tree-ssa.h"
228 #include "tree-data-ref.h"
229 #include "tree-scalar-evolution.h"
230 #include "params.h"
231 #include "tree-affine.h"
232 #include "builtins.h"
233
234 /* The maximum number of iterations between the considered memory
235 references. */
236
237 #define MAX_DISTANCE (target_avail_regs < 16 ? 4 : 8)
238
239 /* Data references (or phi nodes that carry data reference values across
240 loop iterations). */
241
242 typedef struct dref_d
243 {
244 /* The reference itself. */
245 struct data_reference *ref;
246
247 /* The statement in that the reference appears. */
248 gimple *stmt;
249
250 /* In case that STMT is a phi node, this field is set to the SSA name
251 defined by it in replace_phis_by_defined_names (in order to avoid
252 pointing to phi node that got reallocated in the meantime). */
253 tree name_defined_by_phi;
254
255 /* Distance of the reference from the root of the chain (in number of
256 iterations of the loop). */
257 unsigned distance;
258
259 /* Number of iterations offset from the first reference in the component. */
260 widest_int offset;
261
262 /* Number of the reference in a component, in dominance ordering. */
263 unsigned pos;
264
265 /* True if the memory reference is always accessed when the loop is
266 entered. */
267 unsigned always_accessed : 1;
268 } *dref;
269
270
271 /* Type of the chain of the references. */
272
273 enum chain_type
274 {
275 /* The addresses of the references in the chain are constant. */
276 CT_INVARIANT,
277
278 /* There are only loads in the chain. */
279 CT_LOAD,
280
281 /* Root of the chain is store, the rest are loads. */
282 CT_STORE_LOAD,
283
284 /* There are only stores in the chain. */
285 CT_STORE_STORE,
286
287 /* A combination of two chains. */
288 CT_COMBINATION
289 };
290
291 /* Chains of data references. */
292
293 typedef struct chain
294 {
295 /* Type of the chain. */
296 enum chain_type type;
297
298 /* For combination chains, the operator and the two chains that are
299 combined, and the type of the result. */
300 enum tree_code op;
301 tree rslt_type;
302 struct chain *ch1, *ch2;
303
304 /* The references in the chain. */
305 vec<dref> refs;
306
307 /* The maximum distance of the reference in the chain from the root. */
308 unsigned length;
309
310 /* The variables used to copy the value throughout iterations. */
311 vec<tree> vars;
312
313 /* Initializers for the variables. */
314 vec<tree> inits;
315
316 /* Finalizers for the eliminated stores. */
317 vec<tree> finis;
318
319 /* gimple stmts intializing the initial variables of the chain. */
320 gimple_seq init_seq;
321
322 /* gimple stmts finalizing the eliminated stores of the chain. */
323 gimple_seq fini_seq;
324
325 /* True if there is a use of a variable with the maximal distance
326 that comes after the root in the loop. */
327 unsigned has_max_use_after : 1;
328
329 /* True if all the memory references in the chain are always accessed. */
330 unsigned all_always_accessed : 1;
331
332 /* True if this chain was combined together with some other chain. */
333 unsigned combined : 1;
334
335 /* True if this is store elimination chain and eliminated stores store
336 loop invariant value into memory. */
337 unsigned inv_store_elimination : 1;
338 } *chain_p;
339
340
341 /* Describes the knowledge about the step of the memory references in
342 the component. */
343
344 enum ref_step_type
345 {
346 /* The step is zero. */
347 RS_INVARIANT,
348
349 /* The step is nonzero. */
350 RS_NONZERO,
351
352 /* The step may or may not be nonzero. */
353 RS_ANY
354 };
355
356 /* Components of the data dependence graph. */
357
358 struct component
359 {
360 /* The references in the component. */
361 vec<dref> refs;
362
363 /* What we know about the step of the references in the component. */
364 enum ref_step_type comp_step;
365
366 /* True if all references in component are stores and we try to do
367 intra/inter loop iteration dead store elimination. */
368 bool eliminate_store_p;
369
370 /* Next component in the list. */
371 struct component *next;
372 };
373
374 /* Bitmap of ssa names defined by looparound phi nodes covered by chains. */
375
376 static bitmap looparound_phis;
377
378 /* Cache used by tree_to_aff_combination_expand. */
379
380 static hash_map<tree, name_expansion *> *name_expansions;
381
382 /* Dumps data reference REF to FILE. */
383
384 extern void dump_dref (FILE *, dref);
385 void
386 dump_dref (FILE *file, dref ref)
387 {
388 if (ref->ref)
389 {
390 fprintf (file, " ");
391 print_generic_expr (file, DR_REF (ref->ref), TDF_SLIM);
392 fprintf (file, " (id %u%s)\n", ref->pos,
393 DR_IS_READ (ref->ref) ? "" : ", write");
394
395 fprintf (file, " offset ");
396 print_decs (ref->offset, file);
397 fprintf (file, "\n");
398
399 fprintf (file, " distance %u\n", ref->distance);
400 }
401 else
402 {
403 if (gimple_code (ref->stmt) == GIMPLE_PHI)
404 fprintf (file, " looparound ref\n");
405 else
406 fprintf (file, " combination ref\n");
407 fprintf (file, " in statement ");
408 print_gimple_stmt (file, ref->stmt, 0, TDF_SLIM);
409 fprintf (file, "\n");
410 fprintf (file, " distance %u\n", ref->distance);
411 }
412
413 }
414
415 /* Dumps CHAIN to FILE. */
416
417 extern void dump_chain (FILE *, chain_p);
418 void
419 dump_chain (FILE *file, chain_p chain)
420 {
421 dref a;
422 const char *chain_type;
423 unsigned i;
424 tree var;
425
426 switch (chain->type)
427 {
428 case CT_INVARIANT:
429 chain_type = "Load motion";
430 break;
431
432 case CT_LOAD:
433 chain_type = "Loads-only";
434 break;
435
436 case CT_STORE_LOAD:
437 chain_type = "Store-loads";
438 break;
439
440 case CT_STORE_STORE:
441 chain_type = "Store-stores";
442 break;
443
444 case CT_COMBINATION:
445 chain_type = "Combination";
446 break;
447
448 default:
449 gcc_unreachable ();
450 }
451
452 fprintf (file, "%s chain %p%s\n", chain_type, (void *) chain,
453 chain->combined ? " (combined)" : "");
454 if (chain->type != CT_INVARIANT)
455 fprintf (file, " max distance %u%s\n", chain->length,
456 chain->has_max_use_after ? "" : ", may reuse first");
457
458 if (chain->type == CT_COMBINATION)
459 {
460 fprintf (file, " equal to %p %s %p in type ",
461 (void *) chain->ch1, op_symbol_code (chain->op),
462 (void *) chain->ch2);
463 print_generic_expr (file, chain->rslt_type, TDF_SLIM);
464 fprintf (file, "\n");
465 }
466
467 if (chain->vars.exists ())
468 {
469 fprintf (file, " vars");
470 FOR_EACH_VEC_ELT (chain->vars, i, var)
471 {
472 fprintf (file, " ");
473 print_generic_expr (file, var, TDF_SLIM);
474 }
475 fprintf (file, "\n");
476 }
477
478 if (chain->inits.exists ())
479 {
480 fprintf (file, " inits");
481 FOR_EACH_VEC_ELT (chain->inits, i, var)
482 {
483 fprintf (file, " ");
484 print_generic_expr (file, var, TDF_SLIM);
485 }
486 fprintf (file, "\n");
487 }
488
489 fprintf (file, " references:\n");
490 FOR_EACH_VEC_ELT (chain->refs, i, a)
491 dump_dref (file, a);
492
493 fprintf (file, "\n");
494 }
495
496 /* Dumps CHAINS to FILE. */
497
498 extern void dump_chains (FILE *, vec<chain_p> );
499 void
500 dump_chains (FILE *file, vec<chain_p> chains)
501 {
502 chain_p chain;
503 unsigned i;
504
505 FOR_EACH_VEC_ELT (chains, i, chain)
506 dump_chain (file, chain);
507 }
508
509 /* Dumps COMP to FILE. */
510
511 extern void dump_component (FILE *, struct component *);
512 void
513 dump_component (FILE *file, struct component *comp)
514 {
515 dref a;
516 unsigned i;
517
518 fprintf (file, "Component%s:\n",
519 comp->comp_step == RS_INVARIANT ? " (invariant)" : "");
520 FOR_EACH_VEC_ELT (comp->refs, i, a)
521 dump_dref (file, a);
522 fprintf (file, "\n");
523 }
524
525 /* Dumps COMPS to FILE. */
526
527 extern void dump_components (FILE *, struct component *);
528 void
529 dump_components (FILE *file, struct component *comps)
530 {
531 struct component *comp;
532
533 for (comp = comps; comp; comp = comp->next)
534 dump_component (file, comp);
535 }
536
537 /* Frees a chain CHAIN. */
538
539 static void
540 release_chain (chain_p chain)
541 {
542 dref ref;
543 unsigned i;
544
545 if (chain == NULL)
546 return;
547
548 FOR_EACH_VEC_ELT (chain->refs, i, ref)
549 free (ref);
550
551 chain->refs.release ();
552 chain->vars.release ();
553 chain->inits.release ();
554 if (chain->init_seq)
555 gimple_seq_discard (chain->init_seq);
556
557 chain->finis.release ();
558 if (chain->fini_seq)
559 gimple_seq_discard (chain->fini_seq);
560
561 free (chain);
562 }
563
564 /* Frees CHAINS. */
565
566 static void
567 release_chains (vec<chain_p> chains)
568 {
569 unsigned i;
570 chain_p chain;
571
572 FOR_EACH_VEC_ELT (chains, i, chain)
573 release_chain (chain);
574 chains.release ();
575 }
576
577 /* Frees a component COMP. */
578
579 static void
580 release_component (struct component *comp)
581 {
582 comp->refs.release ();
583 free (comp);
584 }
585
586 /* Frees list of components COMPS. */
587
588 static void
589 release_components (struct component *comps)
590 {
591 struct component *act, *next;
592
593 for (act = comps; act; act = next)
594 {
595 next = act->next;
596 release_component (act);
597 }
598 }
599
600 /* Finds a root of tree given by FATHERS containing A, and performs path
601 shortening. */
602
603 static unsigned
604 component_of (unsigned fathers[], unsigned a)
605 {
606 unsigned root, n;
607
608 for (root = a; root != fathers[root]; root = fathers[root])
609 continue;
610
611 for (; a != root; a = n)
612 {
613 n = fathers[a];
614 fathers[a] = root;
615 }
616
617 return root;
618 }
619
620 /* Join operation for DFU. FATHERS gives the tree, SIZES are sizes of the
621 components, A and B are components to merge. */
622
623 static void
624 merge_comps (unsigned fathers[], unsigned sizes[], unsigned a, unsigned b)
625 {
626 unsigned ca = component_of (fathers, a);
627 unsigned cb = component_of (fathers, b);
628
629 if (ca == cb)
630 return;
631
632 if (sizes[ca] < sizes[cb])
633 {
634 sizes[cb] += sizes[ca];
635 fathers[ca] = cb;
636 }
637 else
638 {
639 sizes[ca] += sizes[cb];
640 fathers[cb] = ca;
641 }
642 }
643
644 /* Returns true if A is a reference that is suitable for predictive commoning
645 in the innermost loop that contains it. REF_STEP is set according to the
646 step of the reference A. */
647
648 static bool
649 suitable_reference_p (struct data_reference *a, enum ref_step_type *ref_step)
650 {
651 tree ref = DR_REF (a), step = DR_STEP (a);
652
653 if (!step
654 || TREE_THIS_VOLATILE (ref)
655 || !is_gimple_reg_type (TREE_TYPE (ref))
656 || tree_could_throw_p (ref))
657 return false;
658
659 if (integer_zerop (step))
660 *ref_step = RS_INVARIANT;
661 else if (integer_nonzerop (step))
662 *ref_step = RS_NONZERO;
663 else
664 *ref_step = RS_ANY;
665
666 return true;
667 }
668
669 /* Stores DR_OFFSET (DR) + DR_INIT (DR) to OFFSET. */
670
671 static void
672 aff_combination_dr_offset (struct data_reference *dr, aff_tree *offset)
673 {
674 tree type = TREE_TYPE (DR_OFFSET (dr));
675 aff_tree delta;
676
677 tree_to_aff_combination_expand (DR_OFFSET (dr), type, offset,
678 &name_expansions);
679 aff_combination_const (&delta, type, wi::to_widest (DR_INIT (dr)));
680 aff_combination_add (offset, &delta);
681 }
682
683 /* Determines number of iterations of the innermost enclosing loop before B
684 refers to exactly the same location as A and stores it to OFF. If A and
685 B do not have the same step, they never meet, or anything else fails,
686 returns false, otherwise returns true. Both A and B are assumed to
687 satisfy suitable_reference_p. */
688
689 static bool
690 determine_offset (struct data_reference *a, struct data_reference *b,
691 widest_int *off)
692 {
693 aff_tree diff, baseb, step;
694 tree typea, typeb;
695
696 /* Check that both the references access the location in the same type. */
697 typea = TREE_TYPE (DR_REF (a));
698 typeb = TREE_TYPE (DR_REF (b));
699 if (!useless_type_conversion_p (typeb, typea))
700 return false;
701
702 /* Check whether the base address and the step of both references is the
703 same. */
704 if (!operand_equal_p (DR_STEP (a), DR_STEP (b), 0)
705 || !operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0))
706 return false;
707
708 if (integer_zerop (DR_STEP (a)))
709 {
710 /* If the references have loop invariant address, check that they access
711 exactly the same location. */
712 *off = 0;
713 return (operand_equal_p (DR_OFFSET (a), DR_OFFSET (b), 0)
714 && operand_equal_p (DR_INIT (a), DR_INIT (b), 0));
715 }
716
717 /* Compare the offsets of the addresses, and check whether the difference
718 is a multiple of step. */
719 aff_combination_dr_offset (a, &diff);
720 aff_combination_dr_offset (b, &baseb);
721 aff_combination_scale (&baseb, -1);
722 aff_combination_add (&diff, &baseb);
723
724 tree_to_aff_combination_expand (DR_STEP (a), TREE_TYPE (DR_STEP (a)),
725 &step, &name_expansions);
726 return aff_combination_constant_multiple_p (&diff, &step, off);
727 }
728
729 /* Returns the last basic block in LOOP for that we are sure that
730 it is executed whenever the loop is entered. */
731
732 static basic_block
733 last_always_executed_block (struct loop *loop)
734 {
735 unsigned i;
736 vec<edge> exits = get_loop_exit_edges (loop);
737 edge ex;
738 basic_block last = loop->latch;
739
740 FOR_EACH_VEC_ELT (exits, i, ex)
741 last = nearest_common_dominator (CDI_DOMINATORS, last, ex->src);
742 exits.release ();
743
744 return last;
745 }
746
747 /* Splits dependence graph on DATAREFS described by DEPENDS to components. */
748
749 static struct component *
750 split_data_refs_to_components (struct loop *loop,
751 vec<data_reference_p> datarefs,
752 vec<ddr_p> depends)
753 {
754 unsigned i, n = datarefs.length ();
755 unsigned ca, ia, ib, bad;
756 unsigned *comp_father = XNEWVEC (unsigned, n + 1);
757 unsigned *comp_size = XNEWVEC (unsigned, n + 1);
758 struct component **comps;
759 struct data_reference *dr, *dra, *drb;
760 struct data_dependence_relation *ddr;
761 struct component *comp_list = NULL, *comp;
762 dref dataref;
763 /* Don't do store elimination if loop has multiple exit edges. */
764 bool eliminate_store_p = single_exit (loop) != NULL;
765 basic_block last_always_executed = last_always_executed_block (loop);
766
767 FOR_EACH_VEC_ELT (datarefs, i, dr)
768 {
769 if (!DR_REF (dr))
770 {
771 /* A fake reference for call or asm_expr that may clobber memory;
772 just fail. */
773 goto end;
774 }
775 /* predcom pass isn't prepared to handle calls with data references. */
776 if (is_gimple_call (DR_STMT (dr)))
777 goto end;
778 dr->aux = (void *) (size_t) i;
779 comp_father[i] = i;
780 comp_size[i] = 1;
781 }
782
783 /* A component reserved for the "bad" data references. */
784 comp_father[n] = n;
785 comp_size[n] = 1;
786
787 FOR_EACH_VEC_ELT (datarefs, i, dr)
788 {
789 enum ref_step_type dummy;
790
791 if (!suitable_reference_p (dr, &dummy))
792 {
793 ia = (unsigned) (size_t) dr->aux;
794 merge_comps (comp_father, comp_size, n, ia);
795 }
796 }
797
798 FOR_EACH_VEC_ELT (depends, i, ddr)
799 {
800 widest_int dummy_off;
801
802 if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
803 continue;
804
805 dra = DDR_A (ddr);
806 drb = DDR_B (ddr);
807
808 /* Don't do store elimination if there is any unknown dependence for
809 any store data reference. */
810 if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb))
811 && (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
812 || DDR_NUM_DIST_VECTS (ddr) == 0))
813 eliminate_store_p = false;
814
815 ia = component_of (comp_father, (unsigned) (size_t) dra->aux);
816 ib = component_of (comp_father, (unsigned) (size_t) drb->aux);
817 if (ia == ib)
818 continue;
819
820 bad = component_of (comp_father, n);
821
822 /* If both A and B are reads, we may ignore unsuitable dependences. */
823 if (DR_IS_READ (dra) && DR_IS_READ (drb))
824 {
825 if (ia == bad || ib == bad
826 || !determine_offset (dra, drb, &dummy_off))
827 continue;
828 }
829 /* If A is read and B write or vice versa and there is unsuitable
830 dependence, instead of merging both components into a component
831 that will certainly not pass suitable_component_p, just put the
832 read into bad component, perhaps at least the write together with
833 all the other data refs in it's component will be optimizable. */
834 else if (DR_IS_READ (dra) && ib != bad)
835 {
836 if (ia == bad)
837 continue;
838 else if (!determine_offset (dra, drb, &dummy_off))
839 {
840 merge_comps (comp_father, comp_size, bad, ia);
841 continue;
842 }
843 }
844 else if (DR_IS_READ (drb) && ia != bad)
845 {
846 if (ib == bad)
847 continue;
848 else if (!determine_offset (dra, drb, &dummy_off))
849 {
850 merge_comps (comp_father, comp_size, bad, ib);
851 continue;
852 }
853 }
854 else if (DR_IS_WRITE (dra) && DR_IS_WRITE (drb)
855 && ia != bad && ib != bad
856 && !determine_offset (dra, drb, &dummy_off))
857 {
858 merge_comps (comp_father, comp_size, bad, ia);
859 merge_comps (comp_father, comp_size, bad, ib);
860 continue;
861 }
862
863 merge_comps (comp_father, comp_size, ia, ib);
864 }
865
866 if (eliminate_store_p)
867 {
868 tree niters = number_of_latch_executions (loop);
869
870 /* Don't do store elimination if niters info is unknown because stores
871 in the last iteration can't be eliminated and we need to recover it
872 after loop. */
873 eliminate_store_p = (niters != NULL_TREE && niters != chrec_dont_know);
874 }
875
876 comps = XCNEWVEC (struct component *, n);
877 bad = component_of (comp_father, n);
878 FOR_EACH_VEC_ELT (datarefs, i, dr)
879 {
880 ia = (unsigned) (size_t) dr->aux;
881 ca = component_of (comp_father, ia);
882 if (ca == bad)
883 continue;
884
885 comp = comps[ca];
886 if (!comp)
887 {
888 comp = XCNEW (struct component);
889 comp->refs.create (comp_size[ca]);
890 comp->eliminate_store_p = eliminate_store_p;
891 comps[ca] = comp;
892 }
893
894 dataref = XCNEW (struct dref_d);
895 dataref->ref = dr;
896 dataref->stmt = DR_STMT (dr);
897 dataref->offset = 0;
898 dataref->distance = 0;
899
900 dataref->always_accessed
901 = dominated_by_p (CDI_DOMINATORS, last_always_executed,
902 gimple_bb (dataref->stmt));
903 dataref->pos = comp->refs.length ();
904 comp->refs.quick_push (dataref);
905 if (DR_IS_READ (dr))
906 comp->eliminate_store_p = false;
907 }
908
909 for (i = 0; i < n; i++)
910 {
911 comp = comps[i];
912 if (comp)
913 {
914 comp->next = comp_list;
915 comp_list = comp;
916 }
917 }
918 free (comps);
919
920 end:
921 free (comp_father);
922 free (comp_size);
923 return comp_list;
924 }
925
926 /* Returns true if the component COMP satisfies the conditions
927 described in 2) at the beginning of this file. LOOP is the current
928 loop. */
929
930 static bool
931 suitable_component_p (struct loop *loop, struct component *comp)
932 {
933 unsigned i;
934 dref a, first;
935 basic_block ba, bp = loop->header;
936 bool ok, has_write = false;
937
938 FOR_EACH_VEC_ELT (comp->refs, i, a)
939 {
940 ba = gimple_bb (a->stmt);
941
942 if (!just_once_each_iteration_p (loop, ba))
943 return false;
944
945 gcc_assert (dominated_by_p (CDI_DOMINATORS, ba, bp));
946 bp = ba;
947
948 if (DR_IS_WRITE (a->ref))
949 has_write = true;
950 }
951
952 first = comp->refs[0];
953 ok = suitable_reference_p (first->ref, &comp->comp_step);
954 gcc_assert (ok);
955 first->offset = 0;
956
957 for (i = 1; comp->refs.iterate (i, &a); i++)
958 {
959 if (!determine_offset (first->ref, a->ref, &a->offset))
960 return false;
961
962 enum ref_step_type a_step;
963 gcc_checking_assert (suitable_reference_p (a->ref, &a_step)
964 && a_step == comp->comp_step);
965 }
966
967 /* If there is a write inside the component, we must know whether the
968 step is nonzero or not -- we would not otherwise be able to recognize
969 whether the value accessed by reads comes from the OFFSET-th iteration
970 or the previous one. */
971 if (has_write && comp->comp_step == RS_ANY)
972 return false;
973
974 return true;
975 }
976
977 /* Check the conditions on references inside each of components COMPS,
978 and remove the unsuitable components from the list. The new list
979 of components is returned. The conditions are described in 2) at
980 the beginning of this file. LOOP is the current loop. */
981
982 static struct component *
983 filter_suitable_components (struct loop *loop, struct component *comps)
984 {
985 struct component **comp, *act;
986
987 for (comp = &comps; *comp; )
988 {
989 act = *comp;
990 if (suitable_component_p (loop, act))
991 comp = &act->next;
992 else
993 {
994 dref ref;
995 unsigned i;
996
997 *comp = act->next;
998 FOR_EACH_VEC_ELT (act->refs, i, ref)
999 free (ref);
1000 release_component (act);
1001 }
1002 }
1003
1004 return comps;
1005 }
1006
1007 /* Compares two drefs A and B by their offset and position. Callback for
1008 qsort. */
1009
1010 static int
1011 order_drefs (const void *a, const void *b)
1012 {
1013 const dref *const da = (const dref *) a;
1014 const dref *const db = (const dref *) b;
1015 int offcmp = wi::cmps ((*da)->offset, (*db)->offset);
1016
1017 if (offcmp != 0)
1018 return offcmp;
1019
1020 return (*da)->pos - (*db)->pos;
1021 }
1022
1023 /* Returns root of the CHAIN. */
1024
1025 static inline dref
1026 get_chain_root (chain_p chain)
1027 {
1028 return chain->refs[0];
1029 }
1030
1031 /* Given CHAIN, returns the last ref at DISTANCE, or NULL if it doesn't
1032 exist. */
1033
1034 static inline dref
1035 get_chain_last_ref_at (chain_p chain, unsigned distance)
1036 {
1037 unsigned i;
1038
1039 for (i = chain->refs.length (); i > 0; i--)
1040 if (distance == chain->refs[i - 1]->distance)
1041 break;
1042
1043 return (i > 0) ? chain->refs[i - 1] : NULL;
1044 }
1045
1046 /* Adds REF to the chain CHAIN. */
1047
1048 static void
1049 add_ref_to_chain (chain_p chain, dref ref)
1050 {
1051 dref root = get_chain_root (chain);
1052
1053 gcc_assert (wi::les_p (root->offset, ref->offset));
1054 widest_int dist = ref->offset - root->offset;
1055 if (wi::leu_p (MAX_DISTANCE, dist))
1056 {
1057 free (ref);
1058 return;
1059 }
1060 gcc_assert (wi::fits_uhwi_p (dist));
1061
1062 chain->refs.safe_push (ref);
1063
1064 ref->distance = dist.to_uhwi ();
1065
1066 if (ref->distance >= chain->length)
1067 {
1068 chain->length = ref->distance;
1069 chain->has_max_use_after = false;
1070 }
1071
1072 /* Don't set the flag for store-store chain since there is no use. */
1073 if (chain->type != CT_STORE_STORE
1074 && ref->distance == chain->length
1075 && ref->pos > root->pos)
1076 chain->has_max_use_after = true;
1077
1078 chain->all_always_accessed &= ref->always_accessed;
1079 }
1080
1081 /* Returns the chain for invariant component COMP. */
1082
1083 static chain_p
1084 make_invariant_chain (struct component *comp)
1085 {
1086 chain_p chain = XCNEW (struct chain);
1087 unsigned i;
1088 dref ref;
1089
1090 chain->type = CT_INVARIANT;
1091
1092 chain->all_always_accessed = true;
1093
1094 FOR_EACH_VEC_ELT (comp->refs, i, ref)
1095 {
1096 chain->refs.safe_push (ref);
1097 chain->all_always_accessed &= ref->always_accessed;
1098 }
1099
1100 chain->inits = vNULL;
1101 chain->finis = vNULL;
1102
1103 return chain;
1104 }
1105
1106 /* Make a new chain of type TYPE rooted at REF. */
1107
1108 static chain_p
1109 make_rooted_chain (dref ref, enum chain_type type)
1110 {
1111 chain_p chain = XCNEW (struct chain);
1112
1113 chain->type = type;
1114 chain->refs.safe_push (ref);
1115 chain->all_always_accessed = ref->always_accessed;
1116 ref->distance = 0;
1117
1118 chain->inits = vNULL;
1119 chain->finis = vNULL;
1120
1121 return chain;
1122 }
1123
1124 /* Returns true if CHAIN is not trivial. */
1125
1126 static bool
1127 nontrivial_chain_p (chain_p chain)
1128 {
1129 return chain != NULL && chain->refs.length () > 1;
1130 }
1131
1132 /* Returns the ssa name that contains the value of REF, or NULL_TREE if there
1133 is no such name. */
1134
1135 static tree
1136 name_for_ref (dref ref)
1137 {
1138 tree name;
1139
1140 if (is_gimple_assign (ref->stmt))
1141 {
1142 if (!ref->ref || DR_IS_READ (ref->ref))
1143 name = gimple_assign_lhs (ref->stmt);
1144 else
1145 name = gimple_assign_rhs1 (ref->stmt);
1146 }
1147 else
1148 name = PHI_RESULT (ref->stmt);
1149
1150 return (TREE_CODE (name) == SSA_NAME ? name : NULL_TREE);
1151 }
1152
1153 /* Returns true if REF is a valid initializer for ROOT with given DISTANCE (in
1154 iterations of the innermost enclosing loop). */
1155
1156 static bool
1157 valid_initializer_p (struct data_reference *ref,
1158 unsigned distance, struct data_reference *root)
1159 {
1160 aff_tree diff, base, step;
1161 widest_int off;
1162
1163 /* Both REF and ROOT must be accessing the same object. */
1164 if (!operand_equal_p (DR_BASE_ADDRESS (ref), DR_BASE_ADDRESS (root), 0))
1165 return false;
1166
1167 /* The initializer is defined outside of loop, hence its address must be
1168 invariant inside the loop. */
1169 gcc_assert (integer_zerop (DR_STEP (ref)));
1170
1171 /* If the address of the reference is invariant, initializer must access
1172 exactly the same location. */
1173 if (integer_zerop (DR_STEP (root)))
1174 return (operand_equal_p (DR_OFFSET (ref), DR_OFFSET (root), 0)
1175 && operand_equal_p (DR_INIT (ref), DR_INIT (root), 0));
1176
1177 /* Verify that this index of REF is equal to the root's index at
1178 -DISTANCE-th iteration. */
1179 aff_combination_dr_offset (root, &diff);
1180 aff_combination_dr_offset (ref, &base);
1181 aff_combination_scale (&base, -1);
1182 aff_combination_add (&diff, &base);
1183
1184 tree_to_aff_combination_expand (DR_STEP (root), TREE_TYPE (DR_STEP (root)),
1185 &step, &name_expansions);
1186 if (!aff_combination_constant_multiple_p (&diff, &step, &off))
1187 return false;
1188
1189 if (off != distance)
1190 return false;
1191
1192 return true;
1193 }
1194
1195 /* Finds looparound phi node of LOOP that copies the value of REF, and if its
1196 initial value is correct (equal to initial value of REF shifted by one
1197 iteration), returns the phi node. Otherwise, NULL_TREE is returned. ROOT
1198 is the root of the current chain. */
1199
1200 static gphi *
1201 find_looparound_phi (struct loop *loop, dref ref, dref root)
1202 {
1203 tree name, init, init_ref;
1204 gphi *phi = NULL;
1205 gimple *init_stmt;
1206 edge latch = loop_latch_edge (loop);
1207 struct data_reference init_dr;
1208 gphi_iterator psi;
1209
1210 if (is_gimple_assign (ref->stmt))
1211 {
1212 if (DR_IS_READ (ref->ref))
1213 name = gimple_assign_lhs (ref->stmt);
1214 else
1215 name = gimple_assign_rhs1 (ref->stmt);
1216 }
1217 else
1218 name = PHI_RESULT (ref->stmt);
1219 if (!name)
1220 return NULL;
1221
1222 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
1223 {
1224 phi = psi.phi ();
1225 if (PHI_ARG_DEF_FROM_EDGE (phi, latch) == name)
1226 break;
1227 }
1228
1229 if (gsi_end_p (psi))
1230 return NULL;
1231
1232 init = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
1233 if (TREE_CODE (init) != SSA_NAME)
1234 return NULL;
1235 init_stmt = SSA_NAME_DEF_STMT (init);
1236 if (gimple_code (init_stmt) != GIMPLE_ASSIGN)
1237 return NULL;
1238 gcc_assert (gimple_assign_lhs (init_stmt) == init);
1239
1240 init_ref = gimple_assign_rhs1 (init_stmt);
1241 if (!REFERENCE_CLASS_P (init_ref)
1242 && !DECL_P (init_ref))
1243 return NULL;
1244
1245 /* Analyze the behavior of INIT_REF with respect to LOOP (innermost
1246 loop enclosing PHI). */
1247 memset (&init_dr, 0, sizeof (struct data_reference));
1248 DR_REF (&init_dr) = init_ref;
1249 DR_STMT (&init_dr) = phi;
1250 if (!dr_analyze_innermost (&DR_INNERMOST (&init_dr), init_ref, loop))
1251 return NULL;
1252
1253 if (!valid_initializer_p (&init_dr, ref->distance + 1, root->ref))
1254 return NULL;
1255
1256 return phi;
1257 }
1258
1259 /* Adds a reference for the looparound copy of REF in PHI to CHAIN. */
1260
1261 static void
1262 insert_looparound_copy (chain_p chain, dref ref, gphi *phi)
1263 {
1264 dref nw = XCNEW (struct dref_d), aref;
1265 unsigned i;
1266
1267 nw->stmt = phi;
1268 nw->distance = ref->distance + 1;
1269 nw->always_accessed = 1;
1270
1271 FOR_EACH_VEC_ELT (chain->refs, i, aref)
1272 if (aref->distance >= nw->distance)
1273 break;
1274 chain->refs.safe_insert (i, nw);
1275
1276 if (nw->distance > chain->length)
1277 {
1278 chain->length = nw->distance;
1279 chain->has_max_use_after = false;
1280 }
1281 }
1282
1283 /* For references in CHAIN that are copied around the LOOP (created previously
1284 by PRE, or by user), add the results of such copies to the chain. This
1285 enables us to remove the copies by unrolling, and may need less registers
1286 (also, it may allow us to combine chains together). */
1287
1288 static void
1289 add_looparound_copies (struct loop *loop, chain_p chain)
1290 {
1291 unsigned i;
1292 dref ref, root = get_chain_root (chain);
1293 gphi *phi;
1294
1295 if (chain->type == CT_STORE_STORE)
1296 return;
1297
1298 FOR_EACH_VEC_ELT (chain->refs, i, ref)
1299 {
1300 phi = find_looparound_phi (loop, ref, root);
1301 if (!phi)
1302 continue;
1303
1304 bitmap_set_bit (looparound_phis, SSA_NAME_VERSION (PHI_RESULT (phi)));
1305 insert_looparound_copy (chain, ref, phi);
1306 }
1307 }
1308
1309 /* Find roots of the values and determine distances in the component COMP.
1310 The references are redistributed into CHAINS. LOOP is the current
1311 loop. */
1312
1313 static void
1314 determine_roots_comp (struct loop *loop,
1315 struct component *comp,
1316 vec<chain_p> *chains)
1317 {
1318 unsigned i;
1319 dref a;
1320 chain_p chain = NULL;
1321 widest_int last_ofs = 0;
1322 enum chain_type type;
1323
1324 /* Invariants are handled specially. */
1325 if (comp->comp_step == RS_INVARIANT)
1326 {
1327 chain = make_invariant_chain (comp);
1328 chains->safe_push (chain);
1329 return;
1330 }
1331
1332 /* Trivial component. */
1333 if (comp->refs.length () <= 1)
1334 return;
1335
1336 comp->refs.qsort (order_drefs);
1337 FOR_EACH_VEC_ELT (comp->refs, i, a)
1338 {
1339 if (!chain
1340 || (!comp->eliminate_store_p && DR_IS_WRITE (a->ref))
1341 || wi::leu_p (MAX_DISTANCE, a->offset - last_ofs))
1342 {
1343 if (nontrivial_chain_p (chain))
1344 {
1345 add_looparound_copies (loop, chain);
1346 chains->safe_push (chain);
1347 }
1348 else
1349 release_chain (chain);
1350
1351 if (DR_IS_READ (a->ref))
1352 type = CT_LOAD;
1353 else
1354 type = comp->eliminate_store_p ? CT_STORE_STORE : CT_STORE_LOAD;
1355
1356 chain = make_rooted_chain (a, type);
1357 last_ofs = a->offset;
1358 continue;
1359 }
1360
1361 add_ref_to_chain (chain, a);
1362 }
1363
1364 if (nontrivial_chain_p (chain))
1365 {
1366 add_looparound_copies (loop, chain);
1367 chains->safe_push (chain);
1368 }
1369 else
1370 release_chain (chain);
1371 }
1372
1373 /* Find roots of the values and determine distances in components COMPS, and
1374 separates the references to CHAINS. LOOP is the current loop. */
1375
1376 static void
1377 determine_roots (struct loop *loop,
1378 struct component *comps, vec<chain_p> *chains)
1379 {
1380 struct component *comp;
1381
1382 for (comp = comps; comp; comp = comp->next)
1383 determine_roots_comp (loop, comp, chains);
1384 }
1385
1386 /* Replace the reference in statement STMT with temporary variable
1387 NEW_TREE. If SET is true, NEW_TREE is instead initialized to the value of
1388 the reference in the statement. IN_LHS is true if the reference
1389 is in the lhs of STMT, false if it is in rhs. */
1390
1391 static void
1392 replace_ref_with (gimple *stmt, tree new_tree, bool set, bool in_lhs)
1393 {
1394 tree val;
1395 gassign *new_stmt;
1396 gimple_stmt_iterator bsi, psi;
1397
1398 if (gimple_code (stmt) == GIMPLE_PHI)
1399 {
1400 gcc_assert (!in_lhs && !set);
1401
1402 val = PHI_RESULT (stmt);
1403 bsi = gsi_after_labels (gimple_bb (stmt));
1404 psi = gsi_for_stmt (stmt);
1405 remove_phi_node (&psi, false);
1406
1407 /* Turn the phi node into GIMPLE_ASSIGN. */
1408 new_stmt = gimple_build_assign (val, new_tree);
1409 gsi_insert_before (&bsi, new_stmt, GSI_NEW_STMT);
1410 return;
1411 }
1412
1413 /* Since the reference is of gimple_reg type, it should only
1414 appear as lhs or rhs of modify statement. */
1415 gcc_assert (is_gimple_assign (stmt));
1416
1417 bsi = gsi_for_stmt (stmt);
1418
1419 /* If we do not need to initialize NEW_TREE, just replace the use of OLD. */
1420 if (!set)
1421 {
1422 gcc_assert (!in_lhs);
1423 gimple_assign_set_rhs_from_tree (&bsi, new_tree);
1424 stmt = gsi_stmt (bsi);
1425 update_stmt (stmt);
1426 return;
1427 }
1428
1429 if (in_lhs)
1430 {
1431 /* We have statement
1432
1433 OLD = VAL
1434
1435 If OLD is a memory reference, then VAL is gimple_val, and we transform
1436 this to
1437
1438 OLD = VAL
1439 NEW = VAL
1440
1441 Otherwise, we are replacing a combination chain,
1442 VAL is the expression that performs the combination, and OLD is an
1443 SSA name. In this case, we transform the assignment to
1444
1445 OLD = VAL
1446 NEW = OLD
1447
1448 */
1449
1450 val = gimple_assign_lhs (stmt);
1451 if (TREE_CODE (val) != SSA_NAME)
1452 {
1453 val = gimple_assign_rhs1 (stmt);
1454 gcc_assert (gimple_assign_single_p (stmt));
1455 if (TREE_CLOBBER_P (val))
1456 val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (new_tree));
1457 else
1458 gcc_assert (gimple_assign_copy_p (stmt));
1459 }
1460 }
1461 else
1462 {
1463 /* VAL = OLD
1464
1465 is transformed to
1466
1467 VAL = OLD
1468 NEW = VAL */
1469
1470 val = gimple_assign_lhs (stmt);
1471 }
1472
1473 new_stmt = gimple_build_assign (new_tree, unshare_expr (val));
1474 gsi_insert_after (&bsi, new_stmt, GSI_NEW_STMT);
1475 }
1476
1477 /* Returns a memory reference to DR in the (NITERS + ITER)-th iteration
1478 of the loop it was analyzed in. Append init stmts to STMTS. */
1479
1480 static tree
1481 ref_at_iteration (data_reference_p dr, int iter,
1482 gimple_seq *stmts, tree niters = NULL_TREE)
1483 {
1484 tree off = DR_OFFSET (dr);
1485 tree coff = DR_INIT (dr);
1486 tree ref = DR_REF (dr);
1487 enum tree_code ref_code = ERROR_MARK;
1488 tree ref_type = NULL_TREE;
1489 tree ref_op1 = NULL_TREE;
1490 tree ref_op2 = NULL_TREE;
1491 tree new_offset;
1492
1493 if (iter != 0)
1494 {
1495 new_offset = size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter));
1496 if (TREE_CODE (new_offset) == INTEGER_CST)
1497 coff = size_binop (PLUS_EXPR, coff, new_offset);
1498 else
1499 off = size_binop (PLUS_EXPR, off, new_offset);
1500 }
1501
1502 if (niters != NULL_TREE)
1503 {
1504 niters = fold_convert (ssizetype, niters);
1505 new_offset = size_binop (MULT_EXPR, DR_STEP (dr), niters);
1506 if (TREE_CODE (niters) == INTEGER_CST)
1507 coff = size_binop (PLUS_EXPR, coff, new_offset);
1508 else
1509 off = size_binop (PLUS_EXPR, off, new_offset);
1510 }
1511
1512 /* While data-ref analysis punts on bit offsets it still handles
1513 bitfield accesses at byte boundaries. Cope with that. Note that
1514 if the bitfield object also starts at a byte-boundary we can simply
1515 replicate the COMPONENT_REF, but we have to subtract the component's
1516 byte-offset from the MEM_REF address first.
1517 Otherwise we simply build a BIT_FIELD_REF knowing that the bits
1518 start at offset zero. */
1519 if (TREE_CODE (ref) == COMPONENT_REF
1520 && DECL_BIT_FIELD (TREE_OPERAND (ref, 1)))
1521 {
1522 unsigned HOST_WIDE_INT boff;
1523 tree field = TREE_OPERAND (ref, 1);
1524 tree offset = component_ref_field_offset (ref);
1525 ref_type = TREE_TYPE (ref);
1526 boff = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field));
1527 /* This can occur in Ada. See the comment in get_bit_range. */
1528 if (boff % BITS_PER_UNIT != 0
1529 || !tree_fits_uhwi_p (offset))
1530 {
1531 ref_code = BIT_FIELD_REF;
1532 ref_op1 = DECL_SIZE (field);
1533 ref_op2 = bitsize_zero_node;
1534 }
1535 else
1536 {
1537 boff >>= LOG2_BITS_PER_UNIT;
1538 boff += tree_to_uhwi (offset);
1539 coff = size_binop (MINUS_EXPR, coff, ssize_int (boff));
1540 ref_code = COMPONENT_REF;
1541 ref_op1 = field;
1542 ref_op2 = TREE_OPERAND (ref, 2);
1543 ref = TREE_OPERAND (ref, 0);
1544 }
1545 }
1546 tree addr = fold_build_pointer_plus (DR_BASE_ADDRESS (dr), off);
1547 addr = force_gimple_operand_1 (unshare_expr (addr), stmts,
1548 is_gimple_mem_ref_addr, NULL_TREE);
1549 tree alias_ptr = fold_convert (reference_alias_ptr_type (ref), coff);
1550 tree type = build_aligned_type (TREE_TYPE (ref),
1551 get_object_alignment (ref));
1552 ref = build2 (MEM_REF, type, addr, alias_ptr);
1553 if (ref_type)
1554 ref = build3 (ref_code, ref_type, ref, ref_op1, ref_op2);
1555 return ref;
1556 }
1557
1558 /* Get the initialization expression for the INDEX-th temporary variable
1559 of CHAIN. */
1560
1561 static tree
1562 get_init_expr (chain_p chain, unsigned index)
1563 {
1564 if (chain->type == CT_COMBINATION)
1565 {
1566 tree e1 = get_init_expr (chain->ch1, index);
1567 tree e2 = get_init_expr (chain->ch2, index);
1568
1569 return fold_build2 (chain->op, chain->rslt_type, e1, e2);
1570 }
1571 else
1572 return chain->inits[index];
1573 }
1574
1575 /* Returns a new temporary variable used for the I-th variable carrying
1576 value of REF. The variable's uid is marked in TMP_VARS. */
1577
1578 static tree
1579 predcom_tmp_var (tree ref, unsigned i, bitmap tmp_vars)
1580 {
1581 tree type = TREE_TYPE (ref);
1582 /* We never access the components of the temporary variable in predictive
1583 commoning. */
1584 tree var = create_tmp_reg (type, get_lsm_tmp_name (ref, i));
1585 bitmap_set_bit (tmp_vars, DECL_UID (var));
1586 return var;
1587 }
1588
1589 /* Creates the variables for CHAIN, as well as phi nodes for them and
1590 initialization on entry to LOOP. Uids of the newly created
1591 temporary variables are marked in TMP_VARS. */
1592
1593 static void
1594 initialize_root_vars (struct loop *loop, chain_p chain, bitmap tmp_vars)
1595 {
1596 unsigned i;
1597 unsigned n = chain->length;
1598 dref root = get_chain_root (chain);
1599 bool reuse_first = !chain->has_max_use_after;
1600 tree ref, init, var, next;
1601 gphi *phi;
1602 gimple_seq stmts;
1603 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1604
1605 /* If N == 0, then all the references are within the single iteration. And
1606 since this is an nonempty chain, reuse_first cannot be true. */
1607 gcc_assert (n > 0 || !reuse_first);
1608
1609 chain->vars.create (n + 1);
1610
1611 if (chain->type == CT_COMBINATION)
1612 ref = gimple_assign_lhs (root->stmt);
1613 else
1614 ref = DR_REF (root->ref);
1615
1616 for (i = 0; i < n + (reuse_first ? 0 : 1); i++)
1617 {
1618 var = predcom_tmp_var (ref, i, tmp_vars);
1619 chain->vars.quick_push (var);
1620 }
1621 if (reuse_first)
1622 chain->vars.quick_push (chain->vars[0]);
1623
1624 FOR_EACH_VEC_ELT (chain->vars, i, var)
1625 chain->vars[i] = make_ssa_name (var);
1626
1627 for (i = 0; i < n; i++)
1628 {
1629 var = chain->vars[i];
1630 next = chain->vars[i + 1];
1631 init = get_init_expr (chain, i);
1632
1633 init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1634 if (stmts)
1635 gsi_insert_seq_on_edge_immediate (entry, stmts);
1636
1637 phi = create_phi_node (var, loop->header);
1638 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1639 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1640 }
1641 }
1642
1643 /* For inter-iteration store elimination CHAIN in LOOP, returns true if
1644 all stores to be eliminated store loop invariant values into memory.
1645 In this case, we can use these invariant values directly after LOOP. */
1646
1647 static bool
1648 is_inv_store_elimination_chain (struct loop *loop, chain_p chain)
1649 {
1650 if (chain->length == 0 || chain->type != CT_STORE_STORE)
1651 return false;
1652
1653 gcc_assert (!chain->has_max_use_after);
1654
1655 /* If loop iterates for unknown times or fewer times than chain->lenght,
1656 we still need to setup root variable and propagate it with PHI node. */
1657 tree niters = number_of_latch_executions (loop);
1658 if (TREE_CODE (niters) != INTEGER_CST || wi::leu_p (niters, chain->length))
1659 return false;
1660
1661 /* Check stores in chain for elimination if they only store loop invariant
1662 values. */
1663 for (unsigned i = 0; i < chain->length; i++)
1664 {
1665 dref a = get_chain_last_ref_at (chain, i);
1666 if (a == NULL)
1667 continue;
1668
1669 gimple *def_stmt, *stmt = a->stmt;
1670 if (!gimple_assign_single_p (stmt))
1671 return false;
1672
1673 tree val = gimple_assign_rhs1 (stmt);
1674 if (TREE_CLOBBER_P (val))
1675 return false;
1676
1677 if (CONSTANT_CLASS_P (val))
1678 continue;
1679
1680 if (TREE_CODE (val) != SSA_NAME)
1681 return false;
1682
1683 def_stmt = SSA_NAME_DEF_STMT (val);
1684 if (gimple_nop_p (def_stmt))
1685 continue;
1686
1687 if (flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)))
1688 return false;
1689 }
1690 return true;
1691 }
1692
1693 /* Creates root variables for store elimination CHAIN in which stores for
1694 elimination only store loop invariant values. In this case, we neither
1695 need to load root variables before loop nor propagate it with PHI nodes. */
1696
1697 static void
1698 initialize_root_vars_store_elim_1 (chain_p chain)
1699 {
1700 tree var;
1701 unsigned i, n = chain->length;
1702
1703 chain->vars.create (n);
1704 chain->vars.safe_grow_cleared (n);
1705
1706 /* Initialize root value for eliminated stores at each distance. */
1707 for (i = 0; i < n; i++)
1708 {
1709 dref a = get_chain_last_ref_at (chain, i);
1710 if (a == NULL)
1711 continue;
1712
1713 var = gimple_assign_rhs1 (a->stmt);
1714 chain->vars[a->distance] = var;
1715 }
1716
1717 /* We don't propagate values with PHI nodes, so manually propagate value
1718 to bubble positions. */
1719 var = chain->vars[0];
1720 for (i = 1; i < n; i++)
1721 {
1722 if (chain->vars[i] != NULL_TREE)
1723 {
1724 var = chain->vars[i];
1725 continue;
1726 }
1727 chain->vars[i] = var;
1728 }
1729
1730 /* Revert the vector. */
1731 for (i = 0; i < n / 2; i++)
1732 std::swap (chain->vars[i], chain->vars[n - i - 1]);
1733 }
1734
1735 /* Creates root variables for store elimination CHAIN in which stores for
1736 elimination store loop variant values. In this case, we may need to
1737 load root variables before LOOP and propagate it with PHI nodes. Uids
1738 of the newly created root variables are marked in TMP_VARS. */
1739
1740 static void
1741 initialize_root_vars_store_elim_2 (struct loop *loop,
1742 chain_p chain, bitmap tmp_vars)
1743 {
1744 unsigned i, n = chain->length;
1745 tree ref, init, var, next, val, phi_result;
1746 gimple *stmt;
1747 gimple_seq stmts;
1748
1749 chain->vars.create (n);
1750
1751 ref = DR_REF (get_chain_root (chain)->ref);
1752 for (i = 0; i < n; i++)
1753 {
1754 var = predcom_tmp_var (ref, i, tmp_vars);
1755 chain->vars.quick_push (var);
1756 }
1757
1758 FOR_EACH_VEC_ELT (chain->vars, i, var)
1759 chain->vars[i] = make_ssa_name (var);
1760
1761 /* Root values are either rhs operand of stores to be eliminated, or
1762 loaded from memory before loop. */
1763 auto_vec<tree> vtemps;
1764 vtemps.safe_grow_cleared (n);
1765 for (i = 0; i < n; i++)
1766 {
1767 init = get_init_expr (chain, i);
1768 if (init == NULL_TREE)
1769 {
1770 /* Root value is rhs operand of the store to be eliminated if
1771 it isn't loaded from memory before loop. */
1772 dref a = get_chain_last_ref_at (chain, i);
1773 val = gimple_assign_rhs1 (a->stmt);
1774 if (TREE_CLOBBER_P (val))
1775 val = get_or_create_ssa_default_def (cfun, SSA_NAME_VAR (var));
1776
1777 vtemps[n - i - 1] = val;
1778 }
1779 else
1780 {
1781 edge latch = loop_latch_edge (loop);
1782 edge entry = loop_preheader_edge (loop);
1783
1784 /* Root value is loaded from memory before loop, we also need
1785 to add PHI nodes to propagate the value across iterations. */
1786 init = force_gimple_operand (init, &stmts, true, NULL_TREE);
1787 if (stmts)
1788 gsi_insert_seq_on_edge_immediate (entry, stmts);
1789
1790 next = chain->vars[n - i];
1791 phi_result = copy_ssa_name (next);
1792 gphi *phi = create_phi_node (phi_result, loop->header);
1793 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1794 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1795 vtemps[n - i - 1] = phi_result;
1796 }
1797 }
1798
1799 /* Find the insertion position. */
1800 dref last = get_chain_root (chain);
1801 for (i = 0; i < chain->refs.length (); i++)
1802 {
1803 if (chain->refs[i]->pos > last->pos)
1804 last = chain->refs[i];
1805 }
1806
1807 gimple_stmt_iterator gsi = gsi_for_stmt (last->stmt);
1808
1809 /* Insert statements copying root value to root variable. */
1810 for (i = 0; i < n; i++)
1811 {
1812 var = chain->vars[i];
1813 val = vtemps[i];
1814 stmt = gimple_build_assign (var, val);
1815 gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
1816 }
1817 }
1818
1819 /* Generates stores for CHAIN's eliminated stores in LOOP's last
1820 (CHAIN->length - 1) iterations. */
1821
1822 static void
1823 finalize_eliminated_stores (struct loop *loop, chain_p chain)
1824 {
1825 unsigned i, n = chain->length;
1826
1827 for (i = 0; i < n; i++)
1828 {
1829 tree var = chain->vars[i];
1830 tree fini = chain->finis[n - i - 1];
1831 gimple *stmt = gimple_build_assign (fini, var);
1832
1833 gimple_seq_add_stmt_without_update (&chain->fini_seq, stmt);
1834 }
1835
1836 if (chain->fini_seq)
1837 {
1838 gsi_insert_seq_on_edge_immediate (single_exit (loop), chain->fini_seq);
1839 chain->fini_seq = NULL;
1840 }
1841 }
1842
1843 /* Initializes a variable for load motion for ROOT and prepares phi nodes and
1844 initialization on entry to LOOP if necessary. The ssa name for the variable
1845 is stored in VARS. If WRITTEN is true, also a phi node to copy its value
1846 around the loop is created. Uid of the newly created temporary variable
1847 is marked in TMP_VARS. INITS is the list containing the (single)
1848 initializer. */
1849
1850 static void
1851 initialize_root_vars_lm (struct loop *loop, dref root, bool written,
1852 vec<tree> *vars, vec<tree> inits,
1853 bitmap tmp_vars)
1854 {
1855 unsigned i;
1856 tree ref = DR_REF (root->ref), init, var, next;
1857 gimple_seq stmts;
1858 gphi *phi;
1859 edge entry = loop_preheader_edge (loop), latch = loop_latch_edge (loop);
1860
1861 /* Find the initializer for the variable, and check that it cannot
1862 trap. */
1863 init = inits[0];
1864
1865 vars->create (written ? 2 : 1);
1866 var = predcom_tmp_var (ref, 0, tmp_vars);
1867 vars->quick_push (var);
1868 if (written)
1869 vars->quick_push ((*vars)[0]);
1870
1871 FOR_EACH_VEC_ELT (*vars, i, var)
1872 (*vars)[i] = make_ssa_name (var);
1873
1874 var = (*vars)[0];
1875
1876 init = force_gimple_operand (init, &stmts, written, NULL_TREE);
1877 if (stmts)
1878 gsi_insert_seq_on_edge_immediate (entry, stmts);
1879
1880 if (written)
1881 {
1882 next = (*vars)[1];
1883 phi = create_phi_node (var, loop->header);
1884 add_phi_arg (phi, init, entry, UNKNOWN_LOCATION);
1885 add_phi_arg (phi, next, latch, UNKNOWN_LOCATION);
1886 }
1887 else
1888 {
1889 gassign *init_stmt = gimple_build_assign (var, init);
1890 gsi_insert_on_edge_immediate (entry, init_stmt);
1891 }
1892 }
1893
1894
1895 /* Execute load motion for references in chain CHAIN. Uids of the newly
1896 created temporary variables are marked in TMP_VARS. */
1897
1898 static void
1899 execute_load_motion (struct loop *loop, chain_p chain, bitmap tmp_vars)
1900 {
1901 auto_vec<tree> vars;
1902 dref a;
1903 unsigned n_writes = 0, ridx, i;
1904 tree var;
1905
1906 gcc_assert (chain->type == CT_INVARIANT);
1907 gcc_assert (!chain->combined);
1908 FOR_EACH_VEC_ELT (chain->refs, i, a)
1909 if (DR_IS_WRITE (a->ref))
1910 n_writes++;
1911
1912 /* If there are no reads in the loop, there is nothing to do. */
1913 if (n_writes == chain->refs.length ())
1914 return;
1915
1916 initialize_root_vars_lm (loop, get_chain_root (chain), n_writes > 0,
1917 &vars, chain->inits, tmp_vars);
1918
1919 ridx = 0;
1920 FOR_EACH_VEC_ELT (chain->refs, i, a)
1921 {
1922 bool is_read = DR_IS_READ (a->ref);
1923
1924 if (DR_IS_WRITE (a->ref))
1925 {
1926 n_writes--;
1927 if (n_writes)
1928 {
1929 var = vars[0];
1930 var = make_ssa_name (SSA_NAME_VAR (var));
1931 vars[0] = var;
1932 }
1933 else
1934 ridx = 1;
1935 }
1936
1937 replace_ref_with (a->stmt, vars[ridx],
1938 !is_read, !is_read);
1939 }
1940 }
1941
1942 /* Returns the single statement in that NAME is used, excepting
1943 the looparound phi nodes contained in one of the chains. If there is no
1944 such statement, or more statements, NULL is returned. */
1945
1946 static gimple *
1947 single_nonlooparound_use (tree name)
1948 {
1949 use_operand_p use;
1950 imm_use_iterator it;
1951 gimple *stmt, *ret = NULL;
1952
1953 FOR_EACH_IMM_USE_FAST (use, it, name)
1954 {
1955 stmt = USE_STMT (use);
1956
1957 if (gimple_code (stmt) == GIMPLE_PHI)
1958 {
1959 /* Ignore uses in looparound phi nodes. Uses in other phi nodes
1960 could not be processed anyway, so just fail for them. */
1961 if (bitmap_bit_p (looparound_phis,
1962 SSA_NAME_VERSION (PHI_RESULT (stmt))))
1963 continue;
1964
1965 return NULL;
1966 }
1967 else if (is_gimple_debug (stmt))
1968 continue;
1969 else if (ret != NULL)
1970 return NULL;
1971 else
1972 ret = stmt;
1973 }
1974
1975 return ret;
1976 }
1977
1978 /* Remove statement STMT, as well as the chain of assignments in that it is
1979 used. */
1980
1981 static void
1982 remove_stmt (gimple *stmt)
1983 {
1984 tree name;
1985 gimple *next;
1986 gimple_stmt_iterator psi;
1987
1988 if (gimple_code (stmt) == GIMPLE_PHI)
1989 {
1990 name = PHI_RESULT (stmt);
1991 next = single_nonlooparound_use (name);
1992 reset_debug_uses (stmt);
1993 psi = gsi_for_stmt (stmt);
1994 remove_phi_node (&psi, true);
1995
1996 if (!next
1997 || !gimple_assign_ssa_name_copy_p (next)
1998 || gimple_assign_rhs1 (next) != name)
1999 return;
2000
2001 stmt = next;
2002 }
2003
2004 while (1)
2005 {
2006 gimple_stmt_iterator bsi;
2007
2008 bsi = gsi_for_stmt (stmt);
2009
2010 name = gimple_assign_lhs (stmt);
2011 if (TREE_CODE (name) == SSA_NAME)
2012 {
2013 next = single_nonlooparound_use (name);
2014 reset_debug_uses (stmt);
2015 }
2016 else
2017 {
2018 /* This is a store to be eliminated. */
2019 gcc_assert (gimple_vdef (stmt) != NULL);
2020 next = NULL;
2021 }
2022
2023 unlink_stmt_vdef (stmt);
2024 gsi_remove (&bsi, true);
2025 release_defs (stmt);
2026
2027 if (!next
2028 || !gimple_assign_ssa_name_copy_p (next)
2029 || gimple_assign_rhs1 (next) != name)
2030 return;
2031
2032 stmt = next;
2033 }
2034 }
2035
2036 /* Perform the predictive commoning optimization for a chain CHAIN.
2037 Uids of the newly created temporary variables are marked in TMP_VARS.*/
2038
2039 static void
2040 execute_pred_commoning_chain (struct loop *loop, chain_p chain,
2041 bitmap tmp_vars)
2042 {
2043 unsigned i, n;
2044 dref a;
2045 tree var;
2046 bool in_lhs;
2047
2048 if (chain->combined)
2049 {
2050 /* For combined chains, just remove the statements that are used to
2051 compute the values of the expression (except for the root one).
2052 We delay this until after all chains are processed. */
2053 }
2054 else if (chain->type == CT_STORE_STORE)
2055 {
2056 if (chain->length > 0)
2057 {
2058 if (chain->inv_store_elimination)
2059 {
2060 /* If dead stores in this chain only store loop invariant
2061 values, we can simply record the invariant value and use
2062 it directly after loop. */
2063 initialize_root_vars_store_elim_1 (chain);
2064 }
2065 else
2066 {
2067 /* If dead stores in this chain store loop variant values,
2068 we need to set up the variables by loading from memory
2069 before loop and propagating it with PHI nodes. */
2070 initialize_root_vars_store_elim_2 (loop, chain, tmp_vars);
2071 }
2072
2073 /* For inter-iteration store elimination chain, stores at each
2074 distance in loop's last (chain->length - 1) iterations can't
2075 be eliminated, because there is no following killing store.
2076 We need to generate these stores after loop. */
2077 finalize_eliminated_stores (loop, chain);
2078 }
2079
2080 /* Eliminate the stores killed by following store. */
2081 n = chain->refs.length ();
2082 for (i = 0; i < n - 1; i++)
2083 remove_stmt (chain->refs[i]->stmt);
2084 }
2085 else
2086 {
2087 /* For non-combined chains, set up the variables that hold its value. */
2088 initialize_root_vars (loop, chain, tmp_vars);
2089 a = get_chain_root (chain);
2090 in_lhs = (chain->type == CT_STORE_LOAD
2091 || chain->type == CT_COMBINATION);
2092 replace_ref_with (a->stmt, chain->vars[chain->length], true, in_lhs);
2093
2094 /* Replace the uses of the original references by these variables. */
2095 for (i = 1; chain->refs.iterate (i, &a); i++)
2096 {
2097 var = chain->vars[chain->length - a->distance];
2098 replace_ref_with (a->stmt, var, false, false);
2099 }
2100 }
2101 }
2102
2103 /* Determines the unroll factor necessary to remove as many temporary variable
2104 copies as possible. CHAINS is the list of chains that will be
2105 optimized. */
2106
2107 static unsigned
2108 determine_unroll_factor (vec<chain_p> chains)
2109 {
2110 chain_p chain;
2111 unsigned factor = 1, af, nfactor, i;
2112 unsigned max = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES);
2113
2114 FOR_EACH_VEC_ELT (chains, i, chain)
2115 {
2116 if (chain->type == CT_INVARIANT)
2117 continue;
2118 /* For now we can't handle unrolling when eliminating stores. */
2119 else if (chain->type == CT_STORE_STORE)
2120 return 1;
2121
2122 if (chain->combined)
2123 {
2124 /* For combined chains, we can't handle unrolling if we replace
2125 looparound PHIs. */
2126 dref a;
2127 unsigned j;
2128 for (j = 1; chain->refs.iterate (j, &a); j++)
2129 if (gimple_code (a->stmt) == GIMPLE_PHI)
2130 return 1;
2131 continue;
2132 }
2133
2134 /* The best unroll factor for this chain is equal to the number of
2135 temporary variables that we create for it. */
2136 af = chain->length;
2137 if (chain->has_max_use_after)
2138 af++;
2139
2140 nfactor = factor * af / gcd (factor, af);
2141 if (nfactor <= max)
2142 factor = nfactor;
2143 }
2144
2145 return factor;
2146 }
2147
2148 /* Perform the predictive commoning optimization for CHAINS.
2149 Uids of the newly created temporary variables are marked in TMP_VARS. */
2150
2151 static void
2152 execute_pred_commoning (struct loop *loop, vec<chain_p> chains,
2153 bitmap tmp_vars)
2154 {
2155 chain_p chain;
2156 unsigned i;
2157
2158 FOR_EACH_VEC_ELT (chains, i, chain)
2159 {
2160 if (chain->type == CT_INVARIANT)
2161 execute_load_motion (loop, chain, tmp_vars);
2162 else
2163 execute_pred_commoning_chain (loop, chain, tmp_vars);
2164 }
2165
2166 FOR_EACH_VEC_ELT (chains, i, chain)
2167 {
2168 if (chain->type == CT_INVARIANT)
2169 ;
2170 else if (chain->combined)
2171 {
2172 /* For combined chains, just remove the statements that are used to
2173 compute the values of the expression (except for the root one). */
2174 dref a;
2175 unsigned j;
2176 for (j = 1; chain->refs.iterate (j, &a); j++)
2177 remove_stmt (a->stmt);
2178 }
2179 }
2180
2181 update_ssa (TODO_update_ssa_only_virtuals);
2182 }
2183
2184 /* For each reference in CHAINS, if its defining statement is
2185 phi node, record the ssa name that is defined by it. */
2186
2187 static void
2188 replace_phis_by_defined_names (vec<chain_p> chains)
2189 {
2190 chain_p chain;
2191 dref a;
2192 unsigned i, j;
2193
2194 FOR_EACH_VEC_ELT (chains, i, chain)
2195 FOR_EACH_VEC_ELT (chain->refs, j, a)
2196 {
2197 if (gimple_code (a->stmt) == GIMPLE_PHI)
2198 {
2199 a->name_defined_by_phi = PHI_RESULT (a->stmt);
2200 a->stmt = NULL;
2201 }
2202 }
2203 }
2204
2205 /* For each reference in CHAINS, if name_defined_by_phi is not
2206 NULL, use it to set the stmt field. */
2207
2208 static void
2209 replace_names_by_phis (vec<chain_p> chains)
2210 {
2211 chain_p chain;
2212 dref a;
2213 unsigned i, j;
2214
2215 FOR_EACH_VEC_ELT (chains, i, chain)
2216 FOR_EACH_VEC_ELT (chain->refs, j, a)
2217 if (a->stmt == NULL)
2218 {
2219 a->stmt = SSA_NAME_DEF_STMT (a->name_defined_by_phi);
2220 gcc_assert (gimple_code (a->stmt) == GIMPLE_PHI);
2221 a->name_defined_by_phi = NULL_TREE;
2222 }
2223 }
2224
2225 /* Wrapper over execute_pred_commoning, to pass it as a callback
2226 to tree_transform_and_unroll_loop. */
2227
2228 struct epcc_data
2229 {
2230 vec<chain_p> chains;
2231 bitmap tmp_vars;
2232 };
2233
2234 static void
2235 execute_pred_commoning_cbck (struct loop *loop, void *data)
2236 {
2237 struct epcc_data *const dta = (struct epcc_data *) data;
2238
2239 /* Restore phi nodes that were replaced by ssa names before
2240 tree_transform_and_unroll_loop (see detailed description in
2241 tree_predictive_commoning_loop). */
2242 replace_names_by_phis (dta->chains);
2243 execute_pred_commoning (loop, dta->chains, dta->tmp_vars);
2244 }
2245
2246 /* Base NAME and all the names in the chain of phi nodes that use it
2247 on variable VAR. The phi nodes are recognized by being in the copies of
2248 the header of the LOOP. */
2249
2250 static void
2251 base_names_in_chain_on (struct loop *loop, tree name, tree var)
2252 {
2253 gimple *stmt, *phi;
2254 imm_use_iterator iter;
2255
2256 replace_ssa_name_symbol (name, var);
2257
2258 while (1)
2259 {
2260 phi = NULL;
2261 FOR_EACH_IMM_USE_STMT (stmt, iter, name)
2262 {
2263 if (gimple_code (stmt) == GIMPLE_PHI
2264 && flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
2265 {
2266 phi = stmt;
2267 BREAK_FROM_IMM_USE_STMT (iter);
2268 }
2269 }
2270 if (!phi)
2271 return;
2272
2273 name = PHI_RESULT (phi);
2274 replace_ssa_name_symbol (name, var);
2275 }
2276 }
2277
2278 /* Given an unrolled LOOP after predictive commoning, remove the
2279 register copies arising from phi nodes by changing the base
2280 variables of SSA names. TMP_VARS is the set of the temporary variables
2281 for those we want to perform this. */
2282
2283 static void
2284 eliminate_temp_copies (struct loop *loop, bitmap tmp_vars)
2285 {
2286 edge e;
2287 gphi *phi;
2288 gimple *stmt;
2289 tree name, use, var;
2290 gphi_iterator psi;
2291
2292 e = loop_latch_edge (loop);
2293 for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next (&psi))
2294 {
2295 phi = psi.phi ();
2296 name = PHI_RESULT (phi);
2297 var = SSA_NAME_VAR (name);
2298 if (!var || !bitmap_bit_p (tmp_vars, DECL_UID (var)))
2299 continue;
2300 use = PHI_ARG_DEF_FROM_EDGE (phi, e);
2301 gcc_assert (TREE_CODE (use) == SSA_NAME);
2302
2303 /* Base all the ssa names in the ud and du chain of NAME on VAR. */
2304 stmt = SSA_NAME_DEF_STMT (use);
2305 while (gimple_code (stmt) == GIMPLE_PHI
2306 /* In case we could not unroll the loop enough to eliminate
2307 all copies, we may reach the loop header before the defining
2308 statement (in that case, some register copies will be present
2309 in loop latch in the final code, corresponding to the newly
2310 created looparound phi nodes). */
2311 && gimple_bb (stmt) != loop->header)
2312 {
2313 gcc_assert (single_pred_p (gimple_bb (stmt)));
2314 use = PHI_ARG_DEF (stmt, 0);
2315 stmt = SSA_NAME_DEF_STMT (use);
2316 }
2317
2318 base_names_in_chain_on (loop, use, var);
2319 }
2320 }
2321
2322 /* Returns true if CHAIN is suitable to be combined. */
2323
2324 static bool
2325 chain_can_be_combined_p (chain_p chain)
2326 {
2327 return (!chain->combined
2328 && (chain->type == CT_LOAD || chain->type == CT_COMBINATION));
2329 }
2330
2331 /* Returns the modify statement that uses NAME. Skips over assignment
2332 statements, NAME is replaced with the actual name used in the returned
2333 statement. */
2334
2335 static gimple *
2336 find_use_stmt (tree *name)
2337 {
2338 gimple *stmt;
2339 tree rhs, lhs;
2340
2341 /* Skip over assignments. */
2342 while (1)
2343 {
2344 stmt = single_nonlooparound_use (*name);
2345 if (!stmt)
2346 return NULL;
2347
2348 if (gimple_code (stmt) != GIMPLE_ASSIGN)
2349 return NULL;
2350
2351 lhs = gimple_assign_lhs (stmt);
2352 if (TREE_CODE (lhs) != SSA_NAME)
2353 return NULL;
2354
2355 if (gimple_assign_copy_p (stmt))
2356 {
2357 rhs = gimple_assign_rhs1 (stmt);
2358 if (rhs != *name)
2359 return NULL;
2360
2361 *name = lhs;
2362 }
2363 else if (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))
2364 == GIMPLE_BINARY_RHS)
2365 return stmt;
2366 else
2367 return NULL;
2368 }
2369 }
2370
2371 /* Returns true if we may perform reassociation for operation CODE in TYPE. */
2372
2373 static bool
2374 may_reassociate_p (tree type, enum tree_code code)
2375 {
2376 if (FLOAT_TYPE_P (type)
2377 && !flag_unsafe_math_optimizations)
2378 return false;
2379
2380 return (commutative_tree_code (code)
2381 && associative_tree_code (code));
2382 }
2383
2384 /* If the operation used in STMT is associative and commutative, go through the
2385 tree of the same operations and returns its root. Distance to the root
2386 is stored in DISTANCE. */
2387
2388 static gimple *
2389 find_associative_operation_root (gimple *stmt, unsigned *distance)
2390 {
2391 tree lhs;
2392 gimple *next;
2393 enum tree_code code = gimple_assign_rhs_code (stmt);
2394 tree type = TREE_TYPE (gimple_assign_lhs (stmt));
2395 unsigned dist = 0;
2396
2397 if (!may_reassociate_p (type, code))
2398 return NULL;
2399
2400 while (1)
2401 {
2402 lhs = gimple_assign_lhs (stmt);
2403 gcc_assert (TREE_CODE (lhs) == SSA_NAME);
2404
2405 next = find_use_stmt (&lhs);
2406 if (!next
2407 || gimple_assign_rhs_code (next) != code)
2408 break;
2409
2410 stmt = next;
2411 dist++;
2412 }
2413
2414 if (distance)
2415 *distance = dist;
2416 return stmt;
2417 }
2418
2419 /* Returns the common statement in that NAME1 and NAME2 have a use. If there
2420 is no such statement, returns NULL_TREE. In case the operation used on
2421 NAME1 and NAME2 is associative and commutative, returns the root of the
2422 tree formed by this operation instead of the statement that uses NAME1 or
2423 NAME2. */
2424
2425 static gimple *
2426 find_common_use_stmt (tree *name1, tree *name2)
2427 {
2428 gimple *stmt1, *stmt2;
2429
2430 stmt1 = find_use_stmt (name1);
2431 if (!stmt1)
2432 return NULL;
2433
2434 stmt2 = find_use_stmt (name2);
2435 if (!stmt2)
2436 return NULL;
2437
2438 if (stmt1 == stmt2)
2439 return stmt1;
2440
2441 stmt1 = find_associative_operation_root (stmt1, NULL);
2442 if (!stmt1)
2443 return NULL;
2444 stmt2 = find_associative_operation_root (stmt2, NULL);
2445 if (!stmt2)
2446 return NULL;
2447
2448 return (stmt1 == stmt2 ? stmt1 : NULL);
2449 }
2450
2451 /* Checks whether R1 and R2 are combined together using CODE, with the result
2452 in RSLT_TYPE, in order R1 CODE R2 if SWAP is false and in order R2 CODE R1
2453 if it is true. If CODE is ERROR_MARK, set these values instead. */
2454
2455 static bool
2456 combinable_refs_p (dref r1, dref r2,
2457 enum tree_code *code, bool *swap, tree *rslt_type)
2458 {
2459 enum tree_code acode;
2460 bool aswap;
2461 tree atype;
2462 tree name1, name2;
2463 gimple *stmt;
2464
2465 name1 = name_for_ref (r1);
2466 name2 = name_for_ref (r2);
2467 gcc_assert (name1 != NULL_TREE && name2 != NULL_TREE);
2468
2469 stmt = find_common_use_stmt (&name1, &name2);
2470
2471 if (!stmt
2472 /* A simple post-dominance check - make sure the combination
2473 is executed under the same condition as the references. */
2474 || (gimple_bb (stmt) != gimple_bb (r1->stmt)
2475 && gimple_bb (stmt) != gimple_bb (r2->stmt)))
2476 return false;
2477
2478 acode = gimple_assign_rhs_code (stmt);
2479 aswap = (!commutative_tree_code (acode)
2480 && gimple_assign_rhs1 (stmt) != name1);
2481 atype = TREE_TYPE (gimple_assign_lhs (stmt));
2482
2483 if (*code == ERROR_MARK)
2484 {
2485 *code = acode;
2486 *swap = aswap;
2487 *rslt_type = atype;
2488 return true;
2489 }
2490
2491 return (*code == acode
2492 && *swap == aswap
2493 && *rslt_type == atype);
2494 }
2495
2496 /* Remove OP from the operation on rhs of STMT, and replace STMT with
2497 an assignment of the remaining operand. */
2498
2499 static void
2500 remove_name_from_operation (gimple *stmt, tree op)
2501 {
2502 tree other_op;
2503 gimple_stmt_iterator si;
2504
2505 gcc_assert (is_gimple_assign (stmt));
2506
2507 if (gimple_assign_rhs1 (stmt) == op)
2508 other_op = gimple_assign_rhs2 (stmt);
2509 else
2510 other_op = gimple_assign_rhs1 (stmt);
2511
2512 si = gsi_for_stmt (stmt);
2513 gimple_assign_set_rhs_from_tree (&si, other_op);
2514
2515 /* We should not have reallocated STMT. */
2516 gcc_assert (gsi_stmt (si) == stmt);
2517
2518 update_stmt (stmt);
2519 }
2520
2521 /* Reassociates the expression in that NAME1 and NAME2 are used so that they
2522 are combined in a single statement, and returns this statement. Note the
2523 statement is inserted before INSERT_BEFORE if it's not NULL. */
2524
2525 static gimple *
2526 reassociate_to_the_same_stmt (tree name1, tree name2, gimple *insert_before)
2527 {
2528 gimple *stmt1, *stmt2, *root1, *root2, *s1, *s2;
2529 gassign *new_stmt, *tmp_stmt;
2530 tree new_name, tmp_name, var, r1, r2;
2531 unsigned dist1, dist2;
2532 enum tree_code code;
2533 tree type = TREE_TYPE (name1);
2534 gimple_stmt_iterator bsi;
2535
2536 stmt1 = find_use_stmt (&name1);
2537 stmt2 = find_use_stmt (&name2);
2538 root1 = find_associative_operation_root (stmt1, &dist1);
2539 root2 = find_associative_operation_root (stmt2, &dist2);
2540 code = gimple_assign_rhs_code (stmt1);
2541
2542 gcc_assert (root1 && root2 && root1 == root2
2543 && code == gimple_assign_rhs_code (stmt2));
2544
2545 /* Find the root of the nearest expression in that both NAME1 and NAME2
2546 are used. */
2547 r1 = name1;
2548 s1 = stmt1;
2549 r2 = name2;
2550 s2 = stmt2;
2551
2552 while (dist1 > dist2)
2553 {
2554 s1 = find_use_stmt (&r1);
2555 r1 = gimple_assign_lhs (s1);
2556 dist1--;
2557 }
2558 while (dist2 > dist1)
2559 {
2560 s2 = find_use_stmt (&r2);
2561 r2 = gimple_assign_lhs (s2);
2562 dist2--;
2563 }
2564
2565 while (s1 != s2)
2566 {
2567 s1 = find_use_stmt (&r1);
2568 r1 = gimple_assign_lhs (s1);
2569 s2 = find_use_stmt (&r2);
2570 r2 = gimple_assign_lhs (s2);
2571 }
2572
2573 /* Remove NAME1 and NAME2 from the statements in that they are used
2574 currently. */
2575 remove_name_from_operation (stmt1, name1);
2576 remove_name_from_operation (stmt2, name2);
2577
2578 /* Insert the new statement combining NAME1 and NAME2 before S1, and
2579 combine it with the rhs of S1. */
2580 var = create_tmp_reg (type, "predreastmp");
2581 new_name = make_ssa_name (var);
2582 new_stmt = gimple_build_assign (new_name, code, name1, name2);
2583 if (insert_before && stmt_dominates_stmt_p (insert_before, s1))
2584 bsi = gsi_for_stmt (insert_before);
2585 else
2586 bsi = gsi_for_stmt (s1);
2587
2588 gsi_insert_before (&bsi, new_stmt, GSI_SAME_STMT);
2589
2590 var = create_tmp_reg (type, "predreastmp");
2591 tmp_name = make_ssa_name (var);
2592
2593 /* Rhs of S1 may now be either a binary expression with operation
2594 CODE, or gimple_val (in case that stmt1 == s1 or stmt2 == s1,
2595 so that name1 or name2 was removed from it). */
2596 tmp_stmt = gimple_build_assign (tmp_name, gimple_assign_rhs_code (s1),
2597 gimple_assign_rhs1 (s1),
2598 gimple_assign_rhs2 (s1));
2599
2600 bsi = gsi_for_stmt (s1);
2601 gimple_assign_set_rhs_with_ops (&bsi, code, new_name, tmp_name);
2602 s1 = gsi_stmt (bsi);
2603 update_stmt (s1);
2604
2605 gsi_insert_before (&bsi, tmp_stmt, GSI_SAME_STMT);
2606
2607 return new_stmt;
2608 }
2609
2610 /* Returns the statement that combines references R1 and R2. In case R1
2611 and R2 are not used in the same statement, but they are used with an
2612 associative and commutative operation in the same expression, reassociate
2613 the expression so that they are used in the same statement. The combined
2614 statement is inserted before INSERT_BEFORE if it's not NULL. */
2615
2616 static gimple *
2617 stmt_combining_refs (dref r1, dref r2, gimple *insert_before)
2618 {
2619 gimple *stmt1, *stmt2;
2620 tree name1 = name_for_ref (r1);
2621 tree name2 = name_for_ref (r2);
2622
2623 stmt1 = find_use_stmt (&name1);
2624 stmt2 = find_use_stmt (&name2);
2625 if (stmt1 == stmt2)
2626 return stmt1;
2627
2628 return reassociate_to_the_same_stmt (name1, name2, insert_before);
2629 }
2630
2631 /* Tries to combine chains CH1 and CH2 together. If this succeeds, the
2632 description of the new chain is returned, otherwise we return NULL. */
2633
2634 static chain_p
2635 combine_chains (chain_p ch1, chain_p ch2)
2636 {
2637 dref r1, r2, nw;
2638 enum tree_code op = ERROR_MARK;
2639 bool swap = false;
2640 chain_p new_chain;
2641 int i, j, num;
2642 gimple *root_stmt;
2643 tree rslt_type = NULL_TREE;
2644
2645 if (ch1 == ch2)
2646 return NULL;
2647 if (ch1->length != ch2->length)
2648 return NULL;
2649
2650 if (ch1->refs.length () != ch2->refs.length ())
2651 return NULL;
2652
2653 for (i = 0; (ch1->refs.iterate (i, &r1)
2654 && ch2->refs.iterate (i, &r2)); i++)
2655 {
2656 if (r1->distance != r2->distance)
2657 return NULL;
2658
2659 if (!combinable_refs_p (r1, r2, &op, &swap, &rslt_type))
2660 return NULL;
2661 }
2662
2663 ch1->combined = true;
2664 ch2->combined = true;
2665
2666 if (swap)
2667 std::swap (ch1, ch2);
2668
2669 new_chain = XCNEW (struct chain);
2670 new_chain->type = CT_COMBINATION;
2671 new_chain->op = op;
2672 new_chain->ch1 = ch1;
2673 new_chain->ch2 = ch2;
2674 new_chain->rslt_type = rslt_type;
2675 new_chain->length = ch1->length;
2676
2677 gimple *insert = NULL;
2678 num = ch1->refs.length ();
2679 i = (new_chain->length == 0) ? num - 1 : 0;
2680 j = (new_chain->length == 0) ? -1 : 1;
2681 /* For ZERO length chain, process refs in reverse order so that dominant
2682 position is ready when it comes to the root ref.
2683 For non-ZERO length chain, process refs in order. See PR79663. */
2684 for (; num > 0; num--, i += j)
2685 {
2686 r1 = ch1->refs[i];
2687 r2 = ch2->refs[i];
2688 nw = XCNEW (struct dref_d);
2689 nw->distance = r1->distance;
2690
2691 /* For ZERO length chain, insert combined stmt of root ref at dominant
2692 position. */
2693 nw->stmt = stmt_combining_refs (r1, r2, i == 0 ? insert : NULL);
2694 /* For ZERO length chain, record dominant position where combined stmt
2695 of root ref should be inserted. In this case, though all root refs
2696 dominate following ones, it's possible that combined stmt doesn't.
2697 See PR70754. */
2698 if (new_chain->length == 0
2699 && (insert == NULL || stmt_dominates_stmt_p (nw->stmt, insert)))
2700 insert = nw->stmt;
2701
2702 new_chain->refs.safe_push (nw);
2703 }
2704 if (new_chain->length == 0)
2705 {
2706 /* Restore the order for ZERO length chain's refs. */
2707 num = new_chain->refs.length () >> 1;
2708 for (i = 0, j = new_chain->refs.length () - 1; i < num; i++, j--)
2709 std::swap (new_chain->refs[i], new_chain->refs[j]);
2710
2711 /* For ZERO length chain, has_max_use_after must be true since root
2712 combined stmt must dominates others. */
2713 new_chain->has_max_use_after = true;
2714 return new_chain;
2715 }
2716
2717 new_chain->has_max_use_after = false;
2718 root_stmt = get_chain_root (new_chain)->stmt;
2719 for (i = 1; new_chain->refs.iterate (i, &nw); i++)
2720 {
2721 if (nw->distance == new_chain->length
2722 && !stmt_dominates_stmt_p (nw->stmt, root_stmt))
2723 {
2724 new_chain->has_max_use_after = true;
2725 break;
2726 }
2727 }
2728
2729 return new_chain;
2730 }
2731
2732 /* Try to combine the CHAINS. */
2733
2734 static void
2735 try_combine_chains (vec<chain_p> *chains)
2736 {
2737 unsigned i, j;
2738 chain_p ch1, ch2, cch;
2739 auto_vec<chain_p> worklist;
2740
2741 FOR_EACH_VEC_ELT (*chains, i, ch1)
2742 if (chain_can_be_combined_p (ch1))
2743 worklist.safe_push (ch1);
2744
2745 while (!worklist.is_empty ())
2746 {
2747 ch1 = worklist.pop ();
2748 if (!chain_can_be_combined_p (ch1))
2749 continue;
2750
2751 FOR_EACH_VEC_ELT (*chains, j, ch2)
2752 {
2753 if (!chain_can_be_combined_p (ch2))
2754 continue;
2755
2756 cch = combine_chains (ch1, ch2);
2757 if (cch)
2758 {
2759 worklist.safe_push (cch);
2760 chains->safe_push (cch);
2761 break;
2762 }
2763 }
2764 }
2765 }
2766
2767 /* Prepare initializers for store elimination CHAIN in LOOP. Returns false
2768 if this is impossible because one of these initializers may trap, true
2769 otherwise. */
2770
2771 static bool
2772 prepare_initializers_chain_store_elim (struct loop *loop, chain_p chain)
2773 {
2774 unsigned i, n = chain->length;
2775
2776 /* For now we can't eliminate stores if some of them are conditional
2777 executed. */
2778 if (!chain->all_always_accessed)
2779 return false;
2780
2781 /* Nothing to intialize for intra-iteration store elimination. */
2782 if (n == 0 && chain->type == CT_STORE_STORE)
2783 return true;
2784
2785 /* For store elimination chain, there is nothing to initialize if stores
2786 to be eliminated only store loop invariant values into memory. */
2787 if (chain->type == CT_STORE_STORE
2788 && is_inv_store_elimination_chain (loop, chain))
2789 {
2790 chain->inv_store_elimination = true;
2791 return true;
2792 }
2793
2794 chain->inits.create (n);
2795 chain->inits.safe_grow_cleared (n);
2796
2797 /* For store eliminatin chain like below:
2798
2799 for (i = 0; i < len; i++)
2800 {
2801 a[i] = 1;
2802 // a[i + 1] = ...
2803 a[i + 2] = 3;
2804 }
2805
2806 store to a[i + 1] is missed in loop body, it acts like bubbles. The
2807 content of a[i + 1] remain the same if the loop iterates fewer times
2808 than chain->length. We need to set up root variables for such stores
2809 by loading from memory before loop. Note we only need to load bubble
2810 elements because loop body is guaranteed to be executed at least once
2811 after loop's preheader edge. */
2812 auto_vec<bool> bubbles;
2813 bubbles.safe_grow_cleared (n + 1);
2814 for (i = 0; i < chain->refs.length (); i++)
2815 bubbles[chain->refs[i]->distance] = true;
2816
2817 struct data_reference *dr = get_chain_root (chain)->ref;
2818 for (i = 0; i < n; i++)
2819 {
2820 if (bubbles[i])
2821 continue;
2822
2823 gimple_seq stmts = NULL;
2824
2825 tree init = ref_at_iteration (dr, (int) 0 - i, &stmts);
2826 if (stmts)
2827 gimple_seq_add_seq_without_update (&chain->init_seq, stmts);
2828
2829 chain->inits[i] = init;
2830 }
2831
2832 return true;
2833 }
2834
2835 /* Prepare initializers for CHAIN in LOOP. Returns false if this is
2836 impossible because one of these initializers may trap, true otherwise. */
2837
2838 static bool
2839 prepare_initializers_chain (struct loop *loop, chain_p chain)
2840 {
2841 unsigned i, n = (chain->type == CT_INVARIANT) ? 1 : chain->length;
2842 struct data_reference *dr = get_chain_root (chain)->ref;
2843 tree init;
2844 dref laref;
2845 edge entry = loop_preheader_edge (loop);
2846
2847 if (chain->type == CT_STORE_STORE)
2848 return prepare_initializers_chain_store_elim (loop, chain);
2849
2850 /* Find the initializers for the variables, and check that they cannot
2851 trap. */
2852 chain->inits.create (n);
2853 for (i = 0; i < n; i++)
2854 chain->inits.quick_push (NULL_TREE);
2855
2856 /* If we have replaced some looparound phi nodes, use their initializers
2857 instead of creating our own. */
2858 FOR_EACH_VEC_ELT (chain->refs, i, laref)
2859 {
2860 if (gimple_code (laref->stmt) != GIMPLE_PHI)
2861 continue;
2862
2863 gcc_assert (laref->distance > 0);
2864 chain->inits[n - laref->distance]
2865 = PHI_ARG_DEF_FROM_EDGE (laref->stmt, entry);
2866 }
2867
2868 for (i = 0; i < n; i++)
2869 {
2870 gimple_seq stmts = NULL;
2871
2872 if (chain->inits[i] != NULL_TREE)
2873 continue;
2874
2875 init = ref_at_iteration (dr, (int) i - n, &stmts);
2876 if (!chain->all_always_accessed && tree_could_trap_p (init))
2877 {
2878 gimple_seq_discard (stmts);
2879 return false;
2880 }
2881
2882 if (stmts)
2883 gimple_seq_add_seq_without_update (&chain->init_seq, stmts);
2884
2885 chain->inits[i] = init;
2886 }
2887
2888 return true;
2889 }
2890
2891 /* Prepare initializers for CHAINS in LOOP, and free chains that cannot
2892 be used because the initializers might trap. */
2893
2894 static void
2895 prepare_initializers (struct loop *loop, vec<chain_p> chains)
2896 {
2897 chain_p chain;
2898 unsigned i;
2899
2900 for (i = 0; i < chains.length (); )
2901 {
2902 chain = chains[i];
2903 if (prepare_initializers_chain (loop, chain))
2904 i++;
2905 else
2906 {
2907 release_chain (chain);
2908 chains.unordered_remove (i);
2909 }
2910 }
2911 }
2912
2913 /* Generates finalizer memory references for CHAIN in LOOP. Returns true
2914 if finalizer code for CHAIN can be generated, otherwise false. */
2915
2916 static bool
2917 prepare_finalizers_chain (struct loop *loop, chain_p chain)
2918 {
2919 unsigned i, n = chain->length;
2920 struct data_reference *dr = get_chain_root (chain)->ref;
2921 tree fini, niters = number_of_latch_executions (loop);
2922
2923 /* For now we can't eliminate stores if some of them are conditional
2924 executed. */
2925 if (!chain->all_always_accessed)
2926 return false;
2927
2928 chain->finis.create (n);
2929 for (i = 0; i < n; i++)
2930 chain->finis.quick_push (NULL_TREE);
2931
2932 /* We never use looparound phi node for store elimination chains. */
2933
2934 /* Find the finalizers for the variables, and check that they cannot
2935 trap. */
2936 for (i = 0; i < n; i++)
2937 {
2938 gimple_seq stmts = NULL;
2939 gcc_assert (chain->finis[i] == NULL_TREE);
2940
2941 if (TREE_CODE (niters) != INTEGER_CST && TREE_CODE (niters) != SSA_NAME)
2942 {
2943 niters = unshare_expr (niters);
2944 niters = force_gimple_operand (niters, &stmts, true, NULL);
2945 if (stmts)
2946 {
2947 gimple_seq_add_seq_without_update (&chain->fini_seq, stmts);
2948 stmts = NULL;
2949 }
2950 }
2951 fini = ref_at_iteration (dr, (int) 0 - i, &stmts, niters);
2952 if (stmts)
2953 gimple_seq_add_seq_without_update (&chain->fini_seq, stmts);
2954
2955 chain->finis[i] = fini;
2956 }
2957
2958 return true;
2959 }
2960
2961 /* Generates finalizer memory reference for CHAINS in LOOP. Returns true
2962 if finalizer code generation for CHAINS breaks loop closed ssa form. */
2963
2964 static bool
2965 prepare_finalizers (struct loop *loop, vec<chain_p> chains)
2966 {
2967 chain_p chain;
2968 unsigned i;
2969 bool loop_closed_ssa = false;
2970
2971 for (i = 0; i < chains.length ();)
2972 {
2973 chain = chains[i];
2974
2975 /* Finalizer is only necessary for inter-iteration store elimination
2976 chains. */
2977 if (chain->length == 0 || chain->type != CT_STORE_STORE)
2978 {
2979 i++;
2980 continue;
2981 }
2982
2983 if (prepare_finalizers_chain (loop, chain))
2984 {
2985 i++;
2986 /* Be conservative, assume loop closed ssa form is corrupted
2987 by store-store chain. Though it's not always the case if
2988 eliminated stores only store loop invariant values into
2989 memory. */
2990 loop_closed_ssa = true;
2991 }
2992 else
2993 {
2994 release_chain (chain);
2995 chains.unordered_remove (i);
2996 }
2997 }
2998 return loop_closed_ssa;
2999 }
3000
3001 /* Insert all initializing gimple stmts into loop's entry edge. */
3002
3003 static void
3004 insert_init_seqs (struct loop *loop, vec<chain_p> chains)
3005 {
3006 unsigned i;
3007 edge entry = loop_preheader_edge (loop);
3008
3009 for (i = 0; i < chains.length (); ++i)
3010 if (chains[i]->init_seq)
3011 {
3012 gsi_insert_seq_on_edge_immediate (entry, chains[i]->init_seq);
3013 chains[i]->init_seq = NULL;
3014 }
3015 }
3016
3017 /* Performs predictive commoning for LOOP. Sets bit 1<<0 of return value
3018 if LOOP was unrolled; Sets bit 1<<1 of return value if loop closed ssa
3019 form was corrupted. */
3020
3021 static unsigned
3022 tree_predictive_commoning_loop (struct loop *loop)
3023 {
3024 vec<data_reference_p> datarefs;
3025 vec<ddr_p> dependences;
3026 struct component *components;
3027 vec<chain_p> chains = vNULL;
3028 unsigned unroll_factor;
3029 struct tree_niter_desc desc;
3030 bool unroll = false, loop_closed_ssa = false;
3031 edge exit;
3032
3033 if (dump_file && (dump_flags & TDF_DETAILS))
3034 fprintf (dump_file, "Processing loop %d\n", loop->num);
3035
3036 /* Nothing for predicitive commoning if loop only iterates 1 time. */
3037 if (get_max_loop_iterations_int (loop) == 0)
3038 {
3039 if (dump_file && (dump_flags & TDF_DETAILS))
3040 fprintf (dump_file, "Loop iterates only 1 time, nothing to do.\n");
3041
3042 return 0;
3043 }
3044
3045 /* Find the data references and split them into components according to their
3046 dependence relations. */
3047 auto_vec<loop_p, 3> loop_nest;
3048 dependences.create (10);
3049 datarefs.create (10);
3050 if (! compute_data_dependences_for_loop (loop, true, &loop_nest, &datarefs,
3051 &dependences))
3052 {
3053 if (dump_file && (dump_flags & TDF_DETAILS))
3054 fprintf (dump_file, "Cannot analyze data dependencies\n");
3055 free_data_refs (datarefs);
3056 free_dependence_relations (dependences);
3057 return 0;
3058 }
3059
3060 if (dump_file && (dump_flags & TDF_DETAILS))
3061 dump_data_dependence_relations (dump_file, dependences);
3062
3063 components = split_data_refs_to_components (loop, datarefs, dependences);
3064 loop_nest.release ();
3065 free_dependence_relations (dependences);
3066 if (!components)
3067 {
3068 free_data_refs (datarefs);
3069 free_affine_expand_cache (&name_expansions);
3070 return 0;
3071 }
3072
3073 if (dump_file && (dump_flags & TDF_DETAILS))
3074 {
3075 fprintf (dump_file, "Initial state:\n\n");
3076 dump_components (dump_file, components);
3077 }
3078
3079 /* Find the suitable components and split them into chains. */
3080 components = filter_suitable_components (loop, components);
3081
3082 auto_bitmap tmp_vars;
3083 looparound_phis = BITMAP_ALLOC (NULL);
3084 determine_roots (loop, components, &chains);
3085 release_components (components);
3086
3087 if (!chains.exists ())
3088 {
3089 if (dump_file && (dump_flags & TDF_DETAILS))
3090 fprintf (dump_file,
3091 "Predictive commoning failed: no suitable chains\n");
3092 goto end;
3093 }
3094 prepare_initializers (loop, chains);
3095 loop_closed_ssa = prepare_finalizers (loop, chains);
3096
3097 /* Try to combine the chains that are always worked with together. */
3098 try_combine_chains (&chains);
3099
3100 insert_init_seqs (loop, chains);
3101
3102 if (dump_file && (dump_flags & TDF_DETAILS))
3103 {
3104 fprintf (dump_file, "Before commoning:\n\n");
3105 dump_chains (dump_file, chains);
3106 }
3107
3108 /* Determine the unroll factor, and if the loop should be unrolled, ensure
3109 that its number of iterations is divisible by the factor. */
3110 unroll_factor = determine_unroll_factor (chains);
3111 scev_reset ();
3112 unroll = (unroll_factor > 1
3113 && can_unroll_loop_p (loop, unroll_factor, &desc));
3114 exit = single_dom_exit (loop);
3115
3116 /* Execute the predictive commoning transformations, and possibly unroll the
3117 loop. */
3118 if (unroll)
3119 {
3120 struct epcc_data dta;
3121
3122 if (dump_file && (dump_flags & TDF_DETAILS))
3123 fprintf (dump_file, "Unrolling %u times.\n", unroll_factor);
3124
3125 dta.chains = chains;
3126 dta.tmp_vars = tmp_vars;
3127
3128 update_ssa (TODO_update_ssa_only_virtuals);
3129
3130 /* Cfg manipulations performed in tree_transform_and_unroll_loop before
3131 execute_pred_commoning_cbck is called may cause phi nodes to be
3132 reallocated, which is a problem since CHAINS may point to these
3133 statements. To fix this, we store the ssa names defined by the
3134 phi nodes here instead of the phi nodes themselves, and restore
3135 the phi nodes in execute_pred_commoning_cbck. A bit hacky. */
3136 replace_phis_by_defined_names (chains);
3137
3138 tree_transform_and_unroll_loop (loop, unroll_factor, exit, &desc,
3139 execute_pred_commoning_cbck, &dta);
3140 eliminate_temp_copies (loop, tmp_vars);
3141 }
3142 else
3143 {
3144 if (dump_file && (dump_flags & TDF_DETAILS))
3145 fprintf (dump_file,
3146 "Executing predictive commoning without unrolling.\n");
3147 execute_pred_commoning (loop, chains, tmp_vars);
3148 }
3149
3150 end: ;
3151 release_chains (chains);
3152 free_data_refs (datarefs);
3153 BITMAP_FREE (looparound_phis);
3154
3155 free_affine_expand_cache (&name_expansions);
3156
3157 return (unroll ? 1 : 0) | (loop_closed_ssa ? 2 : 0);
3158 }
3159
3160 /* Runs predictive commoning. */
3161
3162 unsigned
3163 tree_predictive_commoning (void)
3164 {
3165 struct loop *loop;
3166 unsigned ret = 0, changed = 0;
3167
3168 initialize_original_copy_tables ();
3169 FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
3170 if (optimize_loop_for_speed_p (loop))
3171 {
3172 changed |= tree_predictive_commoning_loop (loop);
3173 }
3174 free_original_copy_tables ();
3175
3176 if (changed > 0)
3177 {
3178 scev_reset ();
3179
3180 if (changed > 1)
3181 rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
3182
3183 ret = TODO_cleanup_cfg;
3184 }
3185
3186 return ret;
3187 }
3188
3189 /* Predictive commoning Pass. */
3190
3191 static unsigned
3192 run_tree_predictive_commoning (struct function *fun)
3193 {
3194 if (number_of_loops (fun) <= 1)
3195 return 0;
3196
3197 return tree_predictive_commoning ();
3198 }
3199
3200 namespace {
3201
3202 const pass_data pass_data_predcom =
3203 {
3204 GIMPLE_PASS, /* type */
3205 "pcom", /* name */
3206 OPTGROUP_LOOP, /* optinfo_flags */
3207 TV_PREDCOM, /* tv_id */
3208 PROP_cfg, /* properties_required */
3209 0, /* properties_provided */
3210 0, /* properties_destroyed */
3211 0, /* todo_flags_start */
3212 TODO_update_ssa_only_virtuals, /* todo_flags_finish */
3213 };
3214
3215 class pass_predcom : public gimple_opt_pass
3216 {
3217 public:
3218 pass_predcom (gcc::context *ctxt)
3219 : gimple_opt_pass (pass_data_predcom, ctxt)
3220 {}
3221
3222 /* opt_pass methods: */
3223 virtual bool gate (function *) { return flag_predictive_commoning != 0; }
3224 virtual unsigned int execute (function *fun)
3225 {
3226 return run_tree_predictive_commoning (fun);
3227 }
3228
3229 }; // class pass_predcom
3230
3231 } // anon namespace
3232
3233 gimple_opt_pass *
3234 make_pass_predcom (gcc::context *ctxt)
3235 {
3236 return new pass_predcom (ctxt);
3237 }
3238
3239