* tree-ssa-loop-ivcanon.c: New file.
* tree-ssa-loop-manip.c (create_iv): New function.
* Makefile.in (tree-ssa-loop-ivcanon.o): Add.
(tree-ssa-loop.o, tree-ssa-loop-manip.o): Add SCEV_H dependency.
* cfgloop.c (mark_single_exit_loops): New function.
(verify_loop_structure): Verify single-exit loops.
* cfgloop.h (struct loop): Add single_exit field.
(LOOPS_HAVE_MARKED_SINGLE_EXITS): New constant.
(mark_single_exit_loops): Declare.
(tree_num_loop_insns): Declare.
* cfgloopmanip.c (update_single_exits_after_duplication): New function.
(duplicate_loop_to_header_edge): Use it.
* common.opt (fivcanon): New flag.
* timevar.def (TV_TREE_LOOP_IVCANON, TV_COMPLETE_UNROLL): New timevars.
* tree-cfg.c (tree_find_edge_insert_loc): Return newly created block.
(bsi_commit_edge_inserts_1): Pass null to tree_find_edge_insert_loc.
(bsi_insert_on_edge_immediate): New function.
* tree-flow.h (bsi_insert_on_edge_immediate,
canonicalize_induction_variables, tree_unroll_loops_completely,
create_iv): Declare.
* tree-optimize.c (init_tree_optimization_passes): Add
pass_iv_canon and pass_complete_unroll.
* tree-pass.h (pass_iv_canon, pass_complete_unroll): Declare.
* tree-scalar-evolution.c (get_loop_exit_condition,
get_exit_conditions_rec, number_of_iterations_in_loop,
scev_initialize): Use single_exit information.
* tree-ssa-loop-niter.c (number_of_iterations_cond): Record
missing assumptions.
(loop_niter_by_eval): Return number of iterations as unsigned
int.
* tree-ssa-loop.c (tree_ssa_loop_init): Mark single exit loops.
(tree_ssa_loop_ivcanon, gate_tree_ssa_loop_ivcanon, pass_iv_canon,
tree_complete_unroll, gate_tree_complete_unroll, pass_complete_unroll):
New passes.
(tree_ssa_loop_done): Call free_numbers_of_iterations_estimates.
* tree-ssanames.c (make_ssa_name): Allow creating ssa name before
the defining statement is ready.
* tree-vectorizer.c (vect_create_iv_simple): Removed.
(vect_create_index_for_array_ref, vect_transform_loop_bound):
Use create_iv.
(vect_transform_loop_bound): Use single_exit information.
(vect_analyze_loop_form): Cleanup bogus tests.
(vectorize_loops): Do not call flow_loop_scan.
* tree.h (may_negate_without_overflow_p): Declare.
* fold-const.c (may_negate_without_overflow_p): Split out from ...
(negate_expr_p): ... this function.
(tree_expr_nonzero_p): Handle overflowed constants correctly.
* doc/invoke.texi (-fivcanon): Document.
* doc/passes.texi: Document canonical induction variable creation.
* gcc.dg/tree-ssa/loop-1.c: New test.
From-SVN: r86516
+2004-08-24 Zdenek Dvorak <rakdver@atrey.karlin.mff.cuni.cz>
+
+ * tree-ssa-loop-ivcanon.c: New file.
+ * tree-ssa-loop-manip.c (create_iv): New function.
+ * Makefile.in (tree-ssa-loop-ivcanon.o): Add.
+ (tree-ssa-loop.o, tree-ssa-loop-manip.o): Add SCEV_H dependency.
+ * cfgloop.c (mark_single_exit_loops): New function.
+ (verify_loop_structure): Verify single-exit loops.
+ * cfgloop.h (struct loop): Add single_exit field.
+ (LOOPS_HAVE_MARKED_SINGLE_EXITS): New constant.
+ (mark_single_exit_loops): Declare.
+ (tree_num_loop_insns): Declare.
+ * cfgloopmanip.c (update_single_exits_after_duplication): New function.
+ (duplicate_loop_to_header_edge): Use it.
+ * common.opt (fivcanon): New flag.
+ * timevar.def (TV_TREE_LOOP_IVCANON, TV_COMPLETE_UNROLL): New timevars.
+ * tree-cfg.c (tree_find_edge_insert_loc): Return newly created block.
+ (bsi_commit_edge_inserts_1): Pass null to tree_find_edge_insert_loc.
+ (bsi_insert_on_edge_immediate): New function.
+ * tree-flow.h (bsi_insert_on_edge_immediate,
+ canonicalize_induction_variables, tree_unroll_loops_completely,
+ create_iv): Declare.
+ * tree-optimize.c (init_tree_optimization_passes): Add
+ pass_iv_canon and pass_complete_unroll.
+ * tree-pass.h (pass_iv_canon, pass_complete_unroll): Declare.
+ * tree-scalar-evolution.c (get_loop_exit_condition,
+ get_exit_conditions_rec, number_of_iterations_in_loop,
+ scev_initialize): Use single_exit information.
+ * tree-ssa-loop-niter.c (number_of_iterations_cond): Record
+ missing assumptions.
+ (loop_niter_by_eval): Return number of iterations as unsigned
+ int.
+ * tree-ssa-loop.c (tree_ssa_loop_init): Mark single exit loops.
+ (tree_ssa_loop_ivcanon, gate_tree_ssa_loop_ivcanon, pass_iv_canon,
+ tree_complete_unroll, gate_tree_complete_unroll, pass_complete_unroll):
+ New passes.
+ (tree_ssa_loop_done): Call free_numbers_of_iterations_estimates.
+ * tree-ssanames.c (make_ssa_name): Allow creating ssa name before
+ the defining statement is ready.
+ * tree-vectorizer.c (vect_create_iv_simple): Removed.
+ (vect_create_index_for_array_ref, vect_transform_loop_bound):
+ Use create_iv.
+ (vect_transform_loop_bound): Use single_exit information.
+ (vect_analyze_loop_form): Cleanup bogus tests.
+ (vectorize_loops): Do not call flow_loop_scan.
+ * tree.h (may_negate_without_overflow_p): Declare.
+ * fold-const.c (may_negate_without_overflow_p): Split out from ...
+ (negate_expr_p): ... this function.
+ (tree_expr_nonzero_p): Handle overflowed constants correctly.
+ * doc/invoke.texi (-fivcanon): Document.
+ * doc/passes.texi: Document canonical induction variable creation.
+
2004-08-24 Richard Sandiford <rsandifo@redhat.com>
* config/mips/mips.h (ISA_HAS_INT_CONDMOVE): Delete.
tree-ssa-dom.o domwalk.o tree-tailcall.o gimple-low.o tree-iterator.o \
tree-phinodes.o tree-ssanames.o tree-sra.o tree-complex.o tree-ssa-loop.o \
tree-ssa-loop-niter.o tree-ssa-loop-manip.o tree-ssa-threadupdate.o \
- tree-vectorizer.o \
+ tree-vectorizer.o tree-ssa-loop-ivcanon.o \
alias.o bb-reorder.o bitmap.o builtins.o caller-save.o calls.o \
cfg.o cfganal.o cfgbuild.o cfgcleanup.o cfglayout.o cfgloop.o \
cfgloopanal.o cfgloopmanip.o loop-init.o loop-unswitch.o loop-unroll.o \
tree-ssa-loop.o : tree-ssa-loop.c $(TREE_FLOW_H) $(CONFIG_H) \
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) \
output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
- tree-pass.h $(FLAGS_H) tree-inline.h
+ tree-pass.h $(FLAGS_H) tree-inline.h $(SCEV_H)
tree-ssa-loop-niter.o : tree-ssa-loop-niter.c $(TREE_FLOW_H) $(CONFIG_H) \
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) cfgloop.h $(PARAMS_H) tree-inline.h \
output.h diagnostic.h $(TM_H) coretypes.h $(TREE_DUMP_H) flags.h \
tree-pass.h $(SCEV_H)
+tree-ssa-loop-ivcanon.o : tree-ssa-loop-ivcanon.c $(TREE_FLOW_H) $(CONFIG_H) \
+ $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) $(PARAMS_H) tree-inline.h \
+ output.h diagnostic.h $(TM_H) coretypes.h $(TREE_DUMP_H) flags.h \
+ tree-pass.h $(SCEV_H)
tree-ssa-loop-ch.o : tree-ssa-loop-ch.c $(TREE_FLOW_H) $(CONFIG_H) \
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) tree-inline.h \
output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
tree-ssa-loop-manip.o : tree-ssa-loop-manip.c $(TREE_FLOW_H) $(CONFIG_H) \
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) \
output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
- tree-pass.h cfglayout.h
+ tree-pass.h cfglayout.h $(SCEV_H)
tree-ssa-loop-im.o : tree-ssa-loop-im.c $(TREE_FLOW_H) $(CONFIG_H) \
$(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) domwalk.h $(PARAMS_H)\
output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
return num_nodes;
}
+/* For each loop in the lOOPS tree that has just a single exit
+ record the exit edge. */
+
+void
+mark_single_exit_loops (struct loops *loops)
+{
+ basic_block bb;
+ edge e;
+ struct loop *loop;
+ unsigned i;
+
+ for (i = 1; i < loops->num; i++)
+ {
+ loop = loops->parray[i];
+ if (loop)
+ loop->single_exit = NULL;
+ }
+
+ FOR_EACH_BB (bb)
+ {
+ if (bb->loop_father == loops->tree_root)
+ continue;
+ for (e = bb->succ; e; e = e->succ_next)
+ {
+ if (e->dest == EXIT_BLOCK_PTR)
+ continue;
+
+ if (flow_bb_inside_loop_p (bb->loop_father, e->dest))
+ continue;
+
+ for (loop = bb->loop_father;
+ loop != e->dest->loop_father;
+ loop = loop->outer)
+ {
+ /* If we have already seen an exit, mark this by the edge that
+ surely does not occur as any exit. */
+ if (loop->single_exit)
+ loop->single_exit = ENTRY_BLOCK_PTR->succ;
+ else
+ loop->single_exit = e;
+ }
+ }
+ }
+
+ for (i = 1; i < loops->num; i++)
+ {
+ loop = loops->parray[i];
+ if (!loop)
+ continue;
+
+ if (loop->single_exit == ENTRY_BLOCK_PTR->succ)
+ loop->single_exit = NULL;
+ }
+
+ loops->state |= LOOPS_HAVE_MARKED_SINGLE_EXITS;
+}
+
/* Find the root node of the loop pre-header extended basic block and
the edges along the trace from the root node to the loop header. */
}
}
- free (sizes);
-
/* Check get_loop_body. */
for (i = 1; i < loops->num; i++)
{
free (irreds);
}
+ /* Check the single_exit. */
+ if (loops->state & LOOPS_HAVE_MARKED_SINGLE_EXITS)
+ {
+ memset (sizes, 0, sizeof (unsigned) * loops->num);
+ FOR_EACH_BB (bb)
+ {
+ if (bb->loop_father == loops->tree_root)
+ continue;
+ for (e = bb->succ; e; e = e->succ_next)
+ {
+ if (e->dest == EXIT_BLOCK_PTR)
+ continue;
+
+ if (flow_bb_inside_loop_p (bb->loop_father, e->dest))
+ continue;
+
+ for (loop = bb->loop_father;
+ loop != e->dest->loop_father;
+ loop = loop->outer)
+ {
+ sizes[loop->num]++;
+ if (loop->single_exit
+ && loop->single_exit != e)
+ {
+ error ("Wrong single exit %d->%d recorded for loop %d.",
+ loop->single_exit->src->index,
+ loop->single_exit->dest->index,
+ loop->num);
+ error ("Right exit is %d->%d.",
+ e->src->index, e->dest->index);
+ err = 1;
+ }
+ }
+ }
+ }
+
+ for (i = 1; i < loops->num; i++)
+ {
+ loop = loops->parray[i];
+ if (!loop)
+ continue;
+
+ if (sizes[i] == 1
+ && !loop->single_exit)
+ {
+ error ("Single exit not recorded for loop %d.", loop->num);
+ err = 1;
+ }
+
+ if (sizes[i] != 1
+ && loop->single_exit)
+ {
+ error ("Loop %d should not have single exit (%d -> %d).",
+ loop->num,
+ loop->single_exit->src->index,
+ loop->single_exit->dest->index);
+ err = 1;
+ }
+ }
+ }
+
if (err)
abort ();
+
+ free (sizes);
}
/* Returns latch edge of LOOP. */
/* Upper bound on number of iterations of a loop. */
struct nb_iter_bound *bounds;
+
+ /* If not NULL, loop has just single exit edge stored here (edges to the
+ EXIT_BLOCK_PTR do not count. */
+ edge single_exit;
};
/* Flags for state of loop structure. */
{
LOOPS_HAVE_PREHEADERS = 1,
LOOPS_HAVE_SIMPLE_LATCHES = 2,
- LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS = 4
+ LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS = 4,
+ LOOPS_HAVE_MARKED_SINGLE_EXITS = 8
};
/* Structure to hold CFG information about natural loops within a function. */
extern int flow_loop_scan (struct loop *, int);
extern void flow_loop_free (struct loop *);
void mark_irreducible_loops (struct loops *);
+void mark_single_exit_loops (struct loops *);
extern void create_loop_notes (void);
/* Loop data structure manipulation/querying. */
extern bool flow_bb_inside_loop_p (const struct loop *, const basic_block);
extern struct loop * find_common_loop (struct loop *, struct loop *);
struct loop *superloop_at_depth (struct loop *, unsigned);
+extern unsigned tree_num_loop_insns (struct loop *);
extern int num_loop_insns (struct loop *);
extern int average_num_loop_insns (struct loop *);
extern unsigned get_loop_level (const struct loop *);
return ret;
}
+/* The NBBS blocks in BBS will get duplicated and the copies will be placed
+ to LOOP. Update the single_exit information in superloops of LOOP. */
+
+static void
+update_single_exits_after_duplication (basic_block *bbs, unsigned nbbs,
+ struct loop *loop)
+{
+ unsigned i;
+
+ for (i = 0; i < nbbs; i++)
+ bbs[i]->rbi->duplicated = 1;
+
+ for (; loop->outer; loop = loop->outer)
+ {
+ if (!loop->single_exit)
+ continue;
+
+ if (loop->single_exit->src->rbi->duplicated)
+ loop->single_exit = NULL;
+ }
+
+ for (i = 0; i < nbbs; i++)
+ bbs[i]->rbi->duplicated = 0;
+}
+
/* Duplicates body of LOOP to given edge E NDUPL times. Takes care of updating
LOOPS structure and dominators. E's destination must be LOOP header for
first_active_latch = latch;
}
+ /* Update the information about single exits. */
+ if (loops->state & LOOPS_HAVE_MARKED_SINGLE_EXITS)
+ update_single_exits_after_duplication (bbs, n, target);
+
/* Record exit edge in original loop body. */
if (orig && TEST_BIT (wont_exit, 0))
to_remove[(*n_to_remove)++] = orig;
Common Report Var(flag_instrument_function_entry_exit)
Instrument function entry and exit with profiling calls
+fivcanon
+Common Report Var(flag_ivcanon)
+Create canonical induction variables in loops
+
fkeep-inline-functions
Common Report Var(flag_keep_inline_functions)
Generate code for functions even if they are fully inlined
-funroll-all-loops -funroll-loops -fpeel-loops @gol
-funswitch-loops -fold-unroll-loops -fold-unroll-all-loops @gol
-ftree-pre -ftree-ccp -ftree-dce -ftree-loop-optimize @gol
--ftree-lim @gol
+-ftree-lim -fivcanon @gol
-ftree-dominator-opts -ftree-dse -ftree-copyrename @gol
-ftree-ch -ftree-sra -ftree-ter -ftree-lrs -ftree-fre -ftree-vectorize @gol
--param @var{name}=@var{value}
just trivial invariantness analysis in loop unswitching. The pass also includes
store motion.
+@item -fivcanon
+Create a canonical counter for number of iterations in the loop for that
+determining number of iterations requires complicated analysis. Later
+optimizations then may determine the number easily. Useful especially
+in connection with unrolling.
+
@item -ftree-sra
Perform scalar replacement of aggregates. This pass replaces structure
references with scalars to prevent committing structures to memory too
just trivial invariantness analysis in loop unswitching. The pass also includes
store motion. The pass is implemented in @file{tree-ssa-loop-im.c}.
+Canonical induction variable creation. This pass creates a simple counter
+for number of iterations of the loop and replaces the exit condition of the
+loop using it, in case when a complicated analysis is necessary to determine
+the number of iterations. Later optimizations then may determine the number
+easily. The pass is implemented in @file{tree-ssa-loop-ivcanon.c}.
+
The optimizations also use various utility functions contained in
-@file{cfgloop.c}, @file{cfgloopanal.c} and @file{cfgloopmanip.c}.
+@file{tree-ssa-loop-manip.c}, @file{cfgloop.c}, @file{cfgloopanal.c} and
+@file{cfgloopmanip.c}.
@item Conditional constant propagation
return false;
}
+/* Check whether we may negate an integer constant T without causing
+ overflow. */
+
+bool
+may_negate_without_overflow_p (tree t)
+{
+ unsigned HOST_WIDE_INT val;
+ unsigned int prec;
+ tree type;
+
+ if (TREE_CODE (t) != INTEGER_CST)
+ abort ();
+
+ type = TREE_TYPE (t);
+ if (TYPE_UNSIGNED (type))
+ return false;
+
+ prec = TYPE_PRECISION (type);
+ if (prec > HOST_BITS_PER_WIDE_INT)
+ {
+ if (TREE_INT_CST_LOW (t) != 0)
+ return true;
+ prec -= HOST_BITS_PER_WIDE_INT;
+ val = TREE_INT_CST_HIGH (t);
+ }
+ else
+ val = TREE_INT_CST_LOW (t);
+ if (prec < HOST_BITS_PER_WIDE_INT)
+ val &= ((unsigned HOST_WIDE_INT) 1 << prec) - 1;
+ return val != ((unsigned HOST_WIDE_INT) 1 << (prec - 1));
+}
+
/* Determine whether an expression T can be cheaply negated using
the function negate_expr. */
static bool
negate_expr_p (tree t)
{
- unsigned HOST_WIDE_INT val;
- unsigned int prec;
tree type;
if (t == 0)
return true;
/* Check that -CST will not overflow type. */
- prec = TYPE_PRECISION (type);
- if (prec > HOST_BITS_PER_WIDE_INT)
- {
- if (TREE_INT_CST_LOW (t) != 0)
- return true;
- prec -= HOST_BITS_PER_WIDE_INT;
- val = TREE_INT_CST_HIGH (t);
- }
- else
- val = TREE_INT_CST_LOW (t);
- if (prec < HOST_BITS_PER_WIDE_INT)
- val &= ((unsigned HOST_WIDE_INT) 1 << prec) - 1;
- return val != ((unsigned HOST_WIDE_INT) 1 << (prec - 1));
+ return may_negate_without_overflow_p (t);
case REAL_CST:
case NEGATE_EXPR:
return tree_expr_nonzero_p (TREE_OPERAND (t, 0));
case INTEGER_CST:
- return !integer_zerop (t);
+ /* We used to test for !integer_zerop here. This does not work correctly
+ if TREE_CONSTANT_OVERFLOW (t). */
+ return (TREE_INT_CST_LOW (t) != 0
+ || TREE_INT_CST_HIGH (t) != 0);
case PLUS_EXPR:
if (!TYPE_UNSIGNED (type) && !flag_wrapv)
+2004-08-24 Zdenek Dvorak <rakdver@atrey.karlin.mff.cuni.cz>
+
+ * gcc.dg/tree-ssa/loop-1.c: New test.
+
2004-08-24 Richard Sandiford <rsandifo@redhat.com>
* gcc.c-torture/compile/20040824-1.c: New test.
--- /dev/null
+/* { dg-do compile } */
+/* { dg-options "-O1 -fivcanon -funroll-loops -fdump-tree-ivcanon-details" } */
+
+void xxx(void)
+{
+ int x = 45;
+
+ while (x >>= 1)
+ foo ();
+}
+
+/* We should be able to find out that the loop iterates four times and unroll it completely. */
+
+/* { dg-final { scan-tree-dump-times "Added canonical iv to loop 1, 4 iterations" 1 "ivcanon"} } */
+/* { dg-final { scan-assembler-times "foo" 5} } */
+
+
DEFTIMEVAR (TV_TREE_DSE , "tree DSE")
DEFTIMEVAR (TV_TREE_LOOP , "tree loop optimization")
DEFTIMEVAR (TV_LIM , "loop invariant motion")
+DEFTIMEVAR (TV_TREE_LOOP_IVCANON , "tree canonical iv creation")
+DEFTIMEVAR (TV_COMPLETE_UNROLL , "complete unrolling")
DEFTIMEVAR (TV_TREE_VECTORIZATION , "tree loop vectorization")
DEFTIMEVAR (TV_TREE_CH , "tree copy headers")
DEFTIMEVAR (TV_TREE_SSA_TO_NORMAL , "tree SSA to normal")
In all cases, the returned *BSI points to the correct location. The
return value is true if insertion should be done after the location,
- or false if it should be done before the location. */
+ or false if it should be done before the location. If new basic block
+ has to be created, it is stored in *NEW_BB. */
static bool
-tree_find_edge_insert_loc (edge e, block_stmt_iterator *bsi)
+tree_find_edge_insert_loc (edge e, block_stmt_iterator *bsi,
+ basic_block *new_bb)
{
basic_block dest, src;
tree tmp;
/* Otherwise, create a new basic block, and split this edge. */
dest = split_edge (e);
+ if (new_bb)
+ *new_bb = dest;
e = dest->pred;
goto restart;
}
PENDING_STMT (e) = NULL_TREE;
- if (tree_find_edge_insert_loc (e, &bsi))
+ if (tree_find_edge_insert_loc (e, &bsi, NULL))
bsi_insert_after (&bsi, stmt, BSI_NEW_STMT);
else
bsi_insert_before (&bsi, stmt, BSI_NEW_STMT);
append_to_statement_list (stmt, &PENDING_STMT (e));
}
+/* Similar to bsi_insert_on_edge+bsi_commit_edge_inserts. If new block has to
+ be created, it is returned. */
+
+basic_block
+bsi_insert_on_edge_immediate (edge e, tree stmt)
+{
+ block_stmt_iterator bsi;
+ basic_block new_bb = NULL;
+
+ if (PENDING_STMT (e))
+ abort ();
+
+ if (tree_find_edge_insert_loc (e, &bsi, &new_bb))
+ bsi_insert_after (&bsi, stmt, BSI_NEW_STMT);
+ else
+ bsi_insert_before (&bsi, stmt, BSI_NEW_STMT);
+
+ return new_bb;
+}
/*---------------------------------------------------------------------------
Tree specific functions for CFG manipulation
extern void tree_optimize_tail_calls (bool, enum tree_dump_index);
extern edge tree_block_forwards_to (basic_block bb);
extern void bsi_insert_on_edge (edge, tree);
+extern basic_block bsi_insert_on_edge_immediate (edge, tree);
extern void bsi_commit_edge_inserts (int *);
extern void notice_special_calls (tree);
extern void clear_special_calls (void);
/* In tree-ssa-loop*.c */
void tree_ssa_lim (struct loops *);
+void canonicalize_induction_variables (struct loops *);
+void tree_unroll_loops_completely (struct loops *);
void number_of_iterations_cond (tree, tree, tree, enum tree_code, tree, tree,
struct tree_niter_desc *);
void verify_loop_closed_ssa (void);
void loop_commit_inserts (void);
bool for_each_index (tree *, bool (*) (tree, tree *, void *), void *);
+void create_iv (tree, tree, tree, struct loop *, block_stmt_iterator *, bool,
+ tree *, tree *);
/* In tree-flow-inline.h */
static inline int phi_arg_from_edge (tree, edge);
p = &pass_loop.sub;
NEXT_PASS (pass_loop_init);
NEXT_PASS (pass_lim);
+ NEXT_PASS (pass_iv_canon);
NEXT_PASS (pass_vectorize);
+ NEXT_PASS (pass_complete_unroll);
NEXT_PASS (pass_loop_done);
*p = NULL;
extern struct tree_opt_pass pass_loop;
extern struct tree_opt_pass pass_loop_init;
extern struct tree_opt_pass pass_lim;
+extern struct tree_opt_pass pass_iv_canon;
extern struct tree_opt_pass pass_vectorize;
+extern struct tree_opt_pass pass_complete_unroll;
extern struct tree_opt_pass pass_loop_done;
extern struct tree_opt_pass pass_ch;
extern struct tree_opt_pass pass_ccp;
get_loop_exit_condition (struct loop *loop)
{
tree res = NULL_TREE;
+ edge exit_edge = loop->single_exit;
+
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "(get_loop_exit_condition \n ");
- if (loop->exit_edges)
+ if (exit_edge)
{
- edge exit_edge;
tree expr;
- exit_edge = loop->exit_edges[0];
expr = last_stmt (exit_edge->src);
-
if (analyzable_condition (expr))
res = expr;
}
get_exit_conditions_rec (loop->inner, exit_conditions);
get_exit_conditions_rec (loop->next, exit_conditions);
- flow_loop_scan (loop, LOOP_EXIT_EDGES);
- if (loop->num_exits == 1)
+ if (loop->single_exit)
{
tree loop_condition = get_loop_exit_condition (loop);
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "(number_of_iterations_in_loop\n");
- if (!loop->exit_edges)
+ exit = loop->single_exit;
+ if (!exit)
goto end;
- exit = loop->exit_edges[0];
if (!number_of_iterations_exit (loop, exit, &niter_desc))
goto end;
for (i = 1; i < loops->num; i++)
if (loops->parray[i])
- {
- flow_loop_scan (loops->parray[i], LOOP_EXIT_EDGES);
- loops->parray[i]->nb_iterations = NULL_TREE;
- }
+ loops->parray[i]->nb_iterations = NULL_TREE;
}
/* Cleans up the information cached by the scalar evolutions analysis. */
--- /dev/null
+/* Induction variable canonicalization.
+ Copyright (C) 2004 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 2, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING. If not, write to the Free
+Software Foundation, 59 Temple Place - Suite 330, Boston, MA
+02111-1307, USA. */
+
+/* This pass detects the loops that iterate a constant number of times,
+ adds a canonical induction variable (step -1, tested against 0)
+ and replaces the exit test. This enables the less powerful rtl
+ level analysis to use this information.
+
+ This might spoil the code in some cases (by increasing register pressure).
+ Note that in the case the new variable is not needed, ivopts will get rid
+ of it, so it might only be a problem when there are no other linear induction
+ variables. In that case the created optimization possibilities are likely
+ to pay up.
+
+ Additionally in case we detect that it is beneficial to unroll the
+ loop completely, we do it right here to expose the optimization
+ possibilities to the following passes. */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "rtl.h"
+#include "tm_p.h"
+#include "hard-reg-set.h"
+#include "basic-block.h"
+#include "output.h"
+#include "diagnostic.h"
+#include "tree-flow.h"
+#include "tree-dump.h"
+#include "cfgloop.h"
+#include "tree-pass.h"
+#include "ggc.h"
+#include "tree-chrec.h"
+#include "tree-scalar-evolution.h"
+#include "params.h"
+#include "flags.h"
+#include "tree-inline.h"
+
+/* Adds a canonical induction variable to LOOP iterating NITER times. EXIT
+ is the exit edge whose condition is replaced. */
+
+static void
+create_canonical_iv (struct loop *loop, edge exit, tree niter)
+{
+ edge in;
+ tree cond, type, var;
+ block_stmt_iterator incr_at;
+ enum tree_code cmp;
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file, "Added canonical iv to loop %d, ", loop->num);
+ print_generic_expr (dump_file, niter, TDF_SLIM);
+ fprintf (dump_file, " iterations.\n");
+ }
+
+ cond = last_stmt (exit->src);
+ in = exit->src->succ;
+ if (in == exit)
+ in = in->succ_next;
+
+ /* Note that we do not need to worry about overflows, since
+ type of niter is always unsigned and all comparisons are
+ just for equality/nonequality -- i.e. everything works
+ with a modulo arithmetics. */
+
+ type = TREE_TYPE (niter);
+ niter = fold (build2 (PLUS_EXPR, type,
+ niter,
+ build_int_cst (type, 1, 0)));
+ incr_at = bsi_last (in->src);
+ create_iv (niter,
+ fold_convert (type, integer_minus_one_node),
+ NULL_TREE, loop,
+ &incr_at, false, NULL, &var);
+
+ cmp = (exit->flags & EDGE_TRUE_VALUE) ? EQ_EXPR : NE_EXPR;
+ COND_EXPR_COND (cond) = build2 (cmp, boolean_type_node,
+ var,
+ build_int_cst (type, 0, 0));
+ modify_stmt (cond);
+}
+
+/* Computes an estimated number of insns in LOOP. */
+
+unsigned
+tree_num_loop_insns (struct loop *loop)
+{
+ basic_block *body = get_loop_body (loop);
+ block_stmt_iterator bsi;
+ unsigned size = 1, i;
+
+ for (i = 0; i < loop->num_nodes; i++)
+ for (bsi = bsi_start (body[i]); !bsi_end_p (bsi); bsi_next (&bsi))
+ size += estimate_num_insns (bsi_stmt (bsi));
+ free (body);
+
+ return size;
+}
+
+/* Tries to unroll LOOP completely, i.e. NITER times. LOOPS is the
+ loop tree. COMPLETELY_UNROLL is true if we should unroll the loop
+ even if it may cause code growth. EXIT is the exit of the loop
+ that should be eliminated. */
+
+static bool
+try_unroll_loop_completely (struct loops *loops ATTRIBUTE_UNUSED,
+ struct loop *loop,
+ edge exit, tree niter,
+ bool completely_unroll)
+{
+ unsigned HOST_WIDE_INT n_unroll, ninsns, max_unroll;
+ tree old_cond, cond, dont_exit, do_exit;
+
+ if (loop->inner)
+ return false;
+
+ if (!host_integerp (niter, 1))
+ return false;
+ n_unroll = tree_low_cst (niter, 1);
+
+ max_unroll = PARAM_VALUE (PARAM_MAX_COMPLETELY_PEEL_TIMES);
+ if (n_unroll > max_unroll)
+ return false;
+
+ if (n_unroll)
+ {
+ if (!completely_unroll)
+ return false;
+
+ ninsns = tree_num_loop_insns (loop);
+
+ if (n_unroll * ninsns
+ > (unsigned) PARAM_VALUE (PARAM_MAX_COMPLETELY_PEELED_INSNS))
+ return false;
+ }
+
+ if (exit->flags & EDGE_TRUE_VALUE)
+ {
+ dont_exit = boolean_false_node;
+ do_exit = boolean_true_node;
+ }
+ else
+ {
+ dont_exit = boolean_true_node;
+ do_exit = boolean_false_node;
+ }
+ cond = last_stmt (exit->src);
+
+ if (n_unroll)
+ {
+ if (!flag_unroll_loops)
+ return false;
+
+ old_cond = COND_EXPR_COND (cond);
+ COND_EXPR_COND (cond) = dont_exit;
+ modify_stmt (cond);
+
+#if 0
+ /* The necessary infrastructure is not in yet. */
+ if (!tree_duplicate_loop_to_header_edge (loop, loop_preheader_edge (loop),
+ loops, n_unroll, NULL,
+ NULL, NULL, NULL, 0))
+#endif
+ {
+ COND_EXPR_COND (cond) = old_cond;
+ return false;
+ }
+ }
+
+ COND_EXPR_COND (cond) = do_exit;
+ modify_stmt (cond);
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "Unrolled loop %d completely.\n", loop->num);
+
+ return true;
+}
+
+/* Adds a canonical induction variable to LOOP if suitable. LOOPS is the loops
+ tree. CREATE_IV is true if we may create a new iv. COMPLETELY_UNROLL is
+ true if we should do complete unrolling even if it may cause the code
+ growth. If TRY_EVAL is true, we try to determine the number of iterations
+ of a loop by direct evaluation. Returns true if cfg is changed. */
+
+static bool
+canonicalize_loop_induction_variables (struct loops *loops, struct loop *loop,
+ bool create_iv, bool completely_unroll,
+ bool try_eval)
+{
+ edge exit = NULL;
+ tree niter;
+
+ niter = number_of_iterations_in_loop (loop);
+ if (TREE_CODE (niter) == INTEGER_CST)
+ {
+ exit = loop->single_exit;
+ if (!just_once_each_iteration_p (loop, exit->src))
+ return false;
+
+ /* The result of number_of_iterations_in_loop is by one higher than
+ we expect (i.e. it returns number of executions of the exit
+ condition, not of the loop latch edge). */
+ niter = fold (build2 (MINUS_EXPR, TREE_TYPE (niter), niter,
+ build_int_cst (TREE_TYPE (niter), 1, 0)));
+ }
+ else if (try_eval)
+ niter = find_loop_niter_by_eval (loop, &exit);
+
+ if (chrec_contains_undetermined (niter)
+ || TREE_CODE (niter) != INTEGER_CST)
+ return false;
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ {
+ fprintf (dump_file, "Loop %d iterates ", loop->num);
+ print_generic_expr (dump_file, niter, TDF_SLIM);
+ fprintf (dump_file, " times.\n");
+ }
+
+ if (try_unroll_loop_completely (loops, loop, exit, niter, completely_unroll))
+ return true;
+
+ if (create_iv)
+ create_canonical_iv (loop, exit, niter);
+
+ return false;
+}
+
+/* The main entry point of the pass. Adds canonical induction variables
+ to the suitable LOOPS. */
+
+void
+canonicalize_induction_variables (struct loops *loops)
+{
+ unsigned i;
+ struct loop *loop;
+
+ for (i = 1; i < loops->num; i++)
+ {
+ loop = loops->parray[i];
+
+ if (loop)
+ canonicalize_loop_induction_variables (loops, loop, true, false, true);
+ }
+
+#if 0
+ /* The necessary infrastructure is not in yet. */
+ if (changed)
+ cleanup_tree_cfg_loop ();
+#endif
+}
+
+/* Unroll LOOPS completely if they iterate just few times. */
+
+void
+tree_unroll_loops_completely (struct loops *loops)
+{
+ unsigned i;
+ struct loop *loop;
+ bool changed = false;
+
+ for (i = 1; i < loops->num; i++)
+ {
+ loop = loops->parray[i];
+
+ if (!loop)
+ continue;
+
+ changed |= canonicalize_loop_induction_variables (loops, loop,
+ false, true,
+ !flag_ivcanon);
+ }
+
+#if 0
+ /* The necessary infrastructure is not in yet. */
+ if (changed)
+ cleanup_tree_cfg_loop ();
+#endif
+}
#include "cfglayout.h"
#include "tree-scalar-evolution.h"
+/* Creates an induction variable with value BASE + STEP * iteration in LOOP.
+ It is expected that neither BASE nor STEP are shared with other expressions
+ (unless the sharing rules allow this). Use VAR as a base var_decl for it
+ (if NULL, a new temporary will be created). The increment will occur at
+ INCR_POS (after it if AFTER is true, before it otherwise). The ssa versions
+ of the variable before and after increment will be stored in VAR_BEFORE and
+ VAR_AFTER (unless they are NULL). */
+
+void
+create_iv (tree base, tree step, tree var, struct loop *loop,
+ block_stmt_iterator *incr_pos, bool after,
+ tree *var_before, tree *var_after)
+{
+ tree stmt, initial, step1;
+ tree vb, va;
+ enum tree_code incr_op = PLUS_EXPR;
+
+ if (!var)
+ {
+ var = create_tmp_var (TREE_TYPE (base), "ivtmp");
+ add_referenced_tmp_var (var);
+ }
+
+ vb = make_ssa_name (var, NULL_TREE);
+ if (var_before)
+ *var_before = vb;
+ va = make_ssa_name (var, NULL_TREE);
+ if (var_after)
+ *var_after = va;
+
+ /* For easier readability of the created code, produce MINUS_EXPRs
+ when suitable. */
+ if (TREE_CODE (step) == INTEGER_CST)
+ {
+ if (TYPE_UNSIGNED (TREE_TYPE (step)))
+ {
+ step1 = fold (build1 (NEGATE_EXPR, TREE_TYPE (step), step));
+ if (tree_int_cst_lt (step1, step))
+ {
+ incr_op = MINUS_EXPR;
+ step = step1;
+ }
+ }
+ else
+ {
+ if (!tree_expr_nonnegative_p (step)
+ && may_negate_without_overflow_p (step))
+ {
+ incr_op = MINUS_EXPR;
+ step = fold (build1 (NEGATE_EXPR, TREE_TYPE (step), step));
+ }
+ }
+ }
+
+ stmt = build2 (MODIFY_EXPR, void_type_node, va,
+ build2 (incr_op, TREE_TYPE (base),
+ vb, step));
+ SSA_NAME_DEF_STMT (va) = stmt;
+ if (after)
+ bsi_insert_after (incr_pos, stmt, BSI_NEW_STMT);
+ else
+ bsi_insert_before (incr_pos, stmt, BSI_NEW_STMT);
+
+ initial = base;
+
+ stmt = create_phi_node (vb, loop->header);
+ SSA_NAME_DEF_STMT (vb) = stmt;
+ add_phi_arg (&stmt, initial, loop_preheader_edge (loop));
+ add_phi_arg (&stmt, va, loop_latch_edge (loop));
+}
+
/* Add exit phis for the USE on EXIT. */
static void
convert (niter_type, integer_one_node));
}
+ assumption = fold (build2 (FLOOR_MOD_EXPR, niter_type, base1, d));
+ assumption = fold (build2 (EQ_EXPR, boolean_type_node,
+ assumption,
+ build_int_cst (niter_type, 0, 0)));
+ assumptions = fold (build2 (TRUTH_AND_EXPR, boolean_type_node,
+ assumptions, assumption));
+
tmp = fold (build (EXACT_DIV_EXPR, niter_type, base1, d));
tmp = fold (build (MULT_EXPR, niter_type, tmp, inverse (s, bound)));
niter->niter = fold (build (BIT_AND_EXPR, niter_type, tmp, bound));
fprintf (dump_file,
"Proved that loop %d iterates %d times using brute force.\n",
loop->num, i);
- return build_int_cst (NULL_TREE, i, 0);
+ return build_int_cst (unsigned_type_node, i, 0);
}
for (j = 0; j < 2; j++)
current_loops = tree_loop_optimizer_init (dump_file);
if (!current_loops)
return;
+
+ /* Find the loops that are exited just through a single edge. */
+ mark_single_exit_loops (current_loops);
+
scev_initialize (current_loops);
}
TODO_dump_func /* todo_flags_finish */
};
+/* Canonical induction variable creation pass. */
+
+static void
+tree_ssa_loop_ivcanon (void)
+{
+ if (!current_loops)
+ return;
+
+ canonicalize_induction_variables (current_loops);
+}
+
+static bool
+gate_tree_ssa_loop_ivcanon (void)
+{
+ return flag_ivcanon != 0;
+}
+
+struct tree_opt_pass pass_iv_canon =
+{
+ "ivcanon", /* name */
+ gate_tree_ssa_loop_ivcanon, /* gate */
+ tree_ssa_loop_ivcanon, /* execute */
+ NULL, /* sub */
+ NULL, /* next */
+ 0, /* static_pass_number */
+ TV_TREE_LOOP_IVCANON, /* tv_id */
+ PROP_cfg | PROP_ssa, /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ TODO_dump_func /* todo_flags_finish */
+};
+
+/* Complete unrolling of loops. */
+
+static void
+tree_complete_unroll (void)
+{
+ if (!current_loops)
+ return;
+
+ tree_unroll_loops_completely (current_loops);
+}
+
+static bool
+gate_tree_complete_unroll (void)
+{
+ return flag_unroll_loops != 0;
+}
+
+struct tree_opt_pass pass_complete_unroll =
+{
+ "cunroll", /* name */
+ gate_tree_complete_unroll, /* gate */
+ tree_complete_unroll, /* execute */
+ NULL, /* sub */
+ NULL, /* next */
+ 0, /* static_pass_number */
+ TV_COMPLETE_UNROLL, /* tv_id */
+ PROP_cfg | PROP_ssa, /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ TODO_dump_func /* todo_flags_finish */
+};
+
/* Loop optimizer finalization. */
static void
if (!current_loops)
return;
- scev_finalize ();
-
#ifdef ENABLE_CHECKING
verify_loop_closed_ssa ();
#endif
+ free_numbers_of_iterations_estimates (current_loops);
+ scev_finalize ();
loop_optimizer_finalize (current_loops,
(dump_flags & TDF_DETAILS ? dump_file : NULL));
current_loops = NULL;
#if defined ENABLE_CHECKING
if ((!DECL_P (var)
&& TREE_CODE (var) != INDIRECT_REF)
- || (!IS_EXPR_CODE_CLASS (TREE_CODE_CLASS (TREE_CODE (stmt)))
+ || (stmt
+ && !IS_EXPR_CODE_CLASS (TREE_CODE_CLASS (TREE_CODE (stmt)))
&& TREE_CODE (stmt) != PHI_NODE))
abort ();
#endif
return false;
}
-
-/* THIS IS A COPY OF THE FUNCTION IN TREE-SSA-IVOPTS.C, MODIFIED
- TO NOT USE FORCE_GIMPLE_OPERAND. When that function is accepted
- into he mainline, This function can go away and be replaced by it.
- Creates an induction variable with value BASE + STEP * iteration in
- LOOP. It is expected that neither BASE nor STEP are shared with
- other expressions (unless the sharing rules allow this). Use VAR
- as a base var_decl for it (if NULL, a new temporary will be
- created). The increment will occur at INCR_POS (after it if AFTER
- is true, before it otherwise). The ssa versions of the variable
- before and after increment will be stored in VAR_BEFORE and
- VAR_AFTER (unless they are NULL). */
-
-static void
-vect_create_iv_simple (tree base, tree step, tree var, struct loop *loop,
- block_stmt_iterator *incr_pos, bool after,
- tree *var_before, tree *var_after)
-{
- tree stmt, stmts, initial;
- tree vb, va;
- stmts = NULL;
-
- if (!var)
- {
- var = create_tmp_var (TREE_TYPE (base), "ivtmp");
- add_referenced_tmp_var (var);
- }
-
- vb = make_ssa_name (var, build_empty_stmt ());
- if (var_before)
- *var_before = vb;
- va = make_ssa_name (var, build_empty_stmt ());
- if (var_after)
- *var_after = va;
-
- stmt = build (MODIFY_EXPR, void_type_node, va,
- build (PLUS_EXPR, TREE_TYPE (base), vb, step));
- SSA_NAME_DEF_STMT (va) = stmt;
- if (after)
- bsi_insert_after (incr_pos, stmt, BSI_NEW_STMT);
- else
- bsi_insert_before (incr_pos, stmt, BSI_NEW_STMT);
-
- /* Our base is always a GIMPLE variable, thus, we don't need to
- force_gimple_operand it. */
- initial = base;
- if (stmts)
- {
- edge pe = loop_preheader_edge (loop);
- bsi_insert_on_edge (pe, stmts);
- }
-
- stmt = create_phi_node (vb, loop->header);
- SSA_NAME_DEF_STMT (vb) = stmt;
- add_phi_arg (&stmt, initial, loop_preheader_edge (loop));
- add_phi_arg (&stmt, va, loop_latch_edge (loop));
-}
-
-
/* Function vect_get_base_decl_and_bit_offset
Get the decl from which the data reference REF is based,
fprintf (dump_file, ")");
}
- /* both init and step are guaranted to be gimple expressions,
- so we can use vect_create_iv_simple. */
- vect_create_iv_simple (init, step, NULL, loop, bsi, false,
- &indx_before_incr, &indx_after_incr);
+ create_iv (init, step, NULL_TREE, loop, bsi, false,
+ &indx_before_incr, &indx_after_incr);
return indx_before_incr;
}
vect_transform_loop_bound (loop_vec_info loop_vinfo)
{
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
- edge exit_edge = loop->exit_edges[0];
+ edge exit_edge = loop->single_exit;
block_stmt_iterator loop_exit_bsi = bsi_last (exit_edge->src);
tree indx_before_incr, indx_after_incr;
tree orig_cond_expr;
if (orig_cond_expr != bsi_stmt (loop_exit_bsi))
abort ();
- /* both init and step are guaranted to be gimple expressions,
- so we can use vect_create_iv_simple. */
- vect_create_iv_simple (integer_zero_node, integer_one_node, NULL_TREE, loop,
- &loop_exit_bsi, false, &indx_before_incr, &indx_after_incr);
+ create_iv (integer_zero_node, integer_one_node, NULL_TREE, loop,
+ &loop_exit_bsi, false, &indx_before_incr, &indx_after_incr);
/* bsi_insert is using BSI_NEW_STMT. We need to bump it back
to point to the exit condition. */
if (vect_debug_details (loop))
fprintf (dump_file, "\n<<vect_analyze_loop_form>>\n");
- if (loop->level > 1 /* FORNOW: inner-most loop */
- || loop->num_exits > 1 || loop->num_entries > 1 || loop->num_nodes != 2
- || !loop->pre_header || !loop->header || !loop->latch)
+ if (loop->inner
+ || !loop->single_exit
+ || loop->num_nodes != 2)
{
if (vect_debug_stats (loop) || vect_debug_details (loop))
{
fprintf (dump_file, "not vectorized: bad loop form. ");
- if (loop->level > 1)
+ if (loop->inner)
fprintf (dump_file, "nested loop.");
- else if (loop->num_exits > 1 || loop->num_entries > 1)
- fprintf (dump_file, "multiple entries or exits.");
- else if (loop->num_nodes != 2 || !loop->header || !loop->latch)
+ else if (!loop->single_exit)
+ fprintf (dump_file, "multiple exits.");
+ else if (loop->num_nodes != 2)
fprintf (dump_file, "too many BBs in loop.");
- else if (!loop->pre_header)
- fprintf (dump_file, "no pre-header BB for loop.");
}
return NULL;
if (!loop)
continue;
- flow_loop_scan (loop, LOOP_ALL);
-
loop_vinfo = vect_analyze_loop (loop);
loop->aux = loop_vinfo;
extern int tree_int_cst_msb (tree);
extern int tree_int_cst_sgn (tree);
extern int tree_expr_nonnegative_p (tree);
+extern bool may_negate_without_overflow_p (tree);
extern tree get_inner_array_type (tree);
/* From expmed.c. Since rtl.h is included after tree.h, we can't