From 8e8d51622f9c4aca782074532ee563f4c70f2e2f Mon Sep 17 00:00:00 2001 From: Caroline Tice Date: Wed, 25 Aug 2004 19:52:54 +0000 Subject: [PATCH] Add more details to hot/cold partitioning comments and documentation. 2004-08-25 Caroline Tice * bb-reorder.c (partition_hot_cold_basic_blocks): Add more details to comments at start of function. * cfgbuild.c (make_edges): Add more details to hot/cold partitioning comment. * cfgcleanup.c (try_simplify_condjump, try_forward_edges, merge_blocks_move_predecessor_nojumps, merge_blocks_move_successor_nojumps, merge_blocks_move, try_crossjump_to_edge, try_crossjump_bb): Likewise. * cfglayout.c (fixup_reorder_chain): Likewise. * cfgrtl.c (rtl_can_merge_blocks, try_redirect_by_replacing_jump, cfg_layout_can_merge_blocks_p): Likewise. * ifcvt.c (find_if_case_1, find_if_case_2): Likewise. * passes.c (rest_of_compilation): Update comments for calling optimization that partitions hot/cold basic blocks. * doc/invoke.texi: Update documentation of freorder-blocks-and-partition flag. From-SVN: r86570 --- gcc/ChangeLog | 19 +++++++++++++ gcc/bb-reorder.c | 59 ++++++++++++++++++++++++++++++++-------- gcc/cfgbuild.c | 5 +++- gcc/cfgcleanup.c | 65 +++++++++++++++++++++++++++++++++++++++------ gcc/cfglayout.c | 3 ++- gcc/cfgrtl.c | 28 ++++++++++++------- gcc/doc/invoke.texi | 5 ++++ gcc/ifcvt.c | 19 +++++++++++-- gcc/passes.c | 5 ++-- 9 files changed, 173 insertions(+), 35 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index d75e804f61a..71935c9a9ff 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,22 @@ +2004-08-25 Caroline Tice + + * bb-reorder.c (partition_hot_cold_basic_blocks): Add more details + to comments at start of function. + * cfgbuild.c (make_edges): Add more details to hot/cold partitioning + comment. + * cfgcleanup.c (try_simplify_condjump, try_forward_edges, + merge_blocks_move_predecessor_nojumps, + merge_blocks_move_successor_nojumps, merge_blocks_move, + try_crossjump_to_edge, try_crossjump_bb): Likewise. + * cfglayout.c (fixup_reorder_chain): Likewise. + * cfgrtl.c (rtl_can_merge_blocks, try_redirect_by_replacing_jump, + cfg_layout_can_merge_blocks_p): Likewise. + * ifcvt.c (find_if_case_1, find_if_case_2): Likewise. + * passes.c (rest_of_compilation): Update comments for calling + optimization that partitions hot/cold basic blocks. + * doc/invoke.texi: Update documentation of + freorder-blocks-and-partition flag. + 2004-08-25 Richard Sandiford * config/mips/mips.md (reg): Renamed mode attribute from ccreg. diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c index ddf586c15d9..cf39ce0c988 100644 --- a/gcc/bb-reorder.c +++ b/gcc/bb-reorder.c @@ -2009,20 +2009,57 @@ reorder_basic_blocks (unsigned int flags) been called. However part of this optimization may introduce new register usage, so it must be called before register allocation has occurred. This means that this optimization is actually called - well before the optimization that reorders basic blocks (see function - above). + well before the optimization that reorders basic blocks (see + function above). This optimization checks the feedback information to determine - which basic blocks are hot/cold and adds - NOTE_INSN_UNLIKELY_EXECUTED_CODE to non-hot basic blocks. The + which basic blocks are hot/cold and causes reorder_basic_blocks to + add NOTE_INSN_UNLIKELY_EXECUTED_CODE to non-hot basic blocks. The presence or absence of this note is later used for writing out - sections in the .o file. This optimization must also modify the - CFG to make sure there are no fallthru edges between hot & cold - blocks, as those blocks will not necessarily be contiguous in the - .o (or assembly) file; and in those cases where the architecture - requires it, conditional and unconditional branches that cross - between sections are converted into unconditional or indirect - jumps, depending on what is appropriate. */ + sections in the .o file. Because hot and cold sections can be + arbitrarily large (within the bounds of memory), far beyond the + size of a single function, it is necessary to fix up all edges that + cross section boundaries, to make sure the instructions used can + actually span the required distance. The fixes are described + below. + + Fall-through edges must be changed into jumps; it is not safe or + legal to fall through across a section boundary. Whenever a + fall-through edge crossing a section boundary is encountered, a new + basic block is inserted (in the same section as the fall-through + source), and the fall through edge is redirected to the new basic + block. The new basic block contains an unconditional jump to the + original fall-through target. (If the unconditional jump is + insufficient to cross section boundaries, that is dealt with a + little later, see below). + + In order to deal with architectures that have short conditional + branches (which cannot span all of memory) we take any conditional + jump that attempts to cross a section boundary and add a level of + indirection: it becomes a conditional jump to a new basic block, in + the same section. The new basic block contains an unconditional + jump to the original target, in the other section. + + For those architectures whose unconditional branch is also + incapable of reaching all of memory, those unconditional jumps are + converted into indirect jumps, through a register. + + IMPORTANT NOTE: This optimization causes some messy interactions + with the cfg cleanup optimizations; those optimizations want to + merge blocks wherever possible, and to collapse indirect jump + sequences (change "A jumps to B jumps to C" directly into "A jumps + to C"). Those optimizations can undo the jump fixes that + partitioning is required to make (see above), in order to ensure + that jumps attempting to cross section boundaries are really able + to cover whatever distance the jump requires (on many architectures + conditional or unconditional jumps are not able to reach all of + memory). Therefore tests have to be inserted into each such + optimization to make sure that it does not undo stuff necessary to + cross partition boundaries. This would be much less of a problem + if we could perform this optimization later in the compilation, but + unfortunately the fact that we may need to create indirect jumps + (through registers) requires that this optimization be performed + before register allocation. */ void partition_hot_cold_basic_blocks (void) diff --git a/gcc/cfgbuild.c b/gcc/cfgbuild.c index cb451efe4f2..6c4a67ac572 100644 --- a/gcc/cfgbuild.c +++ b/gcc/cfgbuild.c @@ -232,7 +232,10 @@ make_edges (basic_block min, basic_block max, int update_p) current_function_has_computed_jump = 0; /* If we are partitioning hot and cold basic blocks into separate - sections, we cannot assume there is no computed jump. */ + sections, we cannot assume there is no computed jump (partitioning + sometimes requires the use of indirect jumps; see comments about + partitioning at the top of bb-reorder.c:partition_hot_cold_basic_blocks + for complete details). */ if (flag_reorder_blocks_and_partition) current_function_has_computed_jump = 1; diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c index 22821f3a4e6..91412cf84d5 100644 --- a/gcc/cfgcleanup.c +++ b/gcc/cfgcleanup.c @@ -150,7 +150,13 @@ try_simplify_condjump (basic_block cbranch_block) /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot - and cold sections. */ + and cold sections. + + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ if (flag_reorder_blocks_and_partition && (BB_PARTITION (jump_block) != BB_PARTITION (jump_dest_block) @@ -419,8 +425,14 @@ try_forward_edges (int mode, basic_block b) /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot - and cold sections. */ + and cold sections. + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really m + ust be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ + if (flag_reorder_blocks_and_partition && find_reg_note (BB_END (b), REG_CROSSING_JUMP, NULL_RTX)) return false; @@ -447,7 +459,14 @@ try_forward_edges (int mode, basic_block b) counter = 0; /* If we are partitioning hot/cold basic_blocks, we don't want to mess - up jumps that cross between hot/cold sections. */ + up jumps that cross between hot/cold sections. + + Basic block partitioning may result in some jumps that appear + to be optimizable (or blocks that appear to be mergeable), but which + really must be left untouched (they are required to make it safely + across partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete + details. */ if (flag_reorder_blocks_and_partition && first != EXIT_BLOCK_PTR @@ -670,8 +689,14 @@ merge_blocks_move_predecessor_nojumps (basic_block a, basic_block b) /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot - and cold sections. */ + and cold sections. + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ + if (flag_reorder_blocks_and_partition && (BB_PARTITION (a) != BB_PARTITION (b) || find_reg_note (BB_END (a), REG_CROSSING_JUMP, NULL_RTX))) @@ -722,8 +747,14 @@ merge_blocks_move_successor_nojumps (basic_block a, basic_block b) /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot - and cold sections. */ + and cold sections. + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ + if (flag_reorder_blocks_and_partition && (find_reg_note (BB_END (a), REG_CROSSING_JUMP, NULL_RTX) || BB_PARTITION (a) != BB_PARTITION (b))) @@ -787,8 +818,14 @@ merge_blocks_move (edge e, basic_block b, basic_block c, int mode) /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot - and cold sections. */ + and cold sections. + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ + if (flag_reorder_blocks_and_partition && (find_reg_note (BB_END (b), REG_CROSSING_JUMP, NULL_RTX) || find_reg_note (BB_END (c), REG_CROSSING_JUMP, NULL_RTX) @@ -1471,7 +1508,13 @@ try_crossjump_to_edge (int mode, edge e1, edge e2) newpos1 = newpos2 = NULL_RTX; /* If we have partitioned hot/cold basic blocks, it is a bad idea - to try this optimization. */ + to try this optimization. + + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ if (flag_reorder_blocks_and_partition && no_new_pseudos) return false; @@ -1670,8 +1713,14 @@ try_crossjump_bb (int mode, basic_block bb) /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot - and cold sections. */ + and cold sections. + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ + if (flag_reorder_blocks_and_partition && (BB_PARTITION (bb->pred->src) != BB_PARTITION (bb->pred->pred_next->src) || (bb->pred->flags & EDGE_CROSSING))) diff --git a/gcc/cfglayout.c b/gcc/cfglayout.c index 397180633bc..994ab45c491 100644 --- a/gcc/cfglayout.c +++ b/gcc/cfglayout.c @@ -794,7 +794,8 @@ fixup_reorder_chain (void) bb = nb; /* Make sure new bb is tagged for correct section (same as - fall-thru source). */ + fall-thru source, since you cannot fall-throu across + section boundaries). */ BB_COPY_PARTITION (e_fall->src, bb->pred->src); if (flag_reorder_blocks_and_partition && targetm.have_named_sections) diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c index c4cb01214e4..6dba6c1dfa8 100644 --- a/gcc/cfgrtl.c +++ b/gcc/cfgrtl.c @@ -613,10 +613,11 @@ rtl_can_merge_blocks (basic_block a,basic_block b) mess up unconditional or indirect jumps that cross between hot and cold sections. - ??? If two basic blocks could otherwise be merged (which implies - that the jump between the two is unconditional), and one is in a - hot section and the other is in a cold section, surely that means - that one of the section choices is wrong. */ + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ if (flag_reorder_blocks_and_partition && (find_reg_note (BB_END (a), REG_CROSSING_JUMP, NULL_RTX) @@ -672,7 +673,13 @@ try_redirect_by_replacing_jump (edge e, basic_block target, bool in_cfglayout) /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot - and cold sections. */ + and cold sections. + + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ if (flag_reorder_blocks_and_partition && (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) @@ -2663,11 +2670,12 @@ cfg_layout_can_merge_blocks_p (basic_block a, basic_block b) mess up unconditional or indirect jumps that cross between hot and cold sections. - ??? If two basic blocks could otherwise be merged (which implies - that the jump between the two is unconditional), and one is in a - hot section and the other is in a cold section, surely that means - that one of the section choices is wrong. */ - + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ + if (flag_reorder_blocks_and_partition && (find_reg_note (BB_END (a), REG_CROSSING_JUMP, NULL_RTX) || find_reg_note (BB_END (b), REG_CROSSING_JUMP, NULL_RTX) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 4437699ae8d..25acc613c8e 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -4617,6 +4617,11 @@ to reduce number of taken branches, partitions hot and cold basic blocks into separate sections of the assembly and .o files, to improve paging and cache locality performance. +This optimization is automatically turned off in the presence of +exception handling, for linkonce sections, for functions with a user-defined +section attribute and on any architecture that does not support named +sections. + @item -freorder-functions @opindex freorder-functions Reorder basic blocks in the compiled function in order to reduce number of diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 2711f1b516d..7473054757f 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -2850,8 +2850,14 @@ find_if_case_1 (basic_block test_bb, edge then_edge, edge else_edge) /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot - and cold sections. */ + and cold sections. + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ + if (flag_reorder_blocks_and_partition && ((BB_END (then_bb) && find_reg_note (BB_END (then_bb), REG_CROSSING_JUMP, NULL_RTX)) @@ -2909,6 +2915,9 @@ find_if_case_1 (basic_block test_bb, edge then_edge, edge else_edge) { new_bb->index = then_bb_index; BASIC_BLOCK (then_bb_index) = new_bb; + /* Since the fallthru edge was redirected from test_bb to new_bb, + we need to ensure that new_bb is in the same partition as + test bb (you can not fall through across section boundaries). */ BB_COPY_PARTITION (new_bb, test_bb); } /* We've possibly created jump to next insn, cleanup_cfg will solve that @@ -2933,8 +2942,14 @@ find_if_case_2 (basic_block test_bb, edge then_edge, edge else_edge) /* If we are partitioning hot/cold basic blocks, we don't want to mess up unconditional or indirect jumps that cross between hot - and cold sections. */ + and cold sections. + Basic block partitioning may result in some jumps that appear to + be optimizable (or blocks that appear to be mergeable), but which really + must be left untouched (they are required to make it safely across + partition boundaries). See the comments at the top of + bb-reorder.c:partition_hot_cold_basic_blocks for complete details. */ + if (flag_reorder_blocks_and_partition && ((BB_END (then_bb) && find_reg_note (BB_END (then_bb), REG_CROSSING_JUMP, NULL_RTX)) diff --git a/gcc/passes.c b/gcc/passes.c index 7f0aeb6d138..06c2d8990f7 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -1856,8 +1856,9 @@ rest_of_compilation (void) rest_of_handle_if_after_combine (); /* The optimization to partition hot/cold basic blocks into separate - sections of the .o file does not work well with exception handling. - Don't call it if there are exceptions. */ + sections of the .o file does not work well with linkonce or with + user defined section attributes. Don't call it if either case + arises. */ if (flag_reorder_blocks_and_partition && !DECL_ONE_ONLY (current_function_decl) -- 2.30.2