From 9fb32434113c28e3f8f2346be2609a4f6fd136ea Mon Sep 17 00:00:00 2001 From: Caroline Tice Date: Wed, 18 Aug 2004 09:22:08 -0700 Subject: [PATCH] Hot/cold partitioning update patch. Hot/cold partitioning update patch. The problems that this patch attemptd to address/fix are: - Fix places where adding in_unlikely_executed_text to the enum data type "in_section" threw off switch case statements. - Make it work correctly (by turning it off) for functions where user specifies "__attribute__ section" - Make it work correctly (by turning it off) for linkonce sections - Make it work correctly with -ffunction-sections flag - Make it output correct cold section labels - Undo some changes to original assembly code generation - Turn off hot/cold partitioning in the presence of DWARF debugging (for the moment) - Turn off hot/cold partitioning for architectures that do not support named sections - Use variables rather than constants for cold section labels and names (to work correctly with -ffunction-sections, among other things) 2004-08-18 Caroline Tice * Makefile.in (STAGEFEEDBACK_FLAGS_TO_PASS) Add "-freorder-blocks-and-partition" to the flags used in second stage of profiledbootstrap. * bb-reorder.c (push_to_next_round_p): Add new variable, next_round_is_last; set and use variable to make sure, when partitioning, that the last trace construction round consists of all (and only) cold basic blocks. (rotate_loop): Don't copy blocks that end in a section crossing jump. (copy_bb): Correctly initialize "partition" of duplicated bb. (add_unlikely_executed_notes): Add a comment. (find_rarely_executed_basic_blocks_and_crossing_edges): Modify to make sure, if function contains hot blocks, that the successors of ENTRY_BLOCK_PTR are hot; also, only look for crossing edges if the architecture supports named sections. (mark_bb_for_unlikely_executed_section): Modify to always insert the NOTE_INSN_UNLIKELY_EXECUTED_CODE immediately after the basic block note insn. (fix_crossing_unconditional_branches): Remove extra space. (fix_edges_for_rarely_executed_code): Modify to only do partitioning work if the architecture supports named sections. (reorder_basic_blocks): Modify to only add NOTE_INSN_UNLIKELY_EXECUTED_CODE notes if the architecture supports named sections. * c-common.c (handle_section_attribute): Initialize new global variable, user_defined_section_attribute, to true if user has specified one. * cfgcleanup.c (try_forward_edges): Modify to not attempt to forward edges that cross section boundaries. * cfglayout.c (fixup_reorder_chain): Modify to only fix up partitioning information if the architecture supports named sections. * cfgrtl.c (target.h): Add statement to include this. (rtl_split_block): Make sure newly created bb gets correct partition. (try_redirect_by_replacing_jump): Make sure redirection isn't attempting to cross section boundaries. (force_nonfallthru_and_redirect): Only do partition fix up if architecture supports named sections. (rtl_split_edge): Make sure newly created bb ends up in correct partition. (commit_one_edge_insertion): Remove code that incorrectly updated basic block partition; Make sure partition fix up only happens if architecture supports named sections and it's not already done. (rtl_verify_flow_info_1): Fix if-condition on test/error condition that fallthru edges are not allowed to cross section boundaries. * defaults.h (NORMAL_TEXT_SECTION_NAME): Remove this. * final.c (final_scan_insn): Remove redundant test from if-statement; change calls to text_section into calls to function_section; add code to only to partitioning fix up if architecture supports named sections. * ifcvt.c (find_if_case_1): Make sure newly created bb has correct partition. (if_convert): Add targetm.have_named_sections to test. * output.h (unlikely_section_label): Extern declaration for new global variable. (unlikely_text_section_name): Likewise. * opts.c (decode_options): If both partitioning and DWARF debugging are turned on, issue a warning that this doesn't work, and change partitiong to basic block reordering (without hot/cold partitions). * passes.c (rest_of_handle_final): Re-set new global variable, user_defined_section_attribute, to false. (rest_of_compilation): Change options for calling partitioning function: Don't call if the user defined the section attribute, and don't call if DECL_ONE_ONLY is true for the current function. * predict.c (choose_function_section): Return immediately if we are doing hot/cold partitioning (i.e. let the basic block partitioning determine where the function belongs). * reg-stack.c (emit_swap_insn): Add condition to step over NOTE_INSN_UNLIKELY_EXECUTED_CODE notes. * toplev.c (user_defined_section_attribute): New global variable. * toplev.h (user_defined_section_attribute): Extern declaration for new global variable. * varasm.c (unlikely_section_label): New global variable. (unlikely_text_section_name): New global variable. (unlikely_text_section): Add code to initialize unlikely_text_section_name if necessary; modify to use unlikely_text_section_name and unlikely_section_label; also to use named_section properly. (in_unlikely_text_section): Modify to work correctly with named_section and to use unlikely_text_section_name. (named_section): Add code to work properly with cold section. (function_section): Clean up if-statement. * config/darwin.c (darwin_asm_named_section): Return to original code, removing use of SECTION_FORMAT_STRING. * config/arm/pe.h (switch_to_section): Add case for in_unlikely_executed_text to switch statement. * config/i386/cygming.h (switch_to_section): Likewise. * config/i386/darwin.h (NORMAL_TEXT_SECTION_NAME): Remove. (SECTION_FORMAT_STRING): Likewise. * config/mcore/mcore.h (switch_to_section): Likewise. * config/rs6000/darwin.h (NORMAL_TEXT_SECTION_NAME): Remove. From-SVN: r86189 --- gcc/Makefile.in | 2 +- gcc/bb-reorder.c | 151 +++++++++++++++++++++++-------------- gcc/c-common.c | 2 + gcc/cfgcleanup.c | 9 +++ gcc/cfglayout.c | 4 +- gcc/cfgrtl.c | 29 ++++--- gcc/config/arm/pe.h | 1 + gcc/config/darwin.c | 5 +- gcc/config/i386/cygming.h | 1 + gcc/config/i386/darwin.h | 2 - gcc/config/mcore/mcore.h | 1 + gcc/config/rs6000/darwin.h | 2 - gcc/defaults.h | 4 - gcc/final.c | 13 ++-- gcc/ifcvt.c | 4 +- gcc/opts.c | 15 ++++ gcc/output.h | 10 +++ gcc/passes.c | 6 +- gcc/predict.c | 7 ++ gcc/reg-stack.c | 2 + gcc/toplev.c | 5 ++ gcc/toplev.h | 5 ++ gcc/varasm.c | 75 +++++++++++++----- 23 files changed, 246 insertions(+), 109 deletions(-) diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 57939960d0d..5bafc596ff4 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -3593,7 +3593,7 @@ STAGEPROFILE_FLAGS_TO_PASS = \ # Files never linked into the final executable produces warnings about missing # profile. STAGEFEEDBACK_FLAGS_TO_PASS = \ - CFLAGS="$(BOOT_CFLAGS) -fprofile-use" + CFLAGS="$(BOOT_CFLAGS) -fprofile-use -freorder-blocks-and-partition" # Only build the C compiler for stage1, because that is the only one that # we can guarantee will build with the native compiler, and also it is the diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c index 90c14547aa6..32234b103f5 100644 --- a/gcc/bb-reorder.c +++ b/gcc/bb-reorder.c @@ -197,8 +197,10 @@ push_to_next_round_p (basic_block bb, int round, int number_of_rounds, bool there_exists_another_round; bool cold_block; bool block_not_hot_enough; + bool next_round_is_last; there_exists_another_round = round < number_of_rounds - 1; + next_round_is_last = round + 1 == number_of_rounds - 1; cold_block = (flag_reorder_blocks_and_partition && bb->partition == COLD_PARTITION); @@ -207,7 +209,11 @@ push_to_next_round_p (basic_block bb, int round, int number_of_rounds, || bb->count < count_th || probably_never_executed_bb_p (bb)); - if (there_exists_another_round + if (flag_reorder_blocks_and_partition + && next_round_is_last + && bb->partition != COLD_PARTITION) + return false; + else if (there_exists_another_round && (cold_block || block_not_hot_enough)) return true; else @@ -383,7 +389,9 @@ rotate_loop (edge back_edge, struct trace *trace, int trace_n) /* Duplicate HEADER if it is a small block containing cond jump in the end. */ - if (any_condjump_p (BB_END (header)) && copy_bb_p (header, 0)) + if (any_condjump_p (BB_END (header)) && copy_bb_p (header, 0) + && !find_reg_note (BB_END (header), REG_CROSSING_JUMP, + NULL_RTX)) { copy_bb (header, prev_bb->succ, prev_bb, trace_n); } @@ -750,6 +758,8 @@ copy_bb (basic_block old_bb, edge e, basic_block bb, int trace) basic_block new_bb; new_bb = duplicate_block (old_bb, e); + new_bb->partition = old_bb->partition; + if (e->dest != new_bb) abort (); if (e->dest->rbi->visited) @@ -1236,6 +1246,8 @@ add_unlikely_executed_notes (void) { basic_block bb; + /* Add the UNLIKELY_EXECUTED_NOTES to each cold basic block. */ + FOR_EACH_BB (bb) if (bb->partition == COLD_PARTITION) mark_bb_for_unlikely_executed_section (bb); @@ -1251,6 +1263,7 @@ find_rarely_executed_basic_blocks_and_crossing_edges (edge *crossing_edges, int *max_idx) { basic_block bb; + bool has_hot_blocks = false; edge e; int i; @@ -1261,32 +1274,49 @@ find_rarely_executed_basic_blocks_and_crossing_edges (edge *crossing_edges, if (probably_never_executed_bb_p (bb)) bb->partition = COLD_PARTITION; else - bb->partition = HOT_PARTITION; + { + bb->partition = HOT_PARTITION; + has_hot_blocks = true; + } } + /* Since all "hot" basic blocks will eventually be scheduled before all + cold basic blocks, make *sure* the real function entry block is in + the hot partition (if there is one). */ + + if (has_hot_blocks) + for (e = ENTRY_BLOCK_PTR->succ; e; e = e->succ_next) + if (e->dest->index >= 0) + { + e->dest->partition = HOT_PARTITION; + break; + } + /* Mark every edge that crosses between sections. */ i = 0; - FOR_EACH_BB (bb) - for (e = bb->succ; e; e = e->succ_next) - { - if (e->src != ENTRY_BLOCK_PTR - && e->dest != EXIT_BLOCK_PTR - && e->src->partition != e->dest->partition) + if (targetm.have_named_sections) + { + FOR_EACH_BB (bb) + for (e = bb->succ; e; e = e->succ_next) { - e->crossing_edge = true; - if (i == *max_idx) + if (e->src != ENTRY_BLOCK_PTR + && e->dest != EXIT_BLOCK_PTR + && e->src->partition != e->dest->partition) { - *max_idx *= 2; - crossing_edges = xrealloc (crossing_edges, - (*max_idx) * sizeof (edge)); + e->crossing_edge = true; + if (i == *max_idx) + { + *max_idx *= 2; + crossing_edges = xrealloc (crossing_edges, + (*max_idx) * sizeof (edge)); + } + crossing_edges[i++] = e; } - crossing_edges[i++] = e; + else + e->crossing_edge = false; } - else - e->crossing_edge = false; - } - + } *n_crossing_edges = i; } @@ -1301,32 +1331,28 @@ mark_bb_for_unlikely_executed_section (basic_block bb) rtx insert_insn = NULL; rtx new_note; - /* Find first non-note instruction and insert new NOTE before it (as - long as new NOTE is not first instruction in basic block). */ - - for (cur_insn = BB_HEAD (bb); cur_insn != NEXT_INSN (BB_END (bb)); + /* Insert new NOTE immediately after BASIC_BLOCK note. */ + + for (cur_insn = BB_HEAD (bb); cur_insn != NEXT_INSN (BB_END (bb)); cur_insn = NEXT_INSN (cur_insn)) - if (!NOTE_P (cur_insn) - && !LABEL_P (cur_insn)) + if (GET_CODE (cur_insn) == NOTE + && NOTE_LINE_NUMBER (cur_insn) == NOTE_INSN_BASIC_BLOCK) { insert_insn = cur_insn; break; } - + + /* If basic block does not contain a NOTE_INSN_BASIC_BLOCK, there is + a major problem. */ + + if (!insert_insn) + abort (); + /* Insert note and assign basic block number to it. */ - if (insert_insn) - { - new_note = emit_note_before (NOTE_INSN_UNLIKELY_EXECUTED_CODE, - insert_insn); - NOTE_BASIC_BLOCK (new_note) = bb; - } - else - { - new_note = emit_note_after (NOTE_INSN_UNLIKELY_EXECUTED_CODE, - BB_END (bb)); - NOTE_BASIC_BLOCK (new_note) = bb; - } + new_note = emit_note_after (NOTE_INSN_UNLIKELY_EXECUTED_CODE, + insert_insn); + NOTE_BASIC_BLOCK (new_note) = bb; } /* If any destination of a crossing edge does not have a label, add label; @@ -1754,7 +1780,7 @@ fix_crossing_unconditional_branches (void) rtx new_reg; rtx cur_insn; edge succ; - + FOR_EACH_BB (cur_bb) { last_insn = BB_END (cur_bb); @@ -1886,26 +1912,36 @@ fix_edges_for_rarely_executed_code (edge *crossing_edges, fix_up_fall_thru_edges (); - /* If the architecture does not have conditional branches that can - span all of memory, convert crossing conditional branches into - crossing unconditional branches. */ - - if (!HAS_LONG_COND_BRANCH) - fix_crossing_conditional_branches (); + /* Only do the parts necessary for writing separate sections if + the target architecture has the ability to write separate sections + (i.e. it has named sections). Otherwise, the hot/cold partitioning + information will be used when reordering blocks to try to put all + the hot blocks together, then all the cold blocks, but no actual + section partitioning will be done. */ + + if (targetm.have_named_sections) + { + /* If the architecture does not have conditional branches that can + span all of memory, convert crossing conditional branches into + crossing unconditional branches. */ - /* If the architecture does not have unconditional branches that - can span all of memory, convert crossing unconditional branches - into indirect jumps. Since adding an indirect jump also adds - a new register usage, update the register usage information as - well. */ + if (!HAS_LONG_COND_BRANCH) + fix_crossing_conditional_branches (); - if (!HAS_LONG_UNCOND_BRANCH) - { - fix_crossing_unconditional_branches (); - reg_scan (get_insns(), max_reg_num (), 1); - } + /* If the architecture does not have unconditional branches that + can span all of memory, convert crossing unconditional branches + into indirect jumps. Since adding an indirect jump also adds + a new register usage, update the register usage information as + well. */ + + if (!HAS_LONG_UNCOND_BRANCH) + { + fix_crossing_unconditional_branches (); + reg_scan (get_insns(), max_reg_num (), 1); + } - add_reg_crossing_jump_notes (); + add_reg_crossing_jump_notes (); + } } /* Reorder basic blocks. The main entry point to this file. FLAGS is @@ -1957,7 +1993,8 @@ reorder_basic_blocks (unsigned int flags) if (dump_file) dump_flow_info (dump_file); - if (flag_reorder_blocks_and_partition) + if (flag_reorder_blocks_and_partition + && targetm.have_named_sections) add_unlikely_executed_notes (); cfg_layout_finalize (); diff --git a/gcc/c-common.c b/gcc/c-common.c index 4c7eb21e049..6e07cfcbbdb 100644 --- a/gcc/c-common.c +++ b/gcc/c-common.c @@ -4357,6 +4357,8 @@ handle_section_attribute (tree *node, tree ARG_UNUSED (name), tree args, if (targetm.have_named_sections) { + user_defined_section_attribute = true; + if ((TREE_CODE (decl) == FUNCTION_DECL || TREE_CODE (decl) == VAR_DECL) && TREE_CODE (TREE_VALUE (args)) == STRING_CST) diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c index 648c4a4f7c3..d13e6be9bab 100644 --- a/gcc/cfgcleanup.c +++ b/gcc/cfgcleanup.c @@ -446,6 +446,14 @@ try_forward_edges (int mode, basic_block b) target = first = e->dest; counter = 0; + /* If we are partitioning hot/cold basic_blocks, we don't want to mess + up jumps that cross between hot/cold sections. */ + + if (flag_reorder_blocks_and_partition + && first != EXIT_BLOCK_PTR + && find_reg_note (BB_END (first), REG_CROSSING_JUMP, NULL_RTX)) + return false; + while (counter < n_basic_blocks) { basic_block new_target = NULL; @@ -453,6 +461,7 @@ try_forward_edges (int mode, basic_block b) may_thread |= target->flags & BB_DIRTY; if (FORWARDER_BLOCK_P (target) + && !target->succ->crossing_edge && target->succ->dest != EXIT_BLOCK_PTR) { /* Bypass trivial infinite loops. */ diff --git a/gcc/cfglayout.c b/gcc/cfglayout.c index b805ae5bda9..0cf7d8e5131 100644 --- a/gcc/cfglayout.c +++ b/gcc/cfglayout.c @@ -795,7 +795,8 @@ fixup_reorder_chain (void) /* Make sure new bb is tagged for correct section (same as fall-thru source). */ e_fall->src->partition = bb->pred->src->partition; - if (flag_reorder_blocks_and_partition) + if (flag_reorder_blocks_and_partition + && targetm.have_named_sections) { if (bb->pred->src->partition == COLD_PARTITION) { @@ -1107,6 +1108,7 @@ cfg_layout_duplicate_bb (basic_block bb) insn ? get_last_insn () : NULL, EXIT_BLOCK_PTR->prev_bb); + new_bb->partition = bb->partition; if (bb->rbi->header) { insn = bb->rbi->header; diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c index a4ac8233c0f..f3618f07c97 100644 --- a/gcc/cfgrtl.c +++ b/gcc/cfgrtl.c @@ -56,6 +56,7 @@ Software Foundation, 59 Temple Place - Suite 330, Boston, MA #include "insn-config.h" #include "cfglayout.h" #include "expr.h" +#include "target.h" /* The labels mentioned in non-jump rtl. Valid during find_basic_blocks. */ @@ -488,6 +489,7 @@ rtl_split_block (basic_block bb, void *insnp) /* Create the new basic block. */ new_bb = create_basic_block (NEXT_INSN (insn), BB_END (bb), bb); + new_bb->partition = bb->partition; BB_END (bb) = insn; /* Redirect the outgoing edges. */ @@ -681,7 +683,8 @@ try_redirect_by_replacing_jump (edge e, basic_block target, bool in_cfglayout) and cold sections. */ if (flag_reorder_blocks_and_partition - && find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX)) + && (find_reg_note (insn, REG_CROSSING_JUMP, NULL_RTX) + || (src->partition != target->partition))) return NULL; /* Verify that all targets will be TARGET. */ @@ -1092,7 +1095,8 @@ force_nonfallthru_and_redirect (edge e, basic_block target) /* Make sure new block ends up in correct hot/cold section. */ jump_block->partition = e->src->partition; - if (flag_reorder_blocks_and_partition) + if (flag_reorder_blocks_and_partition + && targetm.have_named_sections) { if (e->src->partition == COLD_PARTITION) { @@ -1350,9 +1354,13 @@ rtl_split_edge (edge edge_in) && NOTE_LINE_NUMBER (before) == NOTE_INSN_LOOP_END) before = NEXT_INSN (before); bb = create_basic_block (before, NULL, edge_in->src); + bb->partition = edge_in->src->partition; } else - bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); + { + bb = create_basic_block (before, NULL, edge_in->dest->prev_bb); + bb->partition = edge_in->dest->partition; + } /* ??? This info is likely going to be out of date very soon. */ if (edge_in->dest->global_live_at_start) @@ -1590,13 +1598,11 @@ commit_one_edge_insertion (edge e, int watch_calls) bb = split_edge (e); after = BB_END (bb); - /* If we are partitioning hot/cold basic blocks, we must make sure - that the new basic block ends up in the correct section. */ - - bb->partition = e->src->partition; if (flag_reorder_blocks_and_partition + && targetm.have_named_sections && e->src != ENTRY_BLOCK_PTR - && e->src->partition == COLD_PARTITION) + && e->src->partition == COLD_PARTITION + && !e->crossing_edge) { rtx bb_note, new_note, cur_insn; @@ -1980,8 +1986,11 @@ rtl_verify_flow_info_1 (void) if (e->flags & EDGE_FALLTHRU) { n_fallthru++, fallthru = e; - if (e->crossing_edge) - { + if (e->crossing_edge + || (e->src->partition != e->dest->partition + && e->src != ENTRY_BLOCK_PTR + && e->dest != EXIT_BLOCK_PTR)) + { error ("Fallthru edge crosses section boundary (bb %i)", e->src->index); err = 1; diff --git a/gcc/config/arm/pe.h b/gcc/config/arm/pe.h index aa78ff176c7..63c127c1a61 100644 --- a/gcc/config/arm/pe.h +++ b/gcc/config/arm/pe.h @@ -198,6 +198,7 @@ switch_to_section (enum in_section section, tree decl) \ switch (section) \ { \ case in_text: text_section (); break; \ + case in_unlikely_executed_text: unlikely_text_section (); break; \ case in_data: data_section (); break; \ case in_named: named_section (decl, NULL, 0); break; \ case in_readonly_data: readonly_data_section (); break; \ diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c index e3a4cfd82e5..1c308bb5eac 100644 --- a/gcc/config/darwin.c +++ b/gcc/config/darwin.c @@ -1077,10 +1077,7 @@ darwin_globalize_label (FILE *stream, const char *name) void darwin_asm_named_section (const char *name, unsigned int flags ATTRIBUTE_UNUSED) { - if (flag_reorder_blocks_and_partition) - fprintf (asm_out_file, SECTION_FORMAT_STRING, name); - else - fprintf (asm_out_file, ".section %s\n", name); + fprintf (asm_out_file, ".section %s\n", name); } unsigned int diff --git a/gcc/config/i386/cygming.h b/gcc/config/i386/cygming.h index 6b9c722f9e8..cb0682010b8 100644 --- a/gcc/config/i386/cygming.h +++ b/gcc/config/i386/cygming.h @@ -162,6 +162,7 @@ switch_to_section (enum in_section section, tree decl) \ switch (section) \ { \ case in_text: text_section (); break; \ + case in_unlikely_text_section: unlikely_text_section (); break; \ case in_data: data_section (); break; \ case in_readonly_data: readonly_data_section (); break; \ case in_named: named_section (decl, NULL, 0); break; \ diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h index 711722ab5ae..0b3db813e6c 100644 --- a/gcc/config/i386/darwin.h +++ b/gcc/config/i386/darwin.h @@ -98,10 +98,8 @@ Boston, MA 02111-1307, USA. */ /* These are used by -fbranch-probabilities */ #define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions" -#define NORMAL_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions" #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \ "__TEXT,__unlikely,regular,pure_instructions" -#define SECTION_FORMAT_STRING ".section %s\n\t.align 2\n" /* Assembler pseudos to introduce constants of various size. */ diff --git a/gcc/config/mcore/mcore.h b/gcc/config/mcore/mcore.h index 4e299e66c21..3ec2dd7c350 100644 --- a/gcc/config/mcore/mcore.h +++ b/gcc/config/mcore/mcore.h @@ -968,6 +968,7 @@ switch_to_section (enum in_section section, tree decl) \ switch (section) \ { \ case in_text: text_section (); break; \ + case in_unlikely_executed_text: unlikely_text_section (); break; \ case in_data: data_section (); break; \ case in_named: named_section (decl, NULL, 0); break; \ SUBTARGET_SWITCH_SECTIONS \ diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h index 1f1924fe5b4..96e878bbcde 100644 --- a/gcc/config/rs6000/darwin.h +++ b/gcc/config/rs6000/darwin.h @@ -168,10 +168,8 @@ do { \ /* These are used by -fbranch-probabilities */ #define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions" -#define NORMAL_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions" #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \ "__TEXT,__unlikely,regular,pure_instructions" -#define SECTION_FORMAT_STRING ".section %s\n\t.align 2\n" /* Define cutoff for using external functions to save floating point. Currently on Darwin, always use inline stores. */ diff --git a/gcc/defaults.h b/gcc/defaults.h index c9a705dab77..82fce205724 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -654,10 +654,6 @@ You Lose! You must define PREFERRED_DEBUGGING_TYPE! #define HOT_TEXT_SECTION_NAME ".text.hot" #endif -#ifndef NORMAL_TEXT_SECTION_NAME -#define NORMAL_TEXT_SECTION_NAME ".text" -#endif - #ifndef UNLIKELY_EXECUTED_TEXT_SECTION_NAME #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text.unlikely" #endif diff --git a/gcc/final.c b/gcc/final.c index d9c4f383e67..4f5d824fed1 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -1728,9 +1728,8 @@ final_scan_insn (rtx insn, FILE *file, int optimize ATTRIBUTE_UNUSED, are writing to appropriately. */ if (flag_reorder_blocks_and_partition - && in_unlikely_text_section() && !scan_ahead_for_unlikely_executed_note (insn)) - text_section (); + function_section (current_function_decl); #ifdef TARGET_UNWIND_INFO targetm.asm_out.unwind_emit (asm_out_file, insn); @@ -1923,7 +1922,8 @@ final_scan_insn (rtx insn, FILE *file, int optimize ATTRIBUTE_UNUSED, basic blocks into separate sections of the .o file, we need to ensure the jump table ends up in the correct section... */ - if (flag_reorder_blocks_and_partition) + if (flag_reorder_blocks_and_partition + && targetm.have_named_sections) { rtx tmp_table, tmp_label; if (LABEL_P (insn) @@ -1933,11 +1933,8 @@ final_scan_insn (rtx insn, FILE *file, int optimize ATTRIBUTE_UNUSED, } else if (scan_ahead_for_unlikely_executed_note (insn)) unlikely_text_section (); - else - { - if (in_unlikely_text_section ()) - text_section (); - } + else if (in_unlikely_text_section ()) + function_section (current_function_decl); } if (app_on) diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 96833a54dda..4b150d0e661 100644 --- a/gcc/ifcvt.c +++ b/gcc/ifcvt.c @@ -2909,6 +2909,7 @@ find_if_case_1 (basic_block test_bb, edge then_edge, edge else_edge) { new_bb->index = then_bb_index; BASIC_BLOCK (then_bb_index) = new_bb; + new_bb->partition = test_bb->partition; } /* We've possibly created jump to next insn, cleanup_cfg will solve that later. */ @@ -3288,7 +3289,8 @@ if_convert (int x_life_data_ok) life_data_ok = (x_life_data_ok != 0); if ((! targetm.cannot_modify_jumps_p ()) - && (!flag_reorder_blocks_and_partition || !no_new_pseudos)) + && (!flag_reorder_blocks_and_partition || !no_new_pseudos + || !targetm.have_named_sections)) mark_loop_exit_edges (); /* Compute postdominators if we think we'll use them. */ diff --git a/gcc/opts.c b/gcc/opts.c index 01297b0ee53..a6a7c337b21 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -641,6 +641,21 @@ decode_options (unsigned int argc, const char **argv) flag_reorder_blocks_and_partition = 0; flag_reorder_blocks = 1; } + + /* The optimization to partition hot and cold basic blocks into + separate sections of the .o and executable files does not currently + work correctly with DWARF debugging turned on. Until this is fixed + we will disable the optimization when DWARF debugging is set. */ + + if (flag_reorder_blocks_and_partition + && (write_symbols == DWARF_DEBUG + || write_symbols == DWARF2_DEBUG)) + { + warning + ("-freorder-blocks-and-partition does not work with -g (currently)"); + flag_reorder_blocks_and_partition = 0; + flag_reorder_blocks = 1; + } } /* Handle target- and language-independent options. Return zero to diff --git a/gcc/output.h b/gcc/output.h index 1e1d1defa31..e832eb3d706 100644 --- a/gcc/output.h +++ b/gcc/output.h @@ -392,6 +392,10 @@ extern const char *first_global_object_name; /* The first weak object in the file. */ extern const char *weak_global_object_name; +/* Label at start of unlikely section, when partitioning hot/cold basic + blocks. */ +extern char *unlikely_section_label; + /* Nonzero if function being compiled doesn't contain any calls (ignoring the prologue and epilogue). This is set prior to local register allocation and is valid for the remaining @@ -438,6 +442,12 @@ extern tree last_assemble_variable_decl; extern bool decl_readonly_section (tree, int); extern bool decl_readonly_section_1 (tree, int, int); +/* The following global variable indicates the section name to be used + for the current cold section, when partitioning hot and cold basic + blocks into separate sections. */ + +extern char *unlikely_text_section_name; + /* This can be used to compute RELOC for the function above, when given a constant expression. */ extern int compute_reloc_for_constant (tree); diff --git a/gcc/passes.c b/gcc/passes.c index 2f625c57a6d..e8a93222c32 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -454,6 +454,8 @@ rest_of_handle_final (void) output_function_exception_table (); #endif + user_defined_section_attribute = false; + if (! quiet_flag) fflush (asm_out_file); @@ -1857,7 +1859,9 @@ rest_of_compilation (void) sections of the .o file does not work well with exception handling. Don't call it if there are exceptions. */ - if (optimize > 0 && flag_reorder_blocks_and_partition && !flag_exceptions) + if (flag_reorder_blocks_and_partition + && !DECL_ONE_ONLY (current_function_decl) + && !user_defined_section_attribute) rest_of_handle_partition_blocks (); if (optimize > 0 && (flag_regmove || flag_expensive_optimizations)) diff --git a/gcc/predict.c b/gcc/predict.c index d89282daa89..d0ab77015ac 100644 --- a/gcc/predict.c +++ b/gcc/predict.c @@ -1441,6 +1441,13 @@ choose_function_section (void) of all instances. For now just never set frequency for these. */ || DECL_ONE_ONLY (current_function_decl)) return; + + /* If we are doing the partitioning optimization, let the optimization + choose the correct section into which to put things. */ + + if (flag_reorder_blocks_and_partition) + return; + if (cfun->function_frequency == FUNCTION_FREQUENCY_HOT) DECL_SECTION_NAME (current_function_decl) = build_string (strlen (HOT_TEXT_SECTION_NAME), HOT_TEXT_SECTION_NAME); diff --git a/gcc/reg-stack.c b/gcc/reg-stack.c index e21f11bbcc4..04220bf0b2d 100644 --- a/gcc/reg-stack.c +++ b/gcc/reg-stack.c @@ -989,6 +989,8 @@ emit_swap_insn (rtx insn, stack regstack, rtx reg) if (LABEL_P (tmp) || CALL_P (tmp) || NOTE_INSN_BASIC_BLOCK_P (tmp) + || (NOTE_P (tmp) + && NOTE_LINE_NUMBER (tmp) == NOTE_INSN_UNLIKELY_EXECUTED_CODE) || (NONJUMP_INSN_P (tmp) && stack_regs_mentioned (tmp))) { diff --git a/gcc/toplev.c b/gcc/toplev.c index 80608400ccb..53f44185584 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -340,6 +340,11 @@ enum pta_type flag_tree_points_to = PTA_NONE; to optimize, debug_info_level and debug_hooks in process_options (). */ int flag_var_tracking = AUTODETECT_FLAG_VAR_TRACKING; +/* True if the user has tagged the function with the 'section' + attribute. */ + +bool user_defined_section_attribute = false; + /* Values of the -falign-* flags: how much to align labels in code. 0 means `use default', 1 means `don't align'. For each variable, there is an _log variant which is the power diff --git a/gcc/toplev.h b/gcc/toplev.h index 64983caaeb3..fe588762a7d 100644 --- a/gcc/toplev.h +++ b/gcc/toplev.h @@ -116,6 +116,11 @@ extern bool exit_after_options; extern int target_flags_explicit; +/* True if the user has tagged the function with the 'section' + attribute. */ + +extern bool user_defined_section_attribute; + /* See toplev.c. */ extern int flag_loop_optimize; extern int flag_crossjumping; diff --git a/gcc/varasm.c b/gcc/varasm.c index a8d23f20207..5d02c570ae3 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -222,18 +222,42 @@ text_section (void) void unlikely_text_section (void) { - if ((in_section != in_unlikely_executed_text) - && (in_section != in_named - || strcmp (in_named_name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) != 0)) + const char *name; + int len; + + if (! unlikely_text_section_name) { - if (targetm.have_named_sections) - named_section (NULL_TREE, UNLIKELY_EXECUTED_TEXT_SECTION_NAME, 0); + if (DECL_SECTION_NAME (current_function_decl) + && (strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME + (current_function_decl)), + HOT_TEXT_SECTION_NAME) != 0) + && (strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME + (current_function_decl)), + UNLIKELY_EXECUTED_TEXT_SECTION_NAME) != 0)) + { + name = TREE_STRING_POINTER (DECL_SECTION_NAME + (current_function_decl)); + len = strlen (name); + unlikely_text_section_name = xmalloc ((len + 10) * sizeof (char)); + strcpy (unlikely_text_section_name, name); + strcat (unlikely_text_section_name, "_unlikely"); + } else { - in_section = in_unlikely_executed_text; - fprintf (asm_out_file, "%s\n", TEXT_SECTION_ASM_OP); + len = strlen (UNLIKELY_EXECUTED_TEXT_SECTION_NAME); + unlikely_text_section_name = xmalloc (len+1 * sizeof (char)); + strcpy (unlikely_text_section_name, + UNLIKELY_EXECUTED_TEXT_SECTION_NAME); } - + } + + if ((in_section != in_unlikely_executed_text) + && (in_section != in_named + || strcmp (in_named_name, unlikely_text_section_name) != 0)) + { + named_section (NULL_TREE, unlikely_text_section_name, 0); + in_section = in_unlikely_executed_text; + if (!unlikely_section_label_printed) { ASM_OUTPUT_LABEL (asm_out_file, unlikely_section_label); @@ -289,7 +313,14 @@ in_text_section (void) int in_unlikely_text_section (void) { - return in_section == in_unlikely_executed_text; + bool ret_val; + + ret_val = ((in_section == in_unlikely_executed_text) + || (in_section == in_named + && unlikely_text_section_name + && strcmp (in_named_name, unlikely_text_section_name) == 0)); + + return ret_val; } /* Determine if we're in the data section. */ @@ -423,6 +454,16 @@ named_section (tree decl, const char *name, int reloc) if (name == NULL) name = TREE_STRING_POINTER (DECL_SECTION_NAME (decl)); + if (strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0 + && !unlikely_text_section_name) + { + unlikely_text_section_name = xmalloc + (strlen (UNLIKELY_EXECUTED_TEXT_SECTION_NAME) + 1 + * sizeof (char)); + strcpy (unlikely_text_section_name, + UNLIKELY_EXECUTED_TEXT_SECTION_NAME); + } + flags = targetm.section_type_flags (decl, name, reloc); /* Sanity check user variables for flag changes. Non-user @@ -533,14 +574,11 @@ function_section (tree decl) { if (scan_ahead_for_unlikely_executed_note (get_insns())) unlikely_text_section (); + else if (decl != NULL_TREE + && DECL_SECTION_NAME (decl) != NULL_TREE) + named_section (decl, (char *) 0, 0); else - { - if (decl != NULL_TREE - && DECL_SECTION_NAME (decl) != NULL_TREE) - named_section (decl, (char *) 0, 0); - else - text_section (); - } + text_section (); } /* Switch to read-only data section associated with function DECL. */ @@ -1153,7 +1191,7 @@ assemble_start_function (tree decl, const char *fnname) free (unlikely_section_label); unlikely_section_label = xmalloc ((strlen (fnname) + 18) * sizeof (char)); sprintf (unlikely_section_label, "%s_unlikely_section", fnname); - + /* The following code does not need preprocessing in the assembler. */ app_disable (); @@ -4481,7 +4519,8 @@ default_section_type_flags_1 (tree decl, const char *name, int reloc, flags = SECTION_CODE; else if (decl && decl_readonly_section_1 (decl, reloc, shlib)) flags = 0; - else if (strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0) + else if (unlikely_text_section_name + && strcmp (name, unlikely_text_section_name) == 0) flags = SECTION_CODE; else flags = SECTION_WRITE; -- 2.30.2