Optimize ODR enum streaming
it turns out that half of the global decl stream of cc1 LTO build consits
TREE_LISTS, identifiers and integer cosntats representing TYPE_VALUES of enums.
Those are streamed only to produce ODR warning and used otherwise, so this
patch moves the info to a separate section that is represented and streamed
more effectively.
This also adds place for more info that may be used for ODR diagnostics
(i.e. at the moment we do not warn when the declarations differs i.e. by the
associated member functions and their types) and the type inheritance graph
rather then poluting the global stream.
I was bit unsure what enums we want to store into the section. All parsed
enums is probably too expensive, only those enums streamed to represent IL is
bit hard to get, so I went for those seen by free lang data.
As a plus we now get bit more precise warning because also the location of
mismatched enum CONST_DECL is streamed.
It changes:
[WPA] read
4608466 unshared trees
[WPA] read
2942094 mergeable SCCs of average size 1.365328
[WPA]
8625389 tree bodies read in total
[WPA] tree SCC table: size 524287, 247652 elements, collision ratio: 0.383702
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared
2694442 SCCs, 228 collisions (0.000085)
[WPA] Merged
2694419 SCCs
[WPA] Merged
3731982 tree bodies
[WPA] Merged 633335 types
[WPA] 122077 types prevailed (155548 associated trees)
...
[WPA] Compression:
110593119 input bytes,
287696614 uncompressed bytes (ratio: 2.601397)
[WPA] Size of mmap'd section decls:
85628556 bytes
[WPA] Size of mmap'd section function_body:
13842928 bytes
[WPA] read
1720989 unshared trees
[WPA] read
1252217 mergeable SCCs of average size 1.858507
[WPA]
4048243 tree bodies read in total
[WPA] tree SCC table: size 524287, 226524 elements, collision ratio: 0.491759
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared
1025693 SCCs, 196 collisions (0.000191)
[WPA] Merged
1025670 SCCs
[WPA] Merged
2063373 tree bodies
[WPA] Merged 633497 types
[WPA] 122299 types prevailed (155827 associated trees)
...
[WPA] Compression:
103428770 input bytes,
281151423 uncompressed bytes (ratio: 2.718310)
[WPA] Size of mmap'd section decls:
49390917 bytes
[WPA] Size of mmap'd section function_body:
13858258 bytes
...
[WPA] Size of mmap'd section odr_types:
29054816 bytes
So number of SCCs streamed drops to 38% and the number of unshared trees (that
are bit misnamed since it is mostly integer_cst) to 37%.
Things speeds up correspondingly, but I did not save time report from previous
build.
The enum values are still quite surprisingly large. I may take a look into
ways getting it smaller incrementally, but it streams reasonably fast:
Time variable usr sys wall GGC
phase opt and generate : 25.20 ( 68%) 10.88 ( 72%) 36.13 ( 69%) 868060 kB ( 52%)
phase stream in : 4.46 ( 12%) 0.90 ( 6%) 5.38 ( 10%) 790724 kB ( 48%)
phase stream out : 6.69 ( 18%) 3.32 ( 22%) 10.03 ( 19%) 8 kB ( 0%)
ipa lto gimple in : 0.79 ( 2%) 1.86 ( 12%) 2.39 ( 5%) 252612 kB ( 15%)
ipa lto gimple out : 2.48 ( 7%) 0.78 ( 5%) 3.26 ( 6%) 0 kB ( 0%)
ipa lto decl in : 1.71 ( 5%) 0.46 ( 3%) 2.34 ( 4%) 417883 kB ( 25%)
ipa lto decl out : 3.28 ( 9%) 0.07 ( 0%) 3.27 ( 6%) 0 kB ( 0%)
whopr wpa I/O : 0.40 ( 1%) 2.24 ( 15%) 2.77 ( 5%) 8 kB ( 0%)
lto stream decompression : 1.38 ( 4%) 0.31 ( 2%) 1.36 ( 3%) 0 kB ( 0%)
ipa ODR types : 0.18 ( 0%) 0.02 ( 0%) 0.25 ( 0%) 0 kB ( 0%)
ipa inlining heuristics : 11.64 ( 31%) 1.45 ( 10%) 13.12 ( 25%) 453160 kB ( 27%)
ipa pure const : 1.74 ( 5%) 0.00 ( 0%) 1.76 ( 3%) 0 kB ( 0%)
ipa icf : 1.72 ( 5%) 5.33 ( 35%) 7.06 ( 13%) 16593 kB ( 1%)
whopr partitioning : 2.22 ( 6%) 0.01 ( 0%) 2.23 ( 4%) 5689 kB ( 0%)
TOTAL : 37.17 15.20 52.46
1660886 kB
LTO-bootstrapped/regtested x86_64-linux, will comit it shortly.
gcc/ChangeLog:
2020-06-03 Jan Hubicka <hubicka@ucw.cz>
* ipa-devirt.c: Include data-streamer.h, lto-streamer.h and
streamer-hooks.h.
(odr_enums): New static var.
(struct odr_enum_val): New struct.
(class odr_enum): New struct.
(odr_enum_map): New hashtable.
(odr_types_equivalent_p): Drop code testing TYPE_VALUES.
(add_type_duplicate): Likewise.
(free_odr_warning_data): Do not free TYPE_VALUES.
(register_odr_enum): New function.
(ipa_odr_summary_write): New function.
(ipa_odr_read_section): New function.
(ipa_odr_summary_read): New function.
(class pass_ipa_odr): New pass.
(make_pass_ipa_odr): New function.
* ipa-utils.h (register_odr_enum): Declare.
* lto-section-in.c: (lto_section_name): Add odr_types section.
* lto-streamer.h (enum lto_section_type): Add odr_types section.
* passes.def: Add odr_types pass.
* lto-streamer-out.c (DFS::DFS_write_tree_body): Do not stream
TYPE_VALUES.
(hash_tree): Likewise.
* tree-streamer-in.c (lto_input_ts_type_non_common_tree_pointers):
Likewise.
* tree-streamer-out.c (write_ts_type_non_common_tree_pointers):
Likewise.
* timevar.def (TV_IPA_ODR): New timervar.
* tree-pass.h (make_pass_ipa_odr): Declare.
* tree.c (free_lang_data_in_type): Regiser ODR types.
gcc/lto/ChangeLog:
2020-06-03 Jan Hubicka <hubicka@ucw.cz>
* lto-common.c (compare_tree_sccs_1): Do not compare TYPE_VALUES.
gcc/testsuite/ChangeLog:
2020-06-03 Jan Hubicka <hubicka@ucw.cz>
* g++.dg/lto/pr84805_0.C: Update.