From: Wilco Dijkstra Date: Thu, 26 May 2016 12:25:51 +0000 (+0000) Subject: GCC expands switch statements in a very simplistic way and tries to use a table... X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=e79136e41af698ccdaec62ed842b4785162dde09;p=gcc.git GCC expands switch statements in a very simplistic way and tries to use a table... GCC expands switch statements in a very simplistic way and tries to use a table expansion even when it is a bad idea for performance or codesize. GCC typically emits extremely sparse tables that contain mostly default entries (something which currently cannot be tuned by backends). Additionally the computation of the minimum/maximum label offsets is too simplistic so the tables are often twice as large as necessary. The cost of a table switch is significant due to the setup overhead, the table lookup (which due to being sparse and large adds unnecessary cachemisses) and hard to predict indirect jump. Therefore it is best to avoid using a table unless there are many real case labels. This patch fixes that by setting the default aarch64_case_values_threshold to 16 when the per-CPU tuning is not set. On SPEC2006 this improves the switch heavy benchmarks GCC and perlbench both in performance (1-2%) as well as size (0.5-1% smaller). gcc/ * config/aarch64/aarch64.c (aarch64_case_values_threshold): Return a better case_values_threshold when optimizing. From-SVN: r236771 --- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index b52e581d60a..8fe8a26f394 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2016-05-26 Wilco Dijkstra + + * config/aarch64/aarch64.c (aarch64_case_values_threshold): + Return a better case_values_threshold when optimizing. + 2016-05-26 Wilco Dijkstra * config/aarch64/aarch64-simd.md (aarch64_combinez): diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index bd45a7d0620..84dcb0be869 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3572,7 +3572,12 @@ aarch64_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x) return aarch64_tls_referenced_p (x); } -/* Implement TARGET_CASE_VALUES_THRESHOLD. */ +/* Implement TARGET_CASE_VALUES_THRESHOLD. + The expansion for a table switch is quite expensive due to the number + of instructions, the table lookup and hard to predict indirect jump. + When optimizing for speed, and -O3 enabled, use the per-core tuning if + set, otherwise use tables for > 16 cases as a tradeoff between size and + performance. When optimizing for size, use the default setting. */ static unsigned int aarch64_case_values_threshold (void) @@ -3583,7 +3588,7 @@ aarch64_case_values_threshold (void) && selected_cpu->tune->max_case_values != 0) return selected_cpu->tune->max_case_values; else - return default_case_values_threshold (); + return optimize_size ? default_case_values_threshold () : 17; } /* Return true if register REGNO is a valid index register.