From db4a1c18ceb5aede224c92ec4c86723f6fb93514 Mon Sep 17 00:00:00 2001 From: Wilco Dijkstra Date: Mon, 14 Nov 2016 11:51:33 +0000 Subject: [PATCH] The existing vector costs stop some beneficial vectorization. The existing vector costs stop some beneficial vectorization. This is mostly due to vector statement cost being set to 3 as well as vector loads having a higher cost than scalar loads. This means that even when we vectorize 4x, it is possible that the cost of a vectorized loop is similar to the scalar version, and we fail to vectorize. Using a cost of 3 for a vector operation suggests they are 3 times as expensive as scalar operations. Since most vector operations have a similar throughput as scalar operations, this is not correct. Using slightly lower values for these heuristics now allows this loop and many others to be vectorized. On a proprietary benchmark the gain from vectorizing this loop is around 15-30% which shows vectorizing it is indeed beneficial. * config/aarch64/aarch64.c (cortexa57_vector_cost): Change vec_stmt_cost, vec_align_load_cost and vec_unalign_load_cost. From-SVN: r242383 --- gcc/ChangeLog | 5 +++++ gcc/config/aarch64/aarch64.c | 6 +++--- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index a4f6a34f8f1..b3967a245de 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2016-11-14 Wilco Dijkstra + + * config/aarch64/aarch64.c (cortexa57_vector_cost): + Change vec_stmt_cost, vec_align_load_cost and vec_unalign_load_cost. + 2016-11-14 Richard Biener PR tree-optimization/78312 diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index b7d4640826a..bd97c5b701c 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -398,12 +398,12 @@ static const struct cpu_vector_cost cortexa57_vector_cost = 1, /* scalar_stmt_cost */ 4, /* scalar_load_cost */ 1, /* scalar_store_cost */ - 3, /* vec_stmt_cost */ + 2, /* vec_stmt_cost */ 3, /* vec_permute_cost */ 8, /* vec_to_scalar_cost */ 8, /* scalar_to_vec_cost */ - 5, /* vec_align_load_cost */ - 5, /* vec_unalign_load_cost */ + 4, /* vec_align_load_cost */ + 4, /* vec_unalign_load_cost */ 1, /* vec_unalign_store_cost */ 1, /* vec_store_cost */ 1, /* cond_taken_branch_cost */ -- 2.30.2