Martin Sebor [Wed, 5 Feb 2020 23:55:26 +0000 (16:55 -0700)]
PR tree-optimization/92765 - wrong code for strcmp of a union member
gcc/ChangeLog:
PR tree-optimization/92765
* gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.
* tree-ssa-strlen.c (compute_string_length): Remove.
(determine_min_objsize): Remove.
(get_len_or_size): Add an argument. Call get_range_strlen_dynamic.
Avoid using type size as the upper bound on string length.
(handle_builtin_string_cmp): Add an argument. Adjust.
(strlen_check_and_optimize_call): Pass additional argument to
handle_builtin_string_cmp.
gcc/testsuite/ChangeLog:
PR tree-optimization/92765
* g++.dg/tree-ssa/strlenopt-1.C: New test.
* g++.dg/tree-ssa/strlenopt-2.C: New test.
* gcc.dg/Warray-bounds-58.c: New test.
* gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
* gcc.dg/Wstring-compare.c: Xfail a test.
* gcc.dg/strcmpopt_2.c: Disable tests.
* gcc.dg/strcmpopt_4.c: Adjust tests.
* gcc.dg/strcmpopt_10.c: New test.
* gcc.dg/strcmpopt_11.c: New test.
* gcc.dg/strlenopt-69.c: Disable tests.
* gcc.dg/strlenopt-92.c: New test.
* gcc.dg/strlenopt-93.c: New test.
* gcc.dg/strlenopt.h: Declare calloc.
* gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).
Jason Merrill [Wed, 5 Feb 2020 22:59:28 +0000 (17:59 -0500)]
c++: Fix decltype of empty pack expansion of parm.
In unevaluated context, we only substitute a single PARM_DECL, not the
entire chain, but the handling of an empty pack expansion was missing that
check.
PR c++/93140
* pt.c (tsubst_decl) [PARM_DECL]: Check cp_unevaluated_operand in
handling of TREE_CHAIN for empty pack.
Uros Bizjak [Wed, 5 Feb 2020 23:13:00 +0000 (00:13 +0100)]
Simplify post epilogue_completed splitters.
Now that we have post epilogue_completed split point for all
optimization levels, we can simplify post epilogue_completed splitters
considerably. If corresponding define_peephole2 pattern fails to
allocate a temporary register (or if peephole2 pass isn't run at all),
we can now always split invalid RTX after epilogue_completed is set.
Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
* config/i386/i386.md (*pushdi2_rex64 peephole2): Remove.
(*pushdi2_rex64 peephole2): Unconditionally split after
epilogue_completed.
(*ashl<mode>3_doubleword): Ditto.
(*<shift_insn><mode>3_doubleword): Ditto.
Marek Polacek [Wed, 5 Feb 2020 22:45:16 +0000 (17:45 -0500)]
Move CL to the correct file.
Marek Polacek [Wed, 5 Feb 2020 22:44:06 +0000 (17:44 -0500)]
Add missing CL.
Jakub Jelinek [Wed, 5 Feb 2020 22:35:08 +0000 (23:35 +0100)]
c++: Mark __builtin_convertvector operand as read [PR93557]
In C++ we weren't calling mark_exp_read on the __builtin_convertvector first
argument. I guess it could misbehave even with lambda implicit captures.
Fixed by calling decay_conversion on the argument, we use the argument as
rvalue so we want the standard lvalue to rvalue conversions, but as the
argument must be a vector type, e.g. integral promotions aren't really
needed.
2020-02-05 Jakub Jelinek <jakub@redhat.com>
PR c++/93557
* semantics.c (cp_build_vec_convert): Call decay_conversion on arg
prior to passing it to c_build_vec_convert.
* c-c++-common/Wunused-var-17.c: New test.
Michael Meissner [Wed, 5 Feb 2020 21:45:05 +0000 (16:45 -0500)]
Fix PR 93568 (thinko)
2020-02-05 Michael Meissner <meissner@linux.ibm.com>
PR target/93568
* config/rs6000/rs6000.c (get_vector_offset): Fix
Marek Polacek [Wed, 5 Feb 2020 17:25:01 +0000 (12:25 -0500)]
c++: Fix ICE with CONSTRUCTOR flags verification [PR93559]
Since reshape_init_array_1 can now reuse a single constructor for
an array of non-aggregate type, we might run into a scenario where
we reuse a constructor with TREE_SIDE_EFFECTS. This broke this test
because we have something like { { expr } } and we try to reshape it,
so we recurse on the inner CONSTRUCTOR, reuse an existing CONSTRUCTOR
with TREE_SIDE_EFFECTS, and then ICE on the discrepancy because the
outermost CONSTRUCTOR doesn't have TREE_SIDE_EFFECTS. In this case
EXPR was a call to an operator function so TREE_SIDE_EFFECTS should
be set. Naturally one would want to fix this by calling
recompute_constructor_flags in an appropriate place so that the flags
on the CONSTRUCTORs match. The appropriate place would be at the end
of reshape_init, but this breaks initlist109.C: there we are dealing
with { { TARGET_EXPR <{}> } } where the outermost { } is TREE_CONSTANT
but the inner { } is not, so recompute_constructor_flags would clear
the constant flag in the outermost { }. Seems resonable but it upsets
check_initializer which then complains about "non-constant in-class
initialization invalid for static member". TARGET_EXPRs are always
created with TREE_SIDE_EFFECTS on, but that is mutually exclusive
with TREE_CONSTANT. So we're in a bind.
Fixed by not reusing a CONSTRUCTOR that has TREE_SIDE_EFFECTS; in the
grand scheme of things it isn't measurable: it only affects ~3 tests
in the testsuite.
PR c++/93559 - ICE with CONSTRUCTOR flags verification.
* decl.c (reshape_init_array_1): Don't reuse a CONSTRUCTOR with
TREE_SIDE_EFFECTS.
* g++.dg/cpp0x/initlist119.C: New test.
* g++.dg/cpp0x/initlist120.C: New test.
Jason Merrill [Tue, 4 Feb 2020 23:49:16 +0000 (18:49 -0500)]
c++: Fix SEGV with malformed constructor decl.
In the testcase, since there's no declaration of T, ref_view(T) declares a
non-static data member T of type ref_view, the same type as its enclosing
class. Then when we try to do C++20 aggregate class template argument
deduction we recursively try to adjust the braced-init-list to match the
template class definition until we run out of stack.
Fixed by rejecting the template data member.
PR c++/92593
* decl.c (grokdeclarator): Reject field of current class type even
in a template.
Andrew Stubbs [Wed, 5 Feb 2020 16:58:41 +0000 (16:58 +0000)]
amdgcn: Remove redundant multilib
2020-02-05 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Use / not space.
Jeff Law [Wed, 5 Feb 2020 17:00:48 +0000 (10:00 -0700)]
Fix testsuite "regression" on hppa after recent IRA changes.
* gcc.target/hppa/shadd-3.c: Disable delay slot filling and
adjust expected shadd insn count appropriately.
Tobias Burnus [Wed, 5 Feb 2020 16:40:48 +0000 (17:40 +0100)]
[libgomp] – Fix check_effective_target_offload_target_nvptx for remote execution
* testsuite/lib/libgomp.exp
(check_effective_target_offload_target_nvptx): Pass flags as 'options'
and not as 'source' argument to libgomp_target_compile.
Jonathan Wakely [Wed, 5 Feb 2020 10:35:19 +0000 (10:35 +0000)]
libstdc++: Remove workarounds for constraints on alias templates
The G++ bug has been fixed for a couple of months so we can remove these
workarounds that define alias templates in terms of constrained class
templates. We can just apply constraints directly to alias templates as
specified in the C++20 working draft.
* include/bits/iterator_concepts.h (iter_reference_t)
(iter_rvalue_reference_t, iter_common_reference_t, indirect_result_t):
Remove workarounds for PR c++/67704.
* testsuite/24_iterators/aliases.cc: New test.
David Malcolm [Tue, 4 Feb 2020 21:23:27 +0000 (16:23 -0500)]
analyzer: add enode status and revamp __analyzer_dump_exploded_nodes
The analyzer recognizes __analyzer_dump_exploded_nodes as a "magic"
function for use in DejaGnu tests: at the end of the pass, it issues
a warning at each such call, dumping the count of exploded nodes seen at
the call, which can be checked in test cases via dg-warning directives,
along with the IDs of the enodes (which is helpful when debugging).
My intent was to give a way of testing the results of the state-merging
code.
The state-merging code can generate duplicate exploded nodes at a point
when state merging occurs, taking a pair of enodes from the worklist
that share a program_point and sufficiently similar state. For these
cases it generates a merged state, and adds edges from those enodes to
the merged-state enode (potentially a new or a pre-existing enode); the
input enodes don't have process_node called on them.
This means that at a CFG join point there can be an unpredictable number
of enodes that we don't care about, where the precise number depends on
the details of the state-merger code, immediately followed by a more
predictable number that we do care about.
I've been papering over this in the analyzer DejaGnu tests somewhat
by adding pairs of __analyzer_dump_exploded_nodes calls at CFG join
points, where the output at the first call is somewhat arbitrary, and
the second has the number we care about; the first number tends to
change "at random" as I tweak the state merging code, in ways that
aren't interesting, but require the tests to be updated.
See e.g. gcc.dg/analyzer/paths-6.c which had:
__analyzer_dump_exploded_nodes (0); /* { dg-warning "2 exploded nodes" } */
// FIXME: the above can vary between 2 and 3 exploded nodes
__analyzer_dump_exploded_nodes (0); /* { dg-warning "1 exploded node" } */
This patch remedies this situation by tracking which enodes are
processed, and which are merely "merger" enodes. It updates the
output for __analyzer_dump_exploded_nodes so that count of enodes
only includes the *processed* enodes, and that the IDs are split
into "processed" and "merger" enodes.
The patch simplifies the testsuite by eliminating the redundant calls
described above; the example above becomes:
__analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */
where the output in question is now:
warning: 1 processed enode: [EN: 94] merger(s): [EN: 93]
The patch also adds various checks on the status of enodes, to ensure
e.g. that each enode is processed at most once.
gcc/analyzer/ChangeLog:
* engine.cc (exploded_node::dump_dot): Show merger enodes.
(worklist::add_node): Assert that the node's m_status is
STATUS_WORKLIST.
(exploded_graph::process_worklist): Likewise for nodes from the
worklist. Set status of merged nodes to STATUS_MERGER.
(exploded_graph::process_node): Set status of node to
STATUS_PROCESSED.
(exploded_graph::dump_exploded_nodes): Rework handling of
"__analyzer_dump_exploded_nodes", splitting enodes by status into
"processed" and "merger", showing the count of just the processed
enodes at the call, rather than the count of all enodes.
* exploded-graph.h (exploded_node::status): New enum.
(exploded_node::exploded_node): Initialize m_status to
STATUS_WORKLIST.
(exploded_node::get_status): New getter.
(exploded_node::set_status): New setter.
(exploded_node::m_status): New field.
gcc/ChangeLog:
* doc/analyzer.texi
(Special Functions for Debugging the Analyzer): Update description
of __analyzer_dump_exploded_nodes.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/data-model-1.c: Update for changed output to
__analyzer_dump_exploded_nodes, dropping redundant call at merger.
* gcc.dg/analyzer/data-model-7.c: Likewise.
* gcc.dg/analyzer/loop-2.c: Update for changed output format.
* gcc.dg/analyzer/loop-2a.c: Likewise.
* gcc.dg/analyzer/loop-4.c: Likewise.
* gcc.dg/analyzer/loop.c: Likewise.
* gcc.dg/analyzer/malloc-paths-10.c: Likewise; drop redundant
call at merger.
* gcc.dg/analyzer/malloc-vs-local-1a.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-1b.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-2.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-3.c: Likewise.
* gcc.dg/analyzer/paths-1.c: Likewise.
* gcc.dg/analyzer/paths-1a.c: Likewise.
* gcc.dg/analyzer/paths-2.c: Likewise.
* gcc.dg/analyzer/paths-3.c: Likewise.
* gcc.dg/analyzer/paths-4.c: Update for changed output format.
* gcc.dg/analyzer/paths-5.c: Likewise.
* gcc.dg/analyzer/paths-6.c: Likewise; drop redundant calls
at merger.
* gcc.dg/analyzer/paths-7.c: Likewise.
* gcc.dg/analyzer/torture/conditionals-2.c: Update for changed
output format.
* gcc.dg/analyzer/zlib-1.c: Likewise; drop redundant calls.
* gcc.dg/analyzer/zlib-5.c: Update for changed output format.
Jakub Jelinek [Wed, 5 Feb 2020 14:38:49 +0000 (15:38 +0100)]
i386: Omit clobbers from vzeroupper until final [PR92190]
As mentioned in the PR, the CLOBBERs in vzeroupper are added there even for
registers that aren't ever live in the function before and break the
prologue/epilogue expansion with ms ABI (normal ABIs are fine, as they
consider all [xyz]mm registers call clobbered, but the ms ABI considers
xmm0-15 call used but the bits above low 128 ones call clobbered).
The following patch fixes it by not adding the clobbers during vzeroupper
pass (before pro_and_epilogue), but adding them for -fipa-ra purposes only
during the final output. Perhaps we could add some CLOBBERs early (say for
df_regs_ever_live_p regs that aren't live in the live_regs bitmap, or
depending on the ABI either add all of them immediately, or for ms ABI add
CLOBBERs for xmm0-xmm5 if they don't have a SET) and add the rest later.
And the addition could be perhaps done at other spots, e.g. in an
epilogue_completed guarded splitter.
2020-02-05 Jakub Jelinek <jakub@redhat.com>
PR target/92190
* config/i386/i386-features.c (ix86_add_reg_usage_to_vzeroupper): Only
include sets and not clobbers in the vzeroupper pattern.
* config/i386/sse.md (*avx_vzeroupper): Require in insn condition that
the parallel has 17 (64-bit) or 9 (32-bit) elts.
(*avx_vzeroupper_1): New define_insn_and_split.
* gcc.target/i386/pr92190.c: New test.
Jakub Jelinek [Wed, 5 Feb 2020 14:35:46 +0000 (15:35 +0100)]
i386: Schedule the only -O0 split pass on x86 after pro_and_epilogue/jump2 [PR92190]
The problem is that x86 is the only target that defines HAVE_ATTR_length and
doesn't schedule any splitting pass at -O0 after pro_and_epilogue.
So, either we go back to handling this during vzeroupper output
(unconditionally, rather than flag_ipa_ra guarded), or we need to tweak the
split* passes for x86.
Seems there are 5 split passes, split1 is run unconditionally before reload,
split2 is run for optimize > 0 or STACK_REGS (x86) after ra but before
epilogue_completed, split3 is run before regstack for STACK_REGS and
optimize and -fno-schedule-insns2, split4 is run before sched2 if sched2 is
run and split5 is run before shorten_branches if HAVE_ATTR_length and not
STACK_REGS.
2020-02-05 Jakub Jelinek <jakub@redhat.com>
PR target/92190
* recog.c (pass_split_after_reload::gate): For STACK_REGS targets,
don't run when !optimize.
(pass_split_before_regstack::gate): For STACK_REGS targets, run even
when !optimize.
Richard Biener [Wed, 5 Feb 2020 13:10:50 +0000 (14:10 +0100)]
testsuite/92177 fix for SLP build changes
We're now consistently building SLP operations with only
scalar defs from scalars which makes the testcase no longer
testing multiplication vectorization. The following smuggles
in a constant making the vector variant profitable for SLP build.
2020-02-05 Richard Biener <rguenther@suse.de>
PR testsuite/92177
* gcc.dg/vect/bb-slp-22.c: Adjust.
Richard Biener [Wed, 5 Feb 2020 13:04:29 +0000 (14:04 +0100)]
middle-end/90648 fend off builtin calls with not enough arguments from match
This adds guards to genmatch generated code before accessing call
expression or stmt arguments that might be out of bounds when
the user provided bogus prototypes for what we consider builtins.
2020-02-05 Richard Biener <rguenther@suse.de>
PR middle-end/90648
* genmatch.c (dt_node::gen_kids_1): Emit number of argument
checks before matching calls.
* gcc.dg/pr90648.c: New testcase.
Andrew Burgess [Mon, 27 Jan 2020 22:06:35 +0000 (22:06 +0000)]
libiberty/hashtab: More const parameters
Makes some parameters const in libiberty's hashtab library.
include/ChangeLog:
* hashtab.h (htab_remove_elt): Make a parameter const.
(htab_remove_elt_with_hash): Likewise.
libiberty/ChangeLog:
* hashtab.c (htab_remove_elt): Make a parameter const.
(htab_remove_elt_with_hash): Likewise.
Bin Cheng [Wed, 5 Feb 2020 10:45:08 +0000 (18:45 +0800)]
Increase index number for creating temp vars' name.
gcc/cp
* coroutines.cc (maybe_promote_captured_temps): Increase the index
number for temporary variables' name.
Jakub Jelinek [Wed, 5 Feb 2020 10:36:25 +0000 (11:36 +0100)]
Fix up comment typo.
2020-02-05 Jakub Jelinek <jakub@redhat.com>
* tree-ssa-alias.c (aliasing_matching_component_refs_p): Fix up
function comment typo.
Jakub Jelinek [Wed, 5 Feb 2020 10:32:37 +0000 (11:32 +0100)]
openmp: Avoid ICEs with declare simd; declare simd inbranch [PR93555]
The testcases ICE because when processing the declare simd inbranch,
we don't create the i == 0 clone as it already exists, which means
clone_info->nargs is not adjusted, but we then rely on it being adjusted
when trying other clones.
2020-02-05 Jakub Jelinek <jakub@redhat.com>
PR middle-end/93555
* omp-simd-clone.c (expand_simd_clones): If simd_clone_mangle or
simd_clone_create failed when i == 0, adjust clone->nargs by
clone->inbranch.
* c-c++-common/gomp/pr93555-1.c: New test.
* c-c++-common/gomp/pr93555-2.c: New test.
* gfortran.dg/gomp/pr93555.f90: New test.
Martin Liska [Wed, 5 Feb 2020 08:56:31 +0000 (09:56 +0100)]
Do not load body for alias symbols.
PR lto/93489
* lto-dump.c (dump_list_functions): Do not
load body for aliases.
(dump_body): Likewise here.
Martin Liska [Wed, 5 Feb 2020 08:55:09 +0000 (09:55 +0100)]
Document ASLR for Precompiled Headers.
PR c++/92717
* doc/invoke.texi: Document that one should
not combine ASLR and -fpch.
Patrick Palka [Wed, 22 Jan 2020 21:51:19 +0000 (16:51 -0500)]
libstdc++: Apply the move_iterator changes described in P1207R4
These changes are needed for some of the tests in the constrained algorithm
patch, because they use move_iterator with an uncopyable output_iterator. The
other changes described in the paper are already applied, it seems.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (move_iterator::move_iterator): Move __i
when initializing _M_current.
(move_iterator::base): Split into two overloads differing in
ref-qualifiers as in P1207R4 for C++20.
JunMa [Tue, 21 Jan 2020 10:18:09 +0000 (18:18 +0800)]
Handle type deduction of auto and decltype(auto) with reference expression
gcc/cp
* coroutines.cc (build_co_await): Call convert_from_reference
to wrap co_await_expr with indirect_ref which avoid
reference/non-reference type confusion.
(co_await_expander): Sink to call_expr if await_resume
is wrapped by indirect_ref.
gcc/testsuite
* g++.dg/coroutines/co-await-14-return-ref-to-auto.C: New test.
GCC Administrator [Wed, 5 Feb 2020 00:16:31 +0000 (00:16 +0000)]
Daily bump.
Jason Merrill [Tue, 4 Feb 2020 22:18:35 +0000 (17:18 -0500)]
c++: Fix error-recovery with concepts.
Here, push_tinst_level refused to push into the scope of Foo::Foo
because it was triggered from the ill-formed function fun. But we didn't
check the return value and tried to pop the un-pushed level.
PR c++/93551
* constraint.cc (satisfy_declaration_constraints): Check return
value of push_tinst_level.
Jason Merrill [Tue, 4 Feb 2020 20:54:17 +0000 (15:54 -0500)]
c++: Fix constexpr vs. omitted aggregate init.
Value-initialization is importantly different from {}-initialization for
this testcase, where the former calls the deleted S constructor and the
latter initializes S happily.
PR c++/90951
* constexpr.c (cxx_eval_array_reference): {}-initialize missing
elements instead of value-initializing them.
Jason Merrill [Tue, 4 Feb 2020 19:21:59 +0000 (14:21 -0500)]
c++: Fix ({ ... }) array mem-initializer.
Here, we were going down the wrong path in perform_member_init because of
the incorrect parens around the mem-initializer for the array. And then
cxx_eval_vec_init_1 didn't know what to do with a CONSTRUCTOR as the
initializer. The latter issue was a straightforward fix, but I also wanted
to fix us silently accepting the parens, which led to factoring out handling
of TREE_LIST and flexarrays. The latter led to adjusting the expected
behavior on flexary29.C: we should complain about the initializer, but not
complain about a missing initializer.
As I commented on PR 92812, in this process I noticed that we weren't
handling C++20 parenthesized aggregate initialization as a mem-initializer.
So my TREE_LIST handling includes a commented out section that should
probably be part of a future fix for that issue; with it uncommented we
continue to crash on the testcase in C++20 mode, but should instead complain
about the braced-init-list not being a valid initializer for an A.
PR c++/86917
* init.c (perform_member_init): Simplify.
* constexpr.c (cx_check_missing_mem_inits): Allow uninitialized
flexarray.
(cxx_eval_vec_init_1): Handle CONSTRUCTOR.
David Malcolm [Mon, 3 Feb 2020 20:39:50 +0000 (15:39 -0500)]
analyzer: fix testsuite assumption that sizeof(int) > 2
Fix some failures on xstormy16-elf:
gcc.dg/analyzer/data-model-1.c (test for warnings, line 595)
gcc.dg/analyzer/data-model-1.c (test for warnings, line 642)
gcc.dg/analyzer/data-model-1.c (test for warnings, line 690)
gcc.dg/analyzer/data-model-1.c (test for warnings, line 738)
due to:
warning: overflow in conversion from ‘long int’ to ‘int’ changes
value from ‘100024’ to ‘-31048’ [-Woverflow]
20 | p[0].x = 100024;
| ^~~~~~
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/data-model-1.c (struct coord): Convert fields
from int to long.
David Malcolm [Tue, 28 Jan 2020 21:31:01 +0000 (16:31 -0500)]
analyzer: fix build error with clang (PR 93543)
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243681 reports a build
failure with clang 9.0.1:
gcc/analyzer/engine.cc:2971:13: error:
reinterpret_cast from 'nullptr_t' to 'function *' is not allowed
v.m_fun = reinterpret_cast<function *> (NULL);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
engine.cc:2983:21: error:
reinterpret_cast from 'nullptr_t' to 'function *' is not allowed
return v.m_fun == reinterpret_cast<function *> (NULL);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The casts appears to be unnecessary; eliminate them.
gcc/analyzer/ChangeLog:
PR analyzer/93543
* engine.cc (pod_hash_traits<function_call_string>::mark_empty):
Eliminate reinterpret_cast.
(pod_hash_traits<function_call_string>::is_empty): Likewise.
Richard Biener [Tue, 4 Feb 2020 14:17:01 +0000 (15:17 +0100)]
tree-optimization/93538 - add missing comparison folding case
This adds back a folding that worked in GCC 4.5 times by amending
the pattern that handles other cases of address vs. SSA name
comparisons.
2020-02-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/93538
* match.pd (addr EQ/NE ptr): Amend to handle &ptr->x EQ/NE ptr.
* gcc.dg/tree-ssa/forwprop-38.c: New testcase.
Jonathan Wakely [Tue, 4 Feb 2020 13:30:57 +0000 (13:30 +0000)]
libstdc++: Fix name of macro in #undef directive
The macro that is defined is _GLIBCXX_NOT_FN_CALL_OP but the macro that
was named in the #undef directive was _GLIBCXX_NOT_FN_CALL. This fixes
the #undef.
* include/std/functional (_GLIBCXX_NOT_FN_CALL_OP): Un-define after
use.
Jonathan Wakely [Tue, 4 Feb 2020 12:59:14 +0000 (12:59 +0000)]
libstdc++: Fix regressions in unique_ptr::swap (PR 93562)
The requirements for this function are only that the deleter is
swappable, but we incorrectly require that the element type is complete
and that the deleter can be swapped using std::swap (which requires it
to be move cosntructible and move assignable).
The fix is to add __uniq_ptr_impl::swap which swaps the pointer and
deleter individually, instead of using the generic std::swap on the
tuple containing them.
PR libstdc++/93562
* include/bits/unique_ptr.h (__uniq_ptr_impl::swap): Define.
(unique_ptr::swap, unique_ptr<T[], D>::swap): Call it.
* testsuite/20_util/unique_ptr/modifiers/93562.cc: New test.
Jakub Jelinek [Tue, 4 Feb 2020 12:40:56 +0000 (13:40 +0100)]
libcpp: Diagnose __has_include outside of preprocessor directives [PR93545]
Add forgotten gcc/testsuite/c-c++-common/gomp/has-include-1.c.
2020-02-04 Jakub Jelinek <jakub@redhat.com>
* macro.c (builtin_has_include): Diagnose __has_include* use outside
of preprocessing directives.
* c-c++-common/cpp/has-include-1.c: New test.
* c-c++-common/cpp/has-include-next-1.c: New test.
* c-c++-common/gomp/has-include-1.c: New test.
Jakub Jelinek [Tue, 4 Feb 2020 12:39:59 +0000 (13:39 +0100)]
libcpp: Diagnose __has_include outside of preprocessor directives [PR93545]
The standard says http://eel.is/c++draft/cpp.cond#7.sentence-2 that
__has_include can't appear at arbitrary places in the source. As we have
not recognized __has_include* outside of preprocessing directives in the
past, accepting it there now would be a regression. The patch does still
allow it in #define if it is then used in preprocessing directives, I guess
that use isn't strictly valid either, but clang seems to accept it.
2020-02-04 Jakub Jelinek <jakub@redhat.com>
* macro.c (builtin_has_include): Diagnose __has_include* use outside
of preprocessing directives.
* c-c++-common/cpp/has-include-1.c: New test.
* c-c++-common/cpp/has-include-next-1.c: New test.
* c-c++-common/gomp/has-include-1.c: New test.
Jakub Jelinek [Tue, 4 Feb 2020 12:38:16 +0000 (13:38 +0100)]
libcpp: Fix ICEs on __has_include syntax errors [PR93545]
Some of the following testcases ICE, because one of the cpp_get_token
calls in builtin_has_include reads the CPP_EOF token but the caller isn't
aware that CPP_EOF has been reached and will do another cpp_get_token.
get_token_no_padding is something that is use by the
has_attribute/has_builtin callbacks, which will first peek and will not
consume CPP_EOF (but will consume other tokens). The !SEEN_EOL ()
check on the other side doesn't work anymore and isn't really needed,
as we don't consume the EOF. The change adds one further error to the
pr88974.c testcase, if we wanted just one error per __has_include,
we could add some boolean whether we've emitted errors already and
only emit the first one we encounter (not implemented).
2020-02-04 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/93545
* macro.c (cpp_get_token_no_padding): New function.
(builtin_has_include): Use it instead of cpp_get_token. Don't check
SEEN_EOL.
* c-c++-common/cpp/pr88974.c: Expect another diagnostics during error
recovery.
* c-c++-common/cpp/pr93545-1.c: New test.
* c-c++-common/cpp/pr93545-2.c: New test.
* c-c++-common/cpp/pr93545-3.c: New test.
* c-c++-common/cpp/pr93545-4.c: New test.
Iain Sandoe [Tue, 4 Feb 2020 09:36:30 +0000 (09:36 +0000)]
coroutines: Prevent repeated error messages for missing promise.
If the user's coroutine return type omits the mandatory promise
type then we will currently restate that error each time we see
a coroutine keyword, which doesn't provide any new information.
This suppresses all but the first instance in each coroutine.
gcc/cp/ChangeLog:
2020-02-04 Iain Sandoe <iain@sandoe.co.uk>
* coroutines.cc (find_promise_type): Delete unused forward
declaration.
(struct coroutine_info): Add a bool for no promise type error.
(coro_promise_type_found_p): Only emit the error for a missing
promise once in each affected coroutine.
gcc/testsuite/ChangeLog:
2020-02-04 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/coro-missing-promise.C: New test.
Richard Biener [Fri, 31 Jan 2020 12:28:11 +0000 (13:28 +0100)]
tree-optimization/91123 - restore redundant store removal
Redundant store removal in FRE was restricted for correctness reasons.
The following extends correctness fixes required to memcpy/aggregate
copy translation. The main change is that we no longer insert
references rewritten to cover such aggregate copies into the hashtable
but the original one.
2020-02-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/91123
* tree-ssa-sccvn.c (vn_walk_cb_data::finish): New method.
(vn_walk_cb_data::last_vuse): New member.
(vn_walk_cb_data::saved_operands): Likewsie.
(vn_walk_cb_data::~vn_walk_cb_data): Release saved_operands.
(vn_walk_cb_data::push_partial_def): Use finish.
(vn_reference_lookup_2): Update last_vuse and use finish if
we've saved operands.
(vn_reference_lookup_3): Use finish and update calls to
push_partial_defs everywhere. When translating through
memcpy or aggregate copies save off operands and alias-set.
(eliminate_dom_walker::eliminate_stmt): Restore VN_WALKREWRITE
operation for redundant store removal.
* gcc.dg/tree-ssa/ssa-fre-85.c: New testcase.
Richard Biener [Tue, 4 Feb 2020 09:03:03 +0000 (10:03 +0100)]
tree-optimization/92819 restrict new vector CTOR canonicalization
The PR shows that code generation ends up pessimized by the new
canonicalization rules that end up nailing do-not-care elements
to specific values making it hard to generate good code later.
The temporary solution is to avoid this for the cases we also
obviously know the canonicalization will create more GIMPLE stmts than
before.
2020-02-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/92819
* tree-ssa-forwprop.c (simplify_vector_constructor): Avoid
generating more stmts than before.
* gcc.target/i386/pr92819.c: New testcase.
* gcc.target/i386/pr92803.c: Adjust.
Martin Liska [Tue, 4 Feb 2020 08:23:22 +0000 (09:23 +0100)]
Fix release checking build of ARM.
* config/arm/arm.c (arm_gen_far_branch): Move the function
outside of selftests.
Ian Lance Taylor [Mon, 3 Feb 2020 20:29:45 +0000 (12:29 -0800)]
syscall: fix riscv64 GNU/Linux build
Make syscall_linux_riscv64.go, new in the 1.14beta1 release, look like
the other syscall_linux_GOARCH.go files.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/217577
Ian Lance Taylor [Tue, 4 Feb 2020 00:44:33 +0000 (16:44 -0800)]
libbacktrace: always pass -g when compiling test code
This approach required adding a few casts to ztest.c, as it is now
compiled with -Wall.
Fixes PR libbacktrace/90636
GCC Administrator [Tue, 4 Feb 2020 00:16:49 +0000 (00:16 +0000)]
Daily bump.
Michael Meissner [Mon, 3 Feb 2020 23:25:07 +0000 (18:25 -0500)]
Optimize vec_extract of vectors in memory with a PC-relative address.
2020-02-03 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000.c (adjust_vec_address_pcrel): New helper
function to adjust PC-relative vector addresses.
(rs6000_adjust_vec_address): Call adjust_vec_address_pcrel to
handle vectors with PC-relative addresses.
Michael Meissner [Mon, 3 Feb 2020 23:22:18 +0000 (18:22 -0500)]
Rewrite convulated code to avoid adding r0.
2020-02-03 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000.c (reg_to_non_prefixed): Add forward
reference.
(hard_reg_and_mode_to_addr_mask): Delete.
(rs6000_adjust_vec_address): If the original vector address
was REG+REG or REG+OFFSET and the element is not zero, do the add
of the elements in the original address before adding the offset
for the vector element. Use address_to_insn_form to validate the
address using the register being loaded, rather than guessing
whether the address is a DS-FORM or DQ-FORM address.
Michael Meissner [Mon, 3 Feb 2020 22:57:57 +0000 (17:57 -0500)]
Adjust how variable vector extraction is done.
2020-02-03 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000.c (get_vector_offset): New helper function
to calculate the offset in memory from the start of a vector of a
particular element. Add code to keep the element number in
bounds if the element number is variable.
(rs6000_adjust_vec_address): Move calculation of offset of the
vector element to get_vector_offset.
(rs6000_split_vec_extract_var): Do not do the initial AND of
element here, move the code to get_vector_offset.
Jason Merrill [Mon, 3 Feb 2020 21:03:45 +0000 (16:03 -0500)]
c++: Fix constexpr vs. reference parameter.
[expr.const] specifically rules out mentioning a reference even if its
address is never used, because it implies indirection that is similarly
non-constant for a pointer variable.
PR c++/66477
* constexpr.c (cxx_eval_constant_expression) [PARM_DECL]: Don't
defer loading the value of a reference.
Jason Merrill [Mon, 3 Feb 2020 16:11:55 +0000 (11:11 -0500)]
c++: Allow parm of empty class type in constexpr.
Since copying a class object is defined in terms of the copy constructor,
copying an empty class is OK even if it would otherwise not be usable in a
constant expression. Relatedly, using a parameter as an lvalue is no more
problematic than a local variable, and calling a member function uses the
object as an lvalue.
PR c++/91953
* constexpr.c (potential_constant_expression_1) [PARM_DECL]: Allow
empty class type.
[COMPONENT_REF]: A member function reference doesn't use the object
as an rvalue.
Michael Meissner [Mon, 3 Feb 2020 20:50:39 +0000 (15:50 -0500)]
Add some gcc_asserts for vector extract processing.
2020-02-03 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Add some
gcc_asserts.
Iain Sandoe [Mon, 3 Feb 2020 19:15:31 +0000 (19:15 +0000)]
coroutines: Fix ICE on invalid (PR93458).
Since coroutine-ness is discovered lazily, we encounter the diagnostics
during each keyword parse. We were not handling the case where a user code
failed to include fundamental information (e.g. the traits) in a graceful
manner.
Once we've emitted an error for this level of fail, then we suppress
additional copies (otherwise the same thing will be reported for every
coroutine keyword seen).
gcc/cp/ChangeLog:
2020-02-03 Iain Sandoe <iain@sandoe.co.uk>
* coroutines.cc (struct coroutine_info): Add a bool flag to note
that we emitted an error for a bad function return type.
(get_coroutine_info): Tolerate an unset info table in case of
missing traits.
(find_coro_traits_template_decl): In case of error or if we didn't
find a type template, note we emitted the error and suppress
duplicates.
(find_coro_handle_template_decl): Likewise.
(instantiate_coro_traits): Only check for error_mark_node in the
return from lookup_qualified_name.
(coro_promise_type_found_p): Reorder initialization so that we check
for the traits and their usability before allocation of the info
table. Check for a suitable return type and emit a diagnostic for
here instead of relying on the lookup machinery. This allows the
error to have a better location, and means we can suppress multiple
copies.
(coro_function_valid_p): Re-check for a valid promise (and thus the
traits) before proceeding. Tolerate missing info as a fatal error.
gcc/testsuite/ChangeLog:
2020-02-03 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/pr93458-1-missing-traits.C: New test.
* g++.dg/coroutines/pr93458-2-bad-traits.C: New test.
* g++.dg/coroutines/pr93458-3-missing-handle.C: New test.
* g++.dg/coroutines/pr93458-4-bad-coro-handle.C: New test.
* g++.dg/coroutines/pr93458-5-bad-coro-type.C: New test.
David Malcolm [Thu, 30 Jan 2020 20:23:40 +0000 (15:23 -0500)]
analyzer: avoid use of fold_build2
Various places in the analyzer use fold_build2, test the result, then
discard it. It's more efficient to use fold_binary, which avoids
building and GC-ing a redundant tree for the cases where folding fails.
gcc/analyzer/ChangeLog:
* constraint-manager.cc (range::constrained_to_single_element):
Replace fold_build2 with fold_binary. Remove unnecessary newline.
(constraint_manager::get_or_add_equiv_class): Replace fold_build2
with fold_binary in two places, and remove out-of-date comment.
(constraint_manager::eval_condition): Replace fold_build2 with
fold_binary.
* region-model.cc (constant_svalue::eval_condition): Likewise.
(region_model::on_assignment): Likewise.
David Malcolm [Mon, 3 Feb 2020 16:23:09 +0000 (11:23 -0500)]
analyzer: detect zero-assignment in phis (PR 93544)
PR analyzer/93544 reports an ICE when attempting to report a double-free
within diagnostic_manager::prune_for_sm_diagnostic, in which the
variable of interest has become an INTEGER_CST. Additionally, it picks
a nonsensical path through the function in which the pointer being
double-freed is known to be NULL, which we shouldn't complain about.
The dump shows that it picks the INTEGER_CST when updating var at a phi
node:
considering event 4, with var: ‘iftmp.0_2’, state: ‘start’
updating from ‘iftmp.0_2’ to ‘0B’ based on phi node
phi: iftmp.0_2 = PHI <iftmp.0_6(3), 0B(2)>
considering event 3, with var: ‘0B’, state: ‘start’
and that it has picked the shortest path through the exploded graph,
and on this path the pointer has been assigned NULL.
The root cause is that the state machine's on_stmt isn't called for phi
nodes (and wouldn't make much sense, as we wouldn't know which arg to
choose). malloc state machine::on_stmt "sees" a GIMPLE_ASSIGN to NULL
and handles it by transitioning the lhs to the "null" state, but never
"sees" GIMPLE_PHI nodes.
This patch fixes the ICE by wiring up phi-handling with state machines,
so that state machines have an on_phi vfunc. It updates the only current
user of "is_zero_assignment" (the malloc sm) to implement equivalent
logic for phi nodes. Doing so ensures that the pointer is in a separate
sm-state for the NULL vs non-NULL cases, and so gets separate exploded
nodes, and hence the path-finding logic chooses the correct path, and
the correct non-NULL phi argument.
The patch also adds some bulletproofing to prune_for_sm_diagnostic to
avoid crashing in the event of a bad path.
gcc/analyzer/ChangeLog:
PR analyzer/93544
* diagnostic-manager.cc
(diagnostic_manager::prune_for_sm_diagnostic): Bulletproof
against bad choices due to bad paths.
* engine.cc (impl_region_model_context::on_phi): New.
* exploded-graph.h (impl_region_model_context::on_phi): New decl.
* region-model.cc (region_model::on_longjmp): Likewise.
(region_model::handle_phi): Add phi param. Call the ctxt's on_phi
vfunc.
(region_model::update_for_phis): Pass phi to handle_phi.
* region-model.h (region_model::handle_phi): Add phi param.
(region_model_context::on_phi): New vfunc.
(test_region_model_context::on_phi): New.
* sm-malloc.cc (malloc_state_machine::on_phi): New.
(malloc_state_machine::on_zero_assignment): New.
* sm.h (state_machine::on_phi): New vfunc.
gcc/testsuite/ChangeLog:
PR analyzer/93544
* gcc.dg/analyzer/torture/pr93544.c: New test.
David Malcolm [Mon, 3 Feb 2020 14:55:26 +0000 (09:55 -0500)]
analyzer: show BBs in .dot dumps
gcc/analyzer/ChangeLog:
* engine.cc (supernode_cluster::dump_dot): Show BB index as
well as SN index.
* supergraph.cc (supernode::dump_dot): Likewise.
David Malcolm [Mon, 3 Feb 2020 13:30:54 +0000 (08:30 -0500)]
analyzer: fix ICE merging models containing label pointers (PR 93546)
PR analyzer/93546 reports an ICE within region_model::add_region_for_type
when merging two region_models each containing a label pointer. The
two labels are stored as pointers to symbolic_regions, but these regions
were created with NULL type, leading to an assertion failure when a
merged copy is created.
The labels themselves have void (but not NULL) type.
This patch updates make_region_for_type to use the type of the decl when
creating such regions, rather than implicitly setting the region's type
to NULL, fixing the ICE.
gcc/analyzer/ChangeLog:
PR analyzer/93546
* region-model.cc (region_model::on_call_pre): Update for new
param of symbolic_region ctor.
(region_model::deref_rvalue): Likewise.
(region_model::add_new_malloc_region): Likewise.
(make_region_for_type): Likewise, preserving type.
* region-model.h (symbolic_region::symbolic_region): Add "type"
param and pass it to base class ctor.
gcc/testsuite/ChangeLog:
PR analyzer/93546
* gcc.dg/analyzer/pr93546.c: New test.
David Malcolm [Mon, 3 Feb 2020 11:34:20 +0000 (06:34 -0500)]
analyzer: fix ICE due to comparing int and real constants (PR 93547)
gcc/analyzer/ChangeLog:
PR analyzer/93547
* constraint-manager.cc
(constraint_manager::get_or_add_equiv_class): Ensure types are
compatible before comparing constants.
gcc/testsuite/ChangeLog:
PR analyzer/93547
* gcc.dg/analyzer/pr93547.c: New test.
Segher Boessenkool [Fri, 31 Jan 2020 00:07:53 +0000 (00:07 +0000)]
rs6000: Update constraint documentation
This un-documents constraints that cannot (or should not) be used in
inline assembler. It also improves markup, and presentation in general.
More work is needed, but gradual improvement is easier to do.
* config/rs6000/constraints.md: Improve documentation.
/
* doc/md.texi (PowerPC and IBM RS6000): Improve documentation.
Richard Earnshaw [Mon, 3 Feb 2020 17:40:55 +0000 (17:40 +0000)]
arm: Use move-if-change for updating regenerated files [PR93548]
The t-arm make fragment currently uses 'mv' to update some files that
are automatically regenerated, but this causes problems on read-only
filesystems if the date stamps are incorrect and the files have not
really changed. So use move-if-change instead.
PR target/93548
* config/arm/t-arm: ($(srcdir)/config/arm/arm-tune.md,
$(srcdir)/config/arm/arm-tables.opt): Use move-if-change.
Andrew Stubbs [Mon, 3 Feb 2020 15:02:22 +0000 (15:02 +0000)]
Remove gfx801 "carrizo" support
2020-02-03 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config.gcc: Remove "carrizo" support.
* config/gcn/gcn-opts.h (processor_type): Likewise.
* config/gcn/gcn.c (gcn_omp_device_kind_arch_isa): Likewise.
* config/gcn/gcn.opt (gpu_type): Likewise.
* config/gcn/t-omp-device: Likewise.
libgomp/
* plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX801): Remove.
(gcn_gfx801_s): Remove.
(isa_hsa_name): Remove gfx801.
(isa_gcc_name): Remove gfx801/carizzo.
(isa_code): Remove gfx801.
Jason Merrill [Sat, 1 Feb 2020 02:59:48 +0000 (21:59 -0500)]
c++: Fix cast to pointer to VLA.
The C front-end fixed this issue in r257620 by adding a DECL_EXPR from
grokdeclarator. We don't have an easy way to do that in the C++ front-end,
but it works fine to create and prepend a DECL_EXPR when we are genericizing
the NOP_EXPR for the cast.
The C patch wraps the DECL_EXPR in a BIND_EXPR, but that seems unnecessary
in C++; this is just a hook to run gimplify_type_sizes, we aren't actually
declaring anything that we need to worry about scoping for.
PR c++/88256
* cp-gimplify.c (predeclare_vla): New.
(cp_genericize_r) [NOP_EXPR]: Call it.
Stam Markianos-Wright [Mon, 3 Feb 2020 10:25:46 +0000 (10:25 +0000)]
This patch is for PR target/91816
This is a patch for an issue where the compiler was generating a conditional
branch in Thumb2, which was too far for b{cond} to handle.
This was originally reported at binutils:
https://sourceware.org/bugzilla/show_bug.cgi?id=24991
And then raised for GCC:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91816
As can be seen here:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/Cihfddaf.html
the range of a 32-bit Thumb B{cond} is +/-1MB.
This is now checked for in arm.md and an unconditional branch is generated if
the jump would be greater than 1MB.
gcc/ChangeLog
2020-02-03 Stam Markianos-Wright <stam.markianos-wright@arm.com>
PR target/91816
* config/arm/arm-protos.h: New function arm_gen_far_branch prototype.
* config/arm/arm.c (arm_gen_far_branch): New function
arm_gen_far_branch.
* config/arm/arm.md: Update b<cond> for Thumb2 range checks.
gcc/testsuite/ChangeLog
2020-02-03 Stam Markianos-Wright <stam.markianos-wright@arm.com>
PR target/91816
* gcc.target/arm/pr91816.c: New test.
Tobias Burnus [Mon, 3 Feb 2020 09:10:37 +0000 (10:10 +0100)]
[OpenACC] bump version for 2.6 plus libgomp.texi update
2020-02-03 Julian Brown <julian@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Update _OPENACC define to 201711.
gcc/
* doc/invoke.texi: Update mention of OpenACC version to 2.6.
gcc/fortran/
* cpp.c (cpp_define_builtins): Update _OPENACC define to 201711.
* intrinsic.texi: Update mentions of OpenACC version to 2.6.
* gfortran.texi: Likewise. Remove experimental disclamer for OpenACC.
* invoke.texi: Remove experimental disclamer for OpenACC.
gcc/testsuite/
* c-c++-common/cpp/openacc-define-3.c: Update expected value for
_OPENACC define.
* gfortran.dg/openacc-define-3.f90: Likewise.
libgomp/
* libgomp.texi (OpenACC Runtime Library Routines): Document *_async
and *_finalize variants; document acc_attach and acc_detach; update
references from OpenACC 2.0 to 2.6.
* openacc.f90 (openacc_version): Update to 201711.
* openacc_lib.h (openacc_version): Update to 201711.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Update expected
openacc_version to 201711.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.
Tobias Burnus [Mon, 3 Feb 2020 09:02:47 +0000 (10:02 +0100)]
[OpenMP] Add missing parameters to omp_lib documentation (PR fortran/93541)
PR fortran/93541
* intrinisic.texi (OpenMP Modules OMP_LIB and OMP_LIB_KINDS):
Add undocumented parameters from omp_lib.f90.in.
Tobias Burnus [Mon, 3 Feb 2020 09:00:07 +0000 (10:00 +0100)]
[Fortran] Fix to strict associate check (PR93427)
PR fortran/93427
* resolve.c (resolve_assoc_var): Remove too strict check.
* gfortran.dg/associate_51.f90: Update test case.
PR fortran/93427
* gfortran.dg/associate_52.f90: New.
Jakub Jelinek [Mon, 3 Feb 2020 08:00:19 +0000 (09:00 +0100)]
s390x: Fix popcounthi2_z196 expander [PR93533]
The following testcase started to ICE when .POPCOUNT matching has been added
to match.pd; we had __builtin_popcount*, but nothing would use the
popcounthi2 expander before.
The problem is that the popcounthi2_z196 expander doesn't emit valid RTL:
error: unrecognizable insn:
(insn 138 137 139 27 (set (reg:SI 190)
(ashift:SI (reg:HI 95 [ _105 ])
(const_int 8 [0x8]))) -1
(nil))
during RTL pass: vregs
The following patch is an attempt to fix that, furthermore I've tried to
slightly simplify it as well, it makes no sense to me to perform
(x + (x << 8)) >> 8 when we need to either zero extend or mask the result
at the end in order to avoid bits from above HImode to affect it, when we
can do
(x + (x >> 8)) & 0xff (or zero extension).
2020-02-03 Jakub Jelinek <jakub@redhat.com>
PR target/93533
* config/s390/s390.md (popcounthi2_z196): Fix up expander to emit
valid RTL to sum up the lowest and second lowest bytes of the popcnt
result.
* gcc.c-torture/compile/pr93533.c: New test.
* gcc.target/s390/pr93533.c: New test.
JunMa [Mon, 20 Jan 2020 09:46:32 +0000 (17:46 +0800)]
coroutines: Bind label_decl of original function to actor function
gcc/cp
* coroutines.cc (transform_await_wrapper): Set actor funcion as
new context of label_decl.
(build_actor_fn): Fill new field of await_xform_data.
gcc/testsuite
* g++.dg/coroutines/co-await-04-control-flow.C: Add label.
GCC Administrator [Mon, 3 Feb 2020 00:16:55 +0000 (00:16 +0000)]
Daily bump.
Marek Polacek [Sat, 1 Feb 2020 00:28:10 +0000 (19:28 -0500)]
c++: Fix ICE on invalid alignas in a template [PR93530]
This fixes an ICE taking place in cp_default_conversion because we got
a SCOPE_REF that doesn't have a type and so checking
INTEGRAL_OR_UNSCOPED_ENUMERATION_TYPE_P (TREE_TYPE (exp)) will crash.
This happens since the recent Joseph's change in decl_attributes whereby
we don't skip C++11 attributes on types.
[dcl.align] is clear that alignas applied to a function is ill-formed.
That should be fixed, and we have PR90847 for that. But I think a more
appropriate fix at this stage would be the following: in a template we
want to splice dependent attributes and save them for later, and by
doing so avoid this crash.
PR c++/93530 - ICE on invalid alignas in a template.
* decl.c (grokdeclarator): Call cplus_decl_attributes instead of
decl_attributes.
* g++.dg/cpp0x/alignas18.C: New test.
Iain Sandoe [Sun, 2 Feb 2020 19:53:24 +0000 (19:53 +0000)]
testsuite,Darwin,PPC: Adjust darwin-abi-12.c for common section use.
This test explicitly tests for code generation that expects a
common section.
gcc/testsuite/ChangeLog:
2020-02-02 Iain Sandoe <iain@sandoe.co.uk>
* gcc.target/powerpc/darwin-abi-12.c: Add '-fcommon' to the
options.
Vladimir N. Makarov [Sun, 2 Feb 2020 16:23:25 +0000 (11:23 -0500)]
One more fix for PR 91333 - suboptimal register allocation for inline asm
2020-02-02 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/91333
* ira-color.c (struct allocno_color_data): Add member
hard_reg_prefs.
(init_allocno_threads): Set the member up.
(bucket_allocno_compare_func): Add compare hard reg
prefs.
2020-02-02 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/91333
* gcc.target/i386/pr91333.c: Add vmovsd to regexp. Set up count
to 3.
GCC Administrator [Sun, 2 Feb 2020 00:16:33 +0000 (00:16 +0000)]
Daily bump.
Jakub Jelinek [Sat, 1 Feb 2020 09:02:20 +0000 (10:02 +0100)]
fortran: Fix up TYPE_ARG_TYPES of procs with scalar VALUE optional args [PR92305]
The following patch fixes
-FAIL: libgomp.fortran/use_device_addr-1.f90 -O0 execution test
-FAIL: libgomp.fortran/use_device_addr-2.f90 -O0 execution test
that has been FAILing for several months on powerpc64le-linux.
The problem is in the Fortran FE, which adds the artificial arguments
for scalar VALUE OPTIONAL dummy args only to DECL_ARGUMENTS where the
current function can see them, but not to TYPE_ARG_TYPES; if those functions
aren't varargs, this confuses calls.c to pass the remaining arguments
(which aren't named (== not covered by TYPE_ARG_TYPES) and aren't varargs
either) in a different spot from what the callee (which has proper
DECL_ARGUMENTS for all args) expects. For the artificial length arguments
for character dummy args we already put them in both DECL_ARGUMENTS and
TYPE_ARG_TYPES.
2020-02-01 Jakub Jelinek <jakub@redhat.com>
PR fortran/92305
* trans-types.c (gfc_get_function_type): Also push boolean_type_node
types for non-character scalar VALUE optional dummy arguments.
* trans-decl.c (create_function_arglist): Skip those in
hidden_typelist. Formatting fix.
Sandra Loosemore [Sat, 1 Feb 2020 00:46:50 +0000 (16:46 -0800)]
nios2: Support for GOT-relative DW_EH_PE_datarel encoding.
On nios2-linux-gnu, there has been a long-standing bug in C++ exception
handling that sometimes resulted in link errors like
../nios2-linux-gnu/bin/ld: FDE encoding in /tmp/cccfpQ2l.o(.eh_frame) prevents .eh_frame_hdr table being created
when building some shared libraries or PIE executables. The root of
the problem is that GCC was incorrectly emitting an absolute encoding
in EH tables for PIC. This patch changes it to use either
DW_EH_PE_indirect (for global) or DW_EH_PE_datarel (for local), and
fixes libgcc so it can find the address of the GOT as the base address
for DW_EH_PE_datarel.
Complicating matters somewhat, GAS was missing support for
%gotoff(symbol) relocation syntax. I have just pushed a fix for that,
but I've added a configure check to test for presence of the binutils
support and fall back to the current absolute encoding (which works
most of the time) if it is not available. Once the fix makes it into
an official binutils release it might be appropriate to make this
error out instead.
Since this is a wrong-code bug and affects only nios2 target, I think
this is appropriate for Stage 4. I regression-tested on both
nios2-linux-gnu and nios2-elf, with and without the binutils support
present, before committing this.
2020-01-31 Sandra Loosemore <sandra@codesourcery.com>
gcc/
* configure.ac [nios2-*-*]: Check HAVE_AS_NIOS2_GOTOFF_RELOCATION.
* config.in: Regenerated.
* configure: Regenerated.
* config/nios2/nios2.h (ASM_PREFERRED_EH_DATA_FORMAT): Fix handling
for PIC when HAVE_AS_NIOS2_GOTOFF_RELOCATION.
(ASM_MAYBE_OUTPUT_ENCODED_ADDR_RTX): New.
gcc/testsuite/
* g++.target/nios2/hello-pie.C: New.
* g++.target/nios2/nios2.exp: New.
libgcc/
* config.host [nios2-*-linux*] (tmake_file, tm_file): Adjust.
* config/nios2-elf-lib.h: New.
* unwind-dw2-fde-dip.c (_Unwind_IteratePhdrCallback): Use existing
code for finding GOT base for nios2.
Andrew Burgess [Thu, 30 Jan 2020 12:18:13 +0000 (12:18 +0000)]
Fixes after recent configure changes relating to static libraries
This commit:
commit
e7c26e04b2dd6266d62d5a5825ff7eb44d1cf14e (tjteru/master)
Date: Wed Jan 22 14:54:26 2020 +0000
gcc: Add new configure options to allow static libraries to be selected
contains a couple of issues. First I failed to correctly regenerate
all of the configure files it should have done. Second, there was a
mistake in lib-link.m4, one of the conditions didn't use pure sh
syntax, I wrote this:
if x$lib_type = xauto || x$lib_type = xshared; then
When I should have written this:
if test "x$lib_type" = "xauto" || test "x$lib_type" = "xshared"; then
These issues were raised on the mailing list in these messages:
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01827.html
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01921.html
config/ChangeLog:
* lib-link.m4 (AC_LIB_LINKFLAGS_BODY): Update shell syntax.
gcc/ChangeLog:
* configure: Regenerate.
intl/ChangeLog:
* configure: Regenerate.
libcpp/ChangeLog:
* configure: Regenerate.
libstdc++-v3/ChangeLog:
* configure: Regenerate.
GCC Administrator [Sat, 1 Feb 2020 00:16:32 +0000 (00:16 +0000)]
Daily bump.
Jason Merrill [Fri, 31 Jan 2020 22:10:30 +0000 (17:10 -0500)]
c++: Fix sizeof VLA lambda capture.
sizeof a VLA type is not a constant in C or the GNU C++ extension, so we
need to capture the VLA even in unevaluated context. For PR60855 we stopped
looking through a previous capture, but we also need to capture the first
time the variable is mentioned.
PR c++/86216
* semantics.c (process_outer_var_ref): Capture VLAs even in
unevaluated context.
Jason Merrill [Fri, 31 Jan 2020 05:21:44 +0000 (00:21 -0500)]
c++: Reduce memory consumption for arrays of non-aggregate type.
The remaining low-hanging fruit for improvement on memory consumption in the
14179 testcase was the duplication of the CONSTRUCTOR for the array by
reshape_init. This patch changes reshape_init to reuse a single constructor
for an array of non-aggregate type such as the one in the testcase.
PR c++/14179
* decl.c (reshape_init_array_1): Reuse a single CONSTRUCTOR with
non-aggregate elements.
(reshape_init_array): Add first_initializer_p parm.
(reshape_init_r): Change first_initializer_p from bool to tree.
(reshape_init): Pass init to it.
Jason Merrill [Thu, 30 Jan 2020 23:49:29 +0000 (18:49 -0500)]
c++: Reduce memory consumption for large static arrays.
PR14179 and the C counterpart PR12245 are about memory consumption of very
large file-scope arrays. Recently, location wrappers increased memory
consumption significantly: in an array of integer constants, each one will
have a location wrapper, which added up to over 500MB in the 14179
testcase. For this kind of testcase tracking these locations isn't worth
the cost, so this patch turns the wrappers off after 256 elements; any array
that size or larger isn't likely to be interested in the location of
individual integer constants.
PR c++/14179
* parser.c (cp_parser_initializer_list): Suppress location wrappers
after 256 elements.
David Malcolm [Fri, 31 Jan 2020 19:05:17 +0000 (14:05 -0500)]
analyzer: fix ICE with 'const void *' (PR 93457)
gcc/analyzer/ChangeLog:
PR analyzer/93457
* region-model.cc (make_region_for_type): Use VOID_TYPE_P rather
than checking against void_type_node.
gcc/testsuite/ChangeLog:
PR analyzer/93457
* gcc.dg/analyzer/pr93457.c: New test.
David Malcolm [Wed, 22 Jan 2020 18:08:26 +0000 (13:08 -0500)]
analyzer: fix ICE handling void-type (PR 93373)
gcc/analyzer/ChangeLog:
PR analyzer/93373
* region-model.cc (ASSERT_COMPAT_TYPES): Convert to...
(assert_compat_types): ...this, and bail when either type is NULL,
or when VOID_TYPE_P (dst_type).
(region_model::get_lvalue): Update for above conversion.
(region_model::get_rvalue): Likewise.
gcc/testsuite/ChangeLog:
PR analyzer/93373
* gcc.dg/analyzer/torture/pr93373.c: New test.
Vladimir N. Makarov [Fri, 31 Jan 2020 19:26:26 +0000 (14:26 -0500)]
Fix for PR 91333 - suboptimal register allocation for inline asm
2020-01-31 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/91333
* ira-color.c (bucket_allocno_compare_func): Move conflict hard
reg preferences comparison up.
2020-01-31 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/91333
* gcc.target/i386/pr91333.c: New.
David Malcolm [Fri, 31 Jan 2020 17:05:03 +0000 (12:05 -0500)]
analyzer: fix ICE getting void return value (PR 93379)
PR analyzer/93379 reports an ICE within
region_model::update_for_return_superedge when writing the
returned svalue_id to the lhs of the call_stmt
The root cause is that this analyzer code assumed that for any call
with a non-NULL gimple_call_lhs, the called fndecl would have non-void
return type, and thus that a non-null svalue_id would be returned from
region_model::pop_frame. This isn't the case e.g. for a call with
conflicting types where the callee returns void but the caller assumes
int.
This patch fixes the ICE by moving the check for null result so that
it also guards setting the lhs.
gcc/analyzer/ChangeLog:
PR analyzer/93379
* region-model.cc (region_model::update_for_return_superedge):
Move check for null result so that it also guards setting the
lhs.
gcc/testsuite/ChangeLog:
PR analyzer/93379
* gcc.dg/analyzer/torture/pr93379-2.c: New test.
* gcc.dg/analyzer/torture/pr93379.c: New test.
David Malcolm [Fri, 31 Jan 2020 14:20:38 +0000 (09:20 -0500)]
analyzer: fix ICE with pointers between stack frames (PR 93438)
PR analyzer/93438 reports an ICE when merging two region_models
in which an older stack frame has a local pointing to a local in
a more recent stack frame.
stack
older frame
int *: "ow" --+
|
newer frame |
int: "pk" <---+
The root cause is that the state-merging code assumes that all frame
regions in the merged model have already been created.
stack_region::can_merge_p iterates through the frames, creating
and populating each merged frame in turn, so when it attempts to
populate the older frame, it attempts to reference the newer frame in
the merged model, which doesn't exist yet.
This patch reworks stack_region::can_merge_p to use a two-pass approach
in which all frames in the merged model are created first, and then
are all populated, fixing the bug.
gcc/analyzer/ChangeLog:
PR analyzer/93438
* region-model.cc (stack_region::can_merge_p): Split into a two
pass approach, creating all stack regions first, then populating
them.
(selftest::test_state_merging): Add test coverage for (a) the case
of self-merging a model in which a local in an older stack frame
points to a local in a more recent stack frame (which previously
would ICE), and (b) the case of self-merging a model in which a
local points to a global (which previously worked OK).
gcc/testsuite/ChangeLog:
PR analyzer/93438
* gcc.dg/analyzer/torture/pr93438.c: New test.
* gcc.dg/analyzer/torture/pr93438-2.c: New test.
Jakub Jelinek [Fri, 31 Jan 2020 18:35:11 +0000 (19:35 +0100)]
testsuite: Fix up pr91838.C test [PR91838]
The test FAILs on i686-linux with:
FAIL: g++.dg/pr91838.C (test for excess errors)
Excess errors:
/home/jakub/src/gcc/gcc/testsuite/g++.dg/pr91838.C:7:8: warning: MMX vector return without MMX enabled changes the ABI [-Wpsabi]
/home/jakub/src/gcc/gcc/testsuite/g++.dg/pr91838.C:7:3: warning: MMX vector argument without MMX enabled changes the ABI [-Wpsabi]
and on x86_64-linux with -m32 testing with failure to match the
expected pattern in there (or both with e.g. -m32/-mno-mmx/-mno-sse testing).
The test is also in a wrong directory, has non-standard specification that
it requires c++11 or later.
2020-01-31 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/91838
* g++.dg/pr91838.C: Moved to ...
* g++.dg/opt/pr91838.C: ... here. Require c++11 target instead of
dg-skip-if for c++98. Pass -Wno-psabi -w to avoid psabi style
warnings on vector arg passing or return. Add -masm=att on i?86/x86_64.
Only check for pxor %xmm0, %xmm0 on lp64 i?86/x86_64.
Richard Sandiford [Thu, 30 Jan 2020 15:46:28 +0000 (15:46 +0000)]
aarch64: Add Armv8.6 SVE bfloat16 support
This patch adds support for the SVE intrinsics that map to Armv8.6
bfloat16 instructions. This means that svcvtnt is now a base SVE
function for one type suffix combination; the others are still
SVE2-specific.
This relies on a binutils fix:
https://sourceware.org/ml/binutils/2020-01/msg00450.html
so anyone testing older binutils 2.34 or binutils master sources will
need to upgrade to get clean test results. (At the time of writing,
no released version of binutils has this bug.)
2020-01-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.h (TARGET_SVE_BF16): New macro.
* config/aarch64/aarch64-sve-builtins-sve2.h (svcvtnt): Move to
aarch64-sve-builtins-base.h.
* config/aarch64/aarch64-sve-builtins-sve2.cc (svcvtnt): Move to
aarch64-sve-builtins-base.cc.
* config/aarch64/aarch64-sve-builtins-base.h (svbfdot, svbfdot_lane)
(svbfmlalb, svbfmlalb_lane, svbfmlalt, svbfmlalt_lane, svbfmmla)
(svcvtnt): Declare.
* config/aarch64/aarch64-sve-builtins-base.cc (svbfdot, svbfdot_lane)
(svbfmlalb, svbfmlalb_lane, svbfmlalt, svbfmlalt_lane, svbfmmla)
(svcvtnt): New functions.
* config/aarch64/aarch64-sve-builtins-base.def (svbfdot, svbfdot_lane)
(svbfmlalb, svbfmlalb_lane, svbfmlalt, svbfmlalt_lane, svbfmmla)
(svcvtnt): New functions.
(svcvt): Add a form that converts f32 to bf16.
* config/aarch64/aarch64-sve-builtins-shapes.h (ternary_bfloat)
(ternary_bfloat_lane, ternary_bfloat_lanex2, ternary_bfloat_opt_n):
Declare.
* config/aarch64/aarch64-sve-builtins-shapes.cc (parse_element_type):
Treat B as bfloat16_t.
(ternary_bfloat_lane_base): New class.
(ternary_bfloat_def): Likewise.
(ternary_bfloat): New shape.
(ternary_bfloat_lane_def): New class.
(ternary_bfloat_lane): New shape.
(ternary_bfloat_lanex2_def): New class.
(ternary_bfloat_lanex2): New shape.
(ternary_bfloat_opt_n_def): New class.
(ternary_bfloat_opt_n): New shape.
* config/aarch64/aarch64-sve-builtins.cc (TYPES_cvt_bfloat): New macro.
* config/aarch64/aarch64-sve.md (@aarch64_sve_<sve_fp_op>vnx4sf)
(@aarch64_sve_<sve_fp_op>_lanevnx4sf): New patterns.
(@aarch64_sve_<optab>_trunc<VNx4SF_ONLY:mode><VNx8BF_ONLY:mode>)
(@cond_<optab>_trunc<VNx4SF_ONLY:mode><VNx8BF_ONLY:mode>): Likewise.
(*cond_<optab>_trunc<VNx4SF_ONLY:mode><VNx8BF_ONLY:mode>): Likewise.
(@aarch64_sve_cvtnt<VNx8BF_ONLY:mode>): Likewise.
* config/aarch64/aarch64-sve2.md (@aarch64_sve2_cvtnt<mode>): Key
the pattern off the narrow mode instead of the wider one.
* config/aarch64/iterators.md (VNx8BF_ONLY): New mode iterator.
(UNSPEC_BFMLALB, UNSPEC_BFMLALT, UNSPEC_BFMMLA): New unspecs.
(sve_fp_op): Handle them.
(SVE_BFLOAT_TERNARY_LONG): New int itertor.
(SVE_BFLOAT_TERNARY_LONG_LANE): Likewise.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_aarch64_asm_bf16_ok):
New proc.
* gcc.target/aarch64/sve/acle/asm/bfdot_f32.c: New test.
* gcc.target/aarch64/sve/acle/asm/bfdot_lane_f32.c: Likweise.
* gcc.target/aarch64/sve/acle/asm/bfmlalb_f32.c: Likweise.
* gcc.target/aarch64/sve/acle/asm/bfmlalb_lane_f32.c: Likweise.
* gcc.target/aarch64/sve/acle/asm/bfmlalt_f32.c: Likweise.
* gcc.target/aarch64/sve/acle/asm/bfmlalt_lane_f32.c: Likweise.
* gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c: Likweise.
* gcc.target/aarch64/sve/acle/asm/cvt_bf16.c: Likweise.
* gcc.target/aarch64/sve/acle/asm/cvtnt_bf16.c: Likweise.
* gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_1.c: Likweise.
* gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lane_1.c:
Likweise.
* gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_lanex2_1.c:
Likweise.
* gcc.target/aarch64/sve/acle/general-c/ternary_bfloat16_opt_n_1.c:
Likweise.
Richard Sandiford [Wed, 29 Jan 2020 16:06:58 +0000 (16:06 +0000)]
aarch64: Add svbfloat16_t support to arm_sve.h
This patch adds support for the bfloat16-related vectors to
arm_sve.h. It also adds support for functions that just treat
bfloat16_t as a bag of 16 bits; these functions are available
for bf16 whenever they're available for other 16-bit types.
Previously "all_data" was used for both data movement and for arithmetic
that happened to be defined for all data types. Adding bf16 means we
need to distinguish between the two cases.
The patch also reorders the mode definitions in aarch64-modes.def,
which means we no longer need separate VECTOR_MODE entries for BF
vectors.
2020-01-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/arm_sve.h: Include arm_bf16.h.
* config/aarch64/aarch64-modes.def (BF): Move definition before
VECTOR_MODES. Remove separate VECTOR_MODES for V4BF and V8BF.
(SVE_MODES): Handle BF modes.
* config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle
BF modes.
(aarch64_full_sve_mode): Likewise.
* config/aarch64/iterators.md (SVE_STRUCT): Add VNx16BF, VNx24BF
and VNx32BF.
(SVE_FULL, SVE_FULL_HSD, SVE_ALL): Add VNx8BF.
(Vetype, Vesize, Vctype, VEL, Vel, VEL_INT, V128, v128, vwcore)
(V_INT_EQUIV, v_int_equiv, V_FP_EQUIV, v_fp_equiv, vector_count)
(insn_length, VSINGLE, vsingle, VPRED, vpred, VDOUBLE): Handle the
new SVE BF modes.
* config/aarch64/aarch64-sve-builtins.h (TYPE_bfloat): New
type_class_index.
* config/aarch64/aarch64-sve-builtins.cc (TYPES_all_arith): New macro.
(TYPES_all_data): Add bf16.
(TYPES_reinterpret1, TYPES_reinterpret): Likewise.
(register_tuple_type): Increase buffer size.
* config/aarch64/aarch64-sve-builtins.def (svbfloat16_t): New type.
(bf16): New type suffix.
* config/aarch64/aarch64-sve-builtins-base.def (svabd, svadd, svaddv)
(svcmpeq, svcmpge, svcmpgt, svcmple, svcmplt, svcmpne, svmad, svmax)
(svmaxv, svmin, svminv, svmla, svmls, svmsb, svmul, svsub, svsubr):
Change type from all_data to all_arith.
* config/aarch64/aarch64-sve-builtins-sve2.def (svaddp, svmaxp)
(svminp): Likewise.
gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/mangle_1.C: Test mangling
of svbfloat16_t.
* g++.target/aarch64/sve/acle/general-c++/mangle_2.C: Likewise for
__SVBfloat16_t.
* gcc.target/aarch64/sve/acle/asm/clasta_bf16.c: New test.
* gcc.target/aarch64/sve/acle/asm/clastb_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/cnt_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/create2_1.c (create_bf16): Likewise.
* gcc.target/aarch64/sve/acle/asm/create3_1.c (create_bf16): Likewise.
* gcc.target/aarch64/sve/acle/asm/create4_1.c (create_bf16): Likewise.
* gcc.target/aarch64/sve/acle/asm/dup_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dup_lane_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/dupq_lane_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ext_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/get2_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/get3_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/get4_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/insr_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lasta_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/lastb_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1rq_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld2_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld3_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld4_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ldnt1_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/len_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c
(reinterpret_f16_bf16_tied1, reinterpret_f16_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c
(reinterpret_f32_bf16_tied1, reinterpret_f32_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c
(reinterpret_f64_bf16_tied1, reinterpret_f64_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c
(reinterpret_s16_bf16_tied1, reinterpret_s16_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c
(reinterpret_s32_bf16_tied1, reinterpret_s32_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c
(reinterpret_s64_bf16_tied1, reinterpret_s64_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c
(reinterpret_s8_bf16_tied1, reinterpret_s8_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c
(reinterpret_u16_bf16_tied1, reinterpret_u16_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c
(reinterpret_u32_bf16_tied1, reinterpret_u32_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c
(reinterpret_u64_bf16_tied1, reinterpret_u64_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c
(reinterpret_u8_bf16_tied1, reinterpret_u8_bf16_untied): Likewise.
* gcc.target/aarch64/sve/acle/asm/rev_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/sel_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/set2_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/set3_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/set4_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/splice_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/st1_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/st2_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/st3_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/st4_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/stnt1_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/tbl_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/undef2_1.c (bfloat16_t): Likewise.
* gcc.target/aarch64/sve/acle/asm/undef3_1.c (bfloat16_t): Likewise.
* gcc.target/aarch64/sve/acle/asm/undef4_1.c (bfloat16_t): Likewise.
* gcc.target/aarch64/sve/acle/asm/undef_1.c (bfloat16_t): Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2_bf16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_bf16.c: Likewise.
* gcc.target/aarch64/sve/pcs/annotate_1.c (ret_bf16, ret_bf16x2)
(ret_bf16x3, ret_bf16x4): Likewise.
* gcc.target/aarch64/sve/pcs/annotate_2.c (fn_bf16, fn_bf16x2)
(fn_bf16x3, fn_bf16x4): Likewise.
* gcc.target/aarch64/sve/pcs/annotate_3.c (fn_bf16, fn_bf16x2)
(fn_bf16x3, fn_bf16x4): Likewise.
* gcc.target/aarch64/sve/pcs/annotate_4.c (fn_bf16, fn_bf16x2)
(fn_bf16x3, fn_bf16x4): Likewise.
* gcc.target/aarch64/sve/pcs/annotate_5.c (fn_bf16, fn_bf16x2)
(fn_bf16x3, fn_bf16x4): Likewise.
* gcc.target/aarch64/sve/pcs/annotate_6.c (fn_bf16, fn_bf16x2)
(fn_bf16x3, fn_bf16x4): Likewise.
* gcc.target/aarch64/sve/pcs/annotate_7.c (fn_bf16, fn_bf16x2)
(fn_bf16x3, fn_bf16x4): Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_bf16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_bf16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_bf16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_bf16.c: Likewise.
* gcc.target/aarch64/sve/pcs/gnu_vectors_1.c (bfloat16x16_t): New
typedef.
(bfloat16_callee, bfloat16_caller): New tests.
* gcc.target/aarch64/sve/pcs/gnu_vectors_2.c (bfloat16x16_t): New
typedef.
(bfloat16_callee, bfloat16_caller): New tests.
* gcc.target/aarch64/sve/pcs/return_4.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_4_128.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_4_256.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_4_512.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_4_1024.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_4_2048.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_5.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_5_128.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_5_256.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_5_512.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_5_1024.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_5_2048.c (CALLER_BF16): New macro.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_6.c (bfloat16_t): New typedef.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_6_128.c (bfloat16_t): New typedef.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_6_256.c (bfloat16_t): New typedef.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_6_512.c (bfloat16_t): New typedef.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_6_1024.c (bfloat16_t): New typedef.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_6_2048.c (bfloat16_t): New typedef.
(callee_bf16, caller_bf16): New tests.
* gcc.target/aarch64/sve/pcs/return_7.c (callee_bf16): Likewise
(caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_8.c (callee_bf16): Likewise
(caller_bf16): Likewise.
* gcc.target/aarch64/sve/pcs/return_9.c (callee_bf16): Likewise
(caller_bf16): Likewise.
* gcc.target/aarch64/sve2/acle/asm/tbl2_bf16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/tbx_bf16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/whilerw_bf16.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/whilewr_bf16.c: Likewise.
Richard Sandiford [Tue, 28 Jan 2020 13:49:49 +0000 (13:49 +0000)]
aarch64: Add Armv8.6 SVE matrix multiply support
This mostly follows existing practice. Perhaps the only noteworthy
thing is that svmmla is split across three extensions (i8mm, f32mm
and f64mm), any of which can be enabled independently. The easiest
way of coping with this seemed to be to add a fourth svmmla entry
for base SVE, but with no type suffixes. This means that the
overloaded function is always available for C, but never successfully
resolves without the appropriate target feature.
2020-01-31 Dennis Zhang <dennis.zhang@arm.com>
Matthew Malcomson <matthew.malcomson@arm.com>
Richard Sandiford <richard.sandiford@arm.com>
gcc/
* doc/invoke.texi (f32mm): Document new AArch64 -march= extension.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_SVE_MATMUL_INT8, __ARM_FEATURE_SVE_MATMUL_FP32 and
__ARM_FEATURE_SVE_MATMUL_FP64 as appropriate. Don't define
__ARM_FEATURE_MATMUL_FP64.
* config/aarch64/aarch64-option-extensions.def (fp, simd, fp16)
(sve): Add AARCH64_FL_F32MM to the list of extensions that should
be disabled at the same time.
(f32mm): New extension.
* config/aarch64/aarch64.h (AARCH64_FL_F32MM): New macro.
(AARCH64_FL_F64MM): Bump to the next bit up.
(AARCH64_ISA_F32MM, TARGET_SVE_I8MM, TARGET_F32MM, TARGET_SVE_F32MM)
(TARGET_SVE_F64MM): New macros.
* config/aarch64/iterators.md (SVE_MATMULF): New mode iterator.
(UNSPEC_FMMLA, UNSPEC_SMATMUL, UNSPEC_UMATMUL, UNSPEC_USMATMUL)
(UNSPEC_TRN1Q, UNSPEC_TRN2Q, UNSPEC_UZP1Q, UNSPEC_UZP2Q, UNSPEC_ZIP1Q)
(UNSPEC_ZIP2Q): New unspeccs.
(DOTPROD_US_ONLY, PERMUTEQ, MATMUL, FMMLA): New int iterators.
(optab, sur, perm_insn): Handle the new unspecs.
(sve_fp_op): Handle UNSPEC_FMMLA. Resort.
* config/aarch64/aarch64-sve.md (@aarch64_sve_ld1ro<mode>): Use
TARGET_SVE_F64MM instead of separate tests.
(@aarch64_<DOTPROD_US_ONLY:sur>dot_prod<vsi2qi>): New pattern.
(@aarch64_<DOTPROD_US_ONLY:sur>dot_prod_lane<vsi2qi>): Likewise.
(@aarch64_sve_add_<MATMUL:optab><vsi2qi>): Likewise.
(@aarch64_sve_<FMMLA:sve_fp_op><mode>): Likewise.
(@aarch64_sve_<PERMUTEQ:optab><mode>): Likewise.
* config/aarch64/aarch64-sve-builtins.cc (TYPES_s_float): New macro.
(TYPES_s_float_hsd_integer, TYPES_s_float_sd_integer): Use it.
(TYPES_s_signed): New macro.
(TYPES_s_integer): Use it.
(TYPES_d_float): New macro.
(TYPES_d_data): Use it.
* config/aarch64/aarch64-sve-builtins-shapes.h (mmla): Declare.
(ternary_intq_uintq_lane, ternary_intq_uintq_opt_n, ternary_uintq_intq)
(ternary_uintq_intq_lane, ternary_uintq_intq_opt_n): Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc (mmla_def): New class.
(svmmla): New shape.
(ternary_resize2_opt_n_base): Add TYPE_CLASS2 and TYPE_CLASS3
template parameters.
(ternary_resize2_lane_base): Likewise.
(ternary_resize2_base): New class.
(ternary_qq_lane_base): Likewise.
(ternary_intq_uintq_lane_def): Likewise.
(ternary_intq_uintq_lane): New shape.
(ternary_intq_uintq_opt_n_def): New class
(ternary_intq_uintq_opt_n): New shape.
(ternary_qq_lane_def): Inherit from ternary_qq_lane_base.
(ternary_uintq_intq_def): New class.
(ternary_uintq_intq): New shape.
(ternary_uintq_intq_lane_def): New class.
(ternary_uintq_intq_lane): New shape.
(ternary_uintq_intq_opt_n_def): New class.
(ternary_uintq_intq_opt_n): New shape.
* config/aarch64/aarch64-sve-builtins-base.h (svmmla, svsudot)
(svsudot_lane, svtrn1q, svtrn2q, svusdot, svusdot_lane, svusmmla)
(svuzp1q, svuzp2q, svzip1q, svzip2q): Declare.
* config/aarch64/aarch64-sve-builtins-base.cc (svdot_lane_impl):
Generalize to...
(svdotprod_lane_impl): ...this new class.
(svmmla_impl, svusdot_impl): New classes.
(svdot_lane): Update to use svdotprod_lane_impl.
(svmmla, svsudot, svsudot_lane, svtrn1q, svtrn2q, svusdot)
(svusdot_lane, svusmmla, svuzp1q, svuzp2q, svzip1q, svzip2q): New
functions.
* config/aarch64/aarch64-sve-builtins-base.def (svmmla): New base
function, with no types defined.
(svmmla, svusmmla, svsudot, svsudot_lane, svusdot, svusdot_lane): New
AARCH64_FL_I8MM functions.
(svmmla): New AARCH64_FL_F32MM function.
(svld1ro): Depend only on AARCH64_FL_F64MM, not on AARCH64_FL_V8_6.
(svmmla, svtrn1q, svtrn2q, svuz1q, svuz2q, svzip1q, svzip2q): New
AARCH64_FL_F64MM function.
(REQUIRED_EXTENSIONS):
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_aarch64_asm_i8mm_ok)
(check_effective_target_aarch64_asm_f32mm_ok): New target selectors.
* gcc.target/aarch64/pragma_cpp_predefs_2.c: Test handling of
__ARM_FEATURE_SVE_MATMUL_INT8, __ARM_FEATURE_SVE_MATMUL_FP32 and
__ARM_FEATURE_SVE_MATMUL_FP64.
* gcc.target/aarch64/sve/acle/asm/test_sve_acle.h (TEST_TRIPLE_Z):
(TEST_TRIPLE_Z_REV2, TEST_TRIPLE_Z_REV, TEST_TRIPLE_LANE_REG)
(TEST_TRIPLE_ZX): New macros.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: Remove +sve and
rely on +f64mm to enable it.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mmla_f32.c: New test.
* gcc.target/aarch64/sve/acle/asm/mmla_f64.c: Likewise,
* gcc.target/aarch64/sve/acle/asm/mmla_s32.c: Likewise,
* gcc.target/aarch64/sve/acle/asm/mmla_u32.c: Likewise,
* gcc.target/aarch64/sve/acle/asm/sudot_lane_s32.c: Likewise,
* gcc.target/aarch64/sve/acle/asm/sudot_s32.c: Likewise,
* gcc.target/aarch64/sve/acle/asm/trn1q_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn1q_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/trn2q_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/usdot_lane_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/usdot_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/usmmla_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp1q_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/uzp2q_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip1q_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_f16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_s8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_u64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/zip2q_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/mmla_1.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/mmla_2.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/mmla_3.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/mmla_4.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/mmla_5.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/mmla_6.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/mmla_7.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_lane_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/ternary_intq_uintq_opt_n_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_lane_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/ternary_uintq_intq_opt_n_1.c:
Likewise.
Richard Sandiford [Fri, 31 Jan 2020 13:56:31 +0000 (13:56 +0000)]
aarch64: Fix SVE PCS failures for BE & ILP32
This patch should (finally!) give clean test results for
aarch64-sve-pcs.exp for all {be,le}{lp64,ilp32} combinations.
The *_128.c tests require aarch64_little_endian because they test for
fixed-length 128-bit code, whereas -msve-vector-bits=128 still generates
VLA code for big-endian.
Some tests require lp64 because they match (64-bit) pointer loads and
stores. Others require it because ilp32 adds extra zero extensions.
We still have a non-trivial amount of coverage for -mbig-endian -mabi=ilp32:
# of expected passes 663
# of unsupported tests 59
2020-01-31 Richard Sandiford <richard.sandiford@arm.com>
gcc/testsuite/
* gcc.target/aarch64/sve/pcs/args_1.c: Require lp64 for
check-function-bodies tests.
* gcc.target/aarch64/sve/pcs/args_2.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_4.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_2.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_4_be.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_4_le.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Require lp64.
* gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_7.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_9.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4_128.c: Require lp64 and
aarch64_little_endian for check-function-bodies tests.
* gcc.target/aarch64/sve/pcs/return_5_128.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_2_128.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1_128.c: Likewise. Remove
target selector from dg-compile.
* gcc.target/aarch64/sve/pcs/return_6_128.c: Likewise.
Patrick Palka [Tue, 21 Jan 2020 22:00:43 +0000 (17:00 -0500)]
libstdc++: Always return a sentinel<I> from __gnu_test::test_range::end()
It seems that in practice std::sentinel_for<I, I> is always true, and so the
test_range container doesn't help us detect bugs in ranges code in which we
wrongly assume that a sentinel can be manipulated like an iterator. Make the
test_range range more strict by having end() unconditionally return a
sentinel<I>, and adjust some tests accordingly.
libstdc++-v3/ChangeLog:
* testsuite/24_iterators/range_operations/distance.cc: Do not assume
test_range::end() returns the same type as test_range::begin().
* testsuite/24_iterators/range_operations/next.cc: Likewise.
* testsuite/24_iterators/range_operations/prev.cc: Likewise.
* testsuite/util/testsuite_iterators.h (__gnu_test::test_range::end):
Always return a sentinel<I>.
Andrew Stubbs [Wed, 29 Jan 2020 16:59:08 +0000 (16:59 +0000)]
Fix conditional add LRA failure for amdgcn
Fix ICE in testcase gfortran.dg/assumed_rank_bounds_3.f90.
2020-01-31 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn-valu.md (addv64di3_exec): Allow one '0' in each
alternative only.
Uros Bizjak [Fri, 31 Jan 2020 15:44:36 +0000 (16:44 +0100)]
Fix TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling.
The reason for TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL on AMD target is
only insn size, as advised in e.g. Software Optimization Guide for the
AMD Family 15h Processors [1], section 7.1.2, where it is said:
--quote--
7.1.2 Reduce Instruction SizeOptimization
Reduce the size of instructions when possible.
Rationale
Using smaller instruction sizes improves instruction fetch throughput.
Specific examples include the following:
*In SIMD code, use the single-precision (PS) form of instructions
instead of the double-precision (PD) form. For example, for register
to register moves, MOVAPS achieves the same result as MOVAPD, but uses
one less byte to encode the instruction and has no prefix byte. Other
examples in which single-precision forms can be substituted for
double-precision forms include MOVUPS, MOVNTPS, XORPS, ORPS, ANDPS,
and SHUFPS.
...
--/quote--
Please note that this optimization applies only to non-AVX forms, as
demonstrated by:
0: 0f 28 c8 movaps %xmm0,%xmm1
3: 66 0f 28 c8 movapd %xmm0,%xmm1
7: c5 f8 28 d1 vmovaps %xmm1,%xmm2
b: c5 f9 28 d1 vmovapd %xmm1,%xmm2
Also note that MOVDQA is missing in the above optimization. It is
harmful to substitute MOVDQA with MOVAPS, as it can (and does)
introduce +1 cycle forwarding penalty between FLT (FPA/FPM) and INT
(VALU) FP clusters.
[1] https://www.amd.com/system/files/TechDocs/47414_15h_sw_opt_guide.pdf
Kwok Cheung Yeung [Fri, 31 Jan 2020 14:53:30 +0000 (06:53 -0800)]
[amdgcn] Scale number of threads/workers with VGPR usage
2020-01-31 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/mkoffload.c (process_asm): Add sgpr_count and vgpr_count
to definition of hsa_kernel_description. Parse assembly to find SGPR
and VGPR count of kernel and store in hsa_kernel_description.
libgomp/
* plugin/plugin-gcn.c (struct hsa_kernel_description): Add sgpr_count
and vgpr_count fields.
(struct kernel_info): Add a field for a hsa_kernel_description.
(run_kernel): Reduce the number of threads/workers if the requested
number would require too many VGPRs.
(init_basic_kernel_info): Initialize description field with
the hsa_kernel_description entry for the kernel.
Tobias Burnus [Fri, 31 Jan 2020 14:54:21 +0000 (15:54 +0100)]
[Fortran] Disable front-end optimization for OpenACC atomic (PR93462)
PR fortran/93462
* frontend-passes.c (gfc_code_walker): For EXEC_OACC_ATOMIC, set
in_omp_atomic to true prevent front-end optimization.
PR fortran/93462
* gfortran.dg/goacc/atomic-1.f90: New.
Tamar Christina [Fri, 31 Jan 2020 14:39:38 +0000 (14:39 +0000)]
middle-end: Fix logical shift truncation (PR rtl-optimization/91838)
This fixes a fall-out from a patch I had submitted two years ago which started
allowing simplify-rtx to fold logical right shifts by offsets a followed by b
into >> (a + b).
However this can generate inefficient code when the resulting shift count ends
up being the same as the size of the shift mode. This will create some
undefined behavior on most platforms.
This patch changes to code to truncate to 0 if the shift amount goes out of
range. Before my older patch this used to happen in combine when it saw the
two shifts. However since we combine them here combine never gets a chance to
truncate them.
The issue mostly affects GCC 8 and 9 since on 10 the back-end knows how to deal
with this shift constant but it's better to do the right thing in simplify-rtx.
Note that this doesn't take care of the Arithmetic shift where you could replace
the constant with MODE_BITS (mode) - 1, but that's not a regression so punting it.
gcc/ChangeLog:
PR rtl-optimization/91838
* simplify-rtx.c (simplify_binary_operation_1): Update LSHIFTRT case
to truncate if allowed or reject combination.
gcc/testsuite/ChangeLog:
PR rtl-optimization/91838
* g++.dg/pr91838.C: New test.
Andrew Stubbs [Thu, 30 Jan 2020 14:06:12 +0000 (14:06 +0000)]
Fix fast-math-pr55281.c ICE
2020-01-31 Andrew Stubbs <ams@codesourcery.com>
gcc/
* tree-ssa-loop-ivopts.c (get_iv): Use sizetype for zero-step.
(find_inv_vars_cb): Likewise.
David Malcolm [Sun, 26 Jan 2020 23:40:43 +0000 (18:40 -0500)]
calls.c: refactor special_function_p for use by analyzer (v2)
This patch refactors some code in special_function_p that checks for
the function being sane to match by name, splitting it out into a new
maybe_special_function_p, and using it it two places in the analyzer.
gcc/analyzer/ChangeLog:
* analyzer.cc (is_named_call_p): Replace tests for fndecl being
extern at file scope and having a non-NULL DECL_NAME with a call
to maybe_special_function_p.
* function-set.cc (function_set::contains_decl_p): Add call to
maybe_special_function_p.
gcc/ChangeLog:
* calls.c (special_function_p): Split out the check for DECL_NAME
being non-NULL and fndecl being extern at file scope into a
new maybe_special_function_p and call it. Drop check for fndecl
being non-NULL that was after a usage of DECL_NAME (fndecl).
* tree.h (maybe_special_function_p): New inline function.
David Malcolm [Thu, 30 Jan 2020 20:21:28 +0000 (15:21 -0500)]
analyzer: further fixes for comparisons between uncomparable types (PR 93450)
gcc/analyzer/ChangeLog:
PR analyzer/93450
* constraint-manager.cc
(constraint_manager::get_or_add_equiv_class): Only compare constants
if their types are compatible.
* region-model.cc (constant_svalue::eval_condition): Replace check
for identical types with call to types_compatible_p.
Andrew Stubbs [Wed, 29 Jan 2020 16:57:02 +0000 (16:57 +0000)]
Zero-initialise masked load destinations
Fixes an execution failure in testcase gfortran.dg/assumed_rank_1.f90.
2020-01-30 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn-valu.md (gather<mode>_exec): Move contents ...
(mask_gather_load<mode>): ... here, and zero-initialize the
destination.
(maskload<mode>di): Zero-initialize the destination.
* config/gcn/gcn.c:
David Malcolm [Thu, 30 Jan 2020 21:59:15 +0000 (16:59 -0500)]
analyzer: add extrinsic_state::dump
gcc/analyzer/ChangeLog:
* program-state.cc (extrinsic_state::dump_to_pp): New.
(extrinsic_state::dump_to_file): New.
(extrinsic_state::dump): New.
* program-state.h (extrinsic_state::dump_to_pp): New decl.
(extrinsic_state::dump_to_file): New decl.
(extrinsic_state::dump): New decl.
* sm.cc: Include "pretty-print.h".
(state_machine::dump_to_pp): New.
* sm.h (state_machine::dump_to_pp): New decl.