gcc.git
3 years agoarm: Improve documentation for effective target 'arm_softfloat'
Andrea Corallo [Tue, 1 Dec 2020 10:21:33 +0000 (11:21 +0100)]
arm: Improve documentation for effective target 'arm_softfloat'

gcc/ChangeLog

2020-12-01  Andrea Corallo  <andrea.corallo@arm.com>

* doc/sourcebuild.texi (arm_softfloat): Improve documentation.

gcc/testsuite/ChangeLog

2020-12-01  Andrea Corallo  <andrea.corallo@arm.com>

* lib/target-supports.exp (check_effective_target_arm_softfloat):
Improve documentation.

3 years agoarm: [testsuite] fix lob tests for -mfloat-abi=hard
Andrea Corallo [Thu, 26 Nov 2020 11:33:18 +0000 (12:33 +0100)]
arm: [testsuite] fix lob tests for -mfloat-abi=hard

2020-11-26  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/arm/lob2.c: Use '-march=armv8.1-m.main+fp'.
* gcc.target/arm/lob3.c: Skip with '-mfloat-abi=hard'.
* gcc.target/arm/lob4.c: Likewise.
* gcc.target/arm/lob5.c: Use '-march=armv8.1-m.main+fp'.

3 years agotestsuite/98244 - amend gcc.dg/vect/vect-live-6.c
Richard Biener [Fri, 11 Dec 2020 12:45:55 +0000 (13:45 +0100)]
testsuite/98244 - amend gcc.dg/vect/vect-live-6.c

Committed.

2020-12-11  Richard Biener  <rguenther@suse.de>

PR testsuite/98244
* gcc.dg/vect/vect-live-6.c: Require vect_condition.

3 years agotestsuite/98242 - amend gcc.dg/vect/bb-slp-subgroups-3.c
Richard Biener [Fri, 11 Dec 2020 12:31:24 +0000 (13:31 +0100)]
testsuite/98242 - amend gcc.dg/vect/bb-slp-subgroups-3.c

Committed.

2020-12-11  Richard Biener  <rguenther@suse.de>

PR testsuite/98242
* gcc.dg/vect/bb-slp-subgroups-3.c: Require vect_int_mult.

3 years agotestsuite/98240 - amend gcc.dg/vect/pr97678.c
Richard Biener [Fri, 11 Dec 2020 12:23:21 +0000 (13:23 +0100)]
testsuite/98240 - amend gcc.dg/vect/pr97678.c

Committed.

2020-12-11  Richard Biener  <rguenther@suse.de>

PR testsuite/98240
* gcc.dg/vect/pr97678.c: Require vect_int_mult and
vect_pack_trunc.

3 years agotestsuite/98239 - require vect_condition for gcc.dg/vect/bb-slp-69.c
Richard Biener [Fri, 11 Dec 2020 12:13:28 +0000 (13:13 +0100)]
testsuite/98239 - require vect_condition for gcc.dg/vect/bb-slp-69.c

Committed.

2020-12-11  Richard Biener  <rguenther@suse.de>

PR testsuite/98239
* gcc.dg/vect/bb-slp-69.c: Require vect_condition.

3 years agoexpand: Fix up expand_doubleword_mod on 32-bit targets [PR98229]
Jakub Jelinek [Fri, 11 Dec 2020 11:47:52 +0000 (12:47 +0100)]
expand: Fix up expand_doubleword_mod on 32-bit targets [PR98229]

As the testcase shows, for 32-bit word size we can end up with op1
up to 0xffffffff (0x100000000 % 0xffffffff == 1 and so we use bit == 32
for that), but the CONST_INT we got from caller is for DImode in that case
and not valid for SImode operations.

The following patch canonicalizes the two spots where the constant needs
canonicalization.

2020-12-10  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/98229
* optabs.c (expand_doubleword_mod): Canonicalize op1 and
1 - INTVAL (op1) as word_mode constants when used in
word_mode arithmetics.

* gcc.c-torture/compile/pr98229.c: New test.

3 years agotree-optimization/98235 - limit SLP discovery
Richard Biener [Fri, 11 Dec 2020 09:52:58 +0000 (10:52 +0100)]
tree-optimization/98235 - limit SLP discovery

With following backedges and the SLP discovery cache not being
permute aware we have to put some discovery limits in place again.
That's also the opportunity to ditch the separate limit on the
number of permutes we try, so the patch limits the overall work
done (as in vect_build_slp_tree cache misses) to what we compute
as max_tree_size which is based on the number of scalar stmts in
the vectorized region.

Note the limit is global and there's no attempt to divide the
allowed work evenly amongst opportunities, so one degenerate
can eat it all up.  That's probably only relevant for BB
vectorization where the limit is based on up to the size of the
whole function.

2020-12-11  Richard Biener  <rguenther@suse.de>

PR tree-optimization/98235
* tree-vect-slp.c (vect_build_slp_tree): Exchange npermutes
for limit.  Decrement that for each cache miss and fail
discovery when it reaches zero.
(vect_build_slp_tree_2): Remove npermutes handling and
simply pass down limit.
(vect_build_slp_instance): Use pass down limit.
(vect_analyze_slp_instance): Likewise.
(vect_analyze_slp): Base the SLP discovery limit on
max_tree_size and pass it down.

* gcc.dg/torture/pr98235.c: New testcase.

3 years agoexpansion: Sign or zero extend on MEM_REF stores into SUBREG with SUBREG_PROMOTED_VAR...
Jakub Jelinek [Fri, 11 Dec 2020 10:10:17 +0000 (11:10 +0100)]
expansion: Sign or zero extend on MEM_REF stores into SUBREG with SUBREG_PROMOTED_VAR_P [PR98190]

Some targets decide to promote certain scalar variables to wider mode,
so their DECL_RTL is a SUBREG with SUBREG_PROMOTED_VAR_P.
When storing to such vars, store_expr takes care of sign or zero extending,
but if we store e.g. through MEM_REF into them, no sign or zero extension
happens and that leads to wrong-code e.g. on the following testcase on
aarch64-linux.

The following patch uses store_expr if we overwrite all the bits and it is
not reversed storage order, i.e. something that store_expr handles normally,
and otherwise (if the most significant bit is (or for pdp11 might be, but
pdp11 doesn't promote) being modified), the code extends manually.

2020-12-11  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/98190
* expr.c (expand_assignment): If to_rtx is a promoted SUBREG,
ensure sign or zero extension either through use of store_expr
or by extending manually.

* gcc.dg/pr98190.c: New test.

3 years agoira.c: Fix ICE in ira-color [PR97092]
Andrea Corallo [Wed, 9 Dec 2020 16:59:12 +0000 (17:59 +0100)]
ira.c: Fix ICE in ira-color [PR97092]

gcc/ChangeLog

2020-12-10  Andrea Corallo  <andrea.corallo@arm.com>

PR rtl-optimization/97092
* ira-color.c (update_costs_from_allocno): Do not carry over mode
between subsequent iterations.

gcc/testsuite/ChangeLog

2020-12-10  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/aarch64/sve/pr97092.c: New test.

3 years agotree-optimization/95582 - fix vector pattern with bool conversions
Richard Biener [Fri, 11 Dec 2020 09:07:10 +0000 (10:07 +0100)]
tree-optimization/95582 - fix vector pattern with bool conversions

The pattern recognizer fends off against recognizing conversions
from VECT_SCALAR_BOOLEAN_TYPE_P to precision one types but what
it really needs to fend off is conversions between
VECT_SCALAR_BOOLEAN_TYPE_P types - the Ada FE uses an 8 bit
boolean type that satisfies this predicate.

2020-12-11  Richard Biener  <rguenther@suse.de>

PR tree-optimization/95582
* tree-vect-patterns.c (vect_recog_bool_pattern): Check
for VECT_SCALAR_BOOLEAN_TYPE_P, not just precision one.

3 years agoFix feature check for HRESET/AVX_VNNI/UINTR
Hongyu [Wed, 9 Dec 2020 19:18:41 +0000 (19:18 +0000)]
Fix feature check for HRESET/AVX_VNNI/UINTR

gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features):
Move check for HRESET/AVX_VNNI/UINTR out of avx512_usable.

3 years agodojump: Fix up probabilities splitting in dojump.c comparison splitting [PR98212]
Jakub Jelinek [Thu, 10 Dec 2020 23:36:21 +0000 (00:36 +0100)]
dojump: Fix up probabilities splitting in dojump.c comparison splitting [PR98212]

When compiling:
void foo (void);
void bar (float a, float b) { if (__builtin_expect (a != b, 1)) foo (); }
void baz (float a, float b) { if (__builtin_expect (a == b, 1)) foo (); }
void qux (float a, float b) { if (__builtin_expect (a != b, 0)) foo (); }
void corge (float a, float b) { if (__builtin_expect (a == b, 0)) foo (); }
on x86_64, we get (unimportant cruft removed):
bar:    ucomiss %xmm1, %xmm0
        jp      .L4
        je      .L1
.L4:    jmp     foo
.L1:    ret
baz:    ucomiss %xmm1, %xmm0
        jp      .L6
        jne     .L6
        jmp     foo
.L6:    ret
qux:    ucomiss %xmm1, %xmm0
        jp      .L13
        jne     .L13
        ret
.L13:   jmp     foo
corge:  ucomiss %xmm1, %xmm0
        jnp     .L18
.L14:   ret
.L18:   jne     .L14
        jmp     foo
(note for bar and qux that changed with a patch I've posted earlier today).
This is all reasonable, except the last function, the overall jump to
the tail call is predicted unlikely (10%), so it is good jmp foo isn't on
the straight line path, but NaNs are (or should be) considered very unlikely
in the programs, so IMHO the right code (and one emitted with the following
patch) is:
corge:  ucomiss %xmm1, %xmm0
        jp      .L14
        je      .L18
.L14:   ret
.L18:   jmp     foo

Let's discuss the probabilities in the above testcase:
for !and_them it looks all correct, so for
bar we split
if (a != b) goto t; // prob 90%
goto f;
into:
if (a unord b) goto t; // first_prob = prob * cprob = 90% * 1% = 0.9%
if (a ltgt b) goto t; // adjusted prob = (prob - first_prob) / (1 - first_prob) = (90% - 0.9%) / (1 - 0.9%) = 89.909%
and for qux we split
if (a != b) goto t; // prob 10%
goto f;
into:
if (a unord b) goto t; // first_prob = prob * cprob = 10% * 1% = 0.1%
if (a ltgt b) goto t; // adjusted prob = (prob - first_prob) / (1 - first_prob) = (10% - 0.1%) / (1 - 0.1%) = 9.910%
Now, the and_them cases should be probability wise exactly the same
if we swap the f and t labels, because baz
if (a == b) goto t; // prob 90%
goto f;
is equivalent to:
if (a != b) goto f; // prob 10%
goto t;
which is in qux.  This means we could expand baz as:
if (a unord b) goto f; // 0.1%
if (a ltgt b) goto f; // 9.910%
goto t;
But we don't expand it exactly that way, but instead (as the comment says)
as:
if (a ord b) ; else goto f; // first_prob as probability of ;
if (a uneq b) goto t; // adjusted prob
goto f;
So, first_prob.invert () should be 0.1% and adjusted prob should be
1 - 9.910%.
Thus, the right thing is 4 inverts:
prob = prob.invert (); // baz is equivalent to qux with swap(t, f) and thus inverted original prob
first_prob = prob.split (cprob.invert ()).invert ();
// cprob.invert because by doing if (cond) ; else goto f; we effectively invert the condition
// the second invert because first_prob is probability of ; rather than goto f
prob = prob.invert (); // lastly because adjusted prob we want is
// probability of goto t;, while the one from corresponding !and_them case
// would be if (...) goto f; goto t;

2020-12-11  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/98212
* dojump.c (do_compare_rtx_and_jump): Change computation of
first_prob for and_them.  Add comment explaining and_them case.

* gcc.dg/predict-8.c: Adjust expected probability.

3 years agolibstdc++: Remove redundant branches in countl_one and countr_one [PR 98226]
Jonathan Wakely [Thu, 10 Dec 2020 21:57:42 +0000 (21:57 +0000)]
libstdc++: Remove redundant branches in countl_one and countr_one [PR 98226]

There's no need to explicitly check for the maximum value, because the
function we call handles it correctly anyway.

libstdc++-v3/ChangeLog:

PR libstdc++/98226
* include/std/bit (__countl_one, __countr_one): Remove redundant
branches.

3 years agoReduce memory requirements for ranger
Andrew MacLeod [Thu, 10 Dec 2020 19:59:14 +0000 (14:59 -0500)]
Reduce memory requirements for ranger

Calculate block exit info upfront, and then any SSA_NAME which is never
used in an outgoing range calculation is a pure global and can bypass the
on-entry cache.

PR tree-optimization/98174
* gimple-range-cache.cc (ranger_cache::ssa_range_in_bb): Only push
poor values to be examined if it isn't a pure global.
(ranger_cache::block_range): Don't process pure globals.
(ranger_cache::fill_block_cache): Adjust has_edge_range call.
* gimple-range-gori.cc (gori_map::all_outgoing): New bitmap.
(gori_map::gori_map): Allocate all_outgoing.
(gori_map::is_export_p): No specified BB returns global context.
(gori_map::calculate_gori): Accumulate each block into global.
(gori_compute::gori_compute): Preprocess each block for exports.
(gori_compute::has_edge_range_p): No edge returns global context.
* gimple-range-gori.h (has_edge_range_p): Provide default parameter.

3 years agoFix PR ada/98230
Ed Schonberg [Thu, 10 Dec 2020 21:26:57 +0000 (22:26 +0100)]
Fix PR ada/98230

It's a rather curious malfunction of the 'Mod attribute applied to the
variable of a loop whose upper bound is dynamic.

gcc/ada/ChangeLog:
PR ada/98230
* exp_attr.adb (Expand_N_Attribute_Reference, case Mod): Use base
type of argument to obtain static bound and required size.

gcc/testsuite/ChangeLog:
* gnat.dg/modular6.adb: New test.

3 years agoc++: Add make_temp_override generator functions
Jason Merrill [Wed, 2 Sep 2020 20:47:37 +0000 (16:47 -0400)]
c++: Add make_temp_override generator functions

A common pattern before C++17 is the generator function, used to avoid
having to specify the type of a container element by using a function call
to get type deduction; for example, std::make_pair.  C++17 added class type
argument deduction, making generator functions unnecessary for many uses,
but GCC won't be written in C++17 for years yet.

gcc/cp/ChangeLog:

* cp-tree.h (struct type_identity): New.
(make_temp_override): New.
* decl.c (grokdeclarator): Use it.
* except.c (maybe_noexcept_warning): Use it.
* parser.c (cp_parser_enum_specifier): Use it.
(cp_parser_parameter_declaration_clause): Use it.
(cp_parser_gnu_attributes_opt): Use it.
(cp_parser_std_attribute): Use it.

3 years agoc++: Update value of __cplusplus for C++20.
Jason Merrill [Thu, 10 Dec 2020 16:21:50 +0000 (11:21 -0500)]
c++: Update value of __cplusplus for C++20.

It's past time to update this macro to the specified value for C++20.

libcpp/ChangeLog:

* init.c (cpp_init_builtins): Update __cplusplus for C++20.

3 years agoc++: Add fixed test [PR91506]
Marek Polacek [Thu, 10 Dec 2020 20:34:19 +0000 (15:34 -0500)]
c++: Add fixed test [PR91506]

Pre-r11-557 we issued a bogus

  error: parameter may not have variably modified type 'double [x]'

but now we compile this, as we should.

gcc/testsuite/ChangeLog:

PR c++/91506
* g++.dg/init/array60.C: New test.

3 years agoc++: modules & using-decls
Nathan Sidwell [Thu, 10 Dec 2020 19:33:35 +0000 (11:33 -0800)]
c++: modules & using-decls

This extends using-decls to modules.  In modules you can export a
using decl, but the exported decl must have external linkage already.
One thing you can do is export something from the GMF.

The novel thing is that now 'export using foo::bar;' *in namespace
bar* can mean something significant (rather than be an obscure nop).

gcc/cp/
* name-lookup.c (do_nonmember_using_decl): Add INSERT_P parm.
Deal with exporting using decls.
(finish_nonmember_using_decl): Examine BINDING_VECTOR.

3 years agoc++: Name lookup for modules
Nathan Sidwell [Thu, 10 Dec 2020 18:19:07 +0000 (10:19 -0800)]
c++: Name lookup for modules

This augments the name lookup with knowledge about the BINDING_VECTOR.
That holds per-module namespace bindings, and we need to collect the
bindings in visible imports when we do lookup.  We also need to do
some checking when we're pushing a new decl to check we're not
overriding an existing visible binding in some way.

To deal with the Global Module and Module Partitions, we reserve 1 or
2 slots inthe BINDING_VECTOR to record those entities that may
legitimately appear in more than one module.

As mentioned before, the BINDING_VECTOR is created lazily, when
imported bindings appear.  The current TUs decls then appear on slot
zero.

gcc/cp/
* cp-tree.h (visible_instantiation_path): Renamed.
* module.cc (get_originating_module_decl, lazy_load_binding)
(lazy_load_members, visible_instantiation_path): Stubs.
* name-lookup.c (STAT_TYPE_VISIBLE_P, STAT_VISIBLE): New.
(search_imported_binding_slot, init_global_partition)
(get_fixed_binding_slot): New.
(name_lookup::process_module_binding): New.
(name_lookup::search_namespace_only): Search BINDING_VECTOR.
(name_lookup::adl_namespace_fns): Likewise.
(name_lookip::search_adl): Search visible instantiation path.
(maybe_lazily_declare): Maybe lazy load members.
(implicitly_exporT_namespace): New.
(maybe_record_mergeable_decl): New.
(check_module_override): New.
(do_pushdecl): Deal with BINDING_VECTOR, check override.
(add_mergeable_namespace_entity): New.
(get_namespace_binding): Deal with BINDING_VECTOR.
(do_namespace_alias): Call set_originating_module.
(lookup_elaborated_type_1): Deal with BINDING_VECTOR.
(do_pushtag): Call set_originating_module.
(reuse_namespace): New.
(make_namespace_finish): Add FROM_IMPORT parm.
(push_namespace): Deal with BINDING_VECTOR & namespace reuse.
(maybe_save_operator_binding): Save when module CMI in play.
* name-lookup.h (add_mergeable_namespace_entity): Declare.

3 years agoc++: modularize spelling suggestions
Nathan Sidwell [Thu, 10 Dec 2020 16:28:31 +0000 (08:28 -0800)]
c++: modularize spelling suggestions

This augments the spelling suggestion code to understand about visible
imported modules.  Simply consider each visible binding in the
binding_vector, until we find one that has something of interest.

gcc/cp/
* name-lookup.c: Include bitmap.h.
(enum binding_slots): New.
(maybe_add_fuzzy_binding): Return bool true if found.
(consider_binding_level): Add module support.
* module.cc (get_import_bitmap): Stub.

3 years agoarm: Fix typo in testcase mve-vsub_1.c
Dennis Zhang [Thu, 10 Dec 2020 15:36:23 +0000 (15:36 +0000)]
arm: Fix typo in testcase mve-vsub_1.c

gcc/testsuite/
* gcc.target/arm/simd/mve-vsub_1.c: Fix typo.
Remove needless dg-additional-options.

3 years agoc++: Add fixed test [PR68451]
Marek Polacek [Thu, 10 Dec 2020 14:56:40 +0000 (09:56 -0500)]
c++: Add fixed test [PR68451]

I was about to add this test with dg-ice but it turned out it had
already been fixed by the recent r11-3361!

gcc/testsuite/ChangeLog:

PR c++/68451
* g++.dg/cpp0x/friend6.C: New test.

3 years agoc++: name-lookup refactoring
Nathan Sidwell [Thu, 10 Dec 2020 14:54:37 +0000 (06:54 -0800)]
c++: name-lookup refactoring

Here are some refactorings to the name-lookup machinery.  Primarily
breakout out worker functions that the modules patch will also use.
Fixing a couple of comments on the way.

gcc/cp/
* name-lookup.c (pop_local_binding): Check for IDENTIFIER_ANON_P.
(update_binding): Level may be null, don't add namespaces to
level.
(newbinding_bookkeeping): New, broken out of ...
(do_pushdecl): ... here, call it.  Don't push anonymous decls.
(pushdecl, add_using_namespace): Correct comments.
(do_push_nested_namespace): Remove assert.
(make_namespace, make_namespace_finish): New, broken out of ...
(push_namespace): ... here.  Call them.  Add namespace to level
here.

3 years agoSmall fix to PLACEHOLDER_EXPR handling in loc_list_from_tree_1
Eric Botcazou [Thu, 10 Dec 2020 14:35:28 +0000 (15:35 +0100)]
Small fix to PLACEHOLDER_EXPR handling in loc_list_from_tree_1

This handles the discriminated record types of Ada: the PLACEHOLDER_EXPR is
the "template" expression for the discriminant in the type definition. Now
for some components, typically arrays whose upper bound is the discriminant,
the compiler creates a local subtype for the component, so the code needs to
be able to deal with this nested type.

gcc/ChangeLog:
* dwarf2out.c (loc_list_from_tree_1) <PLACEHOLDER_EXPR>: Deal with
a nested context type

3 years agoc++: Module-specific error and tree dumping
Nathan Sidwell [Wed, 9 Dec 2020 20:18:06 +0000 (12:18 -0800)]
c++: Module-specific error and tree dumping

With modules, we need the ability to name 'foos' in different modules.
The idiom for that is a trailing '@modulename' suffix.  This adds that
to the error printing routines.  I also augment the tree dumping
machinery to show module-specific metadata.

gcc/cp/
* error.c (dump_module_suffix): New.
(dump_aggr_type, dump_simple_decl, dump_function_name): Call it.
* ptree.c (cxx_print_decl): Print module information.
* module.cc (module_name, get_importing_module): Stubs.

3 years agoc++: name-lookup cleanups
Nathan Sidwell [Wed, 9 Dec 2020 18:46:58 +0000 (10:46 -0800)]
c++: name-lookup cleanups

Name-lookup is the most changed piece of the front end for modules.
Here are some preparatort cleanups and API extensions.

gcc/cp/
* name-lookup.h (set_class_bindings): Return vector, take signed
'extra' parm.
* name-lookup.c (maybe_lazily_declare): Break out ...
(get_class_binding): .. of here, call it.
(find_member_slot): Adjust get_class_bindings call.
(set_class_bindings): Allow -ve extra.  Return the vector.
(set_identifier_type_value_with_scope): Remove checking assert.
(lookup_using_decl): Set decl's context.
(do_pushtag): Adjust set_identifier_type_value_with_scope handling.

3 years agoRemove misleading debug line entries
Bernd Edlinger [Mon, 7 Dec 2020 11:00:00 +0000 (12:00 +0100)]
Remove misleading debug line entries

This removes gimple_debug_begin_stmts without block info which remain
after a gimple block originating from an inline function is unused.

The line numbers from these stmts are from the inline function,
but since the inline function is completely optimized away,
there will be no DW_TAG_inlined_subroutine so the debugger has
no callstack available at this point, and therefore those
line table entries are not helpful to the user.

2020-12-10  Bernd Edlinger  <bernd.edlinger@hotmail.de>

* cfgexpand.c (expand_gimple_basic_block): Remove special handling
of debug_inline_entries without block info.
* tree-inline.c (remap_gimple_stmt): Drop debug_nonbind_markers when
the call statement has no block info.
(copy_debug_stmt): Remove debug_nonbind_markers when inlining
and the block info is mapped to NULL.
* tree-ssa-live.c (clear_unused_block_pointer): Remove
debug_nonbind_markers originating from removed inline functions.

3 years agoremove obsolete conversion handling from vectorizable_assignment
Richard Biener [Thu, 10 Dec 2020 12:33:12 +0000 (13:33 +0100)]
remove obsolete conversion handling from vectorizable_assignment

This removes an odd special-case of VECTOR_BOOLEAN_TYPE_P typed
conversions from vectorizable_assignment that was obsoleted by
making all integer mode VECTOR_BOOLEAN_TYPE_P types have 1-bit
precision bool components with 605c2a393d3a2db8

2020-12-10  Richard Biener  <rguenther@suse.de>

* tree-vect-stmts.c (vectorizable_assignment): Remove special
allowance of VECTOR_BOOLEAN_TYPE_P conversions.

3 years agoarm: Auto-vectorization for MVE: vand
Christophe Lyon [Thu, 12 Nov 2020 20:16:05 +0000 (20:16 +0000)]
arm: Auto-vectorization for MVE: vand

This patch enables MVE vandq instructions for auto-vectorization.  MVE
vandq insns in mve.md are modified to use 'and' instead of unspec
expression to support and<mode>3.  The and<mode>3 expander is added to
vec-common.md

2020-12-03  Christophe Lyon  <christophe.lyon@linaro.org>

gcc/
* config/arm/iterators.md (supf): Remove VANDQ_S and VANDQ_U.
(VANQ): Remove.
(VDQ): Add TARGET_HAVE_MVE condition where relevant.
* config/arm/mve.md (mve_vandq_u<mode>): New entry for vand
instruction using expression 'and'.
(mve_vandq_s<mode>): New expander.
(mve_vaddq_n_f<mode>): Use 'and' code instead of unspec.
* config/arm/neon.md (and<mode>3): Rename into and<mode>3_neon.
* config/arm/predicates.md (imm_for_neon_inv_logic_operand):
Enable for MVE.
* config/arm/unspecs.md (VANDQ_S, VANDQ_U, VANDQ_F): Remove.
* config/arm/vec-common.md (and<mode>3): New expander.

gcc/testsuite/
* gcc.target/arm/simd/mve-vand.c: New test.

3 years agodata-ref: Rework integer handling in split_constant_offset [PR98069]
Richard Sandiford [Thu, 10 Dec 2020 12:10:00 +0000 (12:10 +0000)]
data-ref: Rework integer handling in split_constant_offset [PR98069]

PR98069 is about a case in which split_constant_offset miscategorises
an expression of the form:

  int foo;
  …
  POINTER_PLUS_EXPR<base, (sizetype)(INT_MIN - foo) * size>

as:

  base: base
  offset: (sizetype) (-foo) * size
  init: INT_MIN * size

“-foo” overflows when “foo” is INT_MIN, whereas the original expression
didn't overflow in that case.

As discussed in the PR trail, we could simply ignore the fact that
int overflow is undefined and treat it as a wrapping type, but that
is likely to pessimise quite a few cases.

This patch instead reworks split_constant_offset so that:

- it treats integer operations as having an implicit cast to sizetype
- for integer operations, the returned VAR has type sizetype

In other words, the problem becomes to express:

  (sizetype) (OP0 CODE OP1)

as:

  VAR:sizetype + (sizetype) OFF:ssizetype

The top-level integer split_constant_offset will (usually) be a sizetype
POINTER_PLUS operand, so the extra cast to sizetype disappears.  But adding
the cast allows the conversion handling to defer a lot of the difficult
cases to the recursive split_constant_offset call, which can detect
overflow on individual operations.

The net effect is to analyse the access above as:

  base: base
  offset: -(sizetype) foo * size
  init: INT_MIN * size

See the comments in the patch for more details.

gcc/
PR tree-optimization/98069
* tree-data-ref.c (compute_distributive_range): New function.
(nop_conversion_for_offset_p): Likewise.
(split_constant_offset): In the internal overload, treat integer
expressions as having an implicit cast to sizetype and express
them accordingly.  Pass back the range of the original (uncast)
expression in a new range parameter.
(split_constant_offset_1): Likewise.  Rework the handling of
conversions to account for the implicit sizetype casts.

3 years ago[VECT] pr97929 fix
Joel Hutton [Thu, 10 Dec 2020 11:55:18 +0000 (11:55 +0000)]
[VECT] pr97929 fix

This addresses pr97929. The case for WIDEN_PLUS and WIDEN_MINUS were
missing in vect_get_smallest_scalar_type.

gcc/ChangeLog:

PR tree-optimization/97929
* tree-vect-data-refs.c (vect_get_smallest_scalar_type): Add
WIDEN_PLUS/WIDEN_MINUS case.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr97929.c: New test.

3 years agoAdd WIDEN_PLUS, WIDEN_MINUS pretty print
Joel Hutton [Thu, 10 Dec 2020 11:54:03 +0000 (11:54 +0000)]
Add WIDEN_PLUS, WIDEN_MINUS pretty print

Add 'w+'/'w-' as WIDEN_PLUS/WIDEN_MINUS respectively.
Add VEC_WIDEN_PLUS/MINUS_HI/LO<...> for
VEC_WIDEN_PLUS/MINUS_HI/LO

gcc/ChangeLog:

* tree-pretty-print.c (dump_generic_node): Add case for
VEC_WIDEN_(PLUS/MINUS)_(HI/LO)_EXPR and WIDEN_(PLUS/MINUS)_EXPR.

3 years agotree-optimization/98211 - fix bogus vectorization of conversion
Richard Biener [Thu, 10 Dec 2020 10:12:53 +0000 (11:12 +0100)]
tree-optimization/98211 - fix bogus vectorization of conversion

Pattern recog incompletely handles some bool cases but we shouldn't
miscompile as a result but not vectorize.  Unfortunately
vectorizable_assignment lets invalid conversions (that
vectorizable_conversion rejects) slip through.  The following
rectifies that.

2020-12-10  Richard Biener  <rguenther@suse.de>

PR tree-optimization/98211
* tree-vect-stmts.c (vectorizable_assignment): Disallow
invalid conversions to bool vector types.

* gcc.dg/pr98211.c: New testcase.

3 years agodrop __builtin_ from __clear_cache libname
Alexandre Oliva [Thu, 10 Dec 2020 09:23:36 +0000 (06:23 -0300)]
drop __builtin_ from __clear_cache libname

I made a cut&pasto in my previous patch for tree.c, causing platforms
that have CLEAR_INSN_CACHE defined, and none of the internal
__clear_cache expansion overriders, to issue calls to symbols named
__builtin___clear_cache rather than __clear_cache, on languages other
than those in the C family.  Oops.

This patch removes __builtin_ from the string used as the libname for
__buuiltin___clear_cache.

for  gcc/ChangeLog

* tree.c (build_common_builtin_nodes): Drop __builtin_ from
__clear_cache libname.

3 years agodojump: Improve float != comparisons on x86 [PR98212]
Jakub Jelinek [Thu, 10 Dec 2020 11:03:30 +0000 (12:03 +0100)]
dojump: Improve float != comparisons on x86 [PR98212]

The x86 backend doesn't have EQ or NE floating point comparisons,
so splits x != y into x unord y || x <> y.  The problem with that is
that unord comparison doesn't trap on qNaN operands but LTGT does.
The end effect is that it doesn't trap on qNaN operands, because x unord y
will be true for those and so LTGT will not be performed, but as the backend
is currently unable to merge signalling and non-signalling comparisons (and
after all, with this exact exception it shouldn't unless the first one is
signalling and the second one is non-signalling) it means we end up with:
        ucomiss %xmm1, %xmm0
        jp      .L4
        comiss  %xmm1, %xmm0
        jne     .L4
        ret
        .p2align 4,,10
        .p2align 3
.L4:
        xorl    %eax, %eax
        jmp     foo
where the comiss is the signalling comparison, but we already know that
the right flags bits are already computed by the ucomiss insn.

The following patch, if target supports UNEQ comparisons, splits NE
as x unord y || !(x uneq y) instead, which in the end means we end up with
just:
        ucomiss %xmm1, %xmm0
        jp      .L4
        jne     .L4
        ret
        .p2align 4,,10
        .p2align 3
.L4:
        jmp     foo
because UNEQ is like UNORDERED non-signalling.

2020-12-10  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/98212
* dojump.c (do_compare_rtx_and_jump): When splitting NE and backend
can do UNEQ, prefer splitting x != y into x unord y || !(x uneq y)
instead of into x unord y || x ltgt y.

* gcc.target/i386/pr98212.c: New test.

3 years agodojump: Optimize a == a or a != a [PR98169]
Jakub Jelinek [Thu, 10 Dec 2020 10:46:08 +0000 (11:46 +0100)]
dojump: Optimize a == a or a != a [PR98169]

If the backend doesn't have floating point EQ or NE comparison, dojump.c
splits it into ORDERED && UNEQ or UNORDERED || LTGT.  If both comparison
operands are the same, we know the result of the second comparison though,
a == b is equivalent to a ord b and a != b is equivalent to a unord b,
and thus can just use ORDERED or UNORDERED.

On the testcase, this changes f1:
- ucomiss %xmm0, %xmm0
- movl $1, %eax
- jp .L3
- jne .L3
- ret
- .p2align 4,,10
- .p2align 3
-.L3:
  xorl %eax, %eax
+ ucomiss %xmm0, %xmm0
+ setnp %al
and f3:
- ucomisd %xmm0, %xmm0
- movl $1, %eax
- jp .L8
- jne .L8
- ret
- .p2align 4,,10
- .p2align 3
-.L8:
  xorl %eax, %eax
+ ucomisd %xmm0, %xmm0
+ setnp %al
while keeping the same code for f2 and f4.

2020-12-10  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/98169
* dojump.c (do_compare_rtx_and_jump): Don't split self-EQ/NE
comparisons, just use ORDERED or UNORDERED.

* gcc.target/i386/pr98169.c: New test.

3 years agoopenmp: Fix ICE with broken doacross loop [PR98205]
Jakub Jelinek [Thu, 10 Dec 2020 10:07:07 +0000 (11:07 +0100)]
openmp: Fix ICE with broken doacross loop [PR98205]

If the loop body doesn't ever continue, we don't have a bb to insert the
updates.  Fixed by not adding them at all in that case.

2020-12-10  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/98205
* omp-expand.c (expand_omp_for_generic): Fix up broken_loop handling.

* c-c++-common/gomp/doacross-4.c: New test.

3 years agoAllow scalar fallback for pattern root stmt
Richard Biener [Thu, 10 Dec 2020 09:34:32 +0000 (10:34 +0100)]
Allow scalar fallback for pattern root stmt

This adjusts the SLP build to allow a pattern root stmt to be
built from scalars.  I've noticed this in PR98211 where we fail
to promote a SLP subtree to a simple splat operation and instead
emit a series of uniform vector operations.  The bb-slp-div-1.c
testcase is now vectorized on x86_64 but only the store so I
adjusted it to expect the load to be vectorized.

2020-12-10  Richard Biener  <rguenther@suse.de>

* tree-vect-slp.c (vect_get_and_check_slp_defs): Do
not mark the defs to occur in a pattern if it is the
pattern root and record the original stmt defs in that
case.

* gcc.dg/vect/bb-slp-div-1.c: Expect the load to be
vectorized.

3 years agoRISC-V: Explicitly call python when using multilib generator
Simon Cook [Wed, 9 Dec 2020 10:39:28 +0000 (10:39 +0000)]
RISC-V: Explicitly call python when using multilib generator

When building GCC for RISC-V with the --with-multilib-generator option,
it may not be possible to call arch-canonicalize as an executable when
building on Windows. Instead directly invoke the expected python
interpreter for this step.

gcc/ChangeLog:

* config/riscv/multilib-generator (arch_canonicalize): Invoke
python interpreter when calling arch-canonicalize script.

3 years ago-fdump-go-spec: ignore type ordering of incomplete types
Nikhil Benesch [Thu, 10 Dec 2020 02:46:02 +0000 (18:46 -0800)]
-fdump-go-spec: ignore type ordering of incomplete types

gcc/:
* godump.c (go_format_type): Don't consider whether a type has
been seen when determining whether to output a type by name.
Consider only the use_type_name parameter.
(go_output_typedef): When outputting a typedef, format the
declaration's original type, which contains the name of the
underlying type rather than the name of the typedef.
gcc/testsuite:
* gcc.misc-tests/godump-1.c: Add test case.

3 years agogo-test.exp: recognize errorcheckdir -n
Ian Lance Taylor [Thu, 10 Dec 2020 00:34:14 +0000 (16:34 -0800)]
go-test.exp: recognize errorcheckdir -n

* go.test/go-test.exp (go-gc-tests): Recognize errorcheckdir -n,
for bug345.go.

3 years agoDaily bump.
GCC Administrator [Thu, 10 Dec 2020 00:16:47 +0000 (00:16 +0000)]
Daily bump.

3 years agogo-test.exp: rewrite errchk regexp quoting
Ian Lance Taylor [Wed, 9 Dec 2020 23:43:44 +0000 (15:43 -0800)]
go-test.exp: rewrite errchk regexp quoting

* go.test/go-test.exp (errchk): Rewrite regexp quoting to use
curly braces, making it much simpler.

3 years agophiopt: Fix up two_value_replacement BOOLEAN_TYPE handling for Ada [PR98188]
Jakub Jelinek [Wed, 9 Dec 2020 22:52:25 +0000 (23:52 +0100)]
phiopt: Fix up two_value_replacement BOOLEAN_TYPE handling for Ada [PR98188]

For Ada with LTO, boolean_{false,true}_node can be 1-bit precision boolean,
while TREE_TYPE (lhs) can be 8-bit precision boolean and thus we can end up
with wide_int mismatches.

This patch for non-VR_RANGE just use VARYING min/max manually.
The min + 1 != max check will then do the rest.

2020-12-09  Jakub Jelinek  <jakub@redhat.com>

PR bootstrap/98188
* tree-ssa-phiopt.c (two_value_replacement): Don't special case
BOOLEAN_TYPEs for ranges, instead if get_range_info doesn't return
VR_RANGE, set min/max to wi::min/max_value.

3 years agoaarch64: Add +pauth to -march
Przemyslaw Wirkus [Wed, 9 Dec 2020 21:29:58 +0000 (21:29 +0000)]
aarch64: Add +pauth to -march

New +pauth (Pointer Authentication from Armv8.3-A) feature option for
-march command line option.

Please note that majority of PAUTH instructions are implemented behind HINT
instruction. PAUTH stays an Armv8.3-A feature but now can be assigned to other
architectures or CPUs.

gcc/ChangeLog:

* config/aarch64/aarch64-option-extensions.def
(AARCH64_OPT_EXTENSION): New +pauth option in -march for AArch64.
* config/aarch64/aarch64.h (AARCH64_FL_PAUTH): New pauth extension bitmask.
(AARCH64_ISA_PUATH): New ISA bitmask for PAUTH.
(AARCH64_FL_FOR_ARCH8_3): Add PAUTH to Armv8.3-A.
(TARGET_PAUTH): New target mask to isolate PAUTH instructions.
* config/aarch64/aarch64.md (do_return): Condition set to TARGET_PAUTH.
* doc/invoke.texi: Update docs for +flagm and +pauth.

3 years agoi386: Remove REG_ALLOC_ORDER definition
Uros Bizjak [Wed, 9 Dec 2020 20:06:07 +0000 (21:06 +0100)]
i386: Remove REG_ALLOC_ORDER definition

REG_ALLOC_ORDER just defines what the default is set to.

2020-12-09  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
* config/i386/i386.h (REG_ALLOC_ORDER): Remove

3 years agolibstdc++: Fix build failure for target with no way to sleep
Jonathan Wakely [Wed, 9 Dec 2020 16:53:18 +0000 (16:53 +0000)]
libstdc++: Fix build failure for target with no way to sleep

In previous releases the std::this_thread::sleep_for function was only
declared if the target supports multiple threads. I changed that
recently in r11-2649-g5bbb1f3000c57fd4d95969b30fa0e35be6d54ffb so that
sleep_for could be used single-threaded. But that means that targets
using --disable-threads are now required to provide some way to sleep.
This breaks the build for (at least) AVR when trying to build a hosted
library.

This patch adds a new autoconf macro that is defined when no way to
sleep is available, and uses that to suppress the sleeping functions in
std::this_thread.

The #error in src/c++11/thread.cc is retained for the case where there
is no sleep function available but multiple threads are supported. This
is consistent with previous releases, but that #error could probably be
removed without any consequences.

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Define NO_SLEEP
if none of nanosleep, sleep and Sleep is available.
* config.h.in: Regenerate.
* configure: Regenerate.
* include/std/thread [_GLIBCXX_NO_SLEEP] (__sleep_for): Do
not declare.
[_GLIBCXX_NO_SLEEP] (sleep_for, sleep_until): Do not
define.
* src/c++11/thread.cc [_GLIBCXX_NO_SLEEP] (__sleep_for): Do
not define.

3 years agotree-optimization/98213 - cache PHI walking result in SM
Richard Biener [Wed, 9 Dec 2020 14:48:36 +0000 (15:48 +0100)]
tree-optimization/98213 - cache PHI walking result in SM

This avoids exponential work when walking PHIs in loop store motion.
Fails are quickly propagated and thus need no caching.

2020-12-09  Richard Biener  <rguenther@suse.de>

PR tree-optimization/98213
* tree-ssa-loop-im.c (sm_seq_valid_bb): Cache successfully
processed PHIs.
(hoist_memory_references): Adjust.

* g++.dg/pr98213.C: New testcase.

3 years agoc++: Module parsing
Nathan Sidwell [Wed, 9 Dec 2020 15:18:23 +0000 (07:18 -0800)]
c++: Module parsing

This adds the module-declaration parsing and other logic.  We have two
new kinds of declaration -- module and import.  Plus the ability to
export other declarations.  The module processing can also divide the
TU into several portions -- GMF, Purview and PMF.

There are restrictions that some declarations must or mustnot appear
in a #include, so I needed to add a bit to indicate whether a token
came from the main source or not.  This seemed the least unpleasant
way of implementing such a check.

gcc/cp/
* parser.h (struct cp_token): Add main_source_p field.
* parser.c (cp_lexer_new_main): Pass thought module token filter.
Check macros.
(cp_lexer_get_preprocessor_token): Set main_source_p.
(enum module_parse): New.
(cp_parser_diagnose_invalid_type_name): Deal with unrecognized
module-directives.
(cp_parser_skip_to_closing_parenthesize_1): Skip module-directivres.
(cp_parser_skip_to_end_of_statement): Likewise.
(cp_parser_skiup_to_end_of_block_or_statement): Likewise.
(cp_parser_translation_unit): Add module parsing calls.
(cp_parser_module_name, cp_parser_module_declaration): New.
(cp_parser_import_declaration, cp_parser_module_export): New.
(cp_parser_declaration): Add module export detection.
(cp_parser_template_declaration): Adjust 'export' error message.
(cp_parser_function_definition_after_declarator): Add
module-specific logic.
* module.cc (import_module, declare_module)
(maybe_check_all_macros): Stubs.

3 years agoc++: Fix printing of decltype(nullptr) [PR97517]
Marek Polacek [Tue, 8 Dec 2020 21:44:53 +0000 (16:44 -0500)]
c++: Fix printing of decltype(nullptr) [PR97517]

The C++ printer doesn't handle NULLPTR_TYPE, so we issue the ugly
"'nullptr_type' not supported by...".  Since NULLPTR_TYPE is
decltype(nullptr), it seemed reasonable to handle it where we
handle DECLTYPE_TYPE, that is, in the simple-type-specifier handler.

gcc/cp/ChangeLog:

PR c++/97517
* cxx-pretty-print.c (cxx_pretty_printer::simple_type_specifier): Handle
NULLPTR_TYPE.
(pp_cxx_type_specifier_seq): Likewise.
(cxx_pretty_printer::type_id): Likewise.

gcc/testsuite/ChangeLog:

PR c++/97517
* g++.dg/diagnostic/nullptr.C: New test.

3 years agotestsuite: fix 2 tests on aarch64
Martin Liska [Wed, 9 Dec 2020 14:24:36 +0000 (15:24 +0100)]
testsuite: fix 2 tests on aarch64

gcc/testsuite/ChangeLog:

PR tree-optimization/98182
* gcc.dg/tree-ssa/if-to-switch-1.c: Add case-values-threshold in
order to fix them for aarch64.
* gcc.dg/tree-ssa/if-to-switch-10.c: Likewise.

3 years agoaarch64: Add CPU-specific SVE vector costs struct
Kyrylo Tkachov [Tue, 1 Dec 2020 14:53:30 +0000 (14:53 +0000)]
aarch64: Add CPU-specific SVE vector costs struct

This patch extends the backend vector costs structures to allow for
separate Advanced SIMD and SVE
costs. The fields in the current cpu_vector_costs that would vary
between the ISAs are moved into
a simd_vec_cost struct and we have two typedefs of it: advsimd_vec_cost
and sve_vec_costs.
If, in the future, SVE needs some extra fields it could inherit from
simd_vec_cost.
The CPU vector cost tables in aarch64.c are updated for the struct
changes.
aarch64_builtin_vectorization_cost is updated to select either the
Advanced SIMD or SVE costs field
depending on the mode and field availability.
No change in codegen is intended with this patch.

gcc/
* config/aarch64/aarch64-protos.h (cpu_vector_cost): Move simd
fields to...
(simd_vec_cost): ... Here.  Define.
(advsimd_vec_cost): Define.
(sve_vec_cost): Define.
* config/aarch64/aarch64.c (generic_advsimd_vector_cost):
Define.
(generic_sve_vector_cost): Likewise.
(generic_vector_cost): Update.
(qdf24xx_advsimd_vector_cost): Define.
(qdf24xx_vector_cost): Update.
(thunderx_advsimd_vector_cost): Define.
(thunderx_vector_cost): Update.
(tsv110_advsimd_vector_cost): Define.
(tsv110_vector_cost): Likewise.
(cortexa57_advsimd_vector_cost): Define.
(cortexa57_vector_cost): Update.
(exynosm1_advsimd_vector_cost): Define.
(exynosm1_vector_cost): Update.
(xgene1_advsimd_vector_cost): Define.
(xgene1_vector_cost): Update.
(thunderx2t99_advsimd_vector_cost): Define.
(thunderx2t99_vector_cost): Update.
(thunderx3t110_advsimd_vector_cost): Define.
(thunderx3t110_vector_cost): Update.
(aarch64_builtin_vectorization_cost): Handle sve and advsimd
vector cost fields.

3 years agoc++: Decl module-specific semantic processing
Nathan Sidwell [Wed, 9 Dec 2020 12:52:51 +0000 (04:52 -0800)]
c++: Decl module-specific semantic processing

This adds the module-specific logic to the various declaration
processing routines in decl.c and semantic.c.  I also adjust the rtti
type creation, as those are all in the global module, so we need to
temporarily clear the module_kind, when they are being created.
Finally, I added init and fini module processing with the initialier
giving a fatal error if you try and turn it on (so don't do that yet).

gcc/cp/
* decl.c (duplicate_decls): Add module-specific redeclaration
logic.
(cxx_init_decl_processing): Export the global namespace, maybe
initialize modules.
(start_decl): Reject local-extern in a module, adjust linkage of
template var.
(xref_tag_1): Add module-specific redeclaration logic.
(start_enum): Likewise.
(finish_enum_value_list): Export unscoped members of an exported
enum.
(grokmethod): Implement p1779 linkage of in-class defined
functions.
* decl2.c (no_linkage_error): Imports are ok.
(c_parse_final_cleanups): Call fini_modules.
* lex.c (cxx_dup_lang_specific): Clear some module flags in the
copy.
* module.cc (module_kind): Define.
(module_may_redeclare, set_defining_module): Stubs.
(init_modules): Error on modules.
(fini_modules): Stub.
* rtti.c (push_abi_namespace): Save and reset module_kind.
(pop_abi_namespace): Restore module kind.
(build_dynamic_cast_1, tinfo_base_init): Adjust.
* semantics.c (begin_class_definition): Add module-specific logic.
(expand_or_defer_fn_1): Keep bodies of more fns when modules_p.

3 years agoIBM Z: Build autovec-*-signaling-eq.c tests with exceptions
Ilya Leoshkevich [Thu, 3 Dec 2020 01:02:20 +0000 (02:02 +0100)]
IBM Z: Build autovec-*-signaling-eq.c tests with exceptions

According to
https://gcc.gnu.org/pipermail/gcc/2020-November/234344.html, GCC is
allowed to perform optimizations that remove floating point traps,
since they do not affect the modeled control flow.  This interferes with
two signaling comparison tests, where (a <= b && a >= b) is turned into
(a <= b && a == b) by test_for_singularity, into ((a <= b) & (a == b))
by vectorizer and then into (a == b) eliminate_redundant_comparison.

Fix by making traps affect the control flow by turning them into
exceptions.

gcc/testsuite/ChangeLog:

2020-12-03  Ilya Leoshkevich  <iii@linux.ibm.com>

* gcc.target/s390/zvector/autovec-double-signaling-eq.c: Build
with exceptions.
* gcc.target/s390/zvector/autovec-float-signaling-eq.c:
Likewise.

3 years agoOpenMP: C/C++ parse 'omp allocate'
Tobias Burnus [Wed, 9 Dec 2020 11:20:01 +0000 (12:20 +0100)]
OpenMP: C/C++ parse 'omp allocate'

gcc/c-family/ChangeLog:

* c-pragma.c (omp_pragmas): Add 'allocate'.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_ALLOCATE.

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_allocate): New.
(c_parser_omp_construct): Call it.

gcc/cp/ChangeLog:

* parser.c (cp_parser_omp_allocate): New.
(cp_parser_omp_construct, cp_parser_pragma): Call it.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/allocate-5.c: New test.

3 years agoImport HSA header files from AMD
Andrew Stubbs [Mon, 23 Nov 2020 16:27:59 +0000 (16:27 +0000)]
Import HSA header files from AMD

These are the same header files that exist in the Radeon Open Compute Runtime
project (as of October 2020), but they have been specially relicensed by AMD
for use in GCC.

The header files retain AMD copyright.

include/ChangeLog:

* hsa.h: Replace whole file.
* hsa_ext_amd.h: New file.
* hsa_ext_image.h: New file.

libgomp/ChangeLog:

* plugin/plugin-gcn.c: Include hsa_ext_amd.h.
(HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUNT): Delete redundant definition.

3 years agoc/98200 - improve error recovery for GIMPLE FE
Richard Biener [Wed, 9 Dec 2020 08:56:59 +0000 (09:56 +0100)]
c/98200 - improve error recovery for GIMPLE FE

This avoids ICEing by making sure to propagate error early.

2020-12-09  Richard Biener  <rguenther@suse.de>

PR c/98200
gcc/c/
* gimple-parser.c (c_parser_gimple_postfix_expression): Return
early on error.

gcc/testsuite/
* gcc.dg/gimplefe-error-8.c: New testcase.

3 years agogfortran.dg/gomp/reduction4.f90: Fix testcase
Tobias Burnus [Wed, 9 Dec 2020 09:42:49 +0000 (10:42 +0100)]
gfortran.dg/gomp/reduction4.f90: Fix testcase

Fix to 'omp scan' commit 005cff4e2ecbd5c4e2ef978fe4842fa3c8c79f47

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/reduction4.f90: Update scan-trees, add
lost testcase; move test with FE error to ...
* gfortran.dg/gomp/reduction5.f90: ... here.

3 years agofold-const: Fix native_encode_initializer bitfield handling [PR98199]
Jakub Jelinek [Wed, 9 Dec 2020 08:36:11 +0000 (09:36 +0100)]
fold-const: Fix native_encode_initializer bitfield handling [PR98199]

With the bit_cast changes, I have added support for bitfields which don't
have scalar representatives.  For bit_cast it works fine, as when mask
is non-NULL, off is asserted to be 0.  But when native_encode_initializer
is called e.g. from sccvn with off > 0 (i.e. we are interested in encoding
just a few bytes out of it somewhere from the middle or at the end), the
following computations are incorrect.
pos is a byte position from the start of the constructor, repr_size is the
size in bytes of the bit-field representative and len is the length
of the buffer.  If the buffer is offsetted by positive off, those numbers
are uncomparable though, we need to add off to len to make both
count bytes from the start of the constructor, and o is a utility temporary
set to off != -1 ? off : 0 (because off -1 also means start at offset 0
and just force special behavior).

2020-12-09  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/98199
* fold-const.c (native_encode_initializer): Fix handling bit-fields
when off > 0.

* gcc.c-torture/compile/pr98199.c: New test.

3 years agofold-const: Fix up native_encode_initializer missing field handling [PR98193]
Jakub Jelinek [Wed, 9 Dec 2020 08:34:51 +0000 (09:34 +0100)]
fold-const: Fix up native_encode_initializer missing field handling [PR98193]

When native_encode_initializer is called with non-NULL mask (i.e. ATM
bit_cast only), it checks if the current index in the CONSTRUCTOR (if any)
is the next initializable FIELD_DECL, and if not, decrements cnt and
performs the iteration with that FIELD_DECL as field and val of zero
(so that it computes mask properly).  As the testcase shows, I forgot to
set pos to the byte position of the field though (like it is done
for e.g. index referenced FIELD_DECLs in the constructor.

2020-12-09  Jakub Jelinek  <jakub@redhat.com>

PR c++/98193
* fold-const.c (native_encode_initializer): Set pos to field's
byte position if iterating over a field with missing initializer.

* g++.dg/cpp2a/bit-cast7.C: New test.

3 years agoc++: Avoid [[nodiscard]] warning in requires-expr [PR98019]
Jason Merrill [Wed, 9 Dec 2020 02:47:11 +0000 (21:47 -0500)]
c++: Avoid [[nodiscard]] warning in requires-expr [PR98019]

If we aren't really evaluating the expression, it doesn't matter that the
return value is discarded.

gcc/cp/ChangeLog:

PR c++/98019
* cvt.c (maybe_warn_nodiscard): Check c_inhibit_evaluation_warnings.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-nodiscard1.C: Remove xfail.

3 years agoc++: Don't require accessible dtors for some forms of new [PR59238]
Jason Merrill [Wed, 9 Dec 2020 03:05:45 +0000 (22:05 -0500)]
c++: Don't require accessible dtors for some forms of new [PR59238]

Jakub noticed that in build_new_1 we needed to add tf_no_cleanup to avoid
building a cleanup for a TARGET_EXPR that we already know is going to be
used to initialize something, so the cleanup will never be run.  The best
place to add it is close to where we build the INIT_EXPR; in
cp_build_modify_expr fixes the single-object new, in expand_default_init
fixes array new.

Co-authored-by: Jakub Jelinek <jakub@redhat.com>
gcc/cp/ChangeLog:

PR c++/59238
* init.c (expand_default_init): Pass tf_no_cleanup when building
a TARGET_EXPR to go on the RHS of an INIT_EXPR.
* typeck.c (cp_build_modify_expr): Likewise.

gcc/testsuite/ChangeLog:

PR c++/59238
* g++.dg/cpp0x/new4.C: New test.

3 years agoDaily bump.
GCC Administrator [Wed, 9 Dec 2020 00:16:50 +0000 (00:16 +0000)]
Daily bump.

3 years agotestsuite: Fix up testcase for ia32 [PR98191]
Jakub Jelinek [Tue, 8 Dec 2020 23:35:04 +0000 (00:35 +0100)]
testsuite: Fix up testcase for ia32 [PR98191]

2020-12-09  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/98191
* gcc.dg/torture/pr98191.c: Add dg-additional-options with
-w -Wno-psabi.

3 years agoc++: ICE with -fsanitize=vptr and constexpr dynamic_cast [PR98103]
Marek Polacek [Wed, 2 Dec 2020 19:33:13 +0000 (14:33 -0500)]
c++: ICE with -fsanitize=vptr and constexpr dynamic_cast [PR98103]

-fsanitize=vptr initializes all vtable pointers to null so that it can
catch invalid calls; see cp_ubsan_maybe_initialize_vtbl_ptrs.  That
means that evaluating a vtable reference can produce a null pointer
in this mode, so cxx_eval_dynamic_cast_fn should check that and give
and error.

gcc/cp/ChangeLog:

PR c++/98103
* constexpr.c (cxx_eval_dynamic_cast_fn): If the evaluating of vtable
yields a null pointer, give an error and return.  Use objtype.

gcc/testsuite/ChangeLog:

PR c++/98103
* g++.dg/ubsan/vptr-18.C: New test.

3 years agolibgo: update to 1.15.6 release
Ian Lance Taylor [Tue, 8 Dec 2020 18:57:05 +0000 (10:57 -0800)]
libgo: update to 1.15.6 release

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/276153

3 years agoc++: Originating and instantiating module
Nathan Sidwell [Tue, 8 Dec 2020 20:34:25 +0000 (12:34 -0800)]
c++: Originating and instantiating module

With modules streamed entities have two new properties -- the module
that declares them and the module that instantiates them.  Here
'instantiate' applies to more than just templates -- for instance an
implicit member fn.  These may well be the same module.  This adds the
calls to places that need it.

gcc/cp/
* class.c (layout_class_type): Call set_instantiating_module.
(build_self_reference): Likewise.
* decl.c (grokfndecl): Call set_originating_module.
(grokvardecl): Likewise.
(grokdeclarator): Likewise.
* pt.c (maybe_new_partial_specialization): Call
set_instantiating_module, propagate DECL_MODULE_EXPORT_P.
(lookup_template_class_1): Likewise.
(tsubst_function_decl): Likewise.
(tsubst_decl, instantiate_template_1): Likewise.
(build_template_decl): Propagate module flags.
(tsubst_template_dcl): Likewise.
(finish_concept_definition): Call set_originating_module.
* module.cc (set_instantiating_module, set_originating_module): Stubs.

3 years agoc++: Fix defaulted <=> fallback to < and == [PR96299]
Jason Merrill [Sat, 5 Dec 2020 02:48:43 +0000 (21:48 -0500)]
c++: Fix defaulted <=> fallback to < and == [PR96299]

I thought I had implemented P1186R3, but apparently I didn't read it closely
enough to understand the point of the paper, namely that for a defaulted
operator<=>, if a member type doesn't have a viable operator<=>, we will use
its operator< and operator== if the defaulted operator has an specific
comparison category as its return type; the compiler can't guess if it
should be strong_ordering or something else, but the user can make that
choice explicit.

The libstdc++ test change was necessary because of the change in
genericize_spaceship from op0 > op1 to op1 < op0; this should be equivalent,
but isn't because of PR88173.

gcc/cp/ChangeLog:

PR c++/96299
* cp-tree.h (build_new_op): Add overload that omits some parms.
(genericize_spaceship): Add location_t parm.
* constexpr.c (cxx_eval_binary_expression): Pass it.
* cp-gimplify.c (genericize_spaceship): Pass it.
* method.c (genericize_spaceship): Handle class-type arguments.
(build_comparison_op): Fall back to op</== when appropriate.

gcc/testsuite/ChangeLog:

PR c++/96299
* g++.dg/cpp2a/spaceship-synth-neg2.C: Move error.
* g++.dg/cpp2a/spaceship-p1186.C: New test.

libstdc++-v3/ChangeLog:

PR c++/96299
* testsuite/18_support/comparisons/algorithms/partial_order.cc:
One more line needs to use VERIFY instead of static_assert.

3 years agoc++: Distinguish ambiguity from no valid candidate
Jason Merrill [Mon, 7 Dec 2020 22:21:47 +0000 (17:21 -0500)]
c++: Distinguish ambiguity from no valid candidate

Several recent C++ features are specified to try overload resolution, and if
no viable candidate is found, do something else.  But our error return
doesn't distinguish between that situation and finding multiple viable
candidates that end up being ambiguous.  We're already trying to separately
return the single function we found even if it ends up being ill-formed for
some reason; for ambiguity let's pass back error_mark_node, to be
distinguished from NULL_TREE meaning no viable candidate.

gcc/cp/ChangeLog:

* call.c (build_new_op_1): Set *overload for ambiguity.
(build_new_method_call_1): Likewise.

3 years agoAvoid atomic for guard acquire when that is expensive
Bernd Edlinger [Tue, 1 Dec 2020 17:54:48 +0000 (18:54 +0100)]
Avoid atomic for guard acquire when that is expensive

When the atomic access involves a call to __sync_synchronize
it is better to call __cxa_guard_acquire unconditionally,
since it handles the atomics too, or is a non-threaded
implementation when there is no gthread support for this target.

This fixes also a bug for the ARM EABI big-endian target,
that is, previously the wrong bit was checked.

2020-12-08  Bernd Edlinger  <bernd.edlinger@hotmail.de>

* decl2.c: (is_atomic_expensive_p): New helper function.
(build_atomic_load_byte): Rename to...
(build_atomic_load_type): ... and add new parameter type.
(get_guard_cond): Skip the atomic here if that is expensive.
Use the correct type for the atomic load on certain targets.

3 years agoif-to-switch: fix matching of negative conditions
Martin Liska [Tue, 8 Dec 2020 12:18:37 +0000 (13:18 +0100)]
if-to-switch: fix matching of negative conditions

gcc/ChangeLog:

PR tree-optimization/98182
* gimple-if-to-switch.cc (pass_if_to_switch::execute): Request
chain linkage through false edges only.

gcc/testsuite/ChangeLog:

PR tree-optimization/98182
* gcc.dg/tree-ssa/if-to-switch-10.c: New test.
* gcc.dg/tree-ssa/pr98182.c: New test.

3 years agoc++: template and clone fns for modules
Nathan Sidwell [Tue, 8 Dec 2020 18:38:10 +0000 (10:38 -0800)]
c++: template and clone fns for modules

We need to expose build_cdtor_clones, it fortunately has the desired
API -- gosh, how did that happen? :) The template machinery will need
to cache path-of-instantiation information, so add two more fields to
the tinst_level struct.  I also had to adjust the
match_mergeable_specialization API since adding it, so including that
change too.

gcc/cp/
* cp-tree.h (struct tinst_level): Add path & visible fields.
(build_cdtor_clones): Declare.
(match_mergeable_specialization): Use a spec_entry, add insert parm.
* class.c (build_cdtor_clones): Externalize.
* pt.c (push_tinst_level_loc): Clear new fields.
(match_mergeable_specialization): Adjust API.

3 years agoRaw tree accessors
Nathan Sidwell [Tue, 8 Dec 2020 18:23:44 +0000 (10:23 -0800)]
Raw tree accessors

Here are the couple of raw accessors I make use of in the module streaming.

gcc/
* tree.h (DECL_ALIGN_RAW): New.
(DECL_ALIGN): Use it.
(DECL_WARN_IF_NOT_ALIGN_RAW): New.
(DECL_WARN_IF_NOT_ALIGN): Use it.
(SET_DECL_WARN_IF_NOT_ALIGN): Likewise.

3 years agocompiler: use correct location for iota errors
Ian Lance Taylor [Sun, 6 Dec 2020 05:16:13 +0000 (21:16 -0800)]
compiler: use correct location for iota errors

Also check for valid array length when reducing len/cap to a constant.

For golang/go#8183

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/275654

3 years agolibgcc: block signals when releasing split-stack memory
Ian Lance Taylor [Tue, 8 Dec 2020 18:17:16 +0000 (10:17 -0800)]
libgcc: block signals when releasing split-stack memory

* generic-morestack-thread.c (free_segments): Block signals during
thread exit.

3 years agoarm: Replace calls to __builtin_vmvn* by ~ in vmvn intrinsics in arm_neon.h [PR66791]
Prathamesh Kulkarni [Tue, 8 Dec 2020 17:52:11 +0000 (23:22 +0530)]
arm: Replace calls to __builtin_vmvn* by ~ in vmvn intrinsics in arm_neon.h [PR66791]

gcc/
2020-12-08  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

PR target/66791
* config/arm/arm_neon.h: Replace calls to __builtin_vmvn* by ~
in vmvn intrinsics.
* config/arm/arm_neon_builtins.def: Remove entry for vmvn.

3 years agoc++: Named module global initializers
Nathan Sidwell [Tue, 8 Dec 2020 17:05:32 +0000 (09:05 -0800)]
c++: Named module global initializers

C++ 20 modules adds some new rules about when the global initializers
of imported modules run.  They must run no later than before any
initializers in the importer that appear after the import.  To provide
this, each named module emits an idempotent global initializer that
calls the global initializer functions of its imports (these of course
may call further import initializers).  This is the machinery in our
global-init emission to accomplish that, other than the actual
emission of calls, which is in the module file.  The naming of this
global init is a new piece of the ABI.

FWIW, the module's emitter does some optimization to avoid calling a
direct import's initializer when it can determine thatr import is also
indirect.

gcc/cp/
* decl2.c (start_objects): Refactor and adjust for named module
initializers.
(finish_objects): Likewise.
(generate_ctor_or_dtor_function): Likewise.
* module.cc (module_initializer_kind)
(module_add_import_initializers): Stubs.

3 years agoFortran: Add 'omp scan' support of OpenMP 5.0
Tobias Burnus [Tue, 8 Dec 2020 15:49:46 +0000 (16:49 +0100)]
Fortran: Add 'omp scan' support of OpenMP 5.0

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_clauses, show_omp_node,
show_code_node): Handle OMP SCAN.
* gfortran.h (enum gfc_statement): Add ST_OMP_SCAN.
(enum): Add OMP_LIST_SCAN_IN and OMP_LIST_SCAN_EX.
(enum gfc_exec_op): Add EXEC_OMP_SCAN.
* match.h (gfc_match_omp_scan): New prototype.
* openmp.c (gfc_match_omp_scan): New.
(gfc_match_omp_taskgroup): Cleanup.
(resolve_omp_clauses, gfc_resolve_omp_do_blocks,
omp_code_to_statement, gfc_resolve_omp_directive): Handle 'omp scan'.
* parse.c (decode_omp_directive, next_statement,
gfc_ascii_statement): Likewise.
* resolve.c (gfc_resolve_code): Handle EXEC_OMP_SCAN.
* st.c (gfc_free_statement): Likewise.
* trans-openmp.c (gfc_trans_omp_clauses, gfc_trans_omp_do,
gfc_split_omp_clauses): Handle 'omp scan'.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/scan-1.f90: New test.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/reduction4.f90: Update; move FE some tests to ...
* gfortran.dg/gomp/reduction6.f90: ... this new test and ...
* gfortran.dg/gomp/reduction7.f90: ... this new test.
* gfortran.dg/gomp/reduction5.f90: Add dg-error.
* gfortran.dg/gomp/scan-1.f90: New test.
* gfortran.dg/gomp/scan-2.f90: New test.
* gfortran.dg/gomp/scan-3.f90: New test.
* gfortran.dg/gomp/scan-4.f90: New test.
* gfortran.dg/gomp/scan-5.f90: New test.
* gfortran.dg/gomp/scan-6.f90: New test.
* gfortran.dg/gomp/scan-7.f90: New test.

3 years agoi386: Fix up X87_ENABLE_{FLOAT,ARITH} in conditions [PR94440]
Jakub Jelinek [Tue, 8 Dec 2020 14:44:10 +0000 (15:44 +0100)]
i386: Fix up X87_ENABLE_{FLOAT,ARITH} in conditions [PR94440]

The documentation says
     For a named pattern, the condition may not depend on the data in
     the insn being matched, but only the target-machine-type flags.
The i386 backend violates that by using flag_excess_precision and
flag_unsafe_math_optimizations in the conditions too, which is bad
when optimize attribute or pragmas are used.  The problem is that the
middle-end caches the enabled conditions for the optabs for a particular
switchable target, but multiple functions can share the same
TARGET_OPTION_NODE, but have different TREE_OPTIMIZATION_NODE with different
flag_excess_precision or flag_unsafe_math_optimizations, so the enabled
conditions then match only one of those.

I think best would be to just have a single options node for both the
generic and target options, then such problems wouldn't exist, but that
would be very risky at this point and quite large change.

So, instead the following patch just shadows flag_excess_precision and
flag_unsafe_math_optimizations values for uses in the instruction conditions
in TargetVariable and during set_cfun artificially creates new
TARGET_OPTION_NODE if flag_excess_precision and/or
flag_unsafe_math_optimizations change from what is recorded in their
TARGET_OPTION_NODE.  The target nodes are hashed, so worst case we can get 4
times as many target option nodes if one would for each unique target option
try all the flag_excess_precision and flag_unsafe_math_optimizations values.

2020-12-08  Jakub Jelinek  <jakub@redhat.com>

PR target/94440
* config/i386/i386.opt (ix86_excess_precision,
ix86_unsafe_math_optimizations): New TargetVariables.
* config/i386/i386.h (X87_ENABLE_ARITH, X87_ENABLE_FLOAT): Use
ix86_unsafe_math_optimizations instead of
flag_unsafe_math_optimizations and ix86_excess_precision instead of
flag_excess_precision.
* config/i386/i386.c (ix86_excess_precision): Rename to ...
(ix86_get_excess_precision): ... this.
(TARGET_C_EXCESS_PRECISION): Define to ix86_get_excess_precision.
* config/i386/i386-options.c (ix86_valid_target_attribute_tree,
ix86_option_override_internal): Update ix86_unsafe_math_optimization
from flag_unsafe_math_optimizations and ix86_excess_precision
from flag_excess_precision when constructing target option nodes.
(ix86_set_current_function): If flag_unsafe_math_optimizations
or flag_excess_precision is different from the one recorded
in TARGET_OPTION_NODE, create a new target option node for the
current function and switch to that.

3 years agoc++: Fix MODULE_VERSION breakage
Nathan Sidwell [Tue, 8 Dec 2020 14:31:31 +0000 (06:31 -0800)]
c++: Fix MODULE_VERSION breakage

Adding includes to module.cc triggered the kind of build failure I
wanted to check for.  In this case it was MODULE_VERSION not being
defined, and module.cc's internal #error triggering.  I've relaxed the
check in Make-lang, so we proviude MODULE_VERSION when DEVPHASE is not
empty (rather than when it is 'experimental').  AFAICT devphase is
empty for release builds, and the #error will force us to decide
whether modules is sufficiently baked at that point.

gcc/cp
* Make-lang.in (MODULE_VERSION): Override when DEVPHASE not empty.
* module.cc: Comment.

3 years agoc++: Mangling for modules
Nathan Sidwell [Tue, 8 Dec 2020 14:07:19 +0000 (06:07 -0800)]
c++: Mangling for modules

This is the mangling changes for modules.  These were developed in
collaboration with clang, which also implemements the same ABI (or
plans to, I do not think the global init is in clang).  The global
init mangling is captured in
https://github.com/itanium-cxx-abi/cxx-abi/issues/99

gcc/cp/
* cp-tree.h (mangle_module_substitution, mangle_identifier)
(mangle_module_global_init): Declare.
* mangle.c (struct globals): Add mod field.
 (mangle_module_substitution, mangle_identifier)
(mangle_module_global_init): Define.
(write_module, maybe_write_module): New.
(write_name): Call it.
(start_mangling): Clear mod field.
(finish_mangling_internal): Adjust.
* module.cc (mangle_module, mangle_module_fini)
(get_originating_module): Stubs.

3 years agolibstdc++: Adjust whitespace in documentation
Jonathan Wakely [Tue, 8 Dec 2020 13:35:07 +0000 (13:35 +0000)]
libstdc++: Adjust whitespace in documentation

libstdc++-v3/ChangeLog:

* doc/xml/manual/appendix_contributing.xml: Use consistent
indentation.
* doc/html/manual/source_code_style.html: Regenerate.

3 years agoc++: module directive FSM
Nathan Sidwell [Tue, 8 Dec 2020 13:03:43 +0000 (05:03 -0800)]
c++: module directive FSM

As mentioned in the preprocessor patches, there's a new kind of
preprocessor directive for modules, and it interacts with the
compiler-proper, as that has to stream in header-unit macro
information (when the directive is an import that names a
header-unit).  This is that machinery.  It's an FSM that inspects the
token stream and does the minimal parsing to detect such imports.
This ends up being called from the C++ parser's tokenizer and from the
-E tokenizer (via a lang hook).  The actual module streaming is a stub
here.

gcc/cp/
* cp-tree.h (module_token_pre, module_token_cdtor)
(module_token_lang): Declare.
* lex.c: Include langhooks.
(struct module_token_filter): New.
* cp-tree.h (module_token_pre, module_token_cdtor)
(module_token_lang): Define.
* module.cc (get_module, preprocess_module, preprocessed_module):
Nop stubs.

3 years agoc++: Add module includes
Nathan Sidwell [Tue, 8 Dec 2020 12:59:09 +0000 (04:59 -0800)]
c++: Add module includes

gcc/cp/
* Make-lang.in (MODULE_VERSION): Define.
* module.cc: Add includes.

3 years agotestsuite: i386: Require avx512vpopcntdq in two tests
Rainer Orth [Tue, 8 Dec 2020 12:40:45 +0000 (13:40 +0100)]
testsuite: i386: Require avx512vpopcntdq in two tests

Two recent AVX512 tests FAIL on Solaris/x86 with /bin/as:

FAIL: gcc.target/i386/avx512vpopcntdq-pr97770-2.c (test for excess errors)

Excess errors:
Assembler: avx512vpopcntdq-pr97770-2.c
        "/var/tmp//ccM4Gt1a.s", line 171 : Illegal mnemonic
        Near line: "    vpopcntd        (%eax), %zmm0"
        "/var/tmp//ccM4Gt1a.s", line 171 : Syntax error
        Near line: "    vpopcntd        (%eax), %zmm0"

FAIL: gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c (test for excess errors)

similarly.

Fixed as follows.

Tested on i386-pc-solaris2.11 with as and gas and x86_64-pc-linux-gnu.

2020-12-07  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/avx512vpopcntdq-pr97770-2.c: Require
avx512vpopcntdq support.
* gcc.target/i386/avx512vpopcntdqvl-pr97770-1.c: Require
avx512vpopcntdq, avx512vl support.

3 years agotestsuite: i386: Require ifunc support in gcc.target/i386/pr98100.c
Rainer Orth [Tue, 8 Dec 2020 12:29:26 +0000 (13:29 +0100)]
testsuite: i386: Require ifunc support in gcc.target/i386/pr98100.c

The new gcc.target/i386/pr98100.c test FAILs on Solaris/x86:

FAIL: gcc.target/i386/pr98100.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr98100.c:6:1: error: the call requires 'ifunc', which is not supported by this target

Fixed as follows.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

2020-12-07  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/pr98100.c: Require ifunc support.

3 years agotree-optimization/98192 - fix double free in SLP
Richard Biener [Tue, 8 Dec 2020 11:54:48 +0000 (12:54 +0100)]
tree-optimization/98192 - fix double free in SLP

This makes sure to clear the vector pointer on release.

2020-12-08  Richard Biener  <rguenther@suse.de>

PR tree-optimization/98192
* tree-vect-slp.c (vect_build_slp_instance): Get scalar_stmts
by reference.

3 years agotestsuite/95900 - fix gcc.dg/vect/bb-slp-pr95866.c target requirement
Richard Biener [Tue, 8 Dec 2020 10:44:35 +0000 (11:44 +0100)]
testsuite/95900 - fix gcc.dg/vect/bb-slp-pr95866.c target requirement

We require a vector-by-scalar shift, there's no appropriate target
selector so use SSE2 for now.

2020-12-08  Richard Biener  <rguenther@suse.de>

PR testsuite/95900
* gcc.dg/vect/bb-slp-pr95866.c: Require sse2 for the
BIT_FIELD_REF match.

3 years agocontrib: filter more in filter-clang-warnings.py
Martin Liska [Tue, 8 Dec 2020 10:20:21 +0000 (11:20 +0100)]
contrib: filter more in filter-clang-warnings.py

contrib/ChangeLog:

* filter-clang-warnings.py: Filter more cases.

3 years agotestsuite: Avoid strict aliasing violations in some avx512 tests
Jakub Jelinek [Tue, 8 Dec 2020 10:19:49 +0000 (11:19 +0100)]
testsuite: Avoid strict aliasing violations in some avx512 tests

These tests violated strict aliasing, fixed by using a union and
type punning through that.

2020-12-08  Jakub Jelinek  <jakub@redhat.com>

* gcc.target/i386/avx512dq-vandnpd-2.c (CALC): Use union
to avoid aliasing violations.
* gcc.target/i386/avx512dq-vandnps-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vandpd-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vandps-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vorpd-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vorps-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vxorpd-2.c (CALC): Likewise.
* gcc.target/i386/avx512dq-vxorps-2.c (CALC): Likewise.

3 years agocontrib: modernize filter-clang-warnings.py
Martin Liska [Tue, 8 Dec 2020 10:07:25 +0000 (11:07 +0100)]
contrib: modernize filter-clang-warnings.py

contrib/ChangeLog:

* filter-clang-warnings.py: Modernize and filter 2 more
patterns.

3 years agoopenmp: -fopenmp-simd fixes [PR98187]
Jakub Jelinek [Tue, 8 Dec 2020 09:45:30 +0000 (10:45 +0100)]
openmp: -fopenmp-simd fixes [PR98187]

This patch fixes two bugs in the -fopenmp-simd support.  One is that
in C++ #pragma omp parallel master would actually create OMP_PARALLEL
in the IL, which is a big no-no for -fopenmp-simd, we should be creating
only the constructs -fopenmp-simd handles (mainly OMP_SIMD, OMP_LOOP which
is gimplified as simd in that case, declare simd/reduction and ordered simd).

The other bug was that #pragma omp master taskloop simd combined construct
contains simd and thus should be recognized as #pragma omp simd (with only
the simd applicable clauses), but as master wasn't included in
omp_pragmas_simd, we'd ignore it completely instead.

2020-12-08  Jakub Jelinek  <jakub@redhat.com>

PR c++/98187
* c-pragma.c (omp_pragmas): Remove "master".
(omp_pragmas_simd): Add "master".

* parser.c (cp_parser_omp_parallel): For parallel master with
-fopenmp-simd only, just call cp_parser_omp_master instead of
wrapping it in OMP_PARALLEL.

* c-c++-common/gomp/pr98187.c: New test.

3 years agotree-optimization/98191 - fix BIT_INSERT_EXPR sequence vectorization
Richard Biener [Tue, 8 Dec 2020 08:56:53 +0000 (09:56 +0100)]
tree-optimization/98191 - fix BIT_INSERT_EXPR sequence vectorization

This adds a missing check.

2020-12-08  Richard Biener  <rguenther@suse.de>

PR tree-optimization/98191
* tree-vect-slp.c (vect_slp_check_for_constructors): Do not
follow a non-SSA def chain.

* gcc.dg/torture/pr98191.c: New testcase.

3 years agotree-optimization/97559 - fix sinking in irreducible regions
Richard Biener [Tue, 8 Dec 2020 08:45:57 +0000 (09:45 +0100)]
tree-optimization/97559 - fix sinking in irreducible regions

This fixes sinking of loads when irreducible regions are involved
and the heuristics to find stores on the path along the sink
breaks down since that uses dominator queries.

2020-12-08  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97559
* tree-ssa-sink.c (statement_sink_location): Never ignore
PHIs on sink paths in irreducible regions.

* gcc.dg/torture/pr97559-1.c: New testcase.
* gcc.dg/torture/pr97559-2.c: Likewise.

3 years agogimple-isel: Fold x CMP y ? -1 : 0 to x CMP y [PR97872]
Prathamesh Kulkarni [Tue, 8 Dec 2020 09:00:04 +0000 (14:30 +0530)]
gimple-isel: Fold x CMP y ? -1 : 0 to x CMP y [PR97872]

gcc/
2020-12-08  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

PR target/97872
* gimple-isel.cc (gimple_expand_vec_cond_expr): Try to fold
x CMP y ? -1 : 0 to x CMP y.

gcc/testsuite/
2020-12-08  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

PR target/97872
* gcc.target/arm/pr97872.c: New test.

3 years agotree-optimization/98180 - fix BIT_INSERT_EXPR sequence vectorization
Richard Biener [Tue, 8 Dec 2020 08:42:35 +0000 (09:42 +0100)]
tree-optimization/98180 - fix BIT_INSERT_EXPR sequence vectorization

This adds a missing check for the first inserted value.

2020-12-08  Richard Biener  <rguenther@suse.de>

PR tree-optimization/98180
* tree-vect-slp.c (vect_slp_check_for_constructors): Check the
first inserted value has a def.

3 years agoFix PR target/96470
Eric Botcazou [Tue, 8 Dec 2020 08:19:36 +0000 (09:19 +0100)]
Fix PR target/96470

This forces the scalarization of the testcase on PowerPC.

gcc/testsuite/ChangeLog:
PR target/96470
* gnat.dg/opt39.adb: Add dg-additional-options for PowerPC.

3 years agoPR tree-optimization/96344
Eric Botcazou [Tue, 8 Dec 2020 07:57:46 +0000 (08:57 +0100)]
PR tree-optimization/96344

The very recent addition of the if_to_switch pass has partially disabled
the optimization added back in June to optimize_range_tests_to_bit_test,
as witnessed by the 3 new failures in the gnat.dg testsuite.  It turns out
that both tree-ssa-reassoc.c and tree-switch-conversion.c can turn things
into bit tests so the optimization is added to bit_test_cluster::emit too.

The patch also contains a secondary optimization, whereby the full bit-test
sequence is sent to the folder before being gimplified in case there is only
one test, so that the optimal sequence (bt + jc on x86) can be emitted like
with optimize_range_tests_to_bit_test.

gcc/ChangeLog:
PR tree-optimization/96344
* tree-switch-conversion.c (bit_test_cluster::emit): Compute the
range only if an entry test is necessary.  Merge the entry test in
the bit test when possible.  Use PREC local variable consistently.
When there is only one test, do a single gimplification at the end.