git.libre-soc.org Git

c++: DECL_LOCAL_FUNCTION_P -> DECL_LOCAL_DECL_P

Our handling of block-scope extern decls is insufficient for modern
C++, in particular modules, (but also constexprs).  We mark such local
function decls, and this patch extends that to marking local var decls
too, so mainly a macro rename.  Also, we set this flag earlier, rather
than learning about it when pushing the decl.  This is a step towards
handling these properly.

gcc/cp/
* cp-tree.h (DECL_LOCAL_FUNCTION_P): Rename to ...
(DECL_LOCAL_DECL_P): ... here.  Accept both fns and vars.
* decl.c (start_decl): Set DECL_LOCAL_DECL_P for local externs.
(omp_declare_variant_finalize_one): Use DECL_LOCAL_DECL_P.
(local_variable_p): Simplify.
* name-lookup.c (set_decl_context_in_fn): Assert DECL_LOCAL_DECL_P
is as expected.  Simplify.
(do_pushdecl): Don't set decl_context_in_fn for friends.
(is_local_extern): Simplify.
* call.c (equal_functions): Use DECL_LOCAL_DECL_P.
* parser.c (cp_parser_postfix_expression): Likewise.
(cp_parser_omp_declare_reduction): Likewise.
* pt.c (check_default_tmpl_args): Likewise.
(tsubst_expr): Assert nested reduction function is local.
(type_dependent_expression_p): Use DECL_LOCAL_DECL_P.
* semantics.c (finish_call_expr): Likewise.
libcc1/
* libcp1plugin.cc (plugin_build_call_expr): Use DECL_LOCAL_DECL_P.

[testsuite] Add missing require-effective-target allloca

Add a missing require-effect-target alloca directive.

Tested on nvptx.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/vla-1.c: Add require-effective-target alloca.

Cygwin/MinGW: Do not version lto plugins

GCC on Linux already uses liblto_plugin.so directly without
the libtool version suffix, adjust windows GCC to do the same.

gcc/ChangeLog:

* config.host: Adjust plugin name for Windows.

lto-plugin/ChangeLog:

* Makefile.am: drop versioning from libtool completely.
* Makefile.in: regenerate.

[tree-optimization] Don't clear ctrl-altering flag for IFN_UNIQUE

There's an invariant for IFN_UNIQUE, listed here in
gimple_call_initialize_ctrl_altering:
...
      /* IFN_UNIQUE should be the last insn, to make checking for it
         as cheap as possible.  */
      || (gimple_call_internal_p (stmt)
          && gimple_call_internal_unique_p (stmt)))
    gimple_call_set_ctrl_altering (stmt, true);
...

Recent commit fab77644842 "tree-optimization/96931 - clear ctrl-altering flag
more aggressively" breaks this invariant, causing an ICE triggered during
libgomp testing for x86_64 with nvptx accelerator:
...
during RTL pass: mach
asyncwait-1.f90: In function ‘MAIN__._omp_fn.0’:
asyncwait-1.f90:19: internal compiler error: in nvptx_find_par, at \
  config/nvptx/nvptx.c:3293
...

Fix this by listing IFN_UNIQUE as exception in
cleanup_call_ctrl_altering_flag.

Build for x86_64 with nvptx accelerator, tested libgomp.

gcc/ChangeLog:

PR tree-optimization/97000
* tree-cfgcleanup.c (cleanup_call_ctrl_altering_flag): Don't clear
flag for IFN_UNIQUE.

lto: Stream current working directory for first streamed relative filename and adjust relative paths [PR93865]

If the gcc -c -flto ... commands to compile some or all objects are run in a
different directory (or in different directories) from the directory in
which the gcc -flto link line is invoked, then the .debug_line will be
incorrect if there are any relative filenames, it will use those relative
filenames while .debug_info will contain a different DW_AT_comp_dir.

The following patch streams (at most once after each clear_line_info)
the current working directory (what we record in DW_AT_comp_dir) when
encountering the first relative pathname, and when reading the location info
reads it back and if the current working directory at that point is
different from the saved one, adjusts relative paths by adding a relative
prefix how to go from the current working directory to the previously saved
path (with a fallback e.g. for DOS e:\\foo vs. d:\\bar change to use
absolute directory).

2020-09-10  Jakub Jelinek  <jakub@redhat.com>

PR debug/93865
* lto-streamer.h (struct output_block): Add emit_pwd member.
* lto-streamer-out.c: Include toplev.h.
(clear_line_info): Set emit_pwd.
(lto_output_location_1): Encode the ob->current_file != xloc.file
bit directly into the location number.  If changing file, emit
additionally a bit whether pwd is emitted and emit it before the
first relative pathname since clear_line_info.
(output_function, output_constructor): Don't call clear_line_info
here.
* lto-streamer-in.c (struct string_pair_map): New type.
(struct string_pair_map_hasher): New type.
(string_pair_map_hasher::hash): New method.
(string_pair_map_hasher::equal): New method.
(path_name_pair_hash_table, string_pair_map_allocator): New variables.
(relative_path_prefix, canon_relative_path_prefix,
canon_relative_file_name): New functions.
(canon_file_name): Add relative_prefix argument, if non-NULL
and string is a relative path, return canon_relative_file_name.
(lto_location_cache::input_location_and_block): Decode file change
bit from the location number.  If changing file, unpack bit whether
pwd is streamed and stream in pwd.  Adjust canon_file_name caller.
(lto_free_file_name_hash): Delete path_name_pair_hash_table
and string_pair_map_allocator.

tree-optimization/96043 - BB vectorization costing improvement

This makes the BB vectorizer cost independent SLP subgraphs
separately. While on pristine trunk and for x86_64 I failed to
distill a testcase where the vectorizer would think _any_
basic-block vectorization opportunity is not profitable I do
have pending work that would make the cost savings of a
profitable opportunity make another independently not
profitable opportunity vectorized.

2020-09-08 Richard Biener <rguenther@suse.de>

PR tree-optimization/96043
* tree-vectorizer.h (_slp_instance::cost_vec): New.
(_slp_instance::subgraph_entries): Likewise.
(BB_VINFO_TARGET_COST_DATA): Remove.
* tree-vect-slp.c (vect_free_slp_instance): Free
cost_vec and subgraph_entries.
(vect_analyze_slp_instance): Initialize them.
(vect_slp_analyze_operations): Defer passing costs to
the target, instead record them in the SLP graph entry.
(get_ultimate_leader): New helper for graph partitioning.
(vect_bb_partition_graph_r): Likewise.
(vect_bb_partition_graph): New function to partition the
SLP graph into independently costable parts.
(vect_bb_vectorization_profitable_p): Adjust to work on
a subgraph.
(vect_bb_vectorization_profitable_p): New wrapper,
discarding non-profitable vectorization of subgraphs.
(vect_slp_analyze_bb_1): Call vect_bb_partition_graph before
costing.

* gcc.dg/vect/costmodel/x86_64/costmodel-pr69297.c: Adjust.

Fixup config/ChangeLog.

Daily bump.

c++: Further tweaks for new-expression and paren-init [PR77841]

This patch corrects our handling of array new-expression with ()-init:

  new int[4](1, 2, 3, 4);

should work even with the explicit array bound, and

  new char[3]("so_sad");

should cause an error, but we weren't giving any.

Fixed by handling array new-expressions with ()-init in the same spot
where we deduce the array bound in array new-expression.  I'm now
always passing STRING_CSTs to build_new_1 wrapped in { } which allowed
me to remove the special handling of STRING_CSTs in build_new_1.  And
since the DIRECT_LIST_INIT_P block in build_new_1 calls digest_init, we
report errors about too short arrays. reshape_init now does the {"foo"}
-> "foo" transformation even for CONSTRUCTOR_IS_PAREN_INIT, so no need
to do it in build_new.

I took a stab at cp_complete_array_type's "FIXME: this code is duplicated
from reshape_init", but calling reshape_init there, I ran into issues
with has_designator_problem: when we reshape an already reshaped
CONSTRUCTOR again, d.cur.index has been filled, so we think that we
have a user-provided designator (though there was no designator in the
source code), and report an error.

gcc/cp/ChangeLog:

PR c++/77841
* decl.c (reshape_init): If we're initializing a char array from
a string-literal that is enclosed in braces, unwrap it.
* init.c (build_new_1): Don't handle string-initializers here.
(build_new): Handle new-expression with paren-init when the
array bound is known.  Always pass string constants to build_new_1
enclosed in braces.  Don't handle string-initializers in any
special way.

gcc/testsuite/ChangeLog:

PR c++/77841
* g++.old-deja/g++.ext/arrnew2.C: Expect the error only in C++17
and less.
* g++.old-deja/g++.robertl/eb58.C: Adjust dg-error.
* g++.old-deja/g++.robertl/eb63.C: Expect the error only in C++17
and less.
* g++.dg/cpp2a/new-array5.C: New test.
* g++.dg/cpp2a/paren-init36.C: New test.
* g++.dg/cpp2a/paren-init37.C: New test.
* g++.dg/pr84729.C: Adjust dg-error.

c++: Fix ICE in reshape_init with init-list [PR95164]

This patch fixes a long-standing bug in reshape_init_r.  Since r209314
we implement DR 1467 which handles list-initialization with a single
initializer of the same type as the target.  In this test this causes
a crash in reshape_init_r when we're processing a constructor that has
undergone the DR 1467 transformation.

Take e.g. the

  foo({{1, {H{k}}}});

line in the attached test.  {H{k}} initializes the field b of H in I.
H{k} is a functional cast, so has TREE_HAS_CONSTRUCTOR set, so is
COMPOUND_LITERAL_P.  We perform the DR 1467 transformation and turn
{H{k}} into H{k}.  Then we attempt to reshape H{k} again and since
first_initializer_p is null and it's COMPOUND_LITERAL_P, we go here:

           else if (COMPOUND_LITERAL_P (stripped_init))
             gcc_assert (!BRACE_ENCLOSED_INITIALIZER_P (stripped_init));

then complain about the missing braces, go to reshape_init_class and ICE
on
               gcc_checking_assert (d->cur->index
                                    == get_class_binding (type, id));

because due to the missing { } we're looking for 'b' in H, but that's
not found.

So we have to be prepared to handle an initializer whose outer braces
have been removed due to DR 1467.

gcc/cp/ChangeLog:

PR c++/95164
* decl.c (reshape_init_r): When initializing an aggregate member
with an initializer from an initializer-list, also consider
COMPOUND_LITERAL_P.

gcc/testsuite/ChangeLog:

PR c++/95164
* g++.dg/cpp0x/initlist123.C: New test.

analyzer: generalize sm-malloc to new/delete [PR94355]

This patch generalizes the state machine in sm-malloc.c to support
multiple allocator APIs, and adds just enough support for C++ new and
delete to demonstrate the feature, allowing for detection of code
paths where the result of new in C++ can leak - for some crude examples,
at least (bearing in mind that the analyzer doesn't yet know about
e.g. vfuncs, exceptions, inheritance, RTTI, etc)

It also implements a new warning: -Wanalyzer-mismatching-deallocation.
For example:

demo.cc: In function 'void test()':
demo.cc:8:8: warning: 'f' should have been deallocated with 'delete'
  but was deallocated with 'free' [CWE-762] [-Wanalyzer-mismatching-deallocation]
    8 |   free (f);
      |   ~~~~~^~~
  'void test()': events 1-2
    |
    |    7 |   foo *f = new foo;
    |      |                ^~~
    |      |                |
    |      |                (1) allocated here (expects deallocation with 'delete')
    |    8 |   free (f);
    |      |   ~~~~~~~~
    |      |        |
    |      |        (2) deallocated with 'free' here; allocation at (1) expects deallocation with 'delete'
    |

The patch also adds just enough knowledge of exception-handling to
suppress a false positive from -Wanalyzer-malloc-leak on
g++.dg/analyzer/pr96723.C on the exception-handling CFG edge after
operator new.  It does this by adding a constraint that the result is
NULL if an exception was thrown from operator new, since the result from
operator new is lost when following that exception-handling CFG edge.

gcc/analyzer/ChangeLog:
PR analyzer/94355
* analyzer.opt (Wanalyzer-mismatching-deallocation): New warning.
* region-model-impl-calls.cc
(region_model::impl_call_operator_new): New.
(region_model::impl_call_operator_delete): New.
* region-model.cc (region_model::on_call_pre): Detect operator new
and operator delete.
(region_model::on_call_post): Likewise.
(region_model::maybe_update_for_edge): Detect EH edges and call...
(region_model::apply_constraints_for_exception): New function.
* region-model.h (region_model::impl_call_operator_new): New decl.
(region_model::impl_call_operator_delete): New decl.
(region_model::apply_constraints_for_exception): New decl.
* sm-malloc.cc (enum resource_state): New.
(struct allocation_state): New state subclass.
(enum wording): New.
(struct api): New.
(malloc_state_machine::custom_data_t): New typedef.
(malloc_state_machine::add_state): New decl.
(malloc_state_machine::m_unchecked)
(malloc_state_machine::m_nonnull)
(malloc_state_machine::m_freed): Delete these states in favor
of...
(malloc_state_machine::m_malloc)
(malloc_state_machine::m_scalar_new)
(malloc_state_machine::m_vector_new): ...this new api instances,
which own their own versions of these states.
(malloc_state_machine::on_allocator_call): New decl.
(malloc_state_machine::on_deallocator_call): New decl.
(api::api): New ctor.
(dyn_cast_allocation_state): New.
(as_a_allocation_state): New.
(get_rs): New.
(unchecked_p): New.
(nonnull_p): New.
(freed_p): New.
(malloc_diagnostic::describe_state_change): Use unchecked_p and
nonnull_p.
(class mismatching_deallocation): New.
(double_free::double_free): Add funcname param for initializing
m_funcname.
(double_free::emit): Use m_funcname in warning message rather
than hardcoding "free".
(double_free::describe_state_change): Likewise.  Use freed_p.
(double_free::describe_call_with_state): Use freed_p.
(double_free::describe_final_event): Use m_funcname in message
rather than hardcoding "free".
(double_free::m_funcname): New field.
(possible_null::describe_state_change): Use unchecked_p.
(possible_null::describe_return_of_state): Likewise.
(use_after_free::use_after_free): Add param for initializing m_api.
(use_after_free::emit): Use m_api->m_dealloc_funcname in message
rather than hardcoding "free".
(use_after_free::describe_state_change): Use freed_p.  Change the
wording of the message based on the API.
(use_after_free::describe_final_event): Use
m_api->m_dealloc_funcname in message rather than hardcoding
"free".  Change the wording of the message based on the API.
(use_after_free::m_api): New field.
(malloc_leak::describe_state_change): Use unchecked_p.  Update
for renaming of m_malloc_event to m_alloc_event.
(malloc_leak::describe_final_event): Update for renaming of
m_malloc_event to m_alloc_event.
(malloc_leak::m_malloc_event): Rename...
(malloc_leak::m_alloc_event): ...to this.
(free_of_non_heap::free_of_non_heap): Add param for initializing
m_funcname.
(free_of_non_heap::emit): Use m_funcname in message rather than
hardcoding "free".
(free_of_non_heap::describe_final_event): Likewise.
(free_of_non_heap::m_funcname): New field.
(allocation_state::dump_to_pp): New.
(allocation_state::get_nonnull): New.
(malloc_state_machine::malloc_state_machine): Update for changes
to state fields and new api fields.
(malloc_state_machine::add_state): New.
(malloc_state_machine::on_stmt): Move malloc/calloc handling to
on_allocator_call and call it, passing in the API pointer.
Likewise for free, moving it to on_deallocator_call.  Handle calls
to operator new and delete in an analogous way.  Use unchecked_p
when testing for possibly-null-arg and possibly-null-deref, and
transition to the non-null for the correct API.  Remove redundant
node param from call to on_zero_assignment.  Use freed_p for
use-after-free check, and pass in API.
(malloc_state_machine::on_allocator_call): New, based on code in
on_stmt.
(malloc_state_machine::on_deallocator_call): Likewise.
(malloc_state_machine::on_phi): Mark node param with
ATTRIBUTE_UNUSED; don't pass it to on_zero_assignment.
(malloc_state_machine::on_condition): Mark node param with
ATTRIBUTE_UNUSED.  Replace on_transition calls with get_state and
set_next_state pairs, transitioning to the non-null state for the
appropriate API.
(malloc_state_machine::can_purge_p): Port to new state approach.
(malloc_state_machine::on_zero_assignment): Replace on_transition
calls with get_state and set_next_state pairs.  Drop redundant
node param.
* sm.h (state_machine::add_custom_state): New.

gcc/ChangeLog:
PR analyzer/94355
* doc/invoke.texi: Document -Wanalyzer-mismatching-deallocation.

gcc/testsuite/ChangeLog:
PR analyzer/94355
* g++.dg/analyzer/new-1.C: New test.
* g++.dg/analyzer/new-vs-malloc.C: New test.

Update include/ChangeLog

ChangeLog entry did not get properly updated with previous commit.
Fix that.

2020-09-09 Caroline Tice <cmtice@google.com>

include/

* dwarf2.h (enum dwarf_sect_v5): A new enum section for the
sections in a DWARF 5 DWP file (DWP version 5).

Add codes for DWARF v5 .dwp sections to dwarf2.h.

(Note: This patch has already been accepted/committed in binutils/GDB.
This will bring the same change into the GCC tree.)

For DWARF v5 Dwarf Package Files (.dwp files), the section identifier encodings
have changed. This patch updates dwarf2.h to contain the new
encodings.  The table below shows the old & new encodings:
[ref http://dwarfstd.org/doc/DWARF5.pdf, section 7.3.5. ]

Val  DW4 section       DW4 section id  DW5 section         DW5 section id
--- -----------------  --------------  -----------------   --------------
1  .debug_info.dwo    DW_SECT_INFO    .debug_info.dwo     DW_SECT_INFO
2  .debug_types.dwo   DW_SECT_TYPES         --              reserved
3  .debug_abbrev.dwo  DW_SECT_ABBREV  .debug_abbrev.dwo   DW_SECT_ABBREV
4  .debug_line.dwo    DW_SECT_LINE    .debug_line.dwo     DW_SECT_LINE
5  .debug_loc.dwo     DW_SECT_LOC     .debug_loclists.dwo DW_SECT_LOCLISTS
6  .debug_str_offsets.dwo             .debug_str_offsets.dwo
                       DW_SECT_STR_OFFSETS                 DW_SECT_STR_OFFSETS
7  .debug_macinfo.dwo DW_SECT_MACINFO .debug_macro.dwo    DW_SECT_MACRO
8  .debug_macro.dwo   DW_SECT_MACRO   .debug_rnglists.dwo DW_SECT_RNGLISTS

2020-09-09  Caroline Tice  <cmtice@google.com>

include/

* dwarf2.h (enum dwarf_sect_v5): A new enum section for the
sections in a DWARF 5 DWP file (DWP version 5).

analyzer: eliminate sm_context::warn_for_state in favor of a new 'warn' vfunc

This patch is yet more preliminary work towards generalizing sm-malloc.cc
beyond just malloc/free.

It eliminates sm_context::warn_for_state in terms of a new sm_context::warn
vfunc, guarded by sm_context::get_state calls.

gcc/analyzer/ChangeLog:
* diagnostic-manager.cc
(null_assignment_sm_context::warn_for_state): Replace with...
(null_assignment_sm_context::warn): ...this.
* engine.cc (impl_sm_context::warn_for_state): Replace with...
(impl_sm_context::warn): ...this.
* sm-file.cc (fileptr_state_machine::on_stmt): Replace
warn_for_state and on_transition calls with a get_state
test guarding warn and set_next_state calls.
* sm-malloc.cc (malloc_state_machine::on_stmt): Likewise.
* sm-pattern-test.cc (pattern_test_state_machine::on_condition):
Replace warn_for_state call with warn call.
* sm-sensitive.cc
(sensitive_state_machine::warn_for_any_exposure): Replace
warn_for_state call with a get_state test guarding a warn call.
* sm-signal.cc (signal_state_machine::on_stmt): Likewise.
* sm-taint.cc (taint_state_machine::on_stmt): Replace
warn_for_state and on_transition calls with a get_state
test guarding warn and set_next_state calls.
* sm.h (sm_context::warn_for_state): Replace with...
(sm_context::warn): ...this.

analyzer: reimplement on_transition in terms of get_state/set_next_state

This patch is further preliminary work towards generalizing sm-malloc.cc
beyond just malloc/free.

Reimplement sm_context's on_transition vfunc in terms of new get_state
and set_next_state vfuncs, so that in followup patches we can implement
richer transitions (e.g. where the states are parametrized by
allocator).

gcc/analyzer/ChangeLog:
* diagnostic-manager.cc
(null_assignment_sm_context::null_assignment_sm_context): Add old_state
and ext_state params, initializing m_old_state and m_ext_state.
(null_assignment_sm_context::on_transition): Split into...
(null_assignment_sm_context::get_state): ...this new vfunc
implementation and...
(null_assignment_sm_context::set_next_state): ...this new vfunc
implementation.
(null_assignment_sm_context::m_old_state): New field.
(null_assignment_sm_context::m_ext_state): New field.
(diagnostic_manager::add_events_for_eedge): Pass in old state and
ext_state when creating sm_ctxt.
* engine.cc (impl_sm_context::on_transition): Split into...
(impl_sm_context::get_state): ...this new vfunc
implementation and...
(impl_sm_context::set_next_state): ...this new vfunc
implementation.
* sm.h (sm_context::get_state): New pure virtual function.
(sm_context::set_next_state): Likewise.
(sm_context::on_transition): Convert from a pure virtual function
to a regular function implemented in terms of get_state and
set_next_state.

analyzer: use objects for state_machine::state_t

This patch is preliminary work towards generalizing sm-malloc.cc so that
it can check APIs other than just malloc/free (and e.g. detect
mismatching alloc/dealloc pairs).

Generalize states in state machines so that, rather than state_t being
just an "unsigned", it becomes a "const state *", where the underlying
state objects are immutable objects managed by the state machine in
question, and can e.g. have vfuncs and extra fields.  The start state
m_start becomes a member of the state_machine base_class.

gcc/analyzer/ChangeLog:
* checker-path.cc (state_change_event::get_desc): Update
state_machine::get_state_name calls to state::get_name.
(warning_event::get_desc): Likewise.
* diagnostic-manager.cc
(null_assignment_sm_context::on_transition): Update comparison
against 0 with comparison with m_sm.get_start_state.
(diagnostic_manager::prune_for_sm_diagnostic): Update
state_machine::get_state_name calls to state::get_name.
* engine.cc (impl_sm_context::on_transition): Likewise.
(exploded_node::get_dot_fillcolor): Use get_id when summing
the sm states.
* program-state.cc (sm_state_map::sm_state_map): Don't hardcode
0 as the start state when initializing m_global_state.
(sm_state_map::print): Use dump_to_pp rather than get_state_name
when dumping states.
(sm_state_map::is_empty_p): Don't hardcode 0 as the start state
when examining m_global_state.
(sm_state_map::hash): Use get_id when hashing states.
(selftest::test_sm_state_map): Use state objects rather than
arbitrary hardcoded integers.
(selftest::test_program_state_merging): Likewise.
(selftest::test_program_state_merging_2): Likewise.
* sm-file.cc (fileptr_state_machine::m_start): Move to base class.
(file_diagnostic::describe_state_change): Use get_start_state.
(fileptr_state_machine::fileptr_state_machine): Drop m_start
initialization.
* sm-malloc.cc (malloc_state_machine::m_start): Move to base
class.
(malloc_diagnostic::describe_state_change): Use get_start_state.
(possible_null::describe_state_change): Likewise.
(malloc_state_machine::malloc_state_machine): Drop m_start
initialization.
* sm-pattern-test.cc (pattern_test_state_machine::m_start): Move
to base class.
(pattern_test_state_machine::pattern_test_state_machine): Drop
m_start initialization.
* sm-sensitive.cc (sensitive_state_machine::m_start): Move to base
class.
(sensitive_state_machine::sensitive_state_machine): Drop m_start
initialization.
* sm-signal.cc (signal_state_machine::m_start): Move to base
class.
(signal_state_machine::signal_state_machine): Drop m_start
initialization.
* sm-taint.cc (taint_state_machine::m_start): Move to base class.
(taint_state_machine::taint_state_machine): Drop m_start
initialization.
* sm.cc (state_machine::state::dump_to_pp): New.
(state_machine::state_machine): Move here from sm.h.  Initialize
m_next_state_id and m_start.
(state_machine::add_state): Reimplement in terms of state objects.
(state_machine::get_state_name): Delete.
(state_machine::get_state_by_name): Reimplement in terms of state
objects.  Make const.
(state_machine::validate): Delete.
(state_machine::dump_to_pp): Reimplement in terms of state
objects.
* sm.h (state_machine::state): New class.
(state_machine::state_t): Convert typedef from "unsigned" to
"const state_machine::state *".
(state_machine::state_machine): Move to sm.cc.
(state_machine::get_default_state): Use m_start rather than
hardcoding 0.
(state_machine::get_state_name): Delete.
(state_machine::get_state_by_name): Make const.
(state_machine::get_start_state): New accessor.
(state_machine::alloc_state_id): New.
(state_machine::m_state_names): Drop in favor of...
(state_machine::m_states): New field
(state_machine::m_start): New field
(start_start_p): Delete.

c++: omp reduction cleanups

omp reductions are modeled as nested functions, which is a thing C++
doesn't have.  Leading to much confusion until I figured out what was
happening.  Not helped by some duplicate code and inconsistencies in
the dependent and non-dependent paths.  This patch removes the parser
duplication and fixes up some bookkeeping.  Added some asserts and
comments too.

gcc/cp/
* parser.c (cp_parser_omp_declare_reduction): Refactor to avoid
code duplication.  Update DECL_TI_TEMPLATE's context.
* pt.c (tsubst_expr): For OMP reduction function, set context to
global_namespace before pushing.
(tsubst_omp_udr): Assert current_function_decl, add comment about
decl context.

testsuite: Use C++14 in g++.dg/warn/Wnonnull6.C.

This test uses C++14 features so is failing with -std=c++11.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wnonnull6.C: Use target c++14.

testsuite: Move auto-96647.C to c++1y/.

This test uses a C++14 feature so fails with -std=c++11. Therefore
I've moved it to cpp1y/ and used target c++14.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/auto-96647.C: Moved to...
* g++.dg/cpp1y/auto-96647.C: ...here. Use target c++14.

x32: Update gcc.target/i386/builtin_thread_pointer.c

Update gcc.target/i386/builtin_thread_pointer.c for x32.  For

int
foo3 (int i)
{
  int* p = (int*) __builtin_thread_pointer ();
  return p[i];
}

we can't generate:

movl %fs:0(,%edi,4), %eax
ret

for x32 since the address of %fs:0(,%edi,4) is %fs + zero-extended to 64
bits of 0(,%edi,4).  Instead, we generate:

movl %fs:0, %eax
movl (%eax,%edi,4), %eax

PR target/96955
* gcc.target/i386/builtin_thread_pointer.c: Update scan-assembler
for x32.

libphobos: Include <cet.h> to generate the CET marker for -fcf-protection

Include <cet.h> to generate the CET marker for -fcf-protection to avoid

/bin/ld: ../libdruntime/.libs/libgdruntime_convenience.a(libgdruntime_convenience_la-switchcontext.o): error: missing IBT and SHSTK properties

when -z cet-report=error is passed to the linker to create libgphobos.so
and libgdruntime.so.

PR d/95680
* libdruntime/config/x86/switchcontext.S: Include <cet.h> to
generate the CET marker for -fcf-protection.

[nvptx, libgcc] Fix Wbuiltin-declaration-mismatch in atomic.c

When building for target nvptx, we get this and similar warnings for libgcc:
...
src/libgcc/config/nvptx/atomic.c:39:1: warning: conflicting types for \
  built-in function ‘__sync_val_compare_and_swap_1’; expected \
  ‘unsigned char(volatile void *, unsigned char,  unsigned char)’ \
  [-Wbuiltin-declaration-mismatch]
...

Fix this by making sure in atomic.c that the pointers used are of type
'volatile void *'.

Tested by rebuilding atomic.c.

libgcc/ChangeLog:

* config/nvptx/atomic.c (__SYNC_SUBWORD_COMPARE_AND_SWAP): Fix
Wbuiltin-declaration-mismatch.

bb-reorder: Remove a misfiring micro-optimization (PR96475)

When the compgotos pass copies the tail of blocks ending in an indirect
jump, there is a micro-optimization to not copy the last one, since the
original block will then just be deleted.  This does not work properly
if cleanup_cfg does not merge all pairs of blocks we expect it to.  It
also does not work if that last block can be merged into multiple
predecessors.

2020-09-09  Segher Boessenkool  <segher@kernel.crashing.org>

PR rtl-optimization/96475
* bb-reorder.c (maybe_duplicate_computed_goto): Remove single_pred_p
micro-optimization.

If the lto plugin encounters a file with multiple symbol sections, each of which also has a v1 symbol extension section[1] then it will attempt to read the extension data for *every* symbol from each of the extension sections.  This results in reading off the end of a buffer with the associated memory corruption that that entails.  This patch fixes that problem.

2020-09-09  Nick Clifton  <nickc@redhat.com>

* lto-plugin.c (struct plugin_symtab): Add last_sym field.
(parse_symtab_extension): Only read as many entries as are
available in the buffer.  Store the data read into the symbol
table indexed from last_sym.  Increment last_sym.

[nvptx] Fix Wformat in nvptx_assemble_decl_begin

I'm running into this warning:
...
src/gcc/config/nvptx/nvptx.c: In function \
  ‘void nvptx_assemble_decl_begin(FILE*, const char*, const char*, \
  const_tree, long int, unsigned int, bool)’:
src/gcc/config/nvptx/nvptx.c:2229:29: warning: format ‘%d’ expects argument \
  of type ‘int’, but argument 5 has type ‘long unsigned int’ [-Wformat=]
     elt_size * BITS_PER_UNIT);
                             ^
...
which I seem to have introduced in commit b9c7fe59f9f "[nvptx] Fix array
dimension in nvptx_assemble_decl_begin", but not noticed due to configuring
with --disable-build-format-warnings.

Fix this by using the appropriate format.

Rebuild cc1 on nvptx.

gcc/ChangeLog:

* config/nvptx/nvptx.c (nvptx_assemble_decl_begin): Fix Wformat
warning.

c++: Fix resolving the address of overloaded pmf [PR96647]

In resolve_address_of_overloaded_function, currently only the second
pass over the overload set (which considers just the function templates
in the overload set) checks constraints and performs return type
deduction when necessary. But as the testcases below show, we need to
do the same when considering non-template functions during the first
pass.

gcc/cp/ChangeLog:

PR c++/96647
* class.c (resolve_address_of_overloaded_function): Check
constraints_satisfied_p and perform return-type deduction via
maybe_instantiate_decl when considering non-template functions
in the overload set.
* cp-tree.h (maybe_instantiate_decl): Declare.
* decl2.c (maybe_instantiate_decl): Remove static.

gcc/testsuite/ChangeLog:

PR c++/96647
* g++.dg/cpp0x/auto-96647.C: New test.
* g++.dg/cpp0x/error9.C: New test.
* g++.dg/cpp2a/concepts-fn6.C: New test.

fix useless unsharing of SLP tree

This avoids unsharing the SLP tree when optimizing load permutations
for reductions but there is no actual permute taking place.

2020-09-09 Richard Biener <rguenther@suse.de>

* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Do
nothing when the permutation doesn't permute.

[nvptx] Fix boolean type test in write_fn_proto

When running this libgomp testcase for nvptx accelerator:
...
/* { dg-do run } */
__uint128_t v;
int main () {
  #pragma omp target
  {
    __uint128_t exp = 2;
    __atomic_compare_exchange_n (&v, &exp, 7, false, __ATOMIC_RELEASE,
__ATOMIC_ACQUIRE);
  }
}
...
we run into this assert in write_fn_proto:
...
913             gcc_assert (type == boolean_type_node);
...

This happens when doing some special-handling code for
__atomic_compare_exchange_1/2/4/8/16.  The function decls have a parameter
called weak of type bool, which is skipped when writing the decl because
the corresponding libatomic functions do not have that parameter.  The assert
is there to verify that we skip the correct parameter.

However, we assert because we have different type of bools:
...
(gdb) call debug_generic_expr (type)
_Bool
(gdb) call debug_generic_expr (global_trees[TI_BOOLEAN_TYPE])
bool
...

Fix this by checking for TREE_CODE (type) == BOOLEAN_TYPE instead.

Tested libgomp on x86_64-linux with nvptx accelerator.

Likewise, tested that the test-case above does not ICE anymore.

gcc/ChangeLog:

PR target/96991
* config/nvptx/nvptx.c (write_fn_proto): Fix boolean type check.

enable live comparison vectorization

This removes a check preventing vectorization of live results of
vectorized comparisons. I tested it with AVX512 mask registers
(inspecting assembly) and traditional vector masks.

2020-09-09 Richard Biener <rguenther@suse.de>

* tree-vect-stmts.c (vectorizable_comparison): Allow
STMT_VINFO_LIVE_P stmts.

* gcc.dg/vect/vect-live-6.c: New testcase.

gfortran.dg/gomp/combined-if.f90: Update nvptx tree-dump times

nvptx has additional omp simd lines with _simt_ with -O1 and higher.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/combined-if.f90: Update scan-tree-dump-times for
'omp simd.*if' for nvptx even more.

enable live condition vectorization

This removes a check preventing vectorization of live results of
vectorized conditions.

2020-09-09 Richard Biener <rguenther@suse.de>

* tree-vect-stmts.c (vectorizable_condition): Allow
STMT_VINFO_LIVE_P stmts.

* gcc.dg/vect/vect-cond-13.c: New testcase.
* gcc.target/i386/pr87007-4.c: Adjust.
* gcc.target/i386/pr87007-5.c: Likewise.

config: Sync largefile.m4 from binutils-gdb

The following patch improves handling of largefile support with procfs
on 32-bit Solaris.  It has already been approved and installed for
binutils-gdb in the thread starting at

[PATCH] Unify Solaris procfs and largefile handling
        https://sourceware.org/pipermail/gdb-patches/2020-June/169977.html

I'm syncing the config/largefile.m4 part to gcc now which is the master
for config.  Since ACX_LARGEFILE isn't used anywhere in the gcc tree,
I'm installing it as obvious.

2020-09-09  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

config:
* largefile.m4: Sync from binutils-gdb.

tree-optimization/96978 - fix fallout of BB vectorization of live stmts

This avoids looking at STMT_VINFO_LIVE_P when vectorizing BBs.

2020-09-09 Richard Biener <rguenther@suse.de>

PR tree-optimization/96978
* tree-vect-stmts.c (vectorizable_condition): Do not
look at STMT_VINFO_LIVE_P for BB vectorization.
(vectorizable_comparison): Likewise.

Implement __builtin_thread_pointer for x86 TLS.

gcc/ChangeLog:
PR target/96955
* config/i386/i386.md (get_thread_pointer<mode>): New
expander.

gcc/testsuite/ChangeLog:

* gcc.target/i386/builtin_thread_pointer.c: New test.

Fortran: Fixes for OpenMP loop-iter privatization (PRs 95109 + 94690)

This commit also fixes a gfortran.dg/gomp/target1.f90 regression;
target1.f90 tests the resolve.c and openmp.c changes.

gcc/fortran/ChangeLog:

PR fortran/95109
PR fortran/94690
* resolve.c (gfc_resolve_code): Also call
gfc_resolve_omp_parallel_blocks for 'distribute parallel do (simd)'.
* openmp.c (gfc_resolve_omp_parallel_blocks): Handle it.
(gfc_resolve_do_iterator): Remove special code for SIMD, which is
not needed.
* trans-openmp.c (gfc_trans_omp_target): For TARGET_PARALLEL_DO_SIMD,
call simd not do processing function.

gcc/testsuite/ChangeLog:

PR fortran/95109
PR fortran/94690
* gfortran.dg/gomp/combined-if.f90: Update scan-tree-dump-times for
'omp simd.*if'.
* gfortran.dg/gomp/openmp-simd-5.f90: New test.

libbacktrace: don't strip leading underscore on 64-bit PE

* pecoff.c (coff_initialize_syminfo): Add is_64 parameter.
(coff_add): Determine and pass is_64.

libbacktrace: fetch executable path on macOS

PR libbacktrace/96973
* fileline.c (macho_get_executable_path): New static function.
(fileline_initialize): Call macho_get_executable_path.

libbacktrace: avoid ambiguous binary search

Searching for a range match can cause the search order to not match
the sort order, which can cause libbacktrace to miss matching entries.
Allocate an extra entry at the end of function_addrs and unit_addrs vectors,
so that we can safely compare to the next entry when searching.
Adjust the matching code accordingly.

Fixes https://github.com/ianlancetaylor/libbacktrace/issues/44.

* dwarf.c (function_addrs_search): Compare against the next entry
low address, not the high address.
(unit_addrs_search): Likewise.
(build_address_map): Add a trailing unit_addrs.
(read_function_entry): Add a trailing function_addrs.
(read_function_info): Likewise.
(report_inlined_functions): Search backward for function_addrs
match.
(dwarf_lookup_pc): Search backward for unit_addrs and
function_addrs matches.

Daily bump.

libbacktrace: fix tipo in comment

* simple.c (simple_unwind): Correct comment spelling.

libbacktrace: correct memory lengths in Mach-O dsym support

* macho.c (macho_add_dsym): Make space for '/' in dsym. Use
correct length when freeing diralc.

openacc: Fix atomic_capture-2.c iteration-ordering issues

The test case was written with assumptions about loop iteration ordering
that are not guaranteed by OpenACC and do not apply on all targets,
in particular AMD GCN. This patch removes those assumptions.

2020-09-08 Julian Brown <julian@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-c-c++-common/atomic_capture-2.c: Remove
iteration-ordering assumptions.

amdgcn: Add waitcnt after LDS write instructions

Data-share write (ds_write) instructions do not necessarily complete
the write to LDS immediately. When a write completes, LGKM_CNT is
decremented. For now, we wait until LGKM_CNT reaches zero after each
ds_write instruction.

This fixes a race condition in the case where LDS is read immediately
after being written. This can happen with broadcast operations.

2020-09-08 Julian Brown <julian@codesourcery.com>

gcc/
* config/gcn/gcn-valu.md (scatter<mode>_insn_1offset_ds<exec_scatter>):
Add waitcnt.
* config/gcn/gcn.md (*mov<mode>_insn, *movti_insn): Add waitcnt to
ds_write alternatives.

openacc: Fix mkoffload SGPR/VGPR count parsing for HSACO v3

If an offload kernel uses a large number of VGPRs, AMD GCN hardware may
need to limit the number of threads/workers launched for that kernel.
The number of SGPRs/VGPRs in use is detected by mkoffload and recorded in
the processed output.  The patterns emitted detailing SGPR/VGPR occupancy
changed between HSACO v2 and v3 though, so this patch updates parsing
to account for that.

2020-09-08  Julian Brown  <julian@codesourcery.com>

gcc/
* config/gcn/mkoffload.c (process_asm): Initialise regcount.  Update
scanning for SGPR/VGPR usage for HSACO v3.

openacc: Fix race condition in Fortran loop collapse tests

The gangs participating in a gang-partitioned loop are not all guaranteed
to complete before some given gang continues to execute beyond that loop.
This means that two existing test cases contain a race condition,
because a loop that may be gang-partitioned is followed immediately by
another loop. The fix is to place the loops in separate parallel regions.

2020-09-08 Julian Brown <julian@codesourcery.com>

libgomp/
* testsuite/libgomp.oacc-fortran/collapse-1.f90: Fix race condition.
* testsuite/libgomp.oacc-fortran/collapse-2.f90: Likewise.

libbacktrace: correctly swap Mach-O 32-bit file offset

libbacktrace/ChangeLog:
PR libbacktrace/96973
* macho.c (macho_add_fat): Correctly swap 32-bit file offset.

libbacktrace: only match magic number at start of line

libbacktrace/ChangeLog:
PR libbacktrace/96971
* filetype.awk: Only match magic number at start of line.

floatformat.h: Add bfloat16 support.

This change is motivated by a patchset that adds bfloat16 debugging
support for new avx512 instructions to GDB. The gdb thread can be found
here: https://sourceware.org/pipermail/gdb-patches/2020-July/170820.html

include:
2020-08-17 Felix Willgerodt <felix.willgerodt@intel.com>

* floatformat.h (floatformat_bfloat16_big): New.
(floatformat_bfloat16_little): New.

libiberty:
2020-08-17 Felix Willgerodt <felix.willgerodt@intel.com>

* floatformat.c (floatformat_bfloat16_big): New.
(floatformat_bfloat16_little): New.

analyzer: fix another ICE in constructor-handling [PR96949]

PR analyzer/96949 reports an ICE with
--param analyzer-max-svalue-depth=0, where the param value leads
to INTEGER_CST values in a RANGE_EXPR being treated as unknown
symbolic values.

This patch replaces implicit assumptions that these values are
concrete (and thus have concrete bit offsets), adding
error-handling for symbolic cases instead of assertions.

gcc/analyzer/ChangeLog:
PR analyzer/96949
* store.cc (binding_map::apply_ctor_val_to_range): Add
error-handling for the cases where we have symbolic offsets.

gcc/testsuite/ChangeLog:
PR analyzer/96949
* gfortran.dg/analyzer/pr96949.f90: New test.

analyzer: fix ICE on RANGE_EXPR with CONSTRUCTOR value [PR96950]

gcc/analyzer/ChangeLog:
PR analyzer/96950
* store.cc (binding_map::apply_ctor_to_region): Handle RANGE_EXPR
where min_index == max_index.
(binding_map::apply_ctor_val_to_range): Replace assertion that we
don't have a CONSTRUCTOR value with error-handling.

analyzer: fix ICE on machine-specific builtins [PR96962]

In g:ee7bfbe5eb70a23bbf3a2cedfdcbd2ea1a20c3f2 I added a
  switch (DECL_UNCHECKED_FUNCTION_CODE (callee_fndecl))
to region_model::on_call_pre guarded by
  fndecl_built_in_p (callee_fndecl).
I meant to handle only normal built-ins, whereas this
single-argument overload of fndecl_built_in_p returns true for any
kind of built-in.

PR analyzer/96962 reports a case where this matches for a
machine-specific builtin, leading to an ICE.  Fixed thusly.

gcc/analyzer/ChangeLog:
PR analyzer/96962
* region-model.cc (region_model::on_call_pre): Fix guard on switch
on built-ins to only consider BUILT_IN_NORMAL, rather than other
kinds of build-ins.

PR tree-optimization/96967 - cast label range to type of switch operand

PR tree-optimization/96967
* tree-vrp.c (find_case_label_range): Cast label range to
type of switch operand.

MSP430: Fix detection of assembler support for .mspabi_attribute

The assembly code ".mspabi_attribute 4,1" uses the object attribute
mechanism to indicate that the 430 ISA is in use. However, the default
ISA is 430X, so GAS fails to assemble this since the ISA wasn't also set
to 430 on the command line.

gcc/ChangeLog:

* config/msp430/msp430.c (msp430_file_end): Fix jumbled
HAVE_AS_MSPABI_ATTRIBUTE and HAVE_AS_GNU_ATTRIBUTE checks.
* configure: Regenerate.
* configure.ac: Use ".mspabi_attribute 4,2" to check for assembler
support for this object attribute directive.

libphobos: libdruntime doesn't support shadow stack (PR95680)

Rather than implementing support within D runtime itself, use libc
getcontext/swapcontext functions if CET is enabled.

Removes whatever CET support was in the switchContext routine for x86
D runtime, along with setting version AsmExternal, so that the fallback
ucontext_t implementation is used, which is capable of doing shadow
stack handling.

libphobos/ChangeLog:

PR d/95680
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac (DCFG_ENABLE_CET): Substitute.
* libdruntime/Makefile.in: Regenerate.
* libdruntime/config/x86/switchcontext.S: Remove CET support code.
* libdruntime/core/thread.d: Import gcc.config. Don't set version
AsmExternal when GNU_Enable_CET is true.
* libdruntime/gcc/config.d.in (GNU_Enable_CET): Define.
* src/Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.

MSP430: Use enums to handle -mcpu= values

The -mcpu= option accepts only a handful of string values.
Using enums instead of strings to handle the accepted values removes the
need to have specific processing of the strings in the backend, and
simplifies any comparisons which need to be performed on the value.

It also allows the default value to have semantic equivalence to a user
set value, whilst retaining the ability to differentiate between them.
Practically, this allows a user set -mcpu= value to override the the ISA set by
-mmcu, whilst the default -mcpu= value can still have an explicit meaning.

gcc/ChangeLog:

* common/config/msp430/msp430-common.c (msp430_handle_option): Remove
OPT_mcpu_ handling.
Set target_cpu value to new enum values when parsing certain -mmcu=
values.
* config/msp430/msp430-opts.h (enum msp430_cpu_types): New.
* config/msp430/msp430.c (msp430_option_override): Handle new
target_cpu enum values.
Set target_cpu using extracted value for given MCU when -mcpu=
option is not passed by the user.
* config/msp430/msp430.opt: Handle -mcpu= values using enums.

gcc/testsuite/ChangeLog:

* gcc.target/msp430/mcpu-is-430.c: New test.
* gcc.target/msp430/mcpu-is-430x.c: New test.
* gcc.target/msp430/mcpu-is-430xv2.c: New test.

Fix description of FINDLOC result.

gcc/fortran/ChangeLog:

* intrinsic.texi: Fix description of FINDLOC result.

ubsan: d-demangle.c:214 signed integer overflow

Running the libiberty testsuite
./test-demangle < libiberty/testsuite/d-demangle-expected
libiberty/d-demangle.c:214:14: runtime error: signed integer overflow: 922337203 * 10 cannot be represented in type 'long int'

On looking at silencing ubsan, I found a real bug in dlang_number.
For a 32-bit long, some overflows won't be detected.  For example,
21474836480.  Why?  Well 214748364 * 10 is 0x7FFFFFF8 (no overflow so
far).  Adding 8 gives 0x80000000 (which does overflow but there is no
test for that overflow in the code).  Then multiplying 0x80000000 * 10
= 0x500000000 = 0 won't be caught by the multiplication overflow test.
The same holds for a 64-bit long using similarly crafted digit
sequences.

* d-demangle.c: Include limits.h.
(ULONG_MAX, UINT_MAX): Provide fall-back definition.
(dlang_number): Simplify and correct overflow test.  Only
write *ret on returning non-NULL.  Make "ret" an unsigned long*.
Only succeed for result of [0,UINT_MAX].
(dlang_decode_backref): Simplify and correct overflow test.
Only write *ret on returning non-NULL.  Only succeed for
result [1,MAX_LONG].
(dlang_backref): Remove now unnecessary range check.
(dlang_symbol_name_p): Likewise.
(string_need): Take a size_t n arg, and use size_t tem.
(string_append): Use size_t n.
(string_appendn, string_prependn): Take a size_t n arg.
(TEMPLATE_LENGTH_UNKNOWN): Define as -1UL.
(dlang_lname, dlang_parse_template): Take an unsigned long len
arg.
(dlang_symbol_backref, dlang_identifier, dlang_parse_integer),
(dlang_parse_integer, dlang_parse_string),
(dlang_parse_arrayliteral, dlang_parse_assocarray),
(dlang_parse_structlit, dlang_parse_tuple),
(dlang_template_symbol_param, dlang_template_args): Use
unsigned long variables.
* testsuite/d-demangle-expected: Add new tests.

Daily bump.

PR fortran/96711 - ICE with NINT() for integer(16) result

When rounding a real to the nearest integer, temporarily convert the real
argument to a longer real kind when the result is of type/kind integer(16).

gcc/fortran/ChangeLog:

* trans-intrinsic.c (build_round_expr): Use temporary with
appropriate kind for conversion before rounding to nearest
integer when the result precision is 128 bits.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr96711.f90: New test.

lra: Avoid cycling on certain subreg reloads [PR96796]

This PR is about LRA cycling for a reload of the form:

----------------------------------------------------------------------------
Changing pseudo 196 in operand 1 of insn 103 on equiv [r105:DI*0x8+r140:DI]
      Creating newreg=287, assigning class ALL_REGS to slow/invalid mem r287
      Creating newreg=288, assigning class ALL_REGS to slow/invalid mem r288
  103: r203:SI=r288:SI<<0x1+r196:DI#0
      REG_DEAD r196:DI
    Inserting slow/invalid mem reload before:
  316: r287:DI=[r105:DI*0x8+r140:DI]
  317: r288:SI=r287:DI#0
----------------------------------------------------------------------------

The problem is with r287.  We rightly give it a broad starting class of
POINTER_AND_FP_REGS (reduced from ALL_REGS by preferred_reload_class).
However, we never make forward progress towards narrowing it down to
a specific choice of class (POINTER_REGS or FP_REGS).

I think in practice we rely on two things to narrow a reload pseudo's
class down to a specific choice:

(1) a restricted class is specified when the pseudo is created

    This happens for input address reloads, where the class is taken
    from the target's chosen base register class.  It also happens
    for simple REG reloads, where the class is taken from the chosen
    alternative's constraints.

(2) uses of the reload pseudo as a direct input operand

    In this case get_reload_reg tries to reuse the existing register
    and narrow its class, instead of creating a new reload pseudo.

However, neither occurs here.  As described above, r287 rightly
starts out with a wide choice of class, ultimately derived from
ALL_REGS, so we don't get (1).  And as the comments in the PR
explain, r287 is never used as an input reload, only the subreg is,
so we don't get (2):

----------------------------------------------------------------------------
         Choosing alt 13 in insn 317:  (0) r  (1) w {*movsi_aarch64}
      Creating newreg=291, assigning class FP_REGS to r291
  317: r288:SI=r291:SI
    Inserting insn reload before:
  320: r291:SI=r287:DI#0
----------------------------------------------------------------------------

IMO, in this case we should rely on the reload of r316 to narrow
down the class of r278.  Currently we do:

----------------------------------------------------------------------------
         Choosing alt 7 in insn 316:  (0) r  (1) m {*movdi_aarch64}
      Creating newreg=289 from oldreg=287, assigning class GENERAL_REGS to r289
  316: r289:DI=[r105:DI*0x8+r140:DI]
    Inserting insn reload after:
  318: r287:DI=r289:DI
---------------------------------------------------

i.e. we create a new pseudo register r289 and give *that* pseudo
GENERAL_REGS instead.  This is because get_reload_reg only narrows
down the existing class for OP_IN and OP_INOUT, not OP_OUT.

But if we have a reload pseudo in a reload instruction and have chosen
a specific class for the reload pseudo, I think we should simply install
it for OP_OUT reloads too, if the class is a subset of the existing class.
We will need to pick such a register whatever happens (for r289 in the
example above).  And as explained in the PR, doing this actually avoids
an unnecessary move via the FP registers too.

The patch is quite aggressive in that it does this for all reload
pseudos in all reload instructions.  I wondered about reusing the
condition for a reload move in in_class_p:

          INSN_UID (curr_insn) >= new_insn_uid_start
          && curr_insn_set != NULL
          && ((OBJECT_P (SET_SRC (curr_insn_set))
               && ! CONSTANT_P (SET_SRC (curr_insn_set)))
              || (GET_CODE (SET_SRC (curr_insn_set)) == SUBREG
                  && OBJECT_P (SUBREG_REG (SET_SRC (curr_insn_set)))
                  && ! CONSTANT_P (SUBREG_REG (SET_SRC (curr_insn_set)))))))

but I can't really justify that on first principles.  I think we
should apply the rule consistently until we have a specific reason
for doing otherwise.

gcc/
PR rtl-optimization/96796
* lra-constraints.c (in_class_p): Add a default-false
allow_all_reload_class_changes_p parameter.  Do not treat
reload moves specially when the parameter is true.
(get_reload_reg): Try to narrow the class of an existing OP_OUT
reload if we're reloading a reload pseudo in a reload instruction.

gcc/testsuite/
PR rtl-optimization/96796
* gcc.c-torture/compile/pr96796.c: New test.

libstdc++: Simplify chrono::duration::_S_gcd

We can simplify this constexpr function further because we know that
period::num >= 1 and period::den >= 1 so only the remainder can ever be
zero.

libstdc++-v3/ChangeLog:

* include/std/chrono (duration::_S_gcd): Use invariant that
neither value is zero initially.

libstdc++: Simplify constraints for semiregular-box [LWG 3477]

libstdc++-v3/ChangeLog:

* include/std/ranges (__box): Simplify constraints as per LWG 3477.

vec: Revert "dead code removal in tree-vect-loop.c" and add a comment.

gcc/ChangeLog

2020-09-07 Andrea Corallo <andrea.corallo@arm.com>

* tree-vect-loop.c (vect_estimate_min_profitable_iters): Revert
dead-code removal introduced by 09fa6acd8d9 + add a comment to
clarify.

doc: Update documentation on MODE_PARTIAL_INT subregs

In d8487c949ad5, MODE_PARTIAL_INT modes were changed from having an
unknown number of undefined bits, to having a known number of undefined
bits, however the documentation on using SUBREG expressions with
MODE_PARTIAL_INT modes was not updated to reflect this.

gcc/ChangeLog:

* doc/rtl.texi (subreg): Fix documentation to state there is a known
number of undefined bits in regs and subregs of MODE_PARTIAL_INT modes.

MSP430: Don't override default ISA when MCU name is unrecognized

430X is the default ISA under normal operation, so even when the MCU name
passed to -mmcu= is unrecognized, it should not be overriden.

gcc/ChangeLog:

* config/msp430/msp430.c (msp430_option_override): Don't set the
ISA to 430 when the MCU is unrecognized.

gcc/testsuite/ChangeLog:

* gcc.target/msp430/430x-default-isa.c: New test.

Darwin, testsuite : Update pubtypes tests.

Recent changes in debug output have resulted in a change
in the length of the pub types info. This updates the tests to
reflect the new length.

gcc/testsuite/ChangeLog:

* gcc.dg/pubtypes-2.c: Amend Pub Info Length.
* gcc.dg/pubtypes-3.c: Likewise.
* gcc.dg/pubtypes-4.c: Likewise.

Darwin : Update libc function availability.

Darwin libc has sincos from 10.9 (darwin13) onwards.

gcc/ChangeLog:

* config/darwin.c (darwin_libc_has_function): Report sincos
available from 10.9.

aarch64: Remove redundant mult patterns

Following on from the previous commit to fix up the syntax for
add/sub/adds/subs and friends with a sign/zero-extended operand, this
patch removes the "mult" variants of these patterns which are all
redundant.

This patch removes the following patterns from the AArch64 backend:

*adds_mul_imm_<mode>
*subs_mul_imm_<mode>
*adds_<optab><mode>_multp2
*subs_<optab><mode>_multp2
*add_mul_imm_<mode>
*add_<optab><ALLX:mode>_mult_<GPI:mode>
*add_<optab><SHORT:mode>_mult_si_uxtw
*add_<optab><mode>_multp2
*add_<optab>si_multp2_uxtw
*add_uxt<mode>_multp2
*add_uxtsi_multp2_uxtw
*sub_mul_imm_<mode>
*sub_mul_imm_si_uxtw
*sub_<optab><mode>_multp2
*sub_<optab>si_multp2_uxtw
*sub_uxt<mode>_multp2
*sub_uxtsi_multp2_uxtw
*neg_mul_imm_<mode>2
*neg_mul_imm_si2_uxtw

Together with the following predicates which were used only by these
patterns:

  aarch64_pwr_imm3
  aarch64_pwr_2_si
  aarch64_pwr_2_di

These patterns are all redundant since multiplications by powers of two
should be represented as shfits outside a (mem).

---

gcc/ChangeLog:

* config/aarch64/aarch64.md (*adds_mul_imm_<mode>): Delete.
(*subs_mul_imm_<mode>): Delete.
(*adds_<optab><mode>_multp2): Delete.
(*subs_<optab><mode>_multp2): Delete.
(*add_mul_imm_<mode>): Delete.
(*add_<optab><ALLX:mode>_mult_<GPI:mode>): Delete.
(*add_<optab><SHORT:mode>_mult_si_uxtw): Delete.
(*add_<optab><mode>_multp2): Delete.
(*add_<optab>si_multp2_uxtw): Delete.
(*add_uxt<mode>_multp2): Delete.
(*add_uxtsi_multp2_uxtw): Delete.
(*sub_mul_imm_<mode>): Delete.
(*sub_mul_imm_si_uxtw): Delete.
(*sub_<optab><mode>_multp2): Delete.
(*sub_<optab>si_multp2_uxtw): Delete.
(*sub_uxt<mode>_multp2): Delete.
(*sub_uxtsi_multp2_uxtw): Delete.
(*neg_mul_imm_<mode>2): Delete.
(*neg_mul_imm_si2_uxtw): Delete.
* config/aarch64/predicates.md (aarch64_pwr_imm3): Delete.
(aarch64_pwr_2_si): Delete.
(aarch64_pwr_2_di): Delete.

aarch64: Don't emit invalid zero/sign-extend syntax

Given the following C function:

double *f(double *p, unsigned x)
{
    return p + x;
}

prior to this patch, GCC at -O2 would generate:

f:
        add     x0, x0, x1, uxtw 3
        ret

but this add instruction uses architecturally-invalid syntax: the width
of the third operand conflicts with the width of the extension
specifier. The third operand is only permitted to be an x register when
the extension specifier is (u|s)xtx.

This instruction, and analogous insns for adds, sub, subs, and cmp, are
rejected by clang, but accepted by binutils. Assembling and
disassembling such an insn with binutils gives the architecturally-valid
version in the disassembly:

   0:   8b214c00        add     x0, x0, w1, uxtw #3

This patch fixes several patterns in the AArch64 backend to use the
standard syntax as specified in the Arm ARM such that GCC's output can
be assembled by assemblers other than GAS.

---

gcc/ChangeLog:

* config/aarch64/aarch64.md
(*adds_<optab><ALLX:mode>_<GPI:mode>): Ensure extended operand
agrees with width of extension specifier.
(*subs_<optab><ALLX:mode>_<GPI:mode>): Likewise.
(*adds_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
(*subs_<optab><ALLX:mode>_shift_<GPI:mode>): Likewise.
(*add_<optab><ALLX:mode>_<GPI:mode>): Likewise.
(*add_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
(*add_uxt<mode>_shift2): Likewise.
(*sub_<optab><ALLX:mode>_<GPI:mode>): Likewise.
(*sub_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.
(*sub_uxt<mode>_shift2): Likewise.
(*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>): Likewise.
(*cmp_swp_<optab><ALLX:mode>_shft_<GPI:mode>): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
* gcc.target/aarch64/cmp.c: Likewise.
* gcc.target/aarch64/subs3.c: Likewise.
* gcc.target/aarch64/subsp.c: Likewise.
* gcc.target/aarch64/extend-syntax.c: New test.

improve SLP vect dumping

This adds additional dumping helping in particular basic-block
vectorization SLP dump reading plus showing what we actually
generate code from.

2020-09-07 Richard Biener <rguenther@suse.de>

* tree-vect-slp.c (vect_analyze_slp_instance): Dump
stmts we start SLP analysis from, failure and splitting.
(vect_schedule_slp): Dump SLP graph entry and root stmt
we are about to emit code for.

gcc: Make strchr return value pointers const

This fixes compilation of codepaths for dos-like filesystems
with Clang. When built with clang, it treats C input files as C++
when the compiler driver is invoked in C++ mode, triggering errors
when the return value of strchr() on a pointer to const is assigned
to a pointer to non-const variable.

This matches similar variables outside of the ifdefs for dos-like
path handling.

2020-09-07 Martin Storsjö <martin@martin.st>

gcc/
* dwarf2out.c (file_name_acquire): Make a strchr return value
pointer to const.
libcpp/
* files.c (remap_filename): Make a strchr return value pointer
to const.

Fortran: Fixes for pointer function call as variable (PR96896)

gcc/fortran/ChangeLog:

PR fortran/96896
* resolve.c (get_temp_from_expr): Also reset proc_pointer +
use_assoc attribute.
(resolve_ptr_fcn_assign): Use information from the LHS.

gcc/testsuite/ChangeLog:

PR fortran/96896
* gfortran.dg/ptr_func_assign_4.f08: Update dg-error.
* gfortran.dg/ptr-func-3.f90: New test.

[libatomic, testsuite] Add missing include in atomic-generic.c

When compiling atomic-generic.c from the libatomic testsuite, we run into:
...
$ gcc src/libatomic/testsuite/libatomic.c/atomic-generic.c -latomic
src/libatomic/testsuite/libatomic.c/atomic-generic.c: In function ‘main’:
src/libatomic/testsuite/libatomic.c/atomic-generic.c:31:7: warning: \
  implicit declaration of function ‘memcmp’ [-Wimplicit-function-declaration]
   if (memcmp (&a, &zero, size))
       ^~~~~~
...

Fix this by adding the missing string.h include.

Tested on x86_64.

libatomic/ChangeLog:

* testsuite/libatomic.c/atomic-generic.c: Include string.h.

Adjust testcase.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/slp-46.c: Add --param vect-epilogues-nomask=0 to
void backend interference.

lto: Stream edge goto_locus [PR94235]

The following patch adds streaming of edge goto_locus (both LOCATION_LOCUS
and LOCATION_BLOCK from it), the PR shows a testcase (inappropriate for
gcc testsuite) where the lack of streaming of goto_locus results in worse
debug info.
Earlier version of the patch (without the output_function changes) failed
miserably, because on the order mismatch - input_function would
first input_cfg, then input_eh_regions and then input_bb (all of which now
have locations), while output_function used output_eh_regions, then output_bb
and then output_cfg. *_cfg went to a separate stream...
Now, is there a reason why the order is different?

If the intent is that the cfg could be read separately from the rest of
function or vice versa, alternatively we'd need to clear_line_info ();
before output_eh_regions and before/after output_cfg to make them
independent.

2020-09-07 Jakub Jelinek <jakub@redhat.com>

PR debug/94235
* lto-streamer-out.c (output_cfg): Also stream goto_locus for edges.
Use bp_pack_var_len_unsigned instead of streamer_write_uhwi to stream
e->dest->index and e->flags.
(output_function): Call output_cfg before output_ssa_name, rather than
after streaming all bbs.
* lto-streamer-in.c (input_cfg): Stream in goto_locus for edges.
Use bp_unpack_var_len_unsigned instead of streamer_read_uhwi to stream
in dest_index and edge_flags.

code generate live lanes in basic-block vectorization

The following adds the capability to code-generate live lanes in
basic-block vectorization using lane extracts from vector stmts
rather than keeping the original scalar code around for those.
This eventually makes previously not profitable vectorizations
profitable (the live scalar code was appropriately costed so
are the lane extracts now), without considering the cost model
this patch doesn't add or remove any basic-block vectorization
capabilities.

The patch re/ab-uses STMT_VINFO_LIVE_P in basic-block vectorization
mode to tell whether a live lane is vectorized or whether it is
provided by means of keeping the scalar code live.

The patch is a first step towards vectorizing sequences of
stmts that do not end up in stores or vector constructors though.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

2020-09-04 Richard Biener <rguenther@suse.de>

* tree-vectorizer.h (vectorizable_live_operation): Adjust.
* tree-vect-loop.c (vectorizable_live_operation): Vectorize
live lanes out of basic-block vectorization nodes.
* tree-vect-slp.c (vect_bb_slp_mark_live_stmts): New function.
(vect_slp_analyze_operations): Analyze live lanes and their
vectorization possibility after the whole SLP graph is final.
(vect_bb_slp_scalar_cost): Adjust for vectorized live lanes.
* tree-vect-stmts.c (can_vectorize_live_stmts): Adjust.
(vect_transform_stmt): Call can_vectorize_live_stmts also for
basic-block vectorization.

* gcc.dg/vect/bb-slp-46.c: New testcase.
* gcc.dg/vect/bb-slp-47.c: Likewise.
* gcc.dg/vect/bb-slp-32.c: Adjust.

fortran: Fix argument types in derived types procedures

gcc/fortran/ChangeLog

* trans-types.c (gfc_get_derived_type): Fix argument types.

fortran: Fix arg types of _gfortran_is_extension_of

gcc/fortran/ChangeLog

* resolve.c (resolve_select_type): Provide a formal arg list.

Adjust testcase.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr92658-avx512bw-trunc.c: Add
-mprefer-vector-width=512 to avoid impact of different default
tune which gcc is built with.

Daily bump.

fortran: Add comment about previous commit

gcc/fortran/ChangeLog

* trans-types.c (gfc_get_ppc_type): Add comment.

fortran: Fix function arg types for class objects

gcc/fortran/ChangeLog

* trans-types.c (gfc_get_ppc_type): Fix function arg types.

fortran: caf_fail_image expects no argument

gcc/fortran/ChangeLog

PR fortran/96947
* trans-stmt.c (gfc_trans_fail_image): caf_fail_image
expects no argument.

gcc/testsuite/ChangeLog

* gfortran.dg/coarray_fail_st.f90: Adjust test.

Daily bump.

d: Fix ICE in create_tmp_var, at gimple-expr.c:482

Array concatenate expressions were creating more SAVE_EXPRs than what
was necessary. The internal error itself was the result of a forced
temporary being made on a TREE_ADDRESSABLE type.

gcc/d/ChangeLog:

PR d/96924
* expr.cc (ExprVisitor::visit (CatAssignExp *)): Don't force
temporaries needlessly.

gcc/testsuite/ChangeLog:

PR d/96924
* gdc.dg/simd13927b.d: Removed.
* gdc.dg/pr96924.d: New test.

c++: Use iloc_sentinel in mark_use.

gcc/cp/ChangeLog:

* expr.c (mark_use): Use iloc_sentinel.

tree-optimization/96920 - another ICE when vectorizing nested cycles

This refines the previous fix for PR96698 by re-doing how and where
we arrange for setting vectorized cycle PHI backedge values.

2020-09-04 Richard Biener <rguenther@suse.de>

PR tree-optimization/96698
PR tree-optimization/96920
* tree-vectorizer.h (loop_vec_info::reduc_latch_defs): Remove.
(loop_vec_info::reduc_latch_slp_defs): Likewise.
* tree-vect-stmts.c (vect_transform_stmt): Remove vectorized
cycle PHI latch code.
* tree-vect-loop.c (maybe_set_vectorized_backedge_value): New
helper to set vectorized cycle PHI latch values.
(vect_transform_loop): Walk over all PHIs again after
vectorizing them, calling maybe_set_vectorized_backedge_value.
Call maybe_set_vectorized_backedge_value for each vectorized
stmt. Remove delayed update code.
* tree-vect-slp.c (vect_analyze_slp_instance): Initialize
SLP instance reduc_phis member.
(vect_schedule_slp): Set vectorized cycle PHI latch values.

* gfortran.dg/vect/pr96920.f90: New testcase.
* gcc.dg/vect/pr96920.c: Likewise.

vec: dead code removal in tree-vect-loop.c

gcc/ChangeLog

2020-09-04 Andrea Corallo <andrea.corallo@arm.com>

* tree-vect-loop.c (vect_estimate_min_profitable_iters): Remove
dead code as LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) is
always verified.

arm: Improve immediate generation for thumb-1 with -mpurecode [PR96769]

This patch moves the move-immediate splitter after the regular ones so
that it has lower precedence, and updates its constraints.

For
int f3 (void) { return 0x11000000; }
int f3_2 (void) { return 0x12345678; }

we now generate:
* with -O2 -mcpu=cortex-m0 -mpure-code:
f3:
movs    r0, #136
lsls    r0, r0, #21
bx      lr
f3_2:
movs    r0, #18
lsls    r0, r0, #8
adds    r0, r0, #52
lsls    r0, r0, #8
adds    r0, r0, #86
lsls    r0, r0, #8
adds    r0, r0, #121
bx      lr

* with -O2 -mcpu=cortex-m23 -mpure-code:
f3:
movs    r0, #136
lsls    r0, r0, #21
bx      lr
f3_2:
movw    r0, #22136
movt    r0, 4660
bx      lr

2020-09-04  Christophe Lyon  <christophe.lyon@linaro.org>

PR target/96769
gcc/
* config/arm/thumb1.md: Move movsi splitter for
arm_disable_literal_pool after the other movsi splitters.

gcc/testsuite/
* gcc.target/arm/pure-code/pr96769.c: New test.

rename widest_irange to int_range_max.

gcc/ChangeLog:

* range-op.cc (range_operator::fold_range): Rename widest_irange
to int_range_max.
(operator_div::wi_fold): Same.
(operator_lshift::op1_range): Same.
(operator_rshift::op1_range): Same.
(operator_cast::fold_range): Same.
(operator_cast::op1_range): Same.
(operator_bitwise_and::remove_impossible_ranges): Same.
(operator_bitwise_and::op1_range): Same.
(operator_abs::op1_range): Same.
(range_cast): Same.
(widest_irange_tests): Same.
(range3_tests): Rename irange3 to int_range3.
(int_range_max_tests): Rename from widest_irange_tests.
Rename widest_irange to int_range_max.
(operator_tests): Rename widest_irange to int_range_max.
(range_tests): Same.
* tree-vrp.c (find_case_label_range): Same.
* value-range.cc (irange::irange_intersect): Same.
(irange::invert): Same.
* value-range.h: Same.

tree-optimization/96931 - clear ctrl-altering flag more aggressively

The testcase shows that we fail to clear gimple_call_ctrl_altering_p
when the last abnormal edge goes away, causing an edge insert to
a loop header edge when we have preheaders to split the edge
unnecessarily.

The following addresses this by more aggressively clearing the
flag in cleanup_call_ctrl_altering_flag.

2020-09-04 Richard Biener <rguenther@suse.de>

PR tree-optimization/96931
* tree-cfgcleanup.c (cleanup_call_ctrl_altering_flag): If
there's a fallthru edge and no abnormal edge the call is
no longer control-altering.
(cleanup_control_flow_bb): Pass down the BB to
cleanup_call_ctrl_altering_flag.

* gcc.dg/pr96931.c: New testcase.

lto: Remove stream_input_location_now

As discussed yesterday, stream_input_location_now has been used in 3
remaining places. For ERT_MUST_NOT_THROW, I believe the failure_loc
location is stable at least until the apply_cache after the bbs are all
read, and the locations do not include BLOCK, so we can use normal
stream_input_location, and the two input_struct_function_base also
shouldn't include BLOCK and are stable at least until that same apply_cache
after reading all bbs, so again we can use the location cache.

2020-09-04 Jakub Jelinek <jakub@redhat.com>

* lto-streamer.h (stream_input_location_now): Remove declaration.
* lto-streamer-in.c (stream_input_location_now): Remove.
(input_eh_region, input_struct_function_base): Use
stream_input_location instead of stream_input_location_now.

lto: Ensure we force a change for file/line/column after clear_line_info

As discussed yesterday:
On the streamer out side, we call clear_line_info
in multiple spots which resets the current_* values to something, but on the
reader side, we don't have corresponding resets in the same location, just have
the stream_* static variables that keep the current values through the
entire stream in (so across all the clear_line_info spots in a single LTO
object but also across jumping from one LTO object to another one).
Now, in an earlier version of my patch it actually broke LTO bootstrap
(and a lot of LTO testcases), so for the BLOCK case I've solved it by
clear_line_info setting current_block to something that should never appear,
which means that in the LTO stream after the clear_line_info spots including
the start of the LTO stream we force the block change bit to be set and thus
BLOCK to be streamed and therefore stream_block from earlier to be
ignored.  But for the rest I think that is not the case, so I wonder if we
don't sometimes end up with wrong line/column info because of that, or
please tell me what prevents that.
clear_line_info does:
  ob->current_file = NULL;
  ob->current_line = 0;
  ob->current_col = 0;
  ob->current_sysp = false;
while I think NULL current_file is something that should likely be different
from expanded_location (...).file (UNKNOWN_LOCATION/BUILTINS_LOCATION are
handled separately and not go through the caching), I think line number 0
can sometimes occur and especially column 0 occurs frequently if we ran out
of location_t with columns info.  But then we do:
      bp_pack_value (bp, ob->current_file != xloc.file, 1);
      bp_pack_value (bp, ob->current_line != xloc.line, 1);
      bp_pack_value (bp, ob->current_col != xloc.column, 1);
and stream the details only if the != is true.  If that happens immediately
after clear_line_info and e.g. xloc.column is 0, we would stream 0 bit and
not stream the actual value, so on read-in it would reuse whatever
stream_col etc. were before.  Shouldn't we set some ob->current_* new bit
that would signal we are immediately past clear_line_info which would force
all these != checks to non-zero?  Either by oring something into those
tests, or perhaps:
  if (ob->current_reset)
    {
      if (xloc.file == NULL)
        ob->current_file = "";
      if (xloc.line == 0)
        ob->current_line = 1;
      if (xloc.column == 0)
        ob->current_column = 1;
      ob->current_reset = false;
    }
before doing those bp_pack_value calls with a comment, effectively forcing
all 6 != comparisons to be true?

2020-09-04  Jakub Jelinek  <jakub@redhat.com>

* lto-streamer.h (struct output_block): Add reset_locus member.
* lto-streamer-out.c (clear_line_info): Set reset_locus to true.
(lto_output_location_1): If reset_locus, clear it and ensure
current_{file,line,col} is different from xloc members.

bpf: generate indirect calls for xBPF

This patch updates the BPF back end to generate indirect calls via
the 'call %reg' instruction when targetting xBPF.

Additionally, the BPF ASM_SPEC is updated to pass along -mxbpf to
gas, where it is now supported.

2020-09-03 David Faust <david.faust@oracle.com>

gcc/

* config/bpf/bpf.h (ASM_SPEC): Pass -mxbpf to gas, if specified.
* config/bpf/bpf.c (bpf_output_call): Support indirect calls in xBPF.

gcc/testsuite/

* gcc.target/bpf/xbpf-indirect-call-1.c: New test.

test/rs6000: Replace test targets p8 and p9+

This patch is to clean existing rs6000 test targets p8 and p9+
with existing has_arch_pwr8 and has_arch_pwr9 targets combination
or only one of them.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr92398.p9+.c: Replace p9+ with has_arch_pwr9.
* gcc.target/powerpc/pr92398.p9-.c: Replace p9+ with has_arch_pwr9,
and replace p8 with has_arch_pwr8 && !has_arch_pwr9.
* lib/target-supports.exp (check_effective_target_p8): Remove.
(check_effective_target_p9+): Remove.

Daily bump.

sra: Avoid SRAing if there is an aout-of-bounds access (PR 96820)

The testcase causes and ICE in the SRA verifier on x86_64 when
compiling with -m32 because build_user_friendly_ref_for_offset looks
at an out-of-bounds array_ref within an array_ref which accesses an
offset which does not fit into a signed 32bit integer and turns it
into an array-ref with a negative index.

The best thing is probably to bail out early when encountering an out
of bounds access to a local stack-allocated aggregate (and let the DSE
just delete such statements) which is what the patch does.

I also glanced over to the initial candidate vetting routine to make
sure the size would fit into HWI and noticed that it uses unsigned
variants whereas the rest of SRA operates on signed offsets and
sizes (because get_ref_and_extent does) and so changed that for the
sake of consistency.  These ancient checks operate on sizes of types
as opposed to DECLs but I hope that any issues potentially arising
from that are basically hypothetical.

gcc/ChangeLog:

2020-08-28  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/96820
* tree-sra.c (create_access): Disqualify candidates with accesses
beyond the end of the original aggregate.
(maybe_add_sra_candidate): Check that candidate type size fits
signed uhwi for the sake of consistency.

gcc/testsuite/ChangeLog:

2020-08-28  Martin Jambor  <mjambor@suse.cz>

PR tree-optimization/96820
* gcc.dg/tree-ssa/pr96820.c: New test.

[PATCH, rs6000] Fix vector long long subtype (PR96139)

Hi,
This corrects an issue with the powerpc vector long long subtypes.
As reported by SjMunroe, when building some code with -Wall, and
attempting to print an element of a "long long vector" with a
long long printf format string, we will report an error because
the vector sub-type was improperly defined as int.

When defining a V2DI_type_node we use a TARGET_POWERPC64 ternary to
define the V2DI_type_node with "vector long" or "vector long long".
We also need to specify the proper sub-type when we define the type.

PR target/96139

2020-09-03 Will Schmidt <will_schmidt@vnet.ibm.com>

gcc/ChangeLog:
* config/rs6000/rs6000-call.c (rs6000_init_builtin): Update V2DI_type_node
and unsigned_V2DI_type_node definitions.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: New test.
* gcc.target/powerpc/pr96139-b.c: New test.
* gcc.target/powerpc/pr96139-c.c: New test.

c++: Fix another PCH hash_map issue [PR96901]

The recent libstdc++ changes caused lots of libstdc++-v3 tests FAILs
on i686-linux, all of them in the same spot during constexpr evaluation
of a recursive _S_gcd call.
The problem is yet another hash_map that used the default hasing of
tree keys through pointer hashing which is preserved across PCH write/read.
During PCH handling, the addresses of GC objects are changed, which means
that the hash values of the keys in such hash tables change without those
hash tables being rehashed.  Which in the fundef_copies_table case usually
means we just don't find a copy of a FUNCTION_DECL body for recursive uses
and start from scratch.  But when the hash table keeps growing, the "dead"
elements in the hash table can sometimes reappear and break things.
In particular what I saw under the debugger is when the fundef_copies_table
hash map has been used on the outer _S_gcd call, it didn't find an entry for
it, so returned a slot with *slot == NULL, which is treated as that the
function itself is used directly (i.e. no recursion), but that addition of
a hash table slot caused the recursive _S_gcd call to actually find
something in the hash table, unfortunately not the new *slot == NULL spot,
but a different one from the pre-PCH streaming which contained the returned
toplevel (non-recursive) call entry for it, which means that for the
recursive _S_gcd call we actually used the same trees as for the outer ones
rather than a copy of those, which breaks constexpr evaluation.

2020-09-03  Jakub Jelinek  <jakub@redhat.com>

PR c++/96901
* tree.h (struct decl_tree_traits): New type.
(decl_tree_map): New typedef.

* constexpr.c (fundef_copies_table): Change type from
hash_map<tree, tree> * to decl_tree_map *.