git.libre-soc.org Git

Use symtab_node::order in LTO sections with body.

2019-10-30  Martin Liska  <mliska@suse.cz>

PR lto/91393
PR lto/88220
* cgraph.c (cgraph_node::get_create): Overwrite node->order
from a first_clone in order to get proper LTO section
in LTO stream.
(cgraph_node::get_untransformed_body):
Use lto_get_section_data where symtab_node::order
must be provided.
* cgraphclones.c (cgraph_node::find_replacement):
Update also symbol order.
* ipa-fnsummary.c (ipa_fn_summary_read):
Use new function lto_get_summary_section_data.
* ipa-hsa.c (ipa_hsa_read_summary): Likewise.
* ipa-icf.c (sem_item_optimizer::read_summary):
Likewise.
* ipa-prop.c (ipa_prop_read_jump_functions):
Likewise.
(ipcp_read_transformation_summaries): Likewise.
* ipa-sra.c (ipa_sra_read_summary): Likewise.
* lto-cgraph.c (input_node): Add also order_base.
(input_varpool_node): Likewise.
(input_cgraph_1): Assign the order_base.
(input_cgraph_opt_summary): Use new lto_get_summary_section_data.
* lto-opts.c (lto_write_options): Pass new argument.
* lto-section-in.c (lto_get_section_data): Add new argumente order.
(lto_get_summary_section_data): New.
(lto_get_raw_section_data): Add order argument.
(lto_create_simple_input_block): Likewise.
* lto-section-out.c (lto_destroy_simple_output_block):
Likewise.
* lto-streamer-in.c (lto_input_toplevel_asms):
Use lto_get_summary_section_data.
(lto_input_mode_table): Likewise.
* lto-streamer-out.c (produce_asm): Pass symtab_node::order.
(lto_output_toplevel_asms): Pass new argument.
(copy_function_or_variable): Likewise.
(produce_lto_section):Likewise.
(produce_symtab): Likewise.
(lto_write_mode_table): Likewise.
(produce_asm_for_decls): Likewise.
* lto-streamer.c (lto_get_section_name): Concat symbol name
and symbol order.
* lto-streamer.h (lto_get_section_data): Add order argument.
(lto_get_summary_section_data): New.
(lto_get_raw_section_data): Add order argument.
(lto_get_section_name): Likewise.
* varpool.c (varpool_node::get_constructor): Pass order argument.
2019-10-30  Martin Liska  <mliska@suse.cz>

PR lto/91393
PR lto/88220
* lto-common.c (lto_file_finalize): Use lto_get_summary_section_data.
(get_section_data): Add order argument.
2019-10-30  Martin Liska  <mliska@suse.cz>

PR lto/91393
PR lto/88220
* gcc.dg/lto/pr91393_0.c: New test.

From-SVN: r277607

libgomp/testsuite – use 'stop' and 'dg-do run'

        libgomp/
        * testsuite/libgomp.fortran/target-simd.f90: Use stop not abort.
        * testsuite/libgomp.fortran/use_device_ptr-optional-1.f90:
        Ditto; add 'dg-do run' for torture testing.
        * testsuite/libgomp.fortran/lastprivate1.f90:  Add 'dg-do run'.
        * testsuite/libgomp.fortran/lastprivate2.f90: Ditto.
        * testsuite/libgomp.fortran/nestedfn4.f90: Ditto.
        * testsuite/libgomp.fortran/pr25219.f90: Ditto.
        * testsuite/libgomp.fortran/pr28390.f: Ditto.
        * testsuite/libgomp.fortran/pr35130.f90: Ditto.
        * testsuite/libgomp.fortran/pr90779.f90: Ditto.
        * testsuite/libgomp.fortran/task2.f90: Ditto.
        * testsuite/libgomp.fortran/taskgroup1.f90: Ditto.
        * testsuite/libgomp.fortran/taskloop1.f90: Ditto.
        * testsuite/libgomp.fortran/use_device_addr-1.f90: Ditto.
        * testsuite/libgomp.fortran/use_device_addr-2.f90: Ditto.
        * testsuite/libgomp.fortran/workshare1.f90: Ditto.
        * testsuite/libgomp.fortran/workshare2.f90: Ditto.

From-SVN: r277606

re PR tree-optimization/92262 (ICE: verify_gimple failed (error: incorrect sharing of tree nodes))

PR tree-optimization/92262
* tree-ssa-loop-ivopts.c (get_debug_computation_at): Don't unshare
ubase or cbase here.
(remove_unused_ivs): Unshare comp before using it.

* g++.dg/opt/pr92262.C: New test.

From-SVN: r277605

ipa-prop.c (update_jump_functions_after_inlining): Watch for missing summaries.

* ipa-prop.c (update_jump_functions_after_inlining):
Watch for missing summaries.

From-SVN: r277604

re PR tree-optimization/65930 (Reduction with sign-change not handled)

2019-10-30 Richard Biener <rguenther@suse.de>

PR tree-optimization/65930
* tree-vect-loop.c (vect_is_simple_reduction): For reduction
chains also allow a leading and trailing conversion.
* tree-vect-slp.c (vect_get_and_check_slp_defs): Handle
intermediate reduction chains.
(vect_analyze_slp_instance): Likewise. Build a SLP
node for a trailing conversion manually.

* gcc.dg/vect/pr65930-2.c: New testcase.

From-SVN: r277603

Suppress warning with -Wno-overwrite-recursive.

The use of -fno-automatic with -frecursive results in a warning implying
that recursion will not work. If all relevant local variable have the
automatic attribute explicitly declared recursion does work and the warning
is redundant.

From-SVN: r277602

Remove cgraph_local_info structure.

2019-10-30 Martin Liska <mliska@suse.cz>

* cgraph.c (cgraph_node::local_info): Transform to ...
(cgraph_node::local_info_node): ... this.
(cgraph_node::dump): Remove cgraph_local_info and
put its fields directly into cgraph_node.
(cgraph_node::get_availability): Likewise.
(cgraph_node::make_local): Likewise.
(cgraph_node::verify_node): Likewise.
* cgraph.h (struct GTY): Likewise.
* cgraphclones.c (set_new_clone_decl_and_node_flags): Likewise.
(duplicate_thunk_for_node): Likewise.
(cgraph_node::create_clone): Likewise.
(cgraph_node::create_virtual_clone): Likewise.
(cgraph_node::create_version_clone): Likewise.
* cgraphunit.c (cgraph_node::reset): Likewise.
(cgraph_node::finalize_function): Likewise.
(cgraph_node::add_new_function): Likewise.
(analyze_functions): Likewise.
* combine.c (setup_incoming_promotions): Likewise.
* config/i386/i386.c (ix86_function_regparm): Likewise.
(ix86_function_sseregparm): Likewise.
(init_cumulative_args): Likewise.
* ipa-cp.c (determine_versionability): Likewise.
(count_callers): Likewise.
(set_single_call_flag): Likewise.
(initialize_node_lattices): Likewise.
(estimate_local_effects): Likewise.
(create_specialized_node): Likewise.
(identify_dead_nodes): Likewise.
* ipa-fnsummary.c (compute_fn_summary): Likewise.
(ipa_fn_summary_generate): Likewise.
* ipa-hsa.c (check_warn_node_versionable): Likewise.
(process_hsa_functions): Likewise.
* ipa-icf.c (set_local): Likewise.
* ipa-inline-analysis.c (initialize_inline_failed): Likewise.
* ipa-inline.c (speculation_useful_p): Likewise.
* ipa-profile.c (ipa_propagate_frequency): Likewise.
(ipa_profile): Likewise.
* ipa-split.c (split_function): Likewise.
(execute_split_functions): Likewise.
* ipa-sra.c (ipa_sra_preliminary_function_checks): Likewise.
(ipa_sra_ipa_function_checks): Likewise.
* ipa-visibility.c (function_and_variable_visibility): Likewise.
* ipa.c (symbol_table::remove_unreachable_nodes): Likewise.
* lto-cgraph.c (lto_output_node): Likewise.
(input_overwrite_node): Likewise.
* multiple_target.c (expand_target_clones): Likewise.
* omp-simd-clone.c (simd_clone_create): Likewise.
* trans-mem.c (expand_call_tm): Likewise.
(ipa_tm_mayenterirr_function): Likewise.
(ipa_tm_diagnose_tm_safe): Likewise.
(ipa_tm_diagnose_transaction): Likewise.
(ipa_tm_create_version): Likewise.
(ipa_tm_transform_calls_redirect): Likewise.
(ipa_tm_execute): Likewise.
* tree-inline.c (expand_call_inline): Likewise.

From-SVN: r277601

Remove cgraph_global_info.

From-SVN: r277600

Daily bump.

From-SVN: r277599

typeck.c (build_x_unary_op): Use the location_t argument in three error_at.

/cp
2019-10-29 Paolo Carlini <paolo.carlini@oracle.com>

* typeck.c (build_x_unary_op): Use the location_t argument in
three error_at.

/testsuite
2019-10-29 Paolo Carlini <paolo.carlini@oracle.com>

* g++.dg/other/ptrmem8.C: Test locations too.
* g++.dg/template/dtor6.C: Likewise.

From-SVN: r277595

PR c++/90998 - ICE with copy elision in init by ctor and -Wconversion.

After r269667 which introduced joust_maybe_elide_copy, in C++17 we can elide
a constructor if it uses a conversion function that returns a prvalue, and
use the conversion function in its stead.

This eliding means that if we have a candidate that previously didn't have
->second_conv, it can have it after the elision. This confused the
-Wconversion warning because it was assuming that if cand1->second_conv is
non-null, so is cand2->second_conv. Here cand1->second_conv was non-null
but cand2->second_conv remained null, so it crashed in compare_ics.

I checked with clang that both compilers call A::operator B() in C++17 and
B::B(A const &) otherwise.

* call.c (joust): Don't attempt to warn if ->second_conv is null.

* g++.dg/cpp0x/overload-conv-4.C: New test.

From-SVN: r277593

re PR c++/92201 (ICE: ‘verify_gimple’ failed with -std=c++2a)

PR c++/92201
* cp-gimplify.c (cp_gimplify_expr): If gimplify_to_rvalue changes the
function pointer type, re-add cast to the original one.

* g++.dg/other/pr92201.C: New test.

From-SVN: r277592

PR c++/91548 - fix detecting modifying const objects for ARRAY_REF.

This fixes a bogus "modifying a const object" error for an array that actually
isn't declared const. The problem was how I handled ARRAY_REFs here; we
shouldn't look at the ARRAY_REF itself, but at the array its accessing.

* constexpr.c (cxx_eval_store_expression): Don't call
modifying_const_object_p for ARRAY_REF.

* g++.dg/cpp1y/constexpr-tracking-const15.C: New test.
* g++.dg/cpp1y/constexpr-tracking-const16.C: New test.
* g++.dg/cpp1z/constexpr-tracking-const1.C: New test.

From-SVN: r277591

Fix compilation errors with Clang

* include/bits/range_access.h (ranges::disable_sized_range)
(ranges::begin, ranges::end, ranges::cbegin, ranges::cend)
(ranges::rbegin, ranges::rend, ranges::crbegin, ranges::crend)
(ranges::size, ranges::empty, ranges::data, ranges::cdata)
(ranges::range, ranges::sized_range, ranges::advance, ranges::distance)
(ranges::next, ranges::prev): Guard with __cpp_lib_concepts.
* include/bits/stl_iterator.h (disable_sized_sentinel): Likewise.

From-SVN: r277589

Fix compilation errors with Clang

* include/bits/alloc_traits.h (__cpp_lib_constexpr_dynamic_alloc):
Define.
(allocator_traits::_S_construct, allocator_traits::_S_destroy)
(__alloc_on_copy, __alloc_on_move, __alloc_on_swap): Use
_GLIBCXX14_CONSTEXPR instead of constexpr.
* include/bits/stl_construct.h (_Destroy): Likewise.

From-SVN: r277588

Add iterator concepts and range access customization points for C++20

This adds most of the new C++20 features to <iterator>, as well as a few
initial pieces of <ranges> (but no actual <ranges> header just yet).

* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/iterator_concepts.h: New header.
(contiguous_iterator_tag, iter_reference_t, ranges::iter_move)
(iter_rvalue_reference_t, incrementable_traits, iter_difference_t)
(readable_traits, iter_value_t, readable, iter_common_reference_t)
(writable, waekly_incrementable, incrementable)
(input_or_output_iterator, sentinel_for, disable_sized_sentinel)
(sized_sentinel_for, input_iterator, output_iterator)
(forward_iterator, bidirectional_iterator, random_access_iterator)
(contiguous_iterator, indirectly_unary_invocable)
(indirectly_regular_unary_invocable, indirect_unary_predicate)
(indirect_relation, indirect_strict_weak_order, indirect_result_t)
(projected, indirectly_movable, indirectly_movable_storable)
(indirectly_copyable, indirectly_copyable_storable, ranges::iter_swap)
(indirectly_swappable, indirectly_comparable, permutable, mergeable)
(sortable, unreachable_sentinel_t, unreachable_sentinel)
(default_sentinel_t, default_sentinel): Define.
(__detail::__cpp17_iterator, __detail::__cpp17_input_iterator)
(__detail::__cpp17_fwd_iterator, __detail::__cpp17_bidi_iterator)
(__detail::__cpp17_randacc_iterator): Define.
(__iterator_traits): Define constrained specializations.
* include/bits/move.h (move): Only use old concept check for C++98.
* include/bits/range_access.h (ranges::disable_sized_range)
(ranges::begin, ranges::end, ranges::cbegin, ranges::cend)
(ranges::rbegin, ranges::rend, ranges::crbegin, ranges::crend)
(ranges::size, ranges::empty, ranges::data, ranges::cdata): Define
new customization points for C++20.
(ranges::range, ranges::sized_range): Define new concepts for C++20.
(ranges::advance, ranges::distance, ranges::next, ranges::prev):
Define new functions for C++20.
(__adl_end, __adl_cdata, __adl_cbegin, __adl_cend, __adl_rbegin)
(__adl_rend, __adl_crbegin, __adl_crend, __adl_cdata, __adl_size)
(__adl_empty): Remove.
* include/bits/stl_iterator.h (disable_sized_sentinel): Specialize
for reverse_iterator.
* include/bits/stl_iterator_base_types.h (contiguous_iterator_tag):
Define new struct for C++20.
(iterator_traits<_Tp*>): Constrain partial specialization in C++20.
* include/std/concepts (__is_class_or_enum): Move to __detail
namespace.
* testsuite/20_util/forward/c_neg.cc: Adjust dg-error line number.
* testsuite/20_util/forward/f_neg.cc: Likewise.
* testsuite/24_iterators/associated_types/incrementable.traits.cc: New
test.
* testsuite/24_iterators/associated_types/readable.traits.cc: New test.
* testsuite/24_iterators/contiguous/concept.cc: New test.
* testsuite/24_iterators/contiguous/tag.cc: New test.
* testsuite/24_iterators/customization_points/iter_move.cc: New test.
* testsuite/24_iterators/customization_points/iter_swap.cc: New test.
* testsuite/24_iterators/headers/iterator/synopsis_c++20.cc: New test.
* testsuite/24_iterators/range_operations/advance.cc: New test.
* testsuite/24_iterators/range_operations/distance.cc: New test.
* testsuite/24_iterators/range_operations/next.cc: New test.
* testsuite/24_iterators/range_operations/prev.cc: New test.
* testsuite/26_numerics/adjacent_difference/requirements/
explicit_instantiation/2.cc: Rename types that conflict with C++20
concepts.
* testsuite/26_numerics/adjacent_difference/requirements/
explicit_instantiation/pod.cc: Likewise.
* testsuite/26_numerics/partial_sum/requirements/
explicit_instantiation/2.cc: Likewise.
* testsuite/26_numerics/partial_sum/requirements/
explicit_instantiation/pod.cc: Likewise.
* testsuite/experimental/iterator/requirements.cc: Likewise.
* testsuite/std/ranges/access/begin.cc: New test.
* testsuite/std/ranges/access/cbegin.cc: New test.
* testsuite/std/ranges/access/cdata.cc: New test.
* testsuite/std/ranges/access/cend.cc: New test.
* testsuite/std/ranges/access/crbegin.cc: New test.
* testsuite/std/ranges/access/crend.cc: New test.
* testsuite/std/ranges/access/data.cc: New test.
* testsuite/std/ranges/access/empty.cc: New test.
* testsuite/std/ranges/access/end.cc: New test.
* testsuite/std/ranges/access/rbegin.cc: New test.
* testsuite/std/ranges/access/rend.cc: New test.
* testsuite/std/ranges/access/size.cc: New test.
* testsuite/util/testsuite_iterators.h (contiguous_iterator_wrapper)
(test_range, test_sized_range): New test utilities.

From-SVN: r277579

Minor improvements to testsuite iterator utilities

* testsuite/util/testsuite_iterators.h (BoundsContainer::size()): Add
new member function.
(WritableObject::operator=): Constrain with enable_if when available.
(remove_cv): Use std::remove_if when available.
(test_container::it(int)): Use size().
(test_container::size()): Use BoundsContainer::size().

From-SVN: r277578

PR libstdc++/92267 fix ABI change in deque iterators

Defaulting the copy constructor on its first declaration made it change
from user-provided (and non-trivial) to implicitly-defined (and
trivial). This caused an ABI incompatibility between GCC 8 and GCC 9,
where functions taking a deque iterator disagree on the argument passing
convention.

PR libstdc++/92267
* include/bits/stl_deque.h (_Deque_iterator(const _Deque_iterator&)):
Do not define as defaulted.
* testsuite/23_containers/deque/types/92267.cc: New test.

From-SVN: r277577

re PR testsuite/92144 (c-c++-common/Warray-bounds-4.c still fails after r277080)

gcc/testsuite/ChangeLog:

PR testsuite/92144
* c-c++-common/Warray-bounds-4.c: Disable test to avoid failures
due to PR 83543.

From-SVN: r277576

cp-demangle.c (d_number): Avoid signed int overflow.

2019-10-29 Paul Pluzhnikov <ppluzhnikov@google.com>

* cp-demangle.c (d_number): Avoid signed int overflow.

From-SVN: r277575

Pass memory statistics for {symbol,call}_summary.

2019-10-29 Martin Liska <mliska@suse.cz>

* symbol-summary.h (function_summary): Pass memory location
to underlaying hash_map (or vec).
(V>::fast_function_summary): Likewise.

From-SVN: r277573

Release function and edge summaries allocated with GGC.

2019-10-29 Martin Liska <mliska@suse.cz>

* ggc.h (ggc_alloc_no_dtor): New function.
* ipa-fnsummary.c (ipa_free_fn_summary): Call
destructor and ggc_free.
(ipa_free_size_summary): Call delete instead
of release.
* ipa-fnsummary.h: Use new function ggc_alloc_no_dtor.
* ipa-prop.c (ipa_check_create_edge_args): Likewise.
(ipa_free_all_edge_args): Call destructor and ggc_free.
(ipa_free_all_node_params): Likewise.
(ipcp_free_transformation_sum): Likewise.
* ipa-prop.h (ipa_check_create_node_params):
Call new ggc_alloc_no_dtor.
* ipa-sra.c (ipa_sra_generate_summary): Likewise.
(ipa_sra_analysis): Call destructor and ggc_free.
Replace release with delete operator.
* symbol-summary.h (release): Remove ..
(V>::~fast_function_summary): and move logic here.
Likewise for other classes.

From-SVN: r277572

re PR tree-optimization/92260 (ICE in exact_div, at poly-int.h:2162)

2019-10-29 Richard Biener <rguenther@suse.de>

PR tree-optimization/92260
* tree-vect-slp.c (vect_get_constant_vectors): Special-case
lane-reducing ops.

* gcc.dg/pr92260.c: New testcase.

From-SVN: r277571

[vect]PR 88915: Vectorize epilogues when versioning loops

gcc/ChangeLog:
2019-10-29  Andre Vieira  <andre.simoesdiasvieira@arm.com>

PR 88915
* tree-ssa-loop-niter.h (simplify_replace_tree): Change declaration.
* tree-ssa-loop-niter.c (simplify_replace_tree): Add context parameter
and make the valueize function pointer also take a void pointer.
* gcc/tree-ssa-sccvn.c (vn_valueize_wrapper): New function to wrap
around vn_valueize, to call it without a context.
(process_bb): Use vn_valueize_wrapper instead of vn_valueize.
* tree-vect-loop.c (_loop_vec_info): Initialize epilogue_vinfos.
(~_loop_vec_info): Release epilogue_vinfos.
(vect_analyze_loop_costing): Use knowledge of main VF to estimate
number of iterations of epilogue.
(vect_analyze_loop_2): Adapt to analyse main loop for all supported
vector sizes when vect-epilogues-nomask=1.  Also keep track of lowest
versioning threshold needed for main loop.
(vect_analyze_loop): Likewise.
(find_in_mapping): New helper function.
(update_epilogue_loop_vinfo): New function.
(vect_transform_loop): When vectorizing epilogues re-use analysis done
on main loop and call update_epilogue_loop_vinfo to update it.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): No longer insert
stmts on loop preheader edge.
(vect_do_peeling): Enable skip-vectors when doing loop versioning if
we decided to vectorize epilogues.  Update epilogues NITERS and
construct ADVANCE to update epilogues data references where needed.
* tree-vectorizer.h (_loop_vec_info): Add epilogue_vinfos.
(vect_do_peeling, vect_update_inits_of_drs,
determine_peel_for_niter, vect_analyze_loop): Add or update
declarations.
* tree-vectorizer.c (try_vectorize_loop_1): Make sure to use already
created loop_vec_info's for epilogues when available.  Otherwise analyse
epilogue separately.

From-SVN: r277569

tree-ssa.texi (Immediate Uses): Fix FOR_EACH_IMM_USE_STMT example.

2019-10-29 Richard Biener <rguenther@suse.de>

* doc/tree-ssa.texi (Immediate Uses): Fix FOR_EACH_IMM_USE_STMT
example.

From-SVN: r277568

Fix reduc_index calculation in vectorizable_condition

Fixes ICEs in gcc.target/aarch64/sve/clastb*.

2019-10-29 Richard Sandiford <richard.sandiford@arm.com>

gcc/
* tree-vect-stmts.c (vectorizable_condition): Get the reduction
index for the COND_EXPR from stmt_info rather than reduc_info.

From-SVN: r277567

re PR tree-optimization/65930 (Reduction with sign-change not handled)

2019-10-29 Richard Biener <rguenther@suse.de>

PR tree-optimization/65930
* tree-vect-loop.c (check_reduction_path): Relax single-use
check allowing out-of-loop uses.
(vect_is_simple_reduction): SLP reduction chains cannot have
intermediate stmts used outside of the loop.
(vect_create_epilog_for_reduction): The adjustment might need
to be converted.
(vectorizable_reduction): Annotate live stmts of the reduction
chain with STMT_VINFO_REDUC_DEF.
* tree-vect-stms.c (process_use): Remove no longer true asserts.

* gcc.dg/vect/pr65930-1.c: New testcase.

From-SVN: r277566

[AArch64] Add main SVE ACLE tests

Now that the PCS support is applied, this patch adds the main
SVE ACLE tests.  The idea is to test various combinations of operands
for each ACLE function, with each combination using a specific register
allocation and with each combination being wrapped its own test function.
We then compare the full assembly output of these test functions against
the expected/preferred sequences.  This provides both optimisation and
correctness testing, since ultimately the ACLE functions are defined in
terms of the underlying SVE instructions.

2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>
    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
    Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

gcc/testsuite/
* g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: New file.
* gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: New file.
* gcc.target/aarch64/sve/acle/asm: New test directory.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
From-SVN: r277565

[AArch64] Add support for the SVE PCS

The AAPCS64 specifies that if a function takes arguments in SVE
registers or returns them in SVE registers, it must preserve all
of Z8-Z23 and all of P4-P11.  (Normal functions only preserve the
low 64 bits of Z8-Z15 and clobber all of the predicate registers.)

This variation is known informally as the "SVE PCS" and functions
that use it are known informally as "SVE functions".  The SVE PCS
is mutually interoperable with functions that follow the standard
AAPCS64 rules and those that use the aarch64_vector_pcs attribute.
(Note that it's an error to use the attribute for SVE functions.)

One complication -- although it's not really that complicated --
is that SVE registers need to be saved at a VL-dependent offset while
other registers need to be saved at a constant offset.  The easiest way
of handling this seemed to be to group the SVE registers together below
the hard frame pointer.  In common cases, the frame pointer is then
usually an easy-to-compute VL multiple above the stack pointer and a
constant amount below the incoming stack pointer.

A bigger complication is that, because the base AAPCS64 specifies that
only the low 64 bits of V8-V15 are preserved by calls, the associated
DWARF frame registers are also treated as 64 bits by the unwinder.
The 64 bits must also have the same layout as they would for a base
AAPCS64 function, otherwise unwinding won't work correctly.  (This is
actually a problem for the existing aarch64_vector_pcs support too,
but I'll fix that separately.)

This falls out naturally for little-endian targets but not for
big-endian targets.  The easiest way of meeting the requirement for them
was to use ST1D and LD1D to save and restore Z8-Z15, which also has the
nice property of storing the 64 bits at the start of the slot.  However,
using ST1D and LD1D requires a spare predicate register, and since all
of P0-P7 are either argument registers or call-preserved, we may need
to spill P4 in order to save the vector registers, even if P4 wouldn't
need to be saved otherwise.

Since Z16-Z23 are fully clobbered by base AAPCS64 functions, we don't
need to emit frame information for them at all.  This avoids having
to decide whether the registers should be treated as having 64 bits
(as for Z8-Z15), 128 bits (for Advanced SIMD) or the full SVE width.

There are two ways of dealing with stack-clash protection when
saving SVE registers:

(1) If the area between the hard frame pointer and the incoming stack
    pointer is allocated via a store with writeback (callee_adjust != 0),
    the SVE save area is allocated separately and becomes the "initial"
    allocation as far as stack-clash protection goes.  In this case
    the store with writeback acts as a probe at the hard frame pointer
    position.

(2) If the area between the hard frame pointer and the incoming stack
    pointer is allocated via aarch64_allocate_and_probe_stack_space,
    the SVE save area is added to this initial allocation, so that the
    SP ends up pointing at the SVE register saves.  It's then necessary
    to use a temporary base register to save the non-SVE registers.
    Setting up this temporary register requires a single instruction
    only and so should be more efficient than doing two allocations
    and probes.

When SVE registers need to be saved, saving them below the frame pointer
makes it harder to rely on the LR save as a stack probe, since the LR
register's offset won't usually be a compile-time constant.  The patch
copes with that by using the lowest SVE register save as a stack probe
too, and thus prevents the save from being shrink-wrapped if stack clash
protection is enabled.

The changelog describes the low-level details.

2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
* calls.c (pass_by_reference): Leave the target to decide whether
POLY_INT_CST-sized arguments should be passed by value or reference,
rather than forcing them to be passed by reference.
(must_pass_in_stack_var_size): Likewise.
* config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Redefine from
V31_REGNUM to P15_REGNUM.
* config/aarch64/aarch64-protos.h (aarch64_init_cumulative_args):
Take an extra "silent_p" parameter, defaulting to false.
(aarch64_sve::svbool_type_p): Declare.
(aarch64_sve::nvectors_if_data_type): Likewise.
* config/aarch64/aarch64.h (NUM_PR_ARG_REGS): New macro.
(aarch64_frame::reg_offset): Turn into poly_int64s.
(aarch64_frame::save_regs_size): Likewise.
(aarch64_frame::below_hard_fp_saved_regs_size): New field.
(aarch64_frame::sve_callee_adjust): Likewise.
(aarch64_frame::spare_reg_reg): Likewise.
(ARM_PCS_SVE): New arm_pcs value.
(CUMULATIVE_ARGS::aapcs_nprn): New field.
(CUMULATIVE_ARGS::aapcs_nextnprn): Likewise.
(CUMULATIVE_ARGS::silent_p): Likewise.
(BITS_PER_SVE_PRED): New macro.
* config/aarch64/aarch64.c (handle_aarch64_vector_pcs_attribute): New
function.  Reject aarch64_vector_pcs attributes on SVE functions.
(aarch64_attribute_table): Use the above handler.
(aarch64_sve_abi): New function.
(aarch64_sve_argument_p): Likewise.
(aarch64_returns_value_in_sve_regs_p): Likewise.
(aarch64_takes_arguments_in_sve_regs_p): Likewise.
(aarch64_fntype_abi): Check for SVE functions and return the SVE PCS
descriptor for them.
(aarch64_simd_decl_p): Delete.
(aarch64_emit_cfi_for_reg_p): New function.
(aarch64_reg_save_mode): Remove the fndecl argument and instead use
crtl->abi to choose the mode for FP registers.  Handle the SVE PCS.
(aarch64_hard_regno_call_part_clobbered): Do not treat FP registers
as partly clobbered for the SVE PCS.
(aarch64_function_ok_for_sibcall): Check whether the two functions
use the same ABI, rather than checking specifically for whether
they're aarch64_vector_pcs functions.
(aarch64_pass_by_reference): Raise an error for attempts to pass
SVE arguments when SVE is disabled.  Pass SVE arguments by reference
if there are not enough free registers left, or if the argument is
variadic.
(aarch64_function_value): Handle SVE predicates, vectors and tuples.
(aarch64_return_in_memory): Do not return SVE predicates, vectors and
tuples in memory.
(aarch64_layout_arg): Take a function_arg_info rather than
individual properties.  Handle SVE predicates, vectors and tuples.
Raise an error if they are passed to unprototyped functions.
(aarch64_function_arg): If the silent_p flag is set, suppress the
usual error about using float registers without TARGET_FLOAT.
(aarch64_init_cumulative_args): Take a silent_p parameter and store
it in the cumulative_args structure.  Initialize aapcs_nprn and
aapcs_nextnprn.  If the silent_p flag is set, suppress the usual
error about using float registers without TARGET_FLOAT.
If the silent_p flag is not set, also raise an error about
using SVE functions when SVE is disabled.
(aarch64_function_arg_advance): Update the call to aarch64_layout_arg,
and call it for SVE functions too.  Update aapcs_nprn similarly
to the other register counts.
(aarch64_layout_frame): If a big-endian function needs to save
and restore Z8-Z15, search for a spare predicate that it can use.
Store SVE predicates at the bottom of the register save area,
followed by SVE vectors, then followed by the normal slots.
Keep pointing the hard frame pointer at the base of the normal slots,
above the SVE vectors.  Update the various frame creation and
tear-down strategies for the new layout, initializing the new
sve_callee_adjust field.  Add an additional layout for frames
whose saved registers are all SVE registers.
(aarch64_register_saved_on_entry): Cope with poly_int64 reg_offsets.
(aarch64_return_address_signing_enabled): Likewise.
(aarch64_push_regs, aarch64_pop_regs): Update calls to
aarch64_reg_save_mode.
(aarch64_adjust_sve_callee_save_base): New function.
(aarch64_add_cfa_expression): Move earlier in file.  Take the
saved register as an rtx rather than a register number and use
its mode for the MEM slot.
(aarch64_save_callee_saves): Remove the mode argument and instead
use aarch64_reg_save_mode to get the mode of each save slot.
Add a hard_fp_valid_p parameter.  Cope with poly_int64 register
offsets.  Allow GP offsets to be saved at a VL-based offset from
the stack, handling this case using the frame pointer if available
or a temporary register otherwise.  Use ST1D to save Z8-Z15 for
big-endian SVE functions; use normal moves for other SVE saves.
Only mark the save as frame-related if aarch64_emit_cfi_for_reg_p
returns true.  Add explicit CFA notes when not storing via the
stack pointer.  Do not try to pair SVE saves.
(aarch64_restore_callee_saves): Cope with poly_int64 register
offsets.  Use LD1D to restore Z8-Z15 for big-endian SVE functions;
use normal moves for other SVE restores.  Only add CFA restore notes
if aarch64_emit_cfi_for_reg_p returns true.  Do not try to pair
SVE restores.
(aarch64_get_separate_components): Always keep the first SVE save
in the prologue if we need to use it as a stack probe.  Don't allow
Z8-Z15 saves and loads to be shrink-wrapped for big-endian targets.
Likewise the spare predicate register that they need.  Update the
offset calculation to account for the SVE save area.  Use the
appropriate range check for SVE LDR and STR instructions.
(aarch64_components_for_bb): Cope with poly_int64 reg_offsets.
(aarch64_process_components): Likewise.  Update the offset
calculation to account for the SVE save area.  Only mark the
save as frame-related if aarch64_emit_cfi_for_reg_p returns true.
Do not try to pair SVE saves.
(aarch64_allocate_and_probe_stack_space): Cope with poly_int64
reg_offsets.  When handling the final allocation, expect the
first SVE register save to be part of the initial allocation
and for it to act as a probe at SP.  Account for the SVE callee
save area in the dump information.
(aarch64_expand_prologue): Update the frame diagram.  Fold the
SVE callee allocation into the initial allocation if stack clash
protection is enabled.  Use new variables to track the offset
of the frame chain (and hard frame pointer) from the current
stack pointer, and likewise the offset of the bottom of the
register save area.  Update calls to aarch64_save_callee_saves
and aarch64_add_cfa_expression.  Apply sve_callee_adjust before
saving the FP&SIMD registers.  Save the predicate registers.
(aarch64_expand_epilogue): Take below_hard_fp_saved_regs_size
into account when setting the stack pointer from the frame pointer,
and when deciding whether we can inherit the initial adjustment
amount from the prologue.  Restore the predicate registers after
the vector registers, then apply sve_callee_adjust, then restore
the general registers.
(aarch64_secondary_reload): Don't use secondary SVE reloads
for VNx16BImode.
(aapcs_vfp_sub_candidate): Assert that the type is not an SVE type.
(aarch64_short_vector_p): Return false for SVE types.
(aarch64_vfp_is_call_or_return_candidate): Initialize *is_ha
at the start of the function.  Return false for SVE types.
(aarch64_asm_output_variant_pcs): Output .variant_pcs for SVE
functions too.
(TARGET_STRICT_ARGUMENT_NAMING): Redefine to request strict naming.
* config/aarch64/aarch64-sve.md (*aarch64_sve_mov<mode>_le): Extend
to big-endian targets for bytewise moves.
(*aarch64_sve_mov<mode>_be): Exclude the bytewise case.

gcc/testsuite/
* gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp: New file.
* gcc.target/aarch64/sve/pcs/annotate_1.c: New test.
* gcc.target/aarch64/sve/pcs/annotate_2.c: Likewise.
* gcc.target/aarch64/sve/pcs/annotate_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/annotate_4.c: Likewise.
* gcc.target/aarch64/sve/pcs/annotate_5.c: Likewise.
* gcc.target/aarch64/sve/pcs/annotate_6.c: Likewise.
* gcc.target/aarch64/sve/pcs/annotate_7.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_1.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_10.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_11_nosc.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_11_sc.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_2.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_4.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_7.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_8.c: Likewise.
* gcc.target/aarch64/sve/pcs/args_9.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_1.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_2.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_4.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_5.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_6.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_7.c: Likewise.
* gcc.target/aarch64/sve/pcs/nosve_8.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_1_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_2.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_4_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_5_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_6_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_7.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_8.c: Likewise.
* gcc.target/aarch64/sve/pcs/return_9.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_4_be.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_4_le.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_5_be.c: Likewise.
* gcc.target/aarch64/sve/pcs/saves_5_le.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_1.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_1_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_1_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_2.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_2_256.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_2_512.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c: Likewise.
* gcc.target/aarch64/sve/pcs/stack_clash_3.c: Likewise.
* gcc.target/aarch64/sve/pcs/unprototyped_1.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_3_nosc.c: Likewise.
* gcc.target/aarch64/sve/pcs/varargs_3_sc.c: Likewise.
* gcc.target/aarch64/sve/pcs/vpcs_1.c: Likewise.
* g++.target/aarch64/sve/catch_7.C: Likewise.

From-SVN: r277564

[AArch64] Add support for arm_sve.h

This patch adds support for arm_sve.h.  I've tried to split all the
groundwork out into separate patches, so this is mostly adding new code
rather than changing existing code.

The C++ frontend seems to handle correct ACLE code without modification,
even in length-agnostic mode.  The C frontend is close; the only correct
construct I know it doesn't handle is initialisation.  E.g.:

  svbool_t pg = svptrue_b8 ();

produces:

  variable-sized object may not be initialized

although:

  svbool_t pg; pg = svptrue_b8 ();

works fine.  This can be fixed by changing:

  {
    /* A complete type is ok if size is fixed.  */

-     if (TREE_CODE (TYPE_SIZE (TREE_TYPE (decl))) != INTEGER_CST
+     if (!poly_int_tree_p (TYPE_SIZE (TREE_TYPE (decl)))
|| C_DECL_VARIABLE_SIZE (decl))
      {
error ("variable-sized object may not be initialized");

in c/c-decl.c:start_decl.

Invalid code is likely to trigger ICEs, so this isn't ready for general
use yet.  However, it seemed better to apply the patch now and deal with
diagnosing invalid code as a follow-up.  For one thing, it means that
we'll be able to provide testcases for middle-end changes related
to SVE vectors, which has been a problem until now.  (I already have
a series of such patches lined up.)

The patch includes some tests, but the main ones need to wait until the
PCS support has been applied.

2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>
    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
    Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

gcc/
* config.gcc (aarch64*-*-*): Add arm_sve.h to extra_headers.
Add aarch64-sve-builtins.o, aarch64-sve-builtins-shapes.o and
aarch64-sve-builtins-base.o to extra_objs.  Add
aarch64-sve-builtins.h and aarch64-sve-builtins.cc to target_gtfiles.
* config/aarch64/t-aarch64 (aarch64-sve-builtins.o): New rule.
(aarch64-sve-builtins-shapes.o): Likewise.
(aarch64-sve-builtins-base.o): New rules.
* config/aarch64/aarch64-c.c (aarch64_pragma_aarch64): New function.
(aarch64_resolve_overloaded_builtin): Likewise.
(aarch64_check_builtin_call): Likewise.
(aarch64_register_pragmas): Install aarch64_resolve_overloaded_builtin
and aarch64_check_builtin_call in targetm.  Register the GCC aarch64
pragma.
* config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPRFOP): New macro.
(aarch64_svprfop): New enum.
(AARCH64_BUILTIN_SVE): New aarch64_builtin_class enum value.
(aarch64_sve_int_mode, aarch64_sve_data_mode): Declare.
(aarch64_fold_sve_cnt_pat, aarch64_output_sve_prefetch): Likewise.
(aarch64_output_sve_cnt_pat_immediate): Likewise.
(aarch64_output_sve_ptrues, aarch64_sve_ptrue_svpattern_p): Likewise.
(aarch64_sve_sqadd_sqsub_immediate_p, aarch64_sve_ldff1_operand_p)
(aarch64_sve_ldnf1_operand_p, aarch64_sve_prefetch_operand_p)
(aarch64_ptrue_all_mode, aarch64_convert_sve_data_to_pred): Likewise.
(aarch64_expand_sve_dupq, aarch64_replace_reg_mode): Likewise.
(aarch64_sve::init_builtins, aarch64_sve::handle_arm_sve_h): Likewise.
(aarch64_sve::builtin_decl, aarch64_sve::builtin_type_p): Likewise.
(aarch64_sve::mangle_builtin_type): Likewise.
(aarch64_sve::resolve_overloaded_builtin): Likewise.
(aarch64_sve::check_builtin_call, aarch64_sve::gimple_fold_builtin)
(aarch64_sve::expand_builtin): Likewise.
* config/aarch64/aarch64.c (aarch64_sve_data_mode): Make public.
(aarch64_sve_int_mode): Likewise.
(aarch64_ptrue_all_mode): New function.
(aarch64_convert_sve_data_to_pred): Make public.
(svprfop_token): New function.
(aarch64_output_sve_prefetch): Likewise.
(aarch64_fold_sve_cnt_pat): Likewise.
(aarch64_output_sve_cnt_pat_immediate): Likewise.
(aarch64_sve_move_pred_via_while): Use gen_while with UNSPEC_WHILE_LO
instead of gen_while_ult.
(aarch64_replace_reg_mode): Make public.
(aarch64_init_builtins): Call aarch64_sve::init_builtins.
(aarch64_fold_builtin): Handle AARCH64_BUILTIN_SVE.
(aarch64_gimple_fold_builtin, aarch64_expand_builtin): Likewise.
(aarch64_builtin_decl, aarch64_builtin_reciprocal): Likewise.
(aarch64_mangle_type): Call aarch64_sve::mangle_type.
(aarch64_sve_sqadd_sqsub_immediate_p): New function.
(aarch64_sve_ptrue_svpattern_p): Likewise.
(aarch64_sve_pred_valid_immediate): Check
aarch64_sve_ptrue_svpattern_p.
(aarch64_sve_ldff1_operand_p, aarch64_sve_ldnf1_operand_p)
(aarch64_sve_prefetch_operand_p, aarch64_output_sve_ptrues): New
functions.
* config/aarch64/aarch64.md (UNSPEC_LDNT1_SVE, UNSPEC_STNT1_SVE)
(UNSPEC_LDFF1_GATHER, UNSPEC_PTRUE, UNSPEC_WHILE_LE, UNSPEC_WHILE_LS)
(UNSPEC_WHILE_LT, UNSPEC_CLASTA, UNSPEC_UPDATE_FFR)
(UNSPEC_UPDATE_FFRT, UNSPEC_RDFFR, UNSPEC_WRFFR)
(UNSPEC_SVE_LANE_SELECT, UNSPEC_SVE_CNT_PAT, UNSPEC_SVE_PREFETCH)
(UNSPEC_SVE_PREFETCH_GATHER, UNSPEC_SVE_COMPACT, UNSPEC_SVE_SPLICE):
New unspecs.
* config/aarch64/iterators.md (SI_ONLY, DI_ONLY, VNx8HI_ONLY)
(VNx2DI_ONLY, SVE_PARTIAL, VNx8_NARROW, VNx8_WIDE, VNx4_NARROW)
(VNx4_WIDE, VNx2_NARROW, VNx2_WIDE, PRED_HSD): New mode iterators.
(UNSPEC_ADR, UNSPEC_BRKA, UNSPEC_BRKB, UNSPEC_BRKN, UNSPEC_BRKPA)
(UNSPEC_BRKPB, UNSPEC_PFIRST, UNSPEC_PNEXT, UNSPEC_CNTP, UNSPEC_SADDV)
(UNSPEC_UADDV, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTMAD)
(UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_CMPEQ_WIDE): New unspecs.
(UNSPEC_COND_CMPGE_WIDE, UNSPEC_COND_CMPGT_WIDE): Likewise.
(UNSPEC_COND_CMPHI_WIDE, UNSPEC_COND_CMPHS_WIDE): Likewise.
(UNSPEC_COND_CMPLE_WIDE, UNSPEC_COND_CMPLO_WIDE): Likewise.
(UNSPEC_COND_CMPLS_WIDE, UNSPEC_COND_CMPLT_WIDE): Likewise.
(UNSPEC_COND_CMPNE_WIDE, UNSPEC_COND_FCADD90, UNSPEC_COND_FCADD270)
(UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90, UNSPEC_COND_FCMLA180)
(UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN): Likewise.
(UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX, UNSPEC_COND_FSCALE): Likewise.
(UNSPEC_LASTA, UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE): Likewise.
(UNSPEC_LSHIFTRT_WIDE, UNSPEC_LDFF1, UNSPEC_LDNF1): Likewise.
(Vesize): Handle partial vector modes.
(self_mask, narrower_mask, sve_lane_con, sve_lane_pair_con): New
mode attributes.
(UBINQOPS, ANY_PLUS, SAT_PLUS, ANY_MINUS, SAT_MINUS): New code
iterators.
(s, paired_extend, inc_dec): New code attributes.
(SVE_INT_ADDV, CLAST, LAST): New int iterators.
(SVE_INT_UNARY): Add UNSPEC_RBIT.
(SVE_FP_UNARY, SVE_FP_UNARY_INT): New int iterators.
(SVE_FP_BINARY, SVE_FP_BINARY_INT): Likewise.
(SVE_COND_FP_UNARY): Add UNSPEC_COND_FRECPX.
(SVE_COND_FP_BINARY): Add UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and
UNSPEC_COND_FMULX.
(SVE_COND_FP_BINARY_INT, SVE_COND_FP_ADD): New int iterators.
(SVE_COND_FP_SUB, SVE_COND_FP_MUL): Likewise.
(SVE_COND_FP_BINARY_I1): Add UNSPEC_COND_FMAX and UNSPEC_COND_FMIN.
(SVE_COND_FP_BINARY_REG): Add UNSPEC_COND_FMULX.
(SVE_COND_FCADD, SVE_COND_FP_MAXMIN, SVE_COND_FCMLA)
(SVE_COND_INT_CMP_WIDE, SVE_FP_TERNARY_LANE, SVE_CFP_TERNARY_LANE)
(SVE_WHILE, SVE_SHIFT_WIDE, SVE_LDFF1_LDNF1, SVE_BRK_UNARY)
(SVE_BRK_BINARY, SVE_PITER): New int iterators.
(optab): Handle UNSPEC_SADDV, UNSPEC_UADDV, UNSPEC_FRECPE,
UNSPEC_FRECPS, UNSPEC_RSQRTE, UNSPEC_RSQRTS, UNSPEC_RBIT,
UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART, UNSPEC_FMLA, UNSPEC_FMLS,
UNSPEC_FCMLA, UNSPEC_FCMLA90, UNSPEC_FCMLA180, UNSPEC_FCMLA270,
UNSPEC_FEXPA, UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_FCADD90,
UNSPEC_COND_FCADD270, UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90,
UNSPEC_COND_FCMLA180, UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX,
UNSPEC_COND_FMIN, UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX and
UNSPEC_COND_FSCALE.
(maxmin_uns): Handle UNSPEC_COND_FMAX and UNSPEC_COND_FMIN.
(binqops_op, binqops_op_rev, last_op): New int attributes.
(su): Handle UNSPEC_SADDV and UNSPEC_UADDV.
(fn, ab): New int attributes.
(cmp_op): Handle UNSPEC_COND_CMP*_WIDE and UNSPEC_WHILE_*.
(while_optab_cmp, brk_op, sve_pred_op): New int attributes.
(sve_int_op): Handle UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART,
UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE, UNSPEC_LSHIFTRT_WIDE and
UNSPEC_RBIT.
(sve_fp_op): Handle UNSPEC_FRECPE, UNSPEC_FRECPS, UNSPEC_RSQRTE,
UNSPEC_RSQRTS, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTSMUL,
UNSPEC_FTSSEL, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN, UNSPEC_COND_FMULX,
UNSPEC_COND_FRECPX and UNSPEC_COND_FSCALE.
(sve_fp_op_rev): Handle UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and
UNSPEC_COND_FMULX.
(rot): Handle UNSPEC_COND_FCADD* and UNSPEC_COND_FCMLA*.
(brk_reg_con, brk_reg_opno): New int attributes.
(sve_pred_fp_rhs1_operand, sve_pred_fp_rhs2_operand): Handle
UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX.
(sve_pred_fp_rhs2_immediate): Handle UNSPEC_COND_FMAX and
UNSPEC_COND_FMIN.
(max_elem_bits): New int attribute.
(min_elem_bits): Handle UNSPEC_RBIT.
* config/aarch64/predicates.md (subreg_lowpart_operator): Handle
TRUNCATE as well as SUBREG.
(ascending_int_parallel, aarch64_simd_reg_or_minus_one)
(aarch64_sve_ldff1_operand, aarch64_sve_ldnf1_operand)
(aarch64_sve_prefetch_operand, aarch64_sve_ptrue_svpattern_immediate)
(aarch64_sve_qadd_immediate, aarch64_sve_qsub_immediate)
(aarch64_sve_gather_immediate_b, aarch64_sve_gather_immediate_h)
(aarch64_sve_gather_immediate_w, aarch64_sve_gather_immediate_d)
(aarch64_sve_sqadd_operand, aarch64_sve_gather_offset_b)
(aarch64_sve_gather_offset_h, aarch64_sve_gather_offset_w)
(aarch64_sve_gather_offset_d, aarch64_gather_scale_operand_b)
(aarch64_gather_scale_operand_h): New predicates.
* config/aarch64/constraints.md (UPb, UPd, UPh, UPw, Utf, Utn, vgb)
(vgd, vgh, vgw, vsQ, vsS): New constraints.
* config/aarch64/aarch64-sve.md: Add a note on the FFR handling.
(*aarch64_sve_reinterpret<mode>): Allow any source register
instead of requiring an exact match.
(*aarch64_sve_ptruevnx16bi_cc, *aarch64_sve_ptrue<mode>_cc)
(*aarch64_sve_ptruevnx16bi_ptest, *aarch64_sve_ptrue<mode>_ptest)
(aarch64_wrffr, aarch64_update_ffr_for_load, aarch64_copy_ffr_to_ffrt)
(aarch64_rdffr, aarch64_rdffr_z, *aarch64_rdffr_z_ptest)
(*aarch64_rdffr_ptest, *aarch64_rdffr_z_cc, *aarch64_rdffr_cc)
(aarch64_update_ffrt): New patterns.
(@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>)
(@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
(@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
(@aarch64_ld<fn>f1<mode>): New patterns.
(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>)
(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
(@aarch64_ldnt1<mode>): New patterns.
(gather_load<mode>): Use aarch64_sve_gather_offset_<Vesize> for
the scalar part of the address.
(mask_gather_load<SVE_S:mode>): Use aarch64_sve_gather_offset_w for the
scalar part of the addresse and add an alternative for handling
nonzero offsets.
(mask_gather_load<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d.
(*mask_gather_load<mode>_sxtw, *mask_gather_load<mode>_uxtw)
(@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
(@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw)
(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw)
(@aarch64_ldff1_gather<SVE_S:mode>, @aarch64_ldff1_gather<SVE_D:mode>)
(*aarch64_ldff1_gather<mode>_sxtw, *aarch64_ldff1_gather<mode>_uxtw)
(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw)
(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw)
(@aarch64_sve_prefetch<mode>): New patterns.
(@aarch64_sve_gather_prefetch<SVE_I:mode><VNx4SI_ONLY:mode>)
(@aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>)
(*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_sxtw)
(*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_uxtw)
(@aarch64_store_trunc<VNx8_NARROW:mode><VNx8_WIDE:mode>)
(@aarch64_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>)
(@aarch64_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>)
(@aarch64_stnt1<mode>): New patterns.
(scatter_store<mode>): Use aarch64_sve_gather_offset_<Vesize> for
the scalar part of the address.
(mask_scatter_store<SVE_S:mode>): Use aarch64_sve_gather_offset_w for
the scalar part of the addresse and add an alternative for handling
nonzero offsets.
(mask_scatter_store<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d.
(*mask_scatter_store<mode>_sxtw, *mask_scatter_store<mode>_uxtw)
(@aarch64_scatter_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>)
(@aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>)
(*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_sxtw)
(*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_uxtw):
New patterns.
(vec_duplicate<mode>): Use QI as the mode of the input operand.
(extract_last_<mode>): Generalize to...
(@extract_<LAST:last_op>_<mode>): ...this.
(*<SVE_INT_UNARY:optab><mode>2): Rename to...
(@aarch64_pred_<SVE_INT_UNARY:optab><mode>): ...this.
(@cond_<SVE_INT_UNARY:optab><mode>): New expander.
(@aarch64_pred_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): New pattern.
(@aarch64_cond_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): Likewise.
(@aarch64_pred_cnot<mode>, @cond_cnot<mode>): New expanders.
(@aarch64_sve_<SVE_FP_UNARY_INT:optab><mode>): New pattern.
(@aarch64_sve_<SVE_FP_UNARY:optab><mode>): Likewise.
(*<SVE_COND_FP_UNARY:optab><mode>2): Rename to...
(@aarch64_pred_<SVE_COND_FP_UNARY:optab><mode>): ...this.
(@cond_<SVE_COND_FP_UNARY:optab><mode>): New expander.
(*<SVE_INT_BINARY_IMM:optab><mode>3): Rename to...
(@aarch64_pred_<SVE_INT_BINARY_IMM:optab><mode>): ...this.
(@aarch64_adr<mode>, *aarch64_adr_sxtw): New patterns.
(*aarch64_adr_uxtw_unspec): Likewise.
(*aarch64_adr_uxtw): Rename to...
(*aarch64_adr_uxtw_and): ...this.
(@aarch64_adr<mode>_shift): New expander.
(*aarch64_adr_shift_sxtw): New pattern.
(aarch64_<su>abd<mode>_3): Rename to...
(@aarch64_pred_<su>abd<mode>): ...this.
(<su>abd<mode>_3): Update accordingly.
(@aarch64_cond_<su>abd<mode>): New expander.
(@aarch64_<SBINQOPS:su_optab><optab><mode>): New pattern.
(@aarch64_<UBINQOPS:su_optab><optab><mode>): Likewise.
(*<su>mul<mode>3_highpart): Rename to...
(@aarch64_pred_<optab><mode>): ...this.
(@cond_<MUL_HIGHPART:optab><mode>): New expander.
(*cond_<MUL_HIGHPART:optab><mode>_2): New pattern.
(*cond_<MUL_HIGHPART:optab><mode>_z): Likewise.
(*<SVE_INT_BINARY_SD:optab><mode>3): Rename to...
(@aarch64_pred_<SVE_INT_BINARY_SD:optab><mode>): ...this.
(cond_<SVE_INT_BINARY_SD:optab><mode>): Add a "@" marker.
(@aarch64_bic<mode>, @cond_bic<mode>): New expanders.
(*v<ASHIFT:optab><mode>3): Rename to...
(@aarch64_pred_<ASHIFT:optab><mode>): ...this.
(@aarch64_sve_<SVE_SHIFT_WIDE:sve_int_op><mode>): New pattern.
(@cond_<SVE_SHIFT_WIDE:sve_int_op><mode>): New expander.
(*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_m): New pattern.
(*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_z): Likewise.
(@cond_asrd<mode>): New expander.
(*cond_asrd<mode>_2, *cond_asrd<mode>_z): New patterns.
(sdiv_pow2<mode>3): Expand to *cond_asrd<mode>_2.
(*sdiv_pow2<mode>3): Delete.
(@cond_<SVE_COND_FP_BINARY_INT:optab><mode>): New expander.
(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2): New pattern.
(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any): Likewise.
(@aarch64_sve_<SVE_FP_BINARY:optab><mode>): New pattern.
(@aarch64_sve_<SVE_FP_BINARY_INT:optab><mode>): Likewise.
(*<SVE_COND_FP_BINARY_REG:optab><mode>3): Rename to...
(@aarch64_pred_<SVE_COND_FP_BINARY_REG:optab><mode>): ...this.
(@aarch64_pred_<SVE_COND_FP_BINARY_INT:optab><mode>): New pattern.
(cond_<SVE_COND_FP_BINARY:optab><mode>): Add a "@" marker.
(*add<SVE_F:mode>3): Rename to...
(@aarch64_pred_add<SVE_F:mode>): ...this and add alternatives
for SVE_STRICT_GP.
(@aarch64_pred_<SVE_COND_FCADD:optab><mode>): New pattern.
(@cond_<SVE_COND_FCADD:optab><mode>): New expander.
(*cond_<SVE_COND_FCADD:optab><mode>_2): New pattern.
(*cond_<SVE_COND_FCADD:optab><mode>_any): Likewise.
(*sub<SVE_F:mode>3): Rename to...
(@aarch64_pred_sub<SVE_F:mode>): ...this and add alternatives
for SVE_STRICT_GP.
(@aarch64_pred_abd<SVE_F:mode>): New expander.
(*fabd<SVE_F:mode>3): Rename to...
(*aarch64_pred_abd<SVE_F:mode>): ...this.
(@aarch64_cond_abd<SVE_F:mode>): New expander.
(*mul<SVE_F:mode>3): Rename to...
(@aarch64_pred_<SVE_F:optab><mode>): ...this and add alternatives
for SVE_STRICT_GP.
(@aarch64_mul_lane_<SVE_F:mode>): New pattern.
(*<SVE_COND_FP_MAXMIN_PUBLIC:optab><mode>3): Rename and generalize
to...
(@aarch64_pred_<SVE_COND_FP_MAXMIN:optab><mode>): ...this.
(*<LOGICAL:optab><PRED_ALL:mode>3_ptest): New pattern.
(*<nlogical><PRED_ALL:mode>3): Rename to...
(aarch64_pred_<nlogical><PRED_ALL:mode>_z): ...this.
(*<nlogical><PRED_ALL:mode>3_cc): New pattern.
(*<nlogical><PRED_ALL:mode>3_ptest): Likewise.
(*<logical_nn><PRED_ALL:mode>3): Rename to...
(aarch64_pred_<logical_nn><mode>_z): ...this.
(*<logical_nn><PRED_ALL:mode>3_cc): New pattern.
(*<logical_nn><PRED_ALL:mode>3_ptest): Likewise.
(*fma<SVE_I:mode>4): Rename to...
(@aarch64_pred_fma<SVE_I:mode>): ...this.
(*fnma<SVE_I:mode>4): Rename to...
(@aarch64_pred_fnma<SVE_I:mode>): ...this.
(@aarch64_<sur>dot_prod_lane<vsi2qi>): New pattern.
(*<SVE_FP_TERNARY:optab><mode>4): Rename to...
(@aarch64_pred_<SVE_FP_TERNARY:optab><mode>): ...this.
(cond_<SVE_FP_TERNARY:optab><mode>): Add a "@" marker.
(@aarch64_<SVE_FP_TERNARY_LANE:optab>_lane_<mode>): New pattern.
(@aarch64_pred_<SVE_COND_FCMLA:optab><mode>): Likewise.
(@cond_<SVE_COND_FCMLA:optab><mode>): New expander.
(*cond_<SVE_COND_FCMLA:optab><mode>_4): New pattern.
(*cond_<SVE_COND_FCMLA:optab><mode>_any): Likewise.
(@aarch64_<FCMLA:optab>_lane_<mode>): Likewise.
(@aarch64_sve_tmad<mode>): Likewise.
(vcond_mask_<SVE_ALL:mode><vpred>): Add a "@" marker.
(*aarch64_sel_dup<mode>): Rename to...
(@aarch64_sel_dup<mode>): ...this.
(@aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide): New pattern.
(*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_cc): Likewise.
(*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_ptest): Likewise.
(@while_ult<GPI:mode><PRED_ALL:mode>): Generalize to...
(@while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>): ...this.
(*while_ult<GPI:mode><PRED_ALL:mode>_cc): Generalize to.
(*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_cc): ...this.
(*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_ptest): New pattern.
(*fcm<cmp_op><mode>): Rename to...
(@aarch64_pred_fcm<cmp_op><mode>): ...this.  Make operand order
match @aarch64_pred_cmp<cmp_op><SVE_I:mode>.
(*fcmuo<mode>): Rename to...
(@aarch64_pred_fcmuo<mode>): ...this.  Make operand order
match @aarch64_pred_cmp<cmp_op><SVE_I:mode>.
(@aarch64_pred_fac<cmp_op><mode>): New expander.
(@vcond_mask_<PRED_ALL:mode><mode>): New pattern.
(fold_extract_last_<mode>): Generalize to...
(@fold_extract_<last_op>_<mode>): ...this.
(@aarch64_fold_extract_vector_<last_op>_<mode>): New pattern.
(*reduc_plus_scal_<SVE_I:mode>): Replace with...
(@aarch64_pred_reduc_<optab>_<mode>): ...this pattern, making the
DImode result explicit.
(reduc_plus_scal_<mode>): Update accordingly.
(*reduc_<optab>_scal_<SVE_I:mode>): Rename to...
(@aarch64_pred_reduc_<optab>_<SVE_I:mode>): ...this.
(*reduc_<optab>_scal_<SVE_F:mode>): Rename to...
(@aarch64_pred_reduc_<optab>_<SVE_F:mode>): ...this.
(*aarch64_sve_tbl<mode>): Rename to...
(@aarch64_sve_tbl<mode>): ...this.
(@aarch64_sve_compact<mode>): New pattern.
(*aarch64_sve_dup_lane<mode>): Rename to...
(@aarch64_sve_dup_lane<mode>): ...this.
(@aarch64_sve_dupq_lane<mode>): New pattern.
(@aarch64_sve_splice<mode>): Likewise.
(aarch64_sve_<perm_insn><mode>): Rename to...
(@aarch64_sve_<perm_insn><mode>): ...this.
(*aarch64_sve_ext<mode>): Rename to...
(@aarch64_sve_ext<mode>): ...this.
(aarch64_sve_<su>unpk<perm_hilo>_<SVE_BHSI:mode>): Add a "@" marker.
(*aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): Rename
to...
(@aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): ...this.
(*aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>):
Rename to...
(@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>):
...this.
(@cond_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): New expander.
(@cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Likewise.
(*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): New pattern.
(*aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): Rename
to...
(@aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): ...this.
(aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Add
a "@" marker.
(@cond_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): New expander.
(@cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Likewise.
(*cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): New
pattern.
(*aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): Rename to...
(@aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): ...this.
(@cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New expander.
(*cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New pattern.
(aarch64_sve_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): Add a
"@" marker.
(@cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New expander.
(*cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New pattern.
(aarch64_sve_punpk<perm_hilo>_<mode>): Add a "@" marker.
(@aarch64_brk<SVE_BRK_UNARY:brk_op>): New pattern.
(*aarch64_brk<SVE_BRK_UNARY:brk_op>_cc): Likewise.
(*aarch64_brk<SVE_BRK_UNARY:brk_op>_ptest): Likewise.
(@aarch64_brk<SVE_BRK_BINARY:brk_op>): Likewise.
(*aarch64_brk<SVE_BRK_BINARY:brk_op>_cc): Likewise.
(*aarch64_brk<SVE_BRK_BINARY:brk_op>_ptest): Likewise.
(@aarch64_sve_<SVE_PITER:sve_pred_op><mode>): Likewise.
(*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_cc): Likewise.
(*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_ptest): Likewise.
(aarch64_sve_cnt_pat): Likewise.
(@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode>_pat): Likewise.
(*aarch64_sve_incsi_pat): Likewise.
(@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode>_pat): Likewise.
(@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise.
(@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise.
(@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander.
(*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern.
(@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode>_pat): Likewise.
(*aarch64_sve_decsi_pat): Likewise.
(@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode>_pat): Likewise.
(@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise.
(@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise.
(@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander.
(*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern.
(@aarch64_pred_cntp<mode>): Likewise.
(@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp):
New expander.
(*aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp)
(*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns.
(@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
New expander.
(*aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
New pattern.
(@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New expander.
(*aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern.
(@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New expander.
(*aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern.
(@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New expander.
(*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern.
(@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp):
New expander.
(*aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp)
(*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns.
(@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
New expander.
(*aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
New pattern.
(@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New
expander.
(*aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern.
(@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New
expander.
(*aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern.
(@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New
expander.
(*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern.
* config/aarch64/arm_sve.h: New file.
* config/aarch64/aarch64-sve-builtins.h: Likewise.
* config/aarch64/aarch64-sve-builtins.cc: Likewise.
* config/aarch64/aarch64-sve-builtins.def: Likewise.
* config/aarch64/aarch64-sve-builtins-base.h: Likewise.
* config/aarch64/aarch64-sve-builtins-base.cc: Likewise.
* config/aarch64/aarch64-sve-builtins-base.def: Likewise.
* config/aarch64/aarch64-sve-builtins-functions.h: Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.h: Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise.

gcc/testsuite/
* g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file.
* g++.target/aarch64/sve/acle/general-c++: New test directory.
* gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file.
* gcc.target/aarch64/sve/acle/general: New test directory.
* gcc.target/aarch64/sve/acle/general-c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
From-SVN: r277563

[AArch64] Extend SVE reverse permutes to predicates

This is tested by the main SVE ACLE patches, but since it affects
the evpc routines, it seemed worth splitting out.

2019-10-29 Richard Sandiford <richard.sandiford@arm.com>

gcc/
* config/aarch64/aarch64-sve.md (@aarch64_sve_rev<PRED_ALL:mode>):
New pattern.
* config/aarch64/aarch64.c (aarch64_evpc_rev_global): Handle all
SVE modes.

From-SVN: r277562

[AArch64] Add FFR and FFRT registers

This patch adds the First Fault Register to the AArch64 port, as well
as a fake register known as the FFR Token or FFRT. The main ACLE
patch explains what the FFRT does and how it works.

2019-10-29 Richard Sandiford <richard.sandiford@arm.com>

gcc/
* config/aarch64/aarch64.md (FFR_REGNUM, FFRT_REGNUM): New constants.
* config/aarch64/aarch64.h (FIRST_PSEUDO_REGISTER): Bump to
FFRT_REGNUM + 1.
(FFR_REGS, PR_AND_FFR_REGS): New register classes.
(REG_CLASS_NAMES, REG_CLASS_CONTENTS): Add entries for them.
* config/aarch64/aarch64.c (pr_or_ffr_regnum_p): New function.
(aarch64_hard_regno_nregs): Handle the new register classes.
(aarch64_hard_regno_mode_ok): Likewise.
(aarch64_regno_regclass): Likewise.
(aarch64_class_max_nregs): Likewise.
(aarch64_register_move_cost): Likewise.
(aarch64_conditional_register_usage): Don't treat FFR and FFRT
as general register_operands.

From-SVN: r277561

Fix unsigned type overflow in memory report.

2019-10-29 Martin Liska <mliska@suse.cz>

* ggc-common.c: One can't subtract unsigned types
in compare function.

From-SVN: r277560

Print header in dump_memory_report.

2019-10-29  Martin Liska  <mliska@suse.cz>

* cgraphunit.c (symbol_table::compile): Pass
title as dump_memory_report argument.
* toplev.c (dump_memory_report):  New argument.
(finalize): Pass new argument.
* toplev.h (dump_memory_report): Add argument.
2019-10-29  Martin Liska  <mliska@suse.cz>

* lto.c (do_whole_program_analysis): Pass
title as dump_memory_report argument.

From-SVN: r277559

Move Leak in GCC memory report to the first column.

2019-10-29 Martin Liska <mliska@suse.cz>

* ggc-common.c: Move Leak to the first column.

From-SVN: r277558

Remove misleading sorting function in ggc memory report.

2019-10-29  Martin Liska  <mliska@suse.cz>

* cgraphunit.c (symbol_table::compile): Remove argument
for dump_memory_report.
* ggc-common.c (dump_ggc_loc_statistics): Likewise.
(compare_final): Remove in order to make report
better readable.
* ggc.h (dump_ggc_loc_statistics):  Remove argument.
* mem-stats.h (mem_alloc_description::get_list):
Do not pass cmp.
(mem_alloc_description::dump): Likewise here.
* toplev.c (dump_memory_report): Remove final
argument.
(finalize): Likewise.
* toplev.h (dump_memory_report): Remove argument.
2019-10-29  Martin Liska  <mliska@suse.cz>

* lto.c (do_whole_program_analysis): Remove argument.

From-SVN: r277557

[AArch64] Handle scalars in cmp and shift immediate queries

The SVE ACLE has convenience functions that take scalar arguments
instead of vectors.  This patch makes it easier to implement the shift
and compare functions by making the associated immediate queries work
for scalar immediates as well as vector duplicates of them.

The "const" codes in the predicates were a holdover from an early
version of the SVE port in which we used (const ...) wrappers for
variable-length vector constants.  I'll remove other instances
of them in a separate patch.

2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
* config/aarch64/aarch64.c (aarch64_sve_cmp_immediate_p)
(aarch64_simd_shift_imm_p): Accept scalars as well as vectors.
* config/aarch64/predicates.md (aarch64_sve_cmp_vsc_immediate)
(aarch64_sve_cmp_vsd_immediate): Accept "const_int", but don't
accept "const".

From-SVN: r277556

Add a simulate_enum_decl langhook

Similarly to the simulate_builtin_function_decl patch, this one
adds a hook for simulating an enum declaration in the source
language. Again, the main SVE ACLE patch has tests for various
error conditions.

2019-10-29 Richard Sandiford <richard.sandiford@arm.com>

gcc/
* coretypes.h (string_int_pair): New typedef.
* langhooks-def.h (LANG_HOOKS_SIMULATE_ENUM_DECL): Define.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Include it.
* langhooks.h (lang_hooks_for_types::simulate_enum_decl): New hook.

gcc/c/
* c-tree.h (c_simulate_enum_decl): Declare.
* c-decl.c (c_simulate_enum_decl): New function.
* c-objc-common.h (LANG_HOOKS_SIMULATE_ENUM_DECL): Define to the above.

gcc/cp/
* cp-objcp-common.h (cxx_simulate_enum_decl): Declare.
(LANG_HOOKS_SIMULATE_ENUM_DECL): Define to the above.
* decl.c (cxx_simulate_enum_decl): New function.

From-SVN: r277555

Add a simulate_builin_function_decl langhook

Although it's possible to define the SVE intrinsics in a normal header
file, it's much more convenient to define them directly in the compiler.
This also speeds up compilation and gives better error messages.

The idea is therefore for arm_sve.h (the main intrinsics header file)
to have the pragma:

    #pragma GCC aarch64 "arm_sve.h"

telling GCC to define (almost) everything arm_sve.h needs to define.
The target then needs a way of injecting new built-in function
declarations during compilation.

The main hook for defining built-in functions is add_builtin_function.
This is designed for use at start-up, and so has various features that
are correct in that context but not for the pragma above:

  (1) the location is always BUILTINS_LOCATION, whereas for arm_sve.h
      it ought to be the location of the pragma.

  (2) the function is only immediately visible if it's in the implementation
      namespace, whereas the pragma is deliberately injecting functions
      into the general namespace.

  (3) there's no attempt to emulate a normal function declaration in
      C or C++, whereas functions declared by the pragma should be
      checked in the same way as an open-coded declaration would be.
      E.g. we should get an error if there was a previous incompatible
      declaration.

  (4) in C++, the function is treated as extern "C" and so can't be
      overloaded, whereas SVE intrinsics do use function overloading.

This patch therefore adds a hook that targets can use to inject
the equivalent of a source-level function declaration, but bound
to a BUILT_IN_MD function.

The main SVE intrinsic patch has tests to make sure that we report an
error for conflicting definitions that appear either before or after
including arm_sve.h.

2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
* langhooks.h (lang_hooks::simulate_builtin_function_decl): New hook.
(simulate_builtin_function_decl): Declare.
* langhooks-def.h (LANG_HOOKS_SIMULATE_BUILTIN_FUNCTION_DECL): Define.
(LANG_HOOKS_INITIALIZER): Include it.
* langhooks.c (add_builtin_function_common): Rename to...
(build_builtin_function): ...this.  Add a location parameter and use
it instead of BUILTINS_LOCATION.  Remove the hook parameter and return
the decl instead.
(add_builtin_function): Update accordingly, passing the returned
decl to the lang hook.
(add_builtin_function_ext_scope): Likewise
(simulate_builtin_function_decl): New function.

gcc/c/
* c-tree.h (c_simulate_builtin_function_decl): Declare.
* c-decl.c (c_simulate_builtin_function_decl): New function.
* c-objc-common.h (LANG_HOOKS_SIMULATE_BUILTIN_FUNCTION_DECL): Define
to the above.

gcc/cp/
* cp-tree.h (cxx_simulate_builtin_function_decl): Declare.
* decl.c (cxx_simulate_builtin_function_decl): New function.
* cp-objcp-common.h (LANG_HOOKS_SIMULATE_BUILTIN_FUNCTION_DECL): Define
to the above.

From-SVN: r277554

re PR tree-optimization/92241 (ice in vect_mark_pattern_st mts, at tree-vect-patterns.c:5175)

2019-10-29 Richard Biener <rguenther@suse.de>

PR tree-optimization/92241
* gcc.dg/torture/pr92241-2.c: New testcase.

From-SVN: r277553

install.texi (--enable-offload-targets): Fix up a typo in the example, use actual names of supported offload targets.

* doc/install.texi (--enable-offload-targets): Fix up a typo in the
example, use actual names of supported offload targets.

From-SVN: r277552

re PR target/92258 (ICE: output_operand: invalid %-code)

PR target/92258
* config/i386/sse.md (iptr): Revert 2019-10-27 change.

* gcc.target/i386/pr92258.c: New test.

From-SVN: r277551

Daily bump.

From-SVN: r277550

tree-ssa-strlen.c (get_addr_stridx): Add argument and use it.

gcc/ChangeLog:

* tree-ssa-strlen.c (get_addr_stridx): Add argument and use it.
(handle_store): Pass argument to get_addr_stridx.

gcc/testsuite/ChangeLog:

* gcc.dg/strlenopt-89.c: New test.
* gcc.dg/strlenopt-90.c: New test.
* gcc.dg/Wstringop-overflow-20.c: New test.

From-SVN: r277546

PR tree-optimization/92226 - live nul char store to array eliminated

gcc/testsuite/ChangeLog:

PR tree-optimization/92226
* gcc.dg/strlenopt-88.c: New test.

gcc/ChangeLog:

PR tree-optimization/92226
* tree-ssa-strlen.c (compare_nonzero_chars): Return -1 also when
the offset is in the open range outlined by SI's length.

From-SVN: r277545

PR c/66970 - Add __has_builtin() macro

gcc/ChangeLog:

PR c/66970
* doc/cpp.texi (__has_builtin): Document.
* doc/extend.texi (__builtin_frob_return_addr): Correct spelling.

gcc/c/ChangeLog:

PR c/66970
* c-decl.c (names_builtin_p): Define a new function.

gcc/c-family/ChangeLog:

PR c/66970
* c-common.c (c_common_nodes_and_builtins): Call c_define_builtins
even when only preprocessing.
* c-common.h (names_builtin_p): Declare new function.
* c-lex.c (init_c_lex): Set has_builtin.
(c_common_has_builtin): Define a new function.
* c-ppoutput.c (init_pp_output): Set has_builtin.

gcc/cp/ChangeLog:

PR c/66970
* cp-objcp-common.c (names_builtin_p): Define new function.

gcc/testsuite/ChangeLog:

PR c/66970
* c-c++-common/cpp/has-builtin-2.c: New test.
* c-c++-common/cpp/has-builtin-3.c: New test.
* c-c++-common/cpp/has-builtin.c: New test.

From-SVN: r277544

re PR target/82981 (unnecessary __multi3 call for mips64r6 linux kernel)

PR target/82981
        * config/mips/mips.md (<u>mulditi3): Generate patterns for high
        doubleword and low doubleword result of multiplication on
        MIPS64R6.

        * gcc.target/mips/mips64r6-ti-mult.c: New test.

From-SVN: r277537

cp-demangle.c (d_print_mod): Add a space before printing `complex` and `imaginary`, as opposed to after.

* cp-demangle.c (d_print_mod): Add a space before printing `complex`
and `imaginary`, as opposed to after.
* testsuite/demangle-expected: Adjust test.

From-SVN: r277535

mips.c (DIRECT_BUILTIN_PURE): New macro.

        * config/mips/mips.c (DIRECT_BUILTIN_PURE): New macro. Add a
        pure qualifier to the built-in.
        (MSA_BUILTIN_PURE): New macro. Add a pure qualifier to the MSA
        built-ins.
        (struct mips_builtin_description): Add is_pure flag.
        (mips_init_builtins): Mark built-in as pure if the flag in the
        corresponding mips_builtin_description struct is set.

        * gcc.target/mips/mips-builtins-pure.c: New test.

From-SVN: r277534

mips-msa.md (msa_insert_<msaftm_f>): Add an alternative which covers the floating-point input value.

        * config/mips/mips-msa.md (msa_insert_<msaftm_f>): Add an
        alternative which covers the floating-point input value. Also
        forbid the split of insert.d pattern for floating-point values.

        * gcc.target/mips/msa-insert-split.c: New test.

From-SVN: r277533

gcc/riscv: Add a mechanism to remove some calls to _riscv_save_0

When using the -msave-restore flag we end up with calls to
_riscv_save_0 and _riscv_restore_0.  These functions adjust the stack
and save or restore the return address.  Due to grouping multiple
save/restore stub functions together the save/restore 0 calls actually
save s0, s1, s2, and the return address, but only the return address
actually matters.  Leaf functions don't call the save/restore stubs,
so whenever we do see a call to the save/restore stubs, the store of
the return address is required.

If we look in gcc/config/riscv/riscv.c at the function
riscv_expand_prologue and riscv_expand_epilogue we can see that it
would be reasonably easy to adjust these functions to avoid the calls
to the save/restore stubs for those cases where we are about to call
_riscv_save_0 and _riscv_restore_0, however, the actual code size
saving this would give is debatable, with linker relaxation, the calls
to save/restore are often just 4-bytes, and can sometimes even be
2-bytes, while leaving the stack adjust and return address save inline
is always going to be 4-bytes.

The interesting case is when we call _riscv_save_0 and
_riscv_restore_0, and also have a frame that would (without
save/restore) have resulted in a tail call.  In this case if we could
remove the save/restore calls, and restore the tail call then we would
get a real size saving.

The problem is that the choice of generating a tail call or not is
done during the gimple expand pass, at which point we don't know how
many registers we need to save (or restore).

The solution presented in this patch offers a partial solution to this
problem.  By using the TARGET_MACHINE_DEPENDENT_REORG pass to
implement a very limited pattern matching we identify functions that
call _riscv_save_0 and _riscv_restore_0, and which could be converted
to make use of a tail call.  These functions are then converted to the
non save/restore tail call form.

This should result in a code size reduction when compiling with -Os
and with the -msave-restore flag.

gcc/ChangeLog:

        * config.gcc: Add riscv-sr.o to extra_objs for riscv.
        * config/riscv/riscv-sr.c: New file.
        * config/riscv/riscv.c (riscv_reorg): New function.
        (TARGET_MACHINE_DEPENDENT_REORG): Define.
        * config/riscv/riscv.h (SIBCALL_REG_P): Define.
        (riscv_remove_unneeded_save_restore_calls): Declare.
        * config/riscv/t-riscv (riscv-sr.o): New build rule.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/save-restore-2.c: New file.
        * gcc.target/riscv/save-restore-3.c: New file.
        * gcc.target/riscv/save-restore-4.c: New file.
        * gcc.target/riscv/save-restore-5.c: New file.
        * gcc.target/riscv/save-restore-6.c: New file.
        * gcc.target/riscv/save-restore-7.c: New file.
        * gcc.target/riscv/save-restore-8.c: New file.

From-SVN: r277527

re PR tree-optimization/92163 (ICE: Segmentation fault (in bitmap_set_bit))

2019-10-28 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>

PR tree-optimization/92163
* tree-ssa-dse.c (delete_dead_or_redundant_assignment): New param
need_eh_cleanup with default value NULL. Gate on need_eh_cleanup
before calling bitmap_set_bit.
(dse_optimize_redundant_stores): Pass global need_eh_cleanup to
delete_dead_or_redundant_assignment.
(dse_dom_walker::dse_optimize_stmt): Likewise.
* tree-ssa-dse.h (delete_dead_or_redundant_assignment): Adjust prototype.

testsuite/
* gcc.dg/tree-ssa/pr92163.c: New test.

From-SVN: r277525

re PR middle-end/91272 ([SVE] Use fully-masked loops for CLASTB reductions)

2019-10-28 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>

PR middle-end/91272
* tree-vect-stmts.c (vectorizable_condition): Support
EXTRACT_LAST_REDUCTION with fully-masked loops.

testsuite/
* gcc.target/aarch64/sve/clastb_1.c: Add dg-scan.
* gcc.target/aarch64/sve/clastb_2.c: Likewise.
* gcc.target/aarch64/sve/clastb_3.c: Likewise.
* gcc.target/aarch64/sve/clastb_4.c: Likewise.
* gcc.target/aarch64/sve/clastb_5.c: Likewise.
* gcc.target/aarch64/sve/clastb_6.c: Likewise.
* gcc.target/aarch64/sve/clastb_7.c: Likewise.
* gcc.target/aarch64/sve/clastb_8.c: Likewise.

From-SVN: r277524

re PR tree-optimization/92252 (ICE: Segmentation fault (in vect_stmt_to_vectorize))

2019-10-28 Richard Biener <rguenther@suse.de>

PR tree-optimization/92252
* tree-vect-slp.c (vect_get_and_check_slp_defs): Adjust
STMT_VINFO_REDUC_IDX when swapping operands.

* gcc.dg/torture/pr92252.c: New testcase.

From-SVN: r277517

re PR tree-optimization/92241 (ice in vect_mark_pattern_st mts, at tree-vect-patterns.c:5175)

2019-10-28 Richard Biener <rguenther@suse.de>

PR tree-optimization/92241
* tree-vect-loop.c (vect_fixup_scalar_cycles_with_patterns): When
we failed to update the reduction index do not use the pattern
stmts for the reduction chain.
(vectorizable_reduction): When the reduction chain is corrupt,
fail.
* tree-vect-patterns.c (vect_mark_pattern_stmts): Stop when we
fail to update the reduction chain.

* gcc.dg/torture/pr92241.c: New testcase.

From-SVN: r277516

[C++ PATCH] simplify deferred parsing lexer

https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01962.html

We use an eof_token global variable as a sentinel on a deferred parse
(such as in-class function definitions, or default args). This
complicates retrieving the next token in certain places.

As such deferred parses always nest properly and completely before
resuming the outer lexer, we can simply morph the token after the
deferred buffer into a CPP_EOF token and restore it afterwards. I
finally got around to implementing it with this patch.

One complication is that we have to change the discriminator for when
the token's value is a tree. We can't look at the token's type because
it might have been overwritten. I add a bool flag to the token
(there's several spare bits), and use that. This does simplify the
discriminator because we just check a single bit, rather than a set of
token types.

* parser.h (struct cp_token): Drop {ENUM,BOOL}_BITFIELD C-ism.
Add tree_check_p flag, use as nested union discriminator.
(struct cp_lexer): Add saved_type & saved_keyword fields.
* parser.c (eof_token): Delete.
(cp_lexer_new_main): Always init last_token to last token of
buffer.
(cp_lexer_new_from_tokens): Overlay EOF token at end of range.
(cp_lexer_destroy): Restore token under the EOF.
(cp_lexer_previous_token_position): No check for eof_token here.
(cp_lexer_get_preprocessor_token): Clear tree_check_p.
(cp_lexer_peek_nth_token): Check CPP_EOF not eof_token.
(cp_lexer_consume_token): Assert not CPP_EOF, no check for
eof_token.
(cp_lexer_purge_token): Likewise.
(cp_lexer_purge_tokens_after): No check for EOF token.
(cp_parser_nested_name_specifier, cp_parser_decltype)
(cp_parser_template_id): Set tree_check_p.

From-SVN: r277514

tree-vect-loop.c (vect_create_epilog_for_reduction): Use STMT_VINFO_REDUC_IDX from the actual stmt.

2019-10-28 Richard Biener <rguenther@suse.de>

* tree-vect-loop.c (vect_create_epilog_for_reduction): Use
STMT_VINFO_REDUC_IDX from the actual stmt.
(vect_transform_reduction): Likewise.
(vectorizable_reduction): Compute the reduction chain length,
do not recompute the reduction operand index. Remove no longer
necessary restriction for condition reduction chains.

From-SVN: r277513

re PR c/92249 (ICE in c_parser_gimple_compound_statement w/ GIMPLE testcases)

2019-10-28 Richard Biener <rguenther@suse.de>

PR c/92249
* gimple-parser.c (c_parser_parse_gimple_body): Make
current_bb the entry block initially to easier recover
from errors.
(c_parser_gimple_compound_statement): Adjust.

From-SVN: r277512

re PR target/92225 (ice in gen_smaxv2di3, at config/i386/sse.md:12225)

PR target/92225
* config/i386/sse.md (REDUC_SSE_SMINMAX_MODE): Use TARGET_SSE4_2
condition for V2DImode.

testsuite/ChangeLog:

PR target/92225
* gcc.target/i386/pr92225.c: New test.

From-SVN: r277510

sse.md (sse_cvtss2si<rex64namesuffix>_2): Remove %k operand modifier.

* config/i386/sse.md (sse_cvtss2si<rex64namesuffix>_2):
Remove %k operand modifier.
(*vec_extractv2df_1_sse): Remove %q operand modifier.

From-SVN: r277509

Fix unroll-and-jam.c on 32bit

where LIM interacts with foo10. On 64bit LIM doesn't do the problematic
change for whatever reason, but it seems better to disable LIM
alltogether, which requires a minor change in the testcase.

From-SVN: r277508

Move jump threading before reload

r266734 has introduced a new instance of jump threading pass in order to
take advantage of opportunities that combine opens up.  It was perceived
back then that it was beneficial to delay it after reload, since that
might produce even more such opportunities.

Unfortunately jump threading interferes with hot/cold partitioning.  In
the code from PR92007, it converts the following

  +-------------------------- 2/HOT ------------------------+
  |                                                         |
  v                                                         v
3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT
            |                               ^
            |                               |
            +-------------------------------+

into the following:

  +---------------------- 2/HOT ------------------+
  |                                               |
  v                                               v
3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT

This makes hot bb 6 dominated by cold bb 11, and because of this
fixup_partitions makes bb 6 cold as well, which in turn makes EH edge
6->16 a crossing one.  Not only can't we have crossing EH edges, we are
also not allowed to introduce new crossing edges after reload in
general, since it might require extra registers on some targets.

Therefore, move the jump threading pass between combine and hot/cold
partitioning.  Building SPEC 2006 and SPEC 2017 with the old and the new
code indicates that:

* When doing jump threading right after reload, 3889 edges are threaded.
* When doing jump threading right after combine, 3918 edges are
  threaded.

This means this change will not introduce performance regressions.

gcc/ChangeLog:

2019-10-28  Ilya Leoshkevich  <iii@linux.ibm.com>

PR rtl-optimization/92007
* cfgcleanup.c (thread_jump): Add an assertion that we don't
call it after reload if hot/cold partitioning has been done.
(class pass_postreload_jump): Rename to
pass_jump_after_combine.
(make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.
* passes.def(pass_postreload_jump): Move before reload, rename
to pass_jump_after_combine.
* tree-pass.h (make_pass_postreload_jump): Rename to
make_pass_jump_after_combine.

gcc/testsuite/ChangeLog:

2019-10-28  Ilya Leoshkevich  <iii@linux.ibm.com>

PR rtl-optimization/92007
* g++.dg/opt/pr92007.C: New test (from Arseny Solokha).

From-SVN: r277507

re PR ipa/92242 (LTO ICE in ipa_get_cs_argument_count ipa-prop.h:598)

PR ipa/92242
* ipa-fnsummary.c (ipa_merge_fn_summary_after_inlining): Check
for missing EDGE_REF
* ipa-prop.c (update_jump_functions_after_inlining): Likewise.

From-SVN: r277504

Fortran] OpenACC – libgomp/testsuite – use 'stop' and 'dg-do run'

        * testsuite/libgomp.oacc-fortran/abort-1.f90: Add 'dg-do run'.
        * testsuite/libgomp.oacc-fortran/abort-2.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/lib-1.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/common-block-1.f90:
        Use 'stop' not abort().
        * testsuite/libgomp.oacc-fortran/common-block-2.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/common-block-3.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/data-1.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/data-2.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/data-5.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/dummy-array.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/gemm-2.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/gemm.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/host_data-2.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/host_data-3.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/host_data-4.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-collapse-3.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-collapse-4.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-independent.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-loop-1.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-map-1.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-parallel-loop-data-enter-exit.f95:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-vector-1.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-vector-2.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-1.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-2.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-3.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-4.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-5.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-6.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-worker-7.f90:
        Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-reduction-1.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/lib-12.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/lib-13.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/lib-14.f90: Ditto.
        * testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction-2.f90:
        Likewise and also add 'dg-do run'.
        * testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction.f90:
        Ditto.

From-SVN: r277503

Fortran] PR91863 - fix call to bind(C) with array descriptor

        PR fortran/91863
        * trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Don't free data
        memory as that's done on the Fortran side.
        (gfc_conv_procedure_call): Handle void* pointers from
        gfc_conv_gfc_desc_to_cfi_desc.

        PR fortran/91863
        * gfortran.dg/bind-c-intent-out.f90: New.

From-SVN: r277502

rs6000: Enable limited unrolling at -O2

In PR88760, there are a few disscussion about improve or tune unroller for
targets. And we would agree to enable unroller for small loops at O2 first.
And we could see performance improvement(~10%) for below code:
```
  subroutine foo (i, i1, block)
    integer :: i, i1
    integer :: block(9, 9, 9)
    block(i:9,1,i1) = block(i:9,1,i1) - 10
  end subroutine foo

```
This kind of code occurs a few times in exchange2 benchmark.

Similar C code:
```
  for (i = 0; i < n; i++)
    arr[i] = arr[i] - 10;
```

On powerpcle, for O2 , enable -funroll-loops and limit
PARAM_MAX_UNROLL_TIMES=2 and PARAM_MAX_UNROLLED_INSNS=20, we can see >2%
overall improvement for SPEC2017.

This patch is only for rs6000 in which we see visible performance improvement.

gcc/
2019-10-25  Jiufu Guo  <guojiufu@linux.ibm.com>

PR tree-optimization/88760
* config/rs6000/rs6000-common.c (rs6000_option_optimization_table):
Enable -funroll-loops for -O2 and above.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Set
PARAM_MAX_UNROLL_TIMES to 2 and PARAM_MAX_UNROLLED_INSNS to 20, and
do not turn on web and rngreg implicitly, if the unroller is not
explicitly enabled.

gcc.testsuite/
2019-10-25  Jiufu Guo  <guojiufu@linux.ibm.com>

PR tree-optimization/88760
* gcc.target/powerpc/small-loop-unroll.c: New test.
* c-c++-common/tsan/thread_leak2.c: Update test.
* gcc.dg/pr59643.c: Update test.
* gcc.target/powerpc/loop_align.c: Update test.
* gcc.target/powerpc/ppc-fma-1.c: Update test.
* gcc.target/powerpc/ppc-fma-2.c: Update test.
* gcc.target/powerpc/ppc-fma-3.c: Update test.
* gcc.target/powerpc/ppc-fma-4.c: Update test.
* gcc.target/powerpc/pr78604.c: Update test.

From-SVN: r277501

Daily bump.

From-SVN: r277499

* locales.c (iso_3166): Add missing comma after "United-States".

From-SVN: r277492

fprintf-2.c: Silence a Free/NetBSD libc warning.

2019-10-27 Andreas Tobler <andreast@gcc.gnu.org>

* gcc.c-torture/execute/fprintf-2.c: Silence a Free/NetBSD libc warning.
* gcc.c-torture/execute/printf-2.c: Likewise.
* gcc.c-torture/execute/user-printf.c: Likewise.

From-SVN: r277491

re PR fortran/86248 (LEN_TRIM in specification expression causes link failure)

2019-10-27 Paul Thomas <pault@gcc.gnu.org>

PR fortran/86248
* resolve.c (flag_fn_result_spec): Correct a typo before the
function declaration.
* trans-decl.c (gfc_sym_identifier): Boost the length of 'name'
to allow for all variants. Simplify the code by using a pointer
to the symbol's proc_name and taking the return out of each of
the conditional branches. Allow symbols with fn_result_spec set
that do not come from a procedure namespace and have a module
name to go through the non-fn_result_spec branch.

2019-10-27 Paul Thomas <pault@gcc.gnu.org>

PR fortran/86248
* gfortran.dg/char_result_19.f90 : New test.
* gfortran.dg/char_result_mod_19.f90 : Module for the new test.

From-SVN: r277487

ipa-prop.c (ipa_propagate_indirect_call_infos): Do not remove jump functions.

* ipa-prop.c (ipa_propagate_indirect_call_infos): Do not remove
jump functions.

From-SVN: r277486

fix cgraph comment

This comment cut&pasto fix was split out of another patch I'm about to
contribute, as the current version of the patch no longer touches cgraph
data structures.

for gcc/ChangeLog

* cgraph.c (cgraph_node::rtl_info): Fix cut&pasto in comment.
* cgraph.h (cgraph_node::rtl_info): Likewise.

From-SVN: r277485

ipa-cp.c (propagate_constants_across_call): If args are not available just drop everything to varying.

* ipa-cp.c (propagate_constants_across_call): If args are not available
just drop everything to varying.
(find_aggregate_values_for_callers_subset): Watch for missing
edge summary.
(find_more_scalar_values_for_callers_subs): Likewise.
* ipa-prop.c (ipa_compute_jump_functions_for_edge,
update_jump_functions_after_inlining, propagate_controlled_uses):
Watch for missing summaries.
(ipa_propagate_indirect_call_infos): Remove summary after propagation
is finished.
(ipa_write_node_info): Watch for missing summaries.
(ipa_read_edge_info): Create new ref.
(ipa_edge_args_sum_t): Add remove.
(IPA_EDGE_REF_GET_CREATE): New macro.
* ipa-fnsummary.c (evaluate_properties_for_edge): Watch for missing
edge summary.
(remap_edge_change_prob): Likewise.

From-SVN: r277484

ipa-inline-transform.c (inline_call): update function summaries after expanidng thunk.

* ipa-inline-transform.c (inline_call): update function summaries
after expanidng thunk.

From-SVN: r277483

ipa-icf.c (sem_function::merge): Update function summaries.

* ipa-icf.c (sem_function::merge): Update function summaries.
* ipa-prop.h (ipa_get_param): Do not sanity check for WPA.

From-SVN: r277482

Remove redudant <iptr> when operand already has scalar mode.

gcc/
* config/i386/sse.md (*<sse>_vm<plusminus_insn><mode>3,
<sse>_vm<multdiv_mnemonic><mode>3): Remove <iptr> since
operand is already scalar mode.
(iptr): Remove SF/DF.

From-SVN: r277481

Daily bump.

From-SVN: r277480

codecvt.xml: Switch pubs.opengroup.org to https.

* doc/xml/manual/codecvt.xml: Switch pubs.opengroup.org to https.
* doc/xml/manual/locale.xml (LC_ALL): Ditto.
* doc/xml/manual/messages.xml: Ditto.

From-SVN: r277476

baseline_symbols.txt: Update.

* config/abi/post/hppa-linux-gnu/baseline_symbols.txt: Update.

From-SVN: r277475

rs6000: Fix allocate_stack in a corner case (PR91289)

When we have -fstack-limit-symbol with sysv we can end up with a non-
existing instruction (you cannot add an immediate to register 0).  Fix
this by using register 11 instead.  It might be used for something else
already though, so save and restore its value around this.  In
optimizing compiles these extra moves are usually removed again: the
restore by cprop_hardreg, and then the save by rtl_dce.

PR target/91289
* config/rs6000/rs6000-logue.c (rs6000_emit_allocate_stack): Don't add
an immediate to r0; use r11 instead.  Save and restore r11 to r0 around
this.

From-SVN: r277472

Adjust predicates and constraints of scalar insns.

Changelog

gcc/
* config/i386/sse.md
(<sse>_vm<plusminus_insn><mode>3<mask_scalar_name><round_scalar_name>,
<sse>_vm<multdiv_mnemonic><mode>3<mask_scalar_name><round_scalar_name>,
<sse>_vmsqrt<mode>2<mask_scalar_name><round_scalar_name>,
<sse>_vm<code><mode>3<mask_scalar_name><round_saeonly_scalar_name>,
<sse>_vmmaskcmp<mode>3):
Change predicates from vector_operand to nonimmediate_operand,
constraints xBm to xm, since scalar operations don't need
memory address alignment.
(avx512f_vmcmp<mode>3<round_saeonly_name>,
avx512f_vmcmp<mode>3_mask<round_saeonly_name>): Replace
round_saeonly_nimm_predicate with
round_saeonly_nimm_scalar_predicate.
(fmai_vmfmadd_<mode><round_name>, fmai_vmfmsub_<mode><round_name>,
fmai_vmfnmadd_<mode><round_name>,fmai_vmfnmsub_<mode><round_name>,
*fmai_fmadd_<mode>, *fmai_fmsub_<mode>,
*fmai_fnmadd_<mode><round_name>, *fmai_fnmsub_<mode><round_name>,
avx512f_vmfmadd_<mode>_mask3<round_name>,
avx512f_vmfmadd_<mode>_maskz_1<round_name>,
*avx512f_vmfmsub_<mode>_mask<round_name>,
avx512f_vmfmsub_<mode>_mask3<round_name>,
*avx512f_vmfmsub_<mode>_maskz_1<round_name>,
*avx512f_vmfnmadd_<mode>_mask<round_name>,
*avx512f_vmfnmadd_<mode>_mask3<round_name>,
*avx512f_vmfnmadd_<mode>_maskz_1<round_name>,
*avx512f_vmfnmsub_<mode>_mask<round_name>,
*avx512f_vmfnmsub_<mode>_mask3<round_name>,
*avx512f_vmfnmsub_<mode>_maskz_1<round_name>,
cvtusi2<ssescalarmodesuffix>32<round_name>,
cvtusi2<ssescalarmodesuffix>64<round_name>, ): Replace
round_nimm_predicate with round_nimm_scalr_predicate.
(avx512f_sfixupimm<mode><sd_maskz_name><round_saeonly_name>,
avx512f_sfixupimm<mode>_mask<round_saeonly_name>,
avx512er_vmrcp28<mode><round_saeonly_name>,
avx512er_vmrsqrt28<mode><round_saeonly_name>,
): Replace round_saeonly_nimm_predicate with
round_saeonly_nimm_scalar_predicate.
(avx512dq_vmfpclass<mode><mask_scalar_merge_name>): Replace
vector_operand with nonimmediate_operand.
* config/i386/subst.md (round_scalar_nimm_predicate,
round_saeonly_scalar_nimm_predicate): Replace
vector_operand with nonimmediate_operand.

From-SVN: r277470

Fix false dependence of scalar operation vrcp/vsqrt/vrsqrt/vrndscale
For instructions with xmm operand:

op %xmmN,%xmmQ,%xmmQ ----> op %xmmN, %xmmN, %xmmQ

for instruction with mem operand or gpr operand:

op mem/gpr, %xmmQ, %xmmQ

---> using pass rpad ---->

xorps %xmmN, %xmmN, %xxN
op mem/gpr, %xmmN, %xmmQ

Performance influence of SPEC2017 fprate which is tested on SKX
----
503.bwaves_r -0.03%
507.cactuBSSN_r -0.22%
508.namd_r -0.02%
510.parest_r 0.37%
511.povray_r 0.74%
519.lbm_r 0.24%
521.wrf_r 2.35%
526.blender_r 0.71%
527.cam4_r 0.65%
538.imagick_r 0.95%
544.nab_r -0.37
549.fotonik3d_r 0.24%
554.roms_r 0.90%
fprate geomean 0.50%
-----

Changelog
gcc/
* config/i386/i386.md (*rcpsf2_sse): Add
avx_partial_xmm_update, prefer m constraint for TARGET_AVX.
(*rsqrtsf2_sse): Ditto.
(*sqrt<mode>2_sse): Ditto.
(sse4_1_round<mode>2): separate constraint vm, add
avx_partail_xmm_update, prefer m constraint for TARGET_AVX.
* config/i386/sse.md (*sse_vmrcpv4sf2"): New define_insn used
by pass rpad.
(*<sse>_vmsqrt<mode>2<mask_scalar_name><round_scalar_name>*):
Ditto.
(*sse_vmrsqrtv4sf2): Ditto.
(*avx512f_rndscale<mode><round_saeonly_name>): Ditto.
(*sse4_1_round<ssescalarmodesuffix>): Ditto.
(sse4_1_round<ssescalarmodesuffix>): Add m constraint and
<iptr> pointer size modifier since vround support memory operand.

gcc/testsuite
* gcc.target/i386/pr87007-4.c: New test.
* gcc.target/i386/pr87007-5.c: Ditto.

From-SVN: r277469

Daily bump.

From-SVN: r277468

PR c++/91581 - ICE in exception-specification of defaulted ctor.

* g++.dg/cpp0x/noexcept55.C: New test.

From-SVN: r277462

Use implicitly-defined copy operations for test iterators

All of these special member functions do exactly what the compiler would
do anyway. By defining them as defaulted for C++11 and later we prevent
move constructors and move assignment operators being defined (which is
consistent with the previous semantics).

Also move default init of the input_iterator_wrapper members from the
derived constructor to the protected base constructor.

* testsuite/util/testsuite_iterators.h (output_iterator_wrapper)
(input_iterator_wrapper, forward_iterator_wrapper)
bidirectional_iterator_wrapper, random_access_iterator_wrapper): Remove
user-provided copy constructors and copy assignment operators so they
are defined implicitly.
(input_iterator_wrapper): Initialize members in default constructor.
(forward_iterator_wrapper): Remove assignments to members of base.

From-SVN: r277459

Fix compilation with Clang

The new constexpr destructor on std::allocator breaks compilation with
Clang in C++2a mode. This only makes it constexpr if the compiler
supports the P0784R7 features.

* include/bits/allocator.h: Check __cpp_constexpr_dynamic_alloc
before making the std::allocator destructor constexpr.
* testsuite/20_util/allocator/requirements/constexpr.cc: New test.

From-SVN: r277458

re PR target/85969 (avr/gen-avr-mmcu-specs.c:56: unused function ?)

PR target/85969
* config/avr/gen-avr-mmcu-specs.c (str_prefix_p): Remove unused
static function.

From-SVN: r277455

[Fortran] OpenACC – permit common blocks in some clauses

2019-10-25 Cesar Philippidis <cesar@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>

gcc/fortran/
* openmp.c (gfc_match_omp_map_clause): Add and pass allow_commons
argument.
(gfc_match_omp_clauses): Update calls to permit common blocks for
OpenACC's copy/copyin/copyout, create/delete, host,
pcopy/pcopy_in/pcopy_out, present_or_copy, present_or_copy_in,
present_or_copy_out, present_or_create and self.

gcc/
* gimplify.c (oacc_default_clause): Privatize fortran common blocks.
(omp_notice_variable): Defer the expansion of DECL_VALUE_EXPR for
common block decls.

gcc/testsuite/
* gfortran.dg/goacc/common-block-1.f90: New test.
* gfortran.dg/goacc/common-block-2.f90: New test.
* gfortran.dg/goacc/common-block-3.f90: New test.

libgomp/
* testsuite/libgomp.oacc-fortran/common-block-1.f90: New test.
* testsuite/libgomp.oacc-fortran/common-block-2.f90: New test.
* testsuite/libgomp.oacc-fortran/common-block-3.f90: New test.

Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
From-SVN: r277451

pr70100.c: Add -mvsx.

* gcc.target/powerpc/pr70100.c: Add -mvsx.
Allow AIX ABI function name.

From-SVN: r277450

Guard use of concepts with feature test macro

This fixes a regression when using Clang.

* include/bits/range_cmp.h: Check __cpp_lib_concepts before defining
concepts. Fix comment.

From-SVN: r277449

re PR tree-optimization/92222 (ice in useless_type_conversion_p, at gimple-expr.c:86)

2019-10-25 Richard Biener <rguenther@suse.de>

PR tree-optimization/92222
* tree-vect-slp.c (_slp_oprnd_info::first_pattern): Remove.
(_slp_oprnd_info::second_pattern): Likewise.
(_slp_oprnd_info::any_pattern): New.
(vect_create_oprnd_info): Adjust.
(vect_get_and_check_slp_defs): Compute whether any stmt is
in a pattern.
(vect_build_slp_tree_2): Avoid building up a node from scalars
if any of the operand defs, not just the first, is in a pattern.

* gcc.dg/torture/pr92222.c: New testcase.

From-SVN: r277448

tree-vect-slp.c (vect_get_and_check_slp_defs): Only fail swapping if we actually have to modify the IL on a shared stmt.

2019-10-25 Richard Biener <rguenther@suse.de>

* tree-vect-slp.c (vect_get_and_check_slp_defs): Only fail
swapping if we actually have to modify the IL on a shared stmt.
(vect_build_slp_tree_2): Never fail swapping on shared stmts
because we no longer modify the IL.

From-SVN: r277446

Fix failure in gcc.target/sve/reduc_strict_3.c

Unwanted unrolling meant that we had more single-precision FADDAs
than expected.

2019-10-25 Richard Sandiford <richard.sandiford@arm.com>

gcc/testsuite/
* gcc.target/aarch64/sve/reduc_strict_3.c (double_reduc1): Prevent
the loop from being unrolled.

From-SVN: r277442

Update SVE tests for recent XPASSes

Recent target-independent patches mean that several SVE tests
now produce the code that we'd originally wanted them to produce.
Really nice to see :-)

This patch therefore updates the expected baseline, so that hopefully
we don't regress from this point in future.

2019-10-25 Richard Sandiford <richard.sandiford@arm.com>

gcc/testsuite/
* gcc.target/aarch64/sve/loop_add_5.c: Remove XFAILs for tests
that now pass.
* gcc.target/aarch64/sve/reduc_1.c: Likewise.
* gcc.target/aarch64/sve/reduc_2.c: Likewise.
* gcc.target/aarch64/sve/reduc_5.c: Likewise.
* gcc.target/aarch64/sve/reduc_8.c: Likewise.
* gcc.target/aarch64/sve/slp_13.c: Likewise.
* gcc.target/aarch64/sve/slp_5.c: Likewise. Update expected
WHILELO counts.
* gcc.target/aarch64/sve/slp_7.c: Likewise.

From-SVN: r277441

Fix typo in dump_tree_statistics.

2019-10-25 Martin Liska <mliska@suse.cz>

* tree.c (dump_tree_statistics): Use sorted index 'j' and not 'i'.

From-SVN: r277440

Fix reductions for fully-masked loops

Now that vectorizable_operation vectorises most loop stmts involved
in a reduction, it needs to be aware of reductions in fully-masked loops.
The LOOP_VINFO_CAN_FULLY_MASK_P parts of vectorizable_reduction now only
apply to cases that use vect_transform_reduction.

This new way of doing things is definitely an improvement for SVE though,
since it means we can lift the old restriction of not using fully-masked
loops for reduction chains.

2019-10-25 Richard Sandiford <richard.sandiford@arm.com>

gcc/
* tree-vect-loop.c (vectorizable_reduction): Restrict the
LOOP_VINFO_CAN_FULLY_MASK_P handling to cases that will be
handled by vect_transform_reduction. Allow fully-masked loops
to be used with reduction chains.
* tree-vect-stmts.c (vectorizable_operation): Handle reduction
operations in fully-masked loops.
(vectorizable_condition): Reject EXTRACT_LAST_REDUCTION
operations in fully-masked loops.

gcc/testsuite/
* gcc.dg/vect/pr65947-1.c: No longer expect doubled dump lines
for FOLD_EXTRACT_LAST reductions.
* gcc.dg/vect/pr65947-2.c: Likewise.
* gcc.dg/vect/pr65947-3.c: Likewise.
* gcc.dg/vect/pr65947-4.c: Likewise.
* gcc.dg/vect/pr65947-5.c: Likewise.
* gcc.dg/vect/pr65947-6.c: Likewise.
* gcc.dg/vect/pr65947-9.c: Likewise.
* gcc.dg/vect/pr65947-10.c: Likewise.
* gcc.dg/vect/pr65947-12.c: Likewise.
* gcc.dg/vect/pr65947-13.c: Likewise.
* gcc.dg/vect/pr65947-14.c: Likewise.
* gcc.dg/vect/pr80631-1.c: Likewise.
* gcc.dg/vect/pr80631-2.c: Likewise.
* gcc.dg/vect/vect-cond-reduc-3.c: Likewise.
* gcc.dg/vect/vect-cond-reduc-4.c: Likewise.

From-SVN: r277438

tree-vect-loop.c (vectorizable_reduction): Verify STMT_VINFO_REDUC_IDX on the to be vectorized stmts is set up correctly.

2019-10-25 Richard Biener <rguenther@suse.de>

* tree-vect-loop.c (vectorizable_reduction): Verify
STMT_VINFO_REDUC_IDX on the to be vectorized stmts is set up
correctly.
* tree-vect-patterns.c (vect_mark_pattern_stmts): Transfer
STMT_VINFO_REDUC_IDX from the original stmts to the pattern
stmts.

From-SVN: r277437

policy_data_structures_biblio.xml: Switch pubs.opengroup.org to https.

* doc/xml/manual/policy_data_structures_biblio.xml: Switch
pubs.opengroup.org to https.

From-SVN: r277436

* doc/xml/gnu/gpl-3.0.xml: Switch gnu.org to https.

From-SVN: r277435

status_cxx2020.xml: Add rows and update status.

2019-09-09 Edward Smith-Rowland <3dw4rd@verizon.net>

* doc/xml/manual/status_cxx2020.xml: Add rows and update status.

From-SVN: r277434