git.libre-soc.org Git

libstdc++: Check FE_TONEAREST is defined before using it

We need to test that FE_TONEAREST is defined before we may use it along
with fegetround/fesetround to adjust the floating-point rounding mode.
This fixes a build failure with older versions of newlib.

libstdc++-v3/ChangeLog:

* src/c++17/floating_from_chars.cc (from_chars_impl)
[!defined(FE_TONEAREST)]: Don't adjust the rounding mode.
* src/c++17/floating_to_chars.cc (__floating_to_chars_precision):
Likewise.

openmp: Implicitly add 'declare target' directives for dynamic initializers in C++

2020-12-18 Kwok Cheung Yeung <kcy@codesourcery.com>

gcc/
* langhooks-def.h (lhd_get_decl_init): New.
(lhd_finish_decl_inits): New.
(LANG_HOOKS_GET_DECL_INIT): New.
(LANG_HOOKS_OMP_FINISH_DECL_INITS): New.
(LANG_HOOKS_DECLS): Add LANG_HOOKS_GET_DECL_INIT and
LANG_HOOKS_OMP_FINISH_DECL_INITS.
* langhooks.c (lhd_omp_get_decl_init): New.
(lhd_omp_finish_decl_inits): New.
* langhooks.h (struct lang_hooks_for_decls): Add omp_get_decl_init
and omp_finish_decl_inits.
* omp-offload.c (omp_discover_declare_target_var_r): Use
get_decl_init langhook in place of DECL_INITIAL. Call
omp_finish_decl_inits langhook at end of function.

gcc/cp/
* cp-lang.c (cxx_get_decl_init): New.
(cxx_omp_finish_decl_inits): New.
(LANG_HOOKS_GET_DECL_INIT): New.
(LANG_HOOKS_OMP_FINISH_DECL_INITS): New.
* cp-tree.h (dynamic_initializers): New.
* decl.c (dynamic_initializers): New.
* decl2.c (c_parse_final_cleanups): Add initializer entries
from vars to dynamic_initializers.

gcc/testsuite/
* g++.dg/gomp/declare-target-3.C: New.

aarch64: Extend aarch64-autovec-preference==2 to 128-bit SVE

When compiling with -msve-vector-bits=128, aarch64_preferred_simd_mode
would pass the same vector width to aarch64_simd_container_mode for
both SVE and Advanced SIMD, and so Advanced SIMD would always “win”.
This patch instead makes it choose directly between SVE and Advanced
SIMD modes, so that aarch64-autovec-preference==2 and
aarch64-autovec-preference==4 work for this configuration.

(aarch64-autovec-preference shouldn't affect aarch64_simd_container_mode
because that would have an ABI impact for things like GNU vectors.)

gcc/
* config/aarch64/aarch64.c (aarch64_preferred_simd_mode): Use
aarch64_full_sve_mode and aarch64_vq_mode directly, instead of
going via aarch64_simd_container_mode.

Arm: MVE: Add missing complex mul iterators

Seems when I split the patch I forgot to include these into the rot iterator..
The uncommitted hunks were still in my local tree so didn't notice.

gcc/ChangeLog:

* config/arm/iterators.md (rot): Add UNSPEC_VCMUL, UNSPEC_VCMUL90,
UNSPEC_VCMUL180, UNSPEC_VCMUL270.

c++: Fix windows binary files [PR 98362]

Windows has unique and special needs for open(2).

gcc/cp/
* module.cc (O_CLOEXEC, O_BINARY): Add window's support.
(elf_in::defrost, module_state::do_import)
(finish_module_processing): Use O_BINARY.

As well as the PR this patch fixes problems in handling class objects

2020-12-18 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/83118
PR fortran/96012
* resolve.c (resolve_ordinary_assign): Generate a vtable if
necessary for scalar non-polymorphic rhs's to unlimited lhs's.
* trans-array.c (get_class_info_from_ss): New function.
(gfc_trans_allocate_array_storage): Defer obtaining class
element type until all sources of class exprs are tried. Use
class API rather than TREE_OPERAND. Look for class expressions
in ss->info by calling get_class_info_from_ss. After, obtain
the element size for class descriptors. Where the element type
is unknown, cast the data as character(len=size) to overcome
unlimited polymorphic problems.
(gfc_conv_ss_descriptor): Do not fix class variable refs.
(build_class_array_ref, structure_alloc_comps): Replace code
replicating the new function gfc_resize_class_size_with_len.
(gfc_alloc_allocatable_for_assignment): Obtain element size
for lhs in cases of deferred characters and class enitities.
Move code for the element size of rhs to start of block. Clean
up extraction of class parameters throughout this function.
After the shape check test whether or not the lhs and rhs
element sizes are the same. Use earlier evaluation of
'cond_null'. Reallocation of lhs only to happen if size changes
or element size changes.
* trans-expr.c (gfc_resize_class_size_with_len): New function.
(gfc_get_class_from_expr): If a constant expression is
encountered, return NULL_TREE;
(trans_scalar_class_assign): New function.
(gfc_conv_procedure_call): Ensure the vtable is present for
passing a non-class actual to an unlimited formal.
(trans_class_vptr_len_assignment): For expressions of type
BT_CLASS, extract the class expression if necessary. Use a
statement block outside the loop body. Ensure that 'rhs' is
of the correct type. Obtain rhs vptr in all circumstances.
(gfc_trans_scalar_assign): Call trans_scalar_class_assign to
make maximum use of the vptr copy in place of assignment.
(trans_class_assignment): Actually do reallocation if needed.
(gfc_trans_assignment_1): Simplify some of the logic with
'realloc_flag'. Set 'vptr_copy' for all array assignments to
unlimited polymorphic lhs.
* trans.c (gfc_build_array_ref): Call gfc_resize_class_size_
with_len to correct span for unlimited polymorphic decls.
* trans.h : Add prototype for gfc_resize_class_size_with_len.

gcc/testsuite/
PR fortran/83118
PR fortran/96012
* gfortran.dg/dependency_60.f90: New test.
* gfortran.dg/class_allocate_25.f90: New test.
* gfortran.dg/class_assign_4.f90: New test.
* gfortran.dg/unlimited_polymorphic_32.f03: New test.

c++: Fix PCH ICE with __builtin_source_location [PR98343]

Seems the ggc_remove ppc_nx 3 operand member relies on the hash tables to
contain pointers in the first element, which is not the case for
source_location_table* hash table, which has location_t and unsigned as
first two members and pointer somewhere else.
I've tried to change:
   static void
   pch_nx (T &p, gt_pointer_operator op, void *cookie)
   {
-    op (&p, cookie);
+    extern void gt_pch_nx (T *, gt_pointer_operator, void *);
+    gt_pch_nx (&p, op, cookie);
   }
in hash-traits.h, but that failed miserably.
So, this patch instead overrides the two pch_nx overloads (only the second
one is needed, the former one is identical to the ggc_remove one) but I need
to override both.

2020-12-18  Jakub Jelinek  <jakub@redhat.com>

PR c++/98343
* cp-gimplify.c (source_location_table_entry_hash::pch_nx): Override
static member functions from ggc_remove.

* g++.dg/pch/pr98343.C: New test.
* g++.dg/pch/pr98343.Hs: New file.

Go testsuite: handle +build lines correctly

Update the Go testsuite driver to handle +build lines as is done in
the upstream repo, and update some tests to the upstream repo copy
using +build lines with "gc" and "!gccgo" as appropriate.

* go.test/go-test.exp (go-set-goos): New procedure.
(go-gc-match): New procedure.
(go-gc-tests): Call go-set-goos. Use go-gc-match to handle +build
lines. Look for +build lines beyond first line of file.

libstdc++: Import MSVC floating-point std::to_chars testcases

The testcases are imported almost verbatim, with the only change being
to the -double_nan and -float_nan testcases. We expect these values to
be formatted as "-nan" instead of "-nan(ind)".

libstdc++-v3/ChangeLog:

* testsuite/20_util/to_chars/double.cc: New test, consisting of
testcases imported from the MSVC STL testsuite.
* testsuite/20_util/to_chars/float.cc: Likewise.

libstdc++: Add floating-point std::to_chars implementation

This implements the floating-point std::to_chars overloads for float,
double and long double.  We use the Ryu library to compute the shortest
round-trippable fixed and scientific forms for float, double and long
double.  We also use Ryu for performing explicit-precision fixed and
scientific formatting for float and double. For explicit-precision
formatting for long double we fall back to using printf.  Hexadecimal
formatting for float, double and long double is implemented from
scratch.

The supported long double binary formats are binary64, binary80 (x86
80-bit extended precision), binary128 and ibm128.

Much of the complexity of the implementation is in computing the exact
output length before handing it off to Ryu (which doesn't do bounds
checking).  In some cases it's hard to compute the output length
beforehand, so in these cases we instead compute an upper bound on the
output length and use a sufficiently-sized intermediate buffer only if
necessary.

Another source of complexity is in the general-with-precision formatting
mode, where we need to do zero-trimming of the string returned by Ryu,
and where we also take care to avoid having to format the number through
Ryu a second time when the general formatting mode resolves to fixed
(which we determine by doing a scientific formatting first and
inspecting the scientific exponent).  We avoid going through Ryu twice
by instead transforming the scientific form to the corresponding fixed
form via in-place string manipulation.

This implementation is non-conforming in a couple of ways:

1. For the shortest hexadecimal formatting, we currently follow the
   Microsoft implementation's decision to be consistent with the
   output of printf's '%a' specifier at the expense of sometimes not
   printing the shortest representation.  For example, the shortest hex
   form for the number 1.08p+0 is 2.1p-1, but we output the former
   instead of the latter, as does printf.

2. The Ryu routine generic_binary_to_decimal that we use for performing
   shortest formatting for large floating point types is implemented
   using the __int128 type, but some targets with a large long double
   type lack __int128 (e.g. i686), so we can't perform shortest
   formatting of long double on such targets through Ryu.  As a
   temporary stopgap this patch makes the long double to_chars overloads
   just dispatch to the double overloads on these targets, which means
   we lose precision in the output.  (We could potentially fix this by
   writing a specialized version of Ryu's generic_binary_to_decimal
   routine that uses uint64_t instead of __int128.)  [Though I wonder if
   there's a better way to work around the lack of __int128 on i686
   specifically?]

3. Our shortest formatting for __ibm128 doesn't guarantee the round-trip
   property if the difference between the high- and low-order exponent
   is large.  This is because we treat __ibm128 as if it has a
   contiguous 105-bit mantissa by merging the mantissas of the high-
   and low-order parts (using code extracted from glibc), so we
   potentially lose precision from the low-order part.  This seems to be
   consistent with how glibc printf formats __ibm128.

libstdc++-v3/ChangeLog:

* config/abi/pre/gnu.ver: Add new exports.
* include/std/charconv (to_chars): Declare the floating-point
overloads for float, double and long double.
* src/c++17/Makefile.am (sources): Add floating_to_chars.cc.
* src/c++17/Makefile.in: Regenerate.
* src/c++17/floating_to_chars.cc: New file.
(to_chars): Define for float, double and long double.
* testsuite/20_util/to_chars/long_double.cc: New test.

libstdc++: Apply modifications to our local copy of Ryu

This performs the following modifications to our local copy of Ryu in
order to make it more readily usable for our std::to_chars
implementation:

  * Remove all #includes
  * Remove copy_special_str routines
  * Adjust the exponent formatting to match printf
  * Remove some functions we're not going to use
  * Add an out-parameter to d2exp_buffered_n for the scientific exponent
  * Store the sign bit inside struct floating_decimal_[32|64]
  * Rename [df]2s_buffered_n and change their return type
  * Make generic_binary_to_decimal take the bit representation in parts

libstdc++-v3/ChangeLog:

* src/c++17/ryu/common.h, src/c++17/ryu/d2fixed.c,
src/c++17/ryu/d2fixed_full_table.h, src/c++17/ryu/d2s.c,
src/c++17/ryu/d2s_intrinsics.h, src/c++17/ryu/f2s.c,
src/c++17/ryu/f2s_intrinsics.h, src/c++17/ryu/generic_128.c:
Apply local modifications.

libstdc++: Import parts of the Ryu library

This imports the source files from the Ryu library that define
d2s_buffered_n, f2s_buffered_n, d2fixed_buffered_n, d2exp_buffered_n and
generic_binary_to_decimal, which we're going to use as the base of our
std::to_chars implementation.

libstdc++-v3/ChangeLog:

* src/c++17/ryu/MERGE: New file.
* src/c++17/ryu/common.h, src/c++17/ryu/d2fixed.c,
src/c++17/ryu/d2fixed_full_table.h, src/c++17/ryu/d2s.c,
src/c++17/ryu/d2s_full_table.h, src/c++17/ryu/d2s_intrinsics.h,
src/c++17/ryu/digit_table.h, src/c++17/ryu/f2s.c,
src/c++17/ryu/f2s_intrinsics.h, src/c++17/ryu/generic_128.c,
src/c++17/ryu/generic_128.h, src/c++17/ryu/ryu_generic_128.h:
Import these files from the Ryu library.

c++: More precise tracking of potentially unstable satisfaction

This makes tracking of potentially unstable satisfaction results more
precise by recording the specific types for which completion failed
during satisfaction.  We now recompute a satisfaction result only if one
of these types has been completed since the last time we computed the
satisfaction result.  Thus the number of times that we recompute a
satisfaction result is now bounded by the number of such incomplete
types, rather than being effectively unbounded.  This allows us to
remove the invalid assumption in note_ftc_for_satisfaction that was
added to avoid a compile time performance regression in cmcstl2 due to
repeated recomputation of a satisfaction result that depended on
completion of a permanently incomplete class template specialization.

In order to continue to detect the instability in concepts-complete3.C,
we also need to explicitly keep track of return type deduction failure
alongside type completion failure.  So this patch also adds a call to
note_ftc_for_satisfaction in require_deduced_type.

gcc/cp/ChangeLog:

* constraint.cc (satisfying_constraint): Move up definition
and give it bool type.
(failed_type_completion_count): Replace with ...
(failed_type_completions): ... this.
(note_failed_type_completion_for_satisfaction): Append the
supplied argument to failed_type_completions.
(some_type_complete_p): Define.
(sat_entry::maybe_unstable): Replace with ...
(sat_entry::ftc_begin, sat_entry::ftc_end): ... these.
(satisfaction_cache::ftc_count): Replace with ...
(satisfaction_cache::ftc_begin): ... this.
(satisfaction_cache::satisfaction_cache): Adjust accordingly.
(satisfaction_cache::get): Adjust accordingly, using
some_type_complete_p.
(satisfaction_cache::save): Adjust accordingly.
(satisfying_constraint_p): Remove unused function.
(satisfy_constraint): Set satisfying_constraint.
(satisfy_declaration_constraints): Likewise.
* decl.c (require_deduced_type): Call
note_failed_type_completion_for_satisfaction.

c++: Diagnose self-recursive satisfaction

This patch further extends the satisfaction_cache class to diagnose
self-recursive satisfaction.

gcc/cp/ChangeLog:

* constraint.cc (sat_entry::evaluating): New member.
(satisfaction_cache::get): If entry->evaluating, diagnose
self-recursive satisfaction. Otherwise, set entry->evaluating
if we're not reusing a cached satisfaction result.
(satisfaction_cache::save): Clear entry->evaluating.
(satisfy_atom): Set up diagnosing_failed_constraint before the
first call to get().

gcc/testsuite/ChangeLog:

PR c++/96840
* g++.dg/cpp2a/concepts-pr88395.C: Adjust to expect the
self-recursive satisfaction to get directly diagnosed.
* g++.dg/cpp2a/concepts-recursive-sat2.C: Likewise.
* g++.dg/cpp2a/concepts-recursive-sat4.C: New test.

c++: Diagnose unstable satisfaction

This implements lightweight heuristical detection and diagnosing of
satisfaction whose result changes at different points in the program,
which renders the program ill-formed NDR as of P2104.  We've recently
started to more aggressively cache satisfaction results, and so the goal
with this patch is to make this caching behavior more transparent to
the user.

A satisfaction result is flagged as "potentially unstable" (at the atom
granularity) if during its computation, some type completion failure
occurs.  This is detected by making complete_type_or_maybe_complain
increment a counter upon failure and comparing the value of the counter
before and after satisfaction.  (We don't instrument complete_type
directly because it's used "opportunistically" in many spots where type
completion failure doesn't necessary lead to substitution failure.)

Such flagged satisfaction results are always recomputed from scratch,
even when performing satisfaction quietly.  When saving a satisfaction
result, we now compare the computed result with the cached result, and
if they differ, proceed with diagnosing the instability.

Most of the implementation is confined to the satisfaction_cache class,
which has been completely rewritten.

gcc/cp/ChangeLog:

* constraint.cc (failed_type_completion_count): New.
(note_failed_type_completion_for_satisfaction): New.
(sat_entry::constr): Rename to ...
(sat_entry::atom): ... this.
(sat_entry::location): New member.
(sat_entry::maybe_unstable): New member.
(sat_entry::diagnose_instability): New member.
(struct sat_hasher): Adjust after the above renaming.
(get_satisfaction, save_satisfaction): Remove.
(satisfaction_cache): Rewrite completely.
(satisfy_atom): When instantiation of the parameter mapping
fails, set diagnose_instability.  Propagate location from
inst_cache.entry to cache.entry if the secondary lookup
succeeded.
(satisfy_declaration_constraints): When
failed_type_completion_count differs before and after
satisfaction, then don't cache the satisfaction result.
* cp-tree.h (note_failed_type_completion_for_satisfaction):
Declare.
* pt.c (tsubst) <case TYPENAME_TYPE>: Use
complete_type_or_maybe_complain instead of open-coding it.
* typeck.c (complete_type_or_maybe_complain): Call
note_failed_type_completion_for_satisfaction when type
completion fails.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-complete1.C: New test.
* g++.dg/cpp2a/concepts-complete2.C: New test.
* g++.dg/cpp2a/concepts-complete3.C: New test.

Daily bump.

arm: Add support for Cortex-A78C

This patch adds support for -mcpu=cortex-a78c command line option.
For more information about this processor, see [0]:

[0] https://developer.arm.com/ip-products/processors/cortex-a/cortex-a78c

gcc/ChangeLog:

* config/arm/arm-cpus.in: Add Cortex-A78C core.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Update docs.

rtl-ssa: Fix reg_raw_mode thinko [PR98347]

I'd used reg_raw_mode[regno] for general registers, even though
the array is only valid for hard registers. This patch uses
regno_reg_rtx instead.

gcc/
PR rtl-optimization/98347
* rtl-ssa/access-utils.h (full_register): Use regno_reg_rtx
instead of reg_raw_mode.

Update default_estimated_poly_value prototype in targhooks.h

commit 64432b680eab0bddbe9a4ad4798457cf6a14ad60
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Thu Dec 17 18:02:37 2020 +0000

    vect, aarch64: Extend SVE vs Advanced SIMD costing decisions in vect_better_loop_vinfo_p

changed default_estimated_poly_value to

HOST_WIDE_INT
default_estimated_poly_value (poly_int64 x, poly_value_estimate_kind)
{
  return x.coeffs[0];
}

Update default_estimated_poly_value prototype in targhooks.h to match it.

* targhooks.h (default_estimated_poly_value): Updated.

doc: Standard library header units

It seems users are confused by the lack of standard library header
units.

gcc/
* doc/invoke.texi (C++ Modules): Document lack of std
library header units.

vect, aarch64: Extend SVE vs Advanced SIMD costing decisions in vect_better_loop_vinfo_p

While experimenting with some backend costs for Advanced SIMD and SVE I
hit many cases where GCC would pick SVE for VLA auto-vectorisation even when
the backend very clearly presented cheaper costs for Advanced SIMD.
For a simple float addition loop the SVE costs were:

vec.c:9:21: note:  Cost model analysis:
  Vector inside of loop cost: 28
  Vector prologue cost: 2
  Vector epilogue cost: 0
  Scalar iteration cost: 10
  Scalar outside cost: 0
  Vector outside cost: 2
  prologue iterations: 0
  epilogue iterations: 0
  Minimum number of vector iterations: 1
  Calculated minimum iters for profitability: 4

and for Advanced SIMD (Neon) they're:

vec.c:9:21: note:  Cost model analysis:
  Vector inside of loop cost: 11
  Vector prologue cost: 0
  Vector epilogue cost: 0
  Scalar iteration cost: 10
  Scalar outside cost: 0
  Vector outside cost: 0
  prologue iterations: 0
  epilogue iterations: 0
  Calculated minimum iters for profitability: 0
vec.c:9:21: note:    Runtime profitability threshold = 4

yet the SVE one was always picked. With guidance from Richard this seems
to be due to the vinfo comparisons in vect_better_loop_vinfo_p, in
particular the part with the big comment explaining the
estimated_rel_new * 2 <= estimated_rel_old heuristic.

This patch extends the comparisons by introducing a three-way estimate
kind for poly_int values that the backend can distinguish.
This allows vect_better_loop_vinfo_p to ask for minimum, maximum and
likely estimates and pick Advanced SIMD overs SVE when it is clearly cheaper.

gcc/
* target.h (enum poly_value_estimate_kind): Define.
(estimated_poly_value): Take an estimate kind argument.
* target.def (estimated_poly_value): Update definition for the
above.
* doc/tm.texi: Regenerate.
* targhooks.c (estimated_poly_value): Update prototype.
* tree-vect-loop.c (vect_better_loop_vinfo_p): Use min, max and
likely estimates of VF to pick between vinfos.
* config/aarch64/aarch64.c (aarch64_cmp_autovec_modes): Use
estimated_poly_value instead of aarch64_estimated_poly_value.
(aarch64_estimated_poly_value): Take a kind argument and handle
it.

c++: Fix clang problem [PR 98340]

Clang didn't like sizeot (uintset::value) in a templated context. Not sure
where the problem lies -- ambiguous std, gcc erroneous accept or clang erroneous
reject. Anyway, this avoids that construct.

PR c++/98340
gcc/cp/
* module.cc (uintset<T>::hash::add): Use uintset (0u).MEMBER,
rather than uintset::MEMBER.

libcody: Allow PIC [PR 98324]

While this doesn't fix 98324, it was an omission. Cribbed code from
libcpp to build libcody as PIC.

libcody/
* configure.ac: Add --enable-host-shared.
* Makefile.in: Add FLAGPIC.
* configure: Regenerated.

libstdc++: Test errno macros directly for all targets [PR 93151]

This applies the same changes to the djgpp and mingw versions of
error_constants.h as r11-6137 did for the generic version.

All of these constants are defined as macros by <errno.h> on these
targets, so we can just test the macro directly instead of checking for
it at configure time.

libstdc++-v3/ChangeLog:

* config/os/djgpp/error_constants.h: Test POSIX errno macros
directly, instead of corresponding _GLIBCXX_HAVE_EXXX macros.
* config/os/mingw32-w64/error_constants.h: Likewise.
* config/os/mingw32/error_constants.h: Likewise.

libstdc++: Fix condition for gthreads-timed effective-target

The refactoring in r11-5500 altered the condition for the gthreads-timed
test from #if to #ifdef. For some reason that macro is always defined,
rather than being defined to 1 or undefined like most of our autoconf
macros. That means the test always passes now, even for targets where
the macro is defined to 0 (specifically, Darwin). That causes some tests
to FAIL when they should have been UNSUPPORTED.

This restores the previous behaviour.

libstdc++-v3/ChangeLog:

* testsuite/lib/libstdc++.exp (check_v3_target_gthreads_timed):
Fix condition for _GTHREAD_USE_MUTEX_TIMEDLOCK test.

arm: Fix bootstrap

gcc/ChangeLog

2020-12-17 Andrea Corallo <andrea.corallo@arm.com>

* config/arm/arm_neon.h (vcreate_p64): Remove call to
'__builtin_neon_vcreatedi'.

Fix trap in pointer conversion in op1_range.

Processing op1_range for conversion between a non-pointer and pointer
shouldnt do any fancy math.

gcc/
PR tree-optimization/97750
* range-op.cc (operator_cast::op1_range): Handle pointers better.
gcc/testsuite/
* gcc.dg/pr97750.c: New.

rtl-ssa: Include memmodel.h before tm_p.h

The RTL SSA merge broke SPARC bootstrap:

In file included from ./tm_p.h:4,
                 from /vol/gcc/src/hg/master/local/gcc/rtl-ssa.h:54,
                 from /vol/gcc/src/hg/master/local/gcc/fwprop.c:29:
/vol/gcc/src/hg/master/local/gcc/config/sparc/sparc-protos.h:45:47: error: use of enum 'memmodel' without previous declaration
extern void sparc_emit_membar_for_model (enum memmodel, int, int);
                                               ^~~~~~~~

and similarly in rtl-ssa/functions.cc, rtl-ssa/changes.cc, and
rtl-ssa/insns.cc.

Fixed by moving the memmove.h include in rtl-ssa.h before tm_p.h.

Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11.

2020-12-17  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc:
* rtl-ssa.h: Include memmodel.h before tm_p.h.

bootstrap: Don't use strsignal [PR 98300]

Sadly strsignal is nonportable, so signal numbers it is then.

c++tools/
* server.cc (crash_signal): Don't use strsignal.

libstdc++: Fix -Wunused warning

As noted in PR 66146 comment 35, there is a new warning in the new
std::call_once implementation.

libstdc++-v3/ChangeLog:

* src/c++11/mutex.cc (std::once_flag::_M_finish): Add
maybe_unused attribute to variable used in assertion.

libstdc++: Fix preprocessor condition [PR 98344]

libstdc++-v3/ChangeLog:

PR libstdc++/98344
* include/bits/semaphore_base.h: Fix preprocessor condition.

libstdc++: Move std::hash<std::thread::id> to <bits/std_thread.h>

This makes the hash function available without including the whole of
<thread>, which is needed for <barrier>.

libstdc++-v3/ChangeLog:

* include/bits/std_thread.h (hash<thread::id>): Move here,
from ...
* include/std/thread (hash<thread::id>): ... here.

libstdc++: Regenerate autoconf files

I forgot to regenerate these files in r11-6137.

libstdc++-v3/ChangeLog:

* config.h.in: Regenerate.
* configure: Regenerate.

bootstrap: Fix some windows issues [PR 98300]

When breaking out the sample server from the gcc/cp directory, it lost
its check for mmap, and the sample resolver just assumed it was there.
Fixed thusly.  The non-mapping paths in module.cc weren't (recently)
excercised, and led to a signedness warning.  Finally I'd missed
c++tools's config.h.in in the gcc_update script.  There I took the
opportunity of adding a 'tools' segment of the dependency lists.

PR bootstrap/98300
contrib/
* gcc_update: Add c++tools/config.h.in.
c++tools/
* configure.ac: Check for sys/mman.h.
* resolver.cc: Don't assume mmap, O_CLOEXEC are available.  Use
xmalloc.
* config.h.in: Regenerated.
* configure: Regenerated.
gcc/cp/
* module.cc: Fix ::read, ::write result signedness comparisons.

libcody: Remove nop asm

This asm was a useful place for gdb to drop a breakpoint and make it
clear where you were when debugging. I took a punt that 'surely every
arch has a nop instruction'. Well, no, some apparently have nops with
operands (what, do nothing harder? :)

libcody/
* fatal.cc (HCF): Remove nop breakpoint lander.

c++tools: Fix up c++tools for --with-gcc-major-version-only

Seems c++tools doesn't honor --with-gcc-major-version-only.
Our distro uses that flag and so everything is installed in
/usr/lib/gcc/<target>/11/...
/usr/libexec/gcc/<target>/11/...
except
/usr/libexec/gcc/<target>/11.0.0/g++-mapper-server

The following patch should fix that.

2020-12-17 Jakub Jelinek <jakub@redhat.com>

* configure.ac: Add GCC_BASE_VER.
* Makefile.in (version): Remove variable.
(gcc_version): New variable.
(libexecsubdir): Use $(gcc_version) instead of $(version).
* configure: Regenerated.

shrink-wrap: Don't put on incoming EDGE_CROSSING [PR98289]

As mentioned in the PR, shrink-wrapping disqualifies for prologue
placement basic blocks that have EDGE_CROSSING incoming edge.
I don't see why that is necessary, those edges seem to be redirected
just fine, both on x86_64 and powerpc64.  In the former case, they
are usually conditional jumps that patch_jump_insn can handle just fine,
after all, they were previously crossing and will be crossing after
the redirection too, just to a different label.  And in the powerpc64
case, it is a simple_jump instead that again seems to be handled by
patch_jump_insn just fine.
Sure, redirecting an edge that was previously not crossing to be crossing or
vice versa can fail, but that is not what shrink-wrapping needs.
Also tested in GCC 8 with this patch and don't see ICEs there either
(though, of course, I'm not suggesting we should backport this to release
branches).
The old ICEs could have been fixed by PR87475 fix or some other one
years ago.

2020-12-17  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/98289
* shrink-wrap.c (can_get_prologue): Don't punt on EDGE_CROSSING
incoming edges.

* gcc.target/i386/pr98289.c: New test.
* gcc.dg/torture/pr98289.c: New test.

[Ada] Performance of CW_Membership

gcc/ada/

* libgnat/a-tags.ads, libgnat/a-tags.adb (CW_Membership): Move
to spec to allow inlining.

gcc/testsuite/

* gnat.dg/debug15.adb: Remove fragile testcase.

[Ada] Remove unused subprograms in validsw

gcc/ada/

* checks.adb: Remove, not used.
* checks.ads: Likewise.
* exp_ch6.adb: Likewise.
* exp_ch7.adb: Likewise.
* exp_ch7.ads: Likewise.
* exp_fixd.adb: Likewise.
* exp_tss.adb: Likewise.
* exp_tss.ads: Likewise.
* exp_util.adb: Likewise.
* exp_util.ads: Likewise.
* gnat1drv.adb: Likewise.
* libgnat/s-finmas.adb: Likewise.
* libgnat/s-finmas.ads: Likewise.
* libgnat/system-aix.ads: Likewise.
* libgnat/system-darwin-arm.ads: Likewise.
* libgnat/system-darwin-ppc.ads: Likewise.
* libgnat/system-darwin-x86.ads: Likewise.
* libgnat/system-djgpp.ads: Likewise.
* libgnat/system-dragonfly-x86_64.ads: Likewise.
* libgnat/system-freebsd.ads: Likewise.
* libgnat/system-hpux-ia64.ads: Likewise.
* libgnat/system-hpux.ads: Likewise.
* libgnat/system-linux-alpha.ads: Likewise.
* libgnat/system-linux-arm.ads: Likewise.
* libgnat/system-linux-hppa.ads: Likewise.
* libgnat/system-linux-ia64.ads: Likewise.
* libgnat/system-linux-m68k.ads: Likewise.
* libgnat/system-linux-mips.ads: Likewise.
* libgnat/system-linux-ppc.ads: Likewise.
* libgnat/system-linux-riscv.ads: Likewise.
* libgnat/system-linux-s390.ads: Likewise.
* libgnat/system-linux-sh4.ads: Likewise.
* libgnat/system-linux-sparc.ads: Likewise.
* libgnat/system-linux-x86.ads: Likewise.
* libgnat/system-lynxos178-ppc.ads: Likewise.
* libgnat/system-lynxos178-x86.ads: Likewise.
* libgnat/system-mingw.ads: Likewise.
* libgnat/system-qnx-aarch64.ads: Likewise.
* libgnat/system-rtems.ads: Likewise.
* libgnat/system-solaris-sparc.ads: Likewise.
* libgnat/system-solaris-x86.ads: Likewise.
* libgnat/system-vxworks-arm-rtp-smp.ads: Likewise.
* libgnat/system-vxworks-arm-rtp.ads: Likewise.
* libgnat/system-vxworks-arm.ads: Likewise.
* libgnat/system-vxworks-e500-kernel.ads: Likewise.
* libgnat/system-vxworks-e500-rtp-smp.ads: Likewise.
* libgnat/system-vxworks-e500-rtp.ads: Likewise.
* libgnat/system-vxworks-e500-vthread.ads: Likewise.
* libgnat/system-vxworks-ppc-kernel.ads: Likewise.
* libgnat/system-vxworks-ppc-ravenscar.ads: Likewise.
* libgnat/system-vxworks-ppc-rtp-smp.ads: Likewise.
* libgnat/system-vxworks-ppc-rtp.ads: Likewise.
* libgnat/system-vxworks-ppc-vthread.ads: Likewise.
* libgnat/system-vxworks-ppc.ads: Likewise.
* libgnat/system-vxworks-x86-kernel.ads: Likewise.
* libgnat/system-vxworks-x86-rtp-smp.ads: Likewise.
* libgnat/system-vxworks-x86-rtp.ads: Likewise.
* libgnat/system-vxworks-x86-vthread.ads: Likewise.
* libgnat/system-vxworks-x86.ads: Likewise.
* libgnat/system-vxworks7-aarch64-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-aarch64.ads: Likewise.
* libgnat/system-vxworks7-arm-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-arm.ads: Likewise.
* libgnat/system-vxworks7-e500-kernel.ads: Likewise.
* libgnat/system-vxworks7-e500-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-e500-rtp.ads: Likewise.
* libgnat/system-vxworks7-ppc-kernel.ads: Likewise.
* libgnat/system-vxworks7-ppc-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-ppc-rtp.ads: Likewise.
* libgnat/system-vxworks7-ppc64-kernel.ads: Likewise.
* libgnat/system-vxworks7-ppc64-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-x86-kernel.ads: Likewise.
* libgnat/system-vxworks7-x86-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-x86-rtp.ads: Likewise.
* libgnat/system-vxworks7-x86_64-kernel.ads: Likewise.
* libgnat/system-vxworks7-x86_64-rtp-smp.ads: Likewise.
* repinfo.adb: Likewise.
* repinfo.ads: Likewise.
* rtsfind.ads: Likewise.
* sem_aux.adb: Likewise.
* sem_aux.ads: Likewise.
* sem_ch13.adb: Likewise.
* sem_ch13.ads: Likewise.
* sem_util.adb (Validity_Checks_Suppressed, TSS,
Is_All_Null_Statements, Known_Non_Negative,
Non_Limited_Designated_Type, Get_Binary_Nkind, Get_Unary_Nkind,
Is_Protected_Operation, Number_Components, Package_Body,
Validate_Independence, Independence_Checks): Likewise; update
comments.
* targparm.adb: Likewise.
* targparm.ads (AAM, AAM_Str, Fractional_Fixed_Ops,
Frontend_Layout, Make_Detach_Call, Target_Has_Fixed_Ops, Detach,
Back_End_Layout, Create_Dynamic_SO_Ref, Get_Dynamic_SO_Entity,
Is_Dynamic_SO_Ref, Is_Static_SO_Ref,
Fractional_Fixed_Ops_On_Target): Likewise.
* validsw.adb (Save_Validity_Check_Options,
Set_Default_Validity_Check_Options): Likewise.
* validsw.ads: Likewise.

[Ada] Remove unused files

gcc/ada/

* symbols.ads, symbols.adb: Removed no longer used.

[Ada] Code cleanup: remove Old_Requires_Transient_Scope

gcc/ada/

* sem_util.adb (New_Requires_Transient_Scope): Renamed
Requires_Transient_Scope.
(Requires_Transient_Scope, Old_Requires_Transient_Scope,
Results_Differ): Removed.
* debug.adb: Remove -gnatdQ.

[Ada] Minor comment fix in System.Val_Real

gcc/ada/

* libgnat/s-valrea.adb (Need_Extra): Fix comment.

[Ada] Prevent early exits without restoring a global variable

gcc/ada/

* sem_ch5.adb (Analyze_Case_Statement): Move modification of
Unblocked_Exit_Count after early return statements; fix typo in
comment.

[Ada] Reduce scopes of local variables for case and if statements

gcc/ada/

* sem_ch5.adb (Analyze_Case_Statement): Change local variable
Exp to constant; remove unreferenced Last_Choice variable;
reduce scope of other variables.
(Analyze_If_Statement): Reduce scope of a local variable; add
comment.

[Ada] Refine type of a multi unit index number

gcc/ada/

* opt.ads (Multiple_Unit_Index): Refine type from Int to Nat.

[Ada] Prevent In_Check_Node routine from going too far in the parent chain

gcc/ada/

* sem_util.adb (In_Check_Node): Add guard and rename Node to
Par, just like it is done in surrounding routines, e.g.
In_Assertion_Expression_Pragma and In_Generic_Formal_Package.

[Ada] Ada2020: AI12-0400 Ambiguities associated with Vector

gcc/ada/

* libgnat/a-cbdlli.adb, libgnat/a-cbdlli.ads,
libgnat/a-cdlili.adb, libgnat/a-cdlili.ads,
libgnat/a-cidlli.adb, libgnat/a-cidlli.ads,
libgnat/a-cobove.adb, libgnat/a-cobove.ads,
libgnat/a-coinve.adb, libgnat/a-coinve.ads,
libgnat/a-convec.adb, libgnat/a-convec.ads: Add *_Vector
operations, remove default for Count, rename Append_One to be
Append.

[Ada] Crash on if expression inside declare expression

gcc/ada/

* sem_res.adb (Resolve_Declare_Expression): Need to establish a
transient scope in case Expression (N) requires actions to be
wrapped. Code cleanup.
* exp_ch7.adb, exp_ch11.adb: Code cleanup.

[Ada] Consistent wording for missing -gnat2020 switch

gcc/ada/

* par-ch3.adb (P_Identifier_Declarations): Reuse
Error_Msg_Ada_2020_Feature for object renaming without subtype.
* par-ch4.adb (P_Primary): Likewise for target name.
(P_Iterated_Component_Association): Likewise for iterated
component.
(P_Declare_Expression): Likewise for declare expression.
* par-ch6.adb (P_Formal_Part): Likewise for aspect on formal
parameter.
* sem_aggr.adb (Resolve_Delta_Aggregate): Ditto.
* sem_ch8.adb (Analyze_Object_Renaming): Reuse
Error_Msg_Ada_2020_Feature.
* sem_ch13.adb (Validate_Aspect_Aggregate): Reuse
Error_Msg_Ada_2020_Feature; use lower case for "aspect" and
don't use underscore for "Ada_2020"; don't give up on analysis
in Ada 2012 mode.
(Validate_Aspect_Stable_Properties): Reuse
Error_Msg_Ada_2020_Feature; use lower case for "aspect"; minor
style fixes.

[Ada] Remove discriminant checks processing in gigi

gcc/ada/

* sem_ch4.adb (Analyze_Selected_Component): Request a compile
time error replacement in Apply_Compile_Time_Constraint_Error
in case of an invalid field.
* sem_ch3.adb (Create_Constrained_Components): Take advantage of
Gather_Components also in the case of a record extension and
also constrain records in the case of compile time known discriminant
values, as already done in gigi.
* sem_util.ads, sem_util.adb (Gather_Components): New parameter
Allow_Compile_Time to allow compile time known (but non static)
discriminant values, needed by Create_Constrained_Components,
and new parameter Include_Interface_Tag.
(Is_Dependent_Component_Of_Mutable_Object): Use Original_Node to
perform check on the original tree.
(Is_Object_Reference): Likewise. Only call Original_Node when
relevant via a new function Safe_Prefix.
(Is_Static_Discriminant_Component, In_Check_Node): New.
(Is_Actual_Out_Or_In_Out_Parameter): New.
* exp_ch4.adb (Expand_N_Selected_Component): Remove no longer needed
code preventing evaluating statically discriminants in more cases.
* exp_ch5.adb (Expand_N_Loop_Statement): Simplify expansion of loops
with an N_Raise_xxx_Error node to avoid confusing the code generator.
(Make_Component_List_Assign): Try to find a constrained type to
extract discriminant values from, so that the case statement
built gets an opportunity to be folded by
Expand_N_Case_Statement.
(Expand_Assign_Record): Update comments, code cleanups.
* sem_attr.adb (Analyze_Attribute): Perform most of the analysis
on the original prefix node to deal properly with a prefix rewritten
as a N_Raise_xxx_Error.
* sem_ch5.adb (Analyze_Loop_Parameter_Specification): Handle properly
a discrete subtype definition being rewritten as N_Raise_xxx_Error.
* sem_ch8.adb (Analyze_Object_Renaming): Handle N_Raise_xxx_Error
nodes as part of the expression being renamed.
* sem_eval.ads, sem_eval.adb (Fold, Eval_Selected_Component): New.
(Compile_Time_Known_Value, Expr_Value, Expr_Rep_Value): Evaluate
static discriminant component values.
* sem_res.adb (Resolve_Selected_Component): Call
Eval_Selected_Component.

[Ada] Move folding of unchecked conversions from expansion to evaluation

gcc/ada/

* exp_ch4.adb (Expand_N_Unchecked_Type_Conversion): Remove
folding of discrete values.
* exp_intr.adb (Expand_Unc_Conversion): Analyze, resolve and
evaluate (if possible) calls to instances of
Ada.Unchecked_Conversion after they have been expanded into
N_Unchecked_Type_Conversion.
* sem_eval.adb (Eval_Unchecked_Conversion): Add folding of
discrete values.

[Ada] Do not use exponentiation for common bases in floating-point Value

gcc/ada/

* Makefile.rtl (GNATRTL_NONTASKING_OBJS): Likewise.
* exp_imgv.adb (Expand_Value_Attribute): Use RE_Value_Long_Float in
lieu of RE_Value_Long_Long_Float as fallback for fixed-point types.
Also use it for Long_Long_Float if it has same size as Long_Float.
* libgnat/s-imgrea.adb: Replace Powten_Table with Powen_LLF.
* libgnat/s-powflt.ads: New file.
* libgnat/s-powlfl.ads: Likewise.
* libgnat/s-powtab.ads: Rename to...
* libgnat/s-powllf.ads: ...this.
* libgnat/s-valflt.ads: Add with clause for System.Powten_Flt and
pass its table as actual parameter to System.Val_Real.
* libgnat/s-vallfl.ads: Likewise for System.Powten_LFlt.
* libgnat/s-valllf.ads: Likewise for System.Powten_LLF.
* libgnat/s-valrea.ads: Add Maxpow and Powten_Address parameters.
* libgnat/s-valrea.adb: Add pragma Warnings (Off).
(Need_Extra): New boolean constant.
(Precision_Limit): Set it according to Need_Extra.
(Impl): Adjust actual parameter.
(Integer_to_Rea): Add assertion on the machine radix. Take into
account the extra digit only if Need_Extra is true. Reimplement
the computation of the final value for bases 2, 4, 8, 10 and 16.
* libgnat/s-valued.adb (Impl): Adjust actual parameter.
(Scan_Decimal): Add pragma Unreferenced.
(Value_Decimal): Likewise.
* libgnat/s-valuef.adb (Impl): Adjust actual parameter.
* libgnat/s-valuer.ads (Floating): Remove.
(Round): New formal parameter.
* libgnat/s-valuer.adb (Round_Extra): New procedure.
(Scan_Decimal_Digits): Use it to round the extra digit if Round
is set to True in the instantiation.
(Scan_Integral_Digits): Likewise.

[Ada] Fix small typo in comments.

gcc/ada/

* libgnat/system-lynxos178-ppc.ads,
libgnat/system-lynxos178-x86.ads: Fix small typo in comments.

[Ada] Do not generate encodings for fixed-point types by default

gcc/ada/

* exp_dbug.adb (Get_Encoded_Name): Generate encodings for fixed
point types only if -fgnat-encodings=all is specified.

[Ada] Crash on discriminant check with current instance

gcc/ada/

* checks.adb (Build_Discriminant_Checks): Add condition to
replace references to the current instance of the type when we
are within an Init_Proc.
(Replace_Current_Instance): Examine a given node and replace the
current instance of the type with the corresponding _init
formal.
(Search_And_Replace_Current_Instance): Traverse proc which calls
Replace_Current_Instance in order to replace all references
within a given expression.

[Ada] Better diagnostic for new language features

gcc/ada/

* par-ch12.adb (P_Formal_Derived_Type_Definition): Complain
about formal type with aspect specification, which only become
legal in Ada 2020.
* par-ch9.adb (P_Protected_Operation_Declaration_Opt): Reuse
Error_Msg_Ada_2005_Extension.
(P_Entry_Declaration): Likewise.
* scng.adb (Scan): Improve diagnostics for target_name; emit
error, but otherwise continue in earlier than Ada 2020 modes.

[Ada] Spurious discriminant check on bounded synchronized queue

gcc/ada/

* libgnat/a-cbsyqu.ads (Implementation): Provide a box
initialization for the element array used internally to
represent the queue, so that its components are properly
initialized if the given element type has default
initialization. Suppress warnings on the rest of the package in
case the element type has no default or discriminant, because it
is bound to be confusing to the user.

[Ada] Assert failure on b38105a in -gnat95 mode

gcc/ada/

* sem_util.adb (Inherit_Predicate_Flags): No-op before Ada 2012.

[Ada] Compiler crash on protected component of controlled type

gcc/ada/

* exp_ch7.adb (Make_Final_Call, Make_Init_Call): Take protected
types into account.
* sem_util.ads: Fix typo.

[Ada] Fixes for GNAT error/warning messages

gcc/ada/

* checks.adb: Rework error messages.
* exp_ch3.adb: Likewise.
* freeze.adb: Likewise.
* lib-load.adb: Likewise.
* par-ch12.adb: Likewise.
* par-ch3.adb: Likewise.
* par-ch4.adb: Likewise.
* par-ch9.adb: Likewise.
* sem_aggr.adb: Likewise.
* sem_attr.adb: Likewise.
* sem_cat.adb: Likewise.
* sem_ch10.adb: Likewise.
* sem_ch12.adb: Likewise.
(Instantiate_Type): Fix CODEFIX comment, applicable only on
continuation message, and identify the second message as a
continuation.
* sem_ch13.adb: Rework error messages.
* sem_ch3.adb: Likewise.
* sem_ch4.adb: Likewise.
* sem_ch5.adb: Likewise.
* sem_ch6.adb: Likewise.
* sem_ch8.adb: Likewise.
* sem_ch9.adb: Likewise.
* sem_prag.adb: Likewise.
* sem_res.adb: Likewise.
* sem_util.adb: Likewise.
(Wrong_Type): Fix CODEFIX comment, applicable only on
continuation message, and identify the second message as a
continuation.
* symbols.adb: Rework error messages.

gcc/testsuite/

* gnat.dg/interface6.adb, gnat.dg/not_null.adb,
gnat.dg/protected_func.adb: Adjust error messages.

[Ada] Spurious error on Type'Access and <>

gcc/ada/

* sem_attr.adb (OK_Self_Reference): Return True if node does not
come from source (e.g. a rewritten aggregate).

[Ada] Style cleanups in Parse_Aspect_Stable_Properties

gcc/ada/

* sem_ch13.adb (Parse_Aspect_Stable_Properties): Fix style;
limit the scope of local variables; remove extra assignment in
Extract_Entity.
(Validate_Aspect_Stable_Properties): Simplify with procedural
Next.

IBM Z: Detect libc's float_t behavior on cross compiles

When cross-compiling GCC with target libc headers available and
configure option --enable-s390-excess-float-precision has been omitted,
identify whether they clamp float_t to double or respect
__FLT_EVAL_METHOD__ via a compile test that coerces the build-system
compiler to use the target headers. Then derive the setting from that.

gcc/ChangeLog:

2020-12-16 Marius Hillenbrand <mhillen@linux.ibm.com>

* configure.ac: Change --enable-s390-excess-float-precision
default behavior for cross compiles with target headers.
* configure: Regenerate.
* doc/install.texi: Adjust documentation.

MAINTAINERS: Add myself for write after approval.

ChangeLog:

2020-12-17 Marius Hillenbrand <mhillen@linux.ibm.com>

* MAINTAINERS (Write After Approval): Add myself.

Fortran: Delay vtab generation until after parsing [PR92587]

gcc/fortran/ChangeLog:

PR fortran/92587
* match.c (gfc_match_assignment): Move gfc_find_vtab call from here ...
* resolve.c (gfc_resolve_code): ... to here.

gcc/testsuite/ChangeLog:

PR fortran/92587
* gfortran.dg/finalize_37.f90: New test.

PR fortran/98307 - Dependency check fails when using "allocatable"

The dependency check for FORALL constructs already handled pointer
components to derived types, but missed allocatables. Fix that.

gcc/fortran/ChangeLog:

PR fortran/98307
* trans-stmt.c (check_forall_dependencies): Extend dependency
check to allocatable components of derived types.

gcc/testsuite/ChangeLog:

PR fortran/98307
* gfortran.dg/forall_19.f90: New test.

test: add new Go tests from source repo

gcc: xtensa: add optimizations for shift operations

2020-12-16 Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp>
gcc/
* config/xtensa/xtensa.md (*ashlsi3_1, *ashlsi3_3x, *ashrsi3_3x)
(*lshrsi3_3x): New patterns.

gcc/testsuite/
* gcc.target/xtensa/shifts.c: New test.

Daily bump.

fwprop: Rewrite to use RTL SSA

This patch rewrites fwprop.c to use the RTL SSA framework.  It tries
as far as possible to mimic the old behaviour, even in caes where
that doesn't fit naturally with the new framework.  I've added ???
comments to mark those places, but I think “fixing” them should
be done separately to make bisection easier.

In particular:

* The old implementation iterated over uses, and after a successful
  substitution, the new insn's uses were added to the end of the list.
  The pass still processed those uses, but because it processed them at
  the end, it didn't fully optimise one instruction before propagating
  it into the next.

  The new version follows the same approach for comparison purposes,
  but I'd like to drop that as a follow-on patch.

* The old implementation operated on single use sites (DF_REF_LOCs).
  This doesn't work well for instructions with match_dups, where it's
  necessary to update both an operand and its dups at the same time.
  For example, attempting to substitute into a divmod instruction would
  fail because only the div or the mod side would be updated.

  The new version again follows this to some extent for comparison
  purposes (although not exactly).  Again I'd like to drop it as a
  follow-on patch.

  One difference is that if a register occurs in multiple MEM addresses
  in a set, the new version will try to update them all at once.  This is
  what causes the SVE ACLE st4* output to improve.

Also, the old version didn't naturally guarantee termination (PR79405),
whereas the new one does.

gcc/
* fwprop.c: Rewrite to use the RTL SSA framework.

gcc/testsuite/
* gcc.dg/rtl/x86_64/test-return-const.c.before-fwprop.c: Don't
expect insn updates to be deferred.
* gcc.target/aarch64/sve/acle/asm/st4_s8.c: Expect the addition
to be folded into the address.
* gcc.target/aarch64/sve/acle/asm/st4_u8.c: Likewise.

Add rtl-ssa

This patch adds the RTL SSA infrastructure itself. The following
fwprop.c patch will make use of it.

gcc/
* configure.ac: Add rtl-ssa to the list of dependence directories.
* configure: Regenerate.
* Makefile.in (rtl-ssa-warn): New variable.
(OBJS): Add the rtl-ssa object files.
* emit-rtl.h (rtl_data::ssa): New field.
* rtl-ssa.h: New file.
* system.h: Include <functional> when INCLUDE_FUNCTIONAL is defined.
* rtl-ssa/access-utils.h: Likewise.
* rtl-ssa/accesses.h: New file.
* rtl-ssa/accesses.cc: Likewise.
* rtl-ssa/blocks.h: New file.
* rtl-ssa/blocks.cc: Likewise.
* rtl-ssa/change-utils.h: Likewise.
* rtl-ssa/changes.h: New file.
* rtl-ssa/changes.cc: Likewise.
* rtl-ssa/functions.h: New file.
* rtl-ssa/functions.cc: Likewise.
* rtl-ssa/insn-utils.h: Likewise.
* rtl-ssa/insns.h: New file.
* rtl-ssa/insns.cc: Likewise.
* rtl-ssa/internals.inl: Likewise.
* rtl-ssa/is-a.inl: Likewise.
* rtl-ssa/member-fns.inl: Likewise.
* rtl-ssa/movement.h: Likewise.

doc: Add documentation for rtl-ssa

This patch adds some documentation to rtl.texi about the SSA form.
It only really describes the high-level structure -- I think for
API-level stuff it's better to rely on function comments instead.

gcc/
* doc/rtl.texi (RTL SSA): New node.

rtlanal: Add simple_regno_set

This patch adds a routine for finding a “simple” SET for a register
definition. See the comment in the patch for details.

gcc/
* rtl.h (simple_regno_set): Declare.
* rtlanal.c (simple_regno_set): New function.

rtlanal: Add some new helper classes

This patch adds some classes for gathering the list of registers
and memory that are read and written by an instruction, along
with various properties about the accesses.  In some ways it's
similar to the information that DF collects for registers,
but extended to memory.  The main reason for using it instead
of DF is that it can analyse tentative changes to instructions
before they've been committed.

The classes also collect general information about the instruction,
since it's cheap to do and helps to avoid multiple walks of the same
RTL pattern.

I've tried to optimise the code quite a bit, since with later patches
it becomes relatively performance-sensitive.  See the discussion in
the comments for the trade-offs involved.

I put the declarations in a new rtlanal.h header file since it
seemed a bit excessive to put so much new inline stuff in rtl.h.

gcc/
* rtlanal.h: New file.
(MEM_REGNO): New constant.
(rtx_obj_flags): New namespace.
(rtx_obj_reference, rtx_properties): New classes.
(growing_rtx_properties, vec_rtx_properties_base): Likewise.
(vec_rtx_properties): New alias.
* rtlanal.c: Include it.
(rtx_properties::try_to_add_reg): New function.
(rtx_properties::try_to_add_dest): Likewise.
(rtx_properties::try_to_add_src): Likewise.
(rtx_properties::try_to_add_pattern): Likewise.
(rtx_properties::try_to_add_insn): Likewise.
(vec_rtx_properties_base::grow): Likewise.

recog: Add an RAII class for undoing insn changes

When using validate_change to make a group of changes, you have
to remember to cancel them if something goes wrong. This patch
adds an RAII class to make that easier. See the comments in the
patch for details and examples.

gcc/
* recog.h (insn_change_watermark): New class.

recog: Add a class for propagating into insns

This patch adds yet another way of propagating into an instruction and
simplifying the result.  (The net effect of the series is to keep the
total number of propagation approaches the same though, since a later
patch removes the fwprop.c routines.)

One of the drawbacks of the validate_replace_* routines is that
they only do simple simplifications, mostly canonicalisations:

  /* Do changes needed to keep rtx consistent.  Don't do any other
     simplifications, as it is not our job.  */
  if (simplify)
    simplify_while_replacing (loc, to, object, op0_mode);

But substituting can often lead to real simplification opportunities.
simplify-rtx.c:simplify_replace_rtx does fully simplify the result,
but it only operates on specific rvalues rather than full instruction
patterns.  It is also nondestructive, which means that it returns a
new rtx whenever a substitution or simplification was possible.
This can create quite a bit of garbage rtl in the context of a
speculative recog, where changing the contents of a pointer is
often enough.

The new routines are therefore supposed to provide simplify_replace_rtx-
style substitution in recog.  They go to some effort to prevent garbage
rtl from being created.

At the moment, the new routines fail if the pattern would still refer
to the old "from" value in some way.  That might be unnecessary in
some contexts; if so, it could be put behind a configuration parameter.

gcc/
* recog.h (insn_propagation): New class.
* recog.c (insn_propagation::apply_to_mem_1): New function.
(insn_propagation::apply_to_rvalue_1): Likewise.
(insn_propagation::apply_to_lvalue_1): Likewise.
(insn_propagation::apply_to_pattern_1): Likewise.
(insn_propagation::apply_to_pattern): Likewise.
(insn_propagation::apply_to_rvalue): Likewise.

recog: Add a way of temporarily undoing changes

In some cases, it can be convenient to roll back the changes that
have been made by validate_change to see how things looked before,
then reroll the changes. For example, this makes it possible
to defer calculating the cost of an instruction until we know that
the result is actually needed. It can also make dumps easier to read.

This patch adds a couple of helper functions for doing that.

gcc/
* recog.h (temporarily_undo_changes, redo_changes): Declare.
* recog.c (temporarily_undone_changes): New variable.
(validate_change_1, confirm_change_group): Check that it's zero.
(cancel_changes): Likewise.
(swap_change, temporarily_undo_changes): New functions.
(redo_changes): Likewise.

recog: Add a validate_change_xveclen function

A later patch wants to be able to use the validate_change machinery
to reduce the XVECLEN of a PARALLEL.  This should be more efficient
than allocating a separate PARALLEL at a possibly distant memory
location, especially since the new PARALLEL would be garbage rtl if
the new pattern turns out not to match.  Combine already pulls this
trick with SUBST_INT.

This patch adds a general helper for doing that.

gcc/
* recog.h (validate_change_xveclen): Declare.
* recog.c (change_t::old_len): New field.
(validate_change_1): Add a new_len parameter.  Conditionally
replace the XVECLEN of an rtx, avoiding single-element PARALLELs.
(validate_change_xveclen): New function.
(cancel_changes): Undo changes made by validate_change_xveclen.

simplify-rtx: Put simplify routines into a class

One of the recurring warts of RTL is that multiplication by a power
of 2 is represented as a MULT inside a MEM but as an ASHIFT outside
a MEM.  It would obviously be better if we didn't have this kind of
context sensitivity, but it would be difficult to remove.

Currently the simplify-rtx.c routines are hard-coded for the
ASHIFT form.  This means that some callers have to convert the
ASHIFTs “back” into MULTs after calling the simplify-rtx.c
routines; see fwprop.c:canonicalize_address for an example.

I think we can relieve some of the pain by wrapping the simplify-rtx.c
routines in a simple class that tracks whether the expression occurs
in a MEM or not, so that no post-processing is needed.

An obvious concern is whether passing the “this” pointer around
will slow things down or bloat the code.  I can't measure any
increase in compile time after applying the patch.  Sizewise,
simplify-rtx.o text increases by 2.3% in default-checking builds
and 4.1% in release-checking builds.

I realise the MULT/ASHIFT thing isn't the most palatable
reason for doing this, but I think it might be useful for
other things in future, such as using local nonzero_bits
hooks/virtual functions instead of the global hooks.

The obvious alternative would be to add a static variable
and hope that it is always updated correctly.

Later patches make use of this.

gcc/
* rtl.h (simplify_context): New class.
(simplify_unary_operation, simplify_binary_operation): Use it.
(simplify_ternary_operation, simplify_relational_operation): Likewise.
(simplify_subreg, simplify_gen_unary, simplify_gen_binary): Likewise.
(simplify_gen_ternary, simplify_gen_relational): Likewise.
(simplify_gen_subreg, lowpart_subreg): Likewise.
* simplify-rtx.c (simplify_gen_binary): Turn into a member function
of simplify_context.
(simplify_gen_unary, simplify_gen_ternary, simplify_gen_relational)
(simplify_truncation, simplify_unary_operation): Likewise.
(simplify_unary_operation_1, simplify_byte_swapping_operation)
(simplify_associative_operation, simplify_logical_relational_operation)
(simplify_binary_operation, simplify_binary_operation_series)
(simplify_distributive_operation, simplify_plus_minus): Likewise.
(simplify_relational_operation, simplify_relational_operation_1)
(simplify_cond_clz_ctz, simplify_merge_mask): Likewise.
(simplify_ternary_operation, simplify_subreg, simplify_gen_subreg)
(lowpart_subreg): Likewise.
(simplify_binary_operation_1): Likewise.  Test mem_depth when
deciding whether the ASHIFT or MULT form is canonical.
(simplify_merge_mask): Use simplify_context.

recog: Split out a register_asm_p function

verify_changes has a test for whether a particular hard register
is a user-defined register asm. A later patch needs to test the
same thing, so this patch splits it out into a helper.

gcc/
* rtl.h (register_asm_p): Declare.
* recog.c (verify_changes): Split out the test for whether
a hard register is a register asm to...
* rtlanal.c (register_asm_p): ...this new function.

Export print-rtl.c:print_insn_with_notes

Later patches want to use print_insn_with_notes (printing to
a pretty_printer). This patch exports it from print-rtl.c.

The non-notes version is already public.

gcc/
* print-rtl.h (print_insn_with_notes): Declare.
* print-rtl.c (print_insn_with_notes): Make non-static

Split update_cfg_for_uncondjump out of combine

Later patches want to reuse combine's update_cfg_for_uncondjump,
so this patch makes it a public cfgrtl.c function.

gcc/
* cfgrtl.h (update_cfg_for_uncondjump): Declare.
* combine.c (update_cfg_for_uncondjump): Move to...
* cfgrtl.c: ...here.

Add a cut-down version of std::span (array_slice)

A later patch wants to be able to pass around subarray views of an
existing array. The standard class to do that is std::span, but it's
a C++20 thing. This patch just adds a cut-down version of it.

The intention is just to provide what's currently needed.

gcc/
* vec.h (array_slice): New class.

Add an alternative splay tree implementation

We already have two splay tree implementations: the old C one in
libiberty and a templated reimplementation of it in typed-splay-tree.h.
However, they have some drawbacks:

- They hard-code the assumption that nodes should have both a key and
  a value, which isn't always true.

- They use the two-phase method of lookup, and so nodes need to store
  a temporary back pointer.  We can avoid that overhead by using the
  top-down method (as e.g. the bitmap tree code already does).

- The tree node has to own the key and the value.  For some use cases
  it's more convenient to embed the tree links in the value instead.

Also, a later patch wants to use splay trees to represent an
adaptive total order: the splay tree itself records whether node N1
is less than node N2, and (in the worst case) comparing nodes is
a splay operation.

This patch therefore adds an alternative implementation.  The main
features are:

- Nodes can optionally point back to their parents.

- An Accessors class abstracts accessing child nodes and (where
  applicable) parent nodes, so that the information can be embedded
  in larger data structures.

- There is no fixed comparison function at the class level.  Instead,
  individual functions that do comparisons take a comparison function
  argument.

- There are two styles of comparison function, optimised for different
  use cases.  (See the comments in the patch for details.)

- It's possible to do some operations directly on a given node,
  without knowing whether it's the root.  This includes the comparison
  use case described above.

This of course has its own set of drawbacks.  It's really providing
splay utility functions rather than a true ADT, and so is more low-level
than the existing routines.  It's mostly geared for cases in which the
client code wants to participate in the splay operations to some extent.

gcc/
* Makefile.in (OBJS): Add splay-tree-utils.o.
* system.h: Include <array> when INCLUDE_ARRAY is defined.
* selftest.h (splay_tree_cc_tests): Declare.
* selftest-run-tests.c (selftest::run_tests): Run splay_tree_cc_tests.
* splay-tree-utils.h: New file.
* splay-tree-utils.tcc: Likewise.
* splay-tree-utils.cc: Likewise.

Add a class that multiplexes two pointer types

This patch adds a pointer_mux<T1, T2> class that provides similar
functionality to:

    union { T1 *a; T2 *b; };
    ...
    bool is_b_rather_than_a;

except that the is_b_rather_than_a tag is stored in the low bit
of the pointer.  See the comments in the patch for a comparison
between the two approaches and why this one can be more efficient.

I've tried to microoptimise the class a fair bit, since a later
patch uses it extensively in order to keep the sizes of data
structures down.

gcc/
* mux-utils.h: New file.

Add an RAII class for managing obstacks

This patch adds an RAII class for managing the lifetimes of objects
on an obstack. See the comments in the patch for more details and
example usage.

gcc/
* obstack-utils.h: New file.

Add more iterator utilities

This patch adds some more iterator helper classes.  They really fall
into two groups, but there didn't seem much value in separating them:

- A later patch has a class hierarchy of the form:

     Base
      +- Derived1
      +- Derived2

  A class wants to store an array A1 of Derived1 pointers and an
  array A2 of Derived2 pointers.  However, for compactness reasons,
  it was convenient to have a single array of Base pointers,
  with A1 and A2 being slices of this array.  This reduces the
  overhead from two pointers and two ints (3 LP64 words) to one
  pointer and two ints (2 LP64 words).

  But consumers of the class shouldn't be aware of this: they should
  see A1 as containing Derived1 pointers rather than Base pointers
  and A2 as containing Derived2 pointers rather than Base pointers.
  This patch adds derived_iterator and const_derived_container
  classes to support this use case.

- A later patch also adds various linked lists.  This patch adds
  wrapper_iterator and list_iterator classes to make it easier
  to create iterators for these linked lists.  For example:

    // Iterators for lists of definitions.
    using def_iterator = list_iterator<def_info, &def_info::next_def>;
    using reverse_def_iterator
      = list_iterator<def_info, &def_info::prev_def>;

  This in turn makes it possible to use range-based for loops
  on the lists.

The patch just adds the things that the later patches need; it doesn't
try to make the classes as functionally complete as possible.  I think
we should add extra functionality when needed rather than ahead of time.

gcc/
* iterator-utils.h (derived_iterator): New class.
(const_derived_container, wrapper_iterator): Likewise.
(list_iterator): Likewise.

reginfo: Add a global_reg_set

A later patch wants to use the set of global registers as a HARD_REG_SET
rather than a bool/char array. Most other arrays already have a
HARD_REG_SET counterpart, but this one didn't.

gcc/
* hard-reg-set.h (global_reg_set): Declare.
* reginfo.c (global_reg_set): New variable.
(init_reg_sets_1, globalize_reg): Update it when globalizing
registers.

libstdc++: Add C++ runtime support for new 128-bit long double format

This adds support for the new __ieee128 long double format on
powerpc64le targets.

Most of the complexity comes from wanting a single libstdc++.so library
that contains the symbols needed by code compiled with both
-mabi=ibmlongdouble and -mabi=ieeelongdouble (and not forgetting
-mlong-double-64 as well!)

In a few places this just requires an extra overload, for example
std::from_chars has to be overloaded for both forms of long double.
That can be done in a single translation unit that defines overloads
for 'long double' and also '__ieee128', so that user code including
<charconv> will be able to link to a definition for either type of long
double. Those are the easy cases.

The difficult parts are (as for the std::string ABI transition) the I/O
and locale facets. In order to be able to write either form of long
double to an ostream such as std::cout we need the locale to contain a
std::num_put facet that can handle both forms. The same approach is
taken as was already done for supporting 64-bit long double and 128-bit
long double: adding extra overloads of do_put to the facet class. On
targets where the new long double code is enabled, the facets that are
registered in the locale at program startup have additional overloads so
that they can work with any long double type. Where this fails to work
is if user code installs its own facet, which will probably not have the
additional overloads and so will only be able to output one or the other
type. In practice the number of users expecting to be able to use their
own locale facets in code using a mix of -mabi=ibmlongdouble and
-mabi=ieeelongdouble is probably close to zero.

libstdc++-v3/ChangeLog:

* Makefile.in: Regenerate.
* config.h.in: Regenerate.
* config/abi/pre/gnu.ver: Make patterns less greedy.
* config/os/gnu-linux/ldbl-ieee128-extra.ver: New file with patterns
for IEEE128 long double symbols.
* configure: Regenerate.
* configure.ac: Enable alternative 128-bit long double format on
powerpc64*-*-linux*.
* doc/Makefile.in: Regenerate.
* fragment.am: Regenerate.
* include/Makefile.am: Set _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT.
* include/Makefile.in: Regenerate.
* include/bits/c++config: Define inline namespace for new long
double symbols. Don't define _GLIBCXX_USE_FLOAT128 when it's the
same type as long double.
* include/bits/locale_classes.h [_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT]
(locale::_Impl::_M_init_extra_ldbl128): Declare new member function.
* include/bits/locale_facets.h (_GLIBCXX_NUM_FACETS): Simplify by
only counting narrow character facets.
(_GLIBCXX_NUM_CXX11_FACETS): Likewise.
(_GLIBCXX_NUM_LBDL_ALT128_FACETS): New.
[_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT] (num_get::__do_get): Define
vtable placeholder for __ibm128 long double type.
[_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT && __LONG_DOUBLE_IEEE128__]
(num_get::__do_get): Declare vtable placeholder for __ibm128 long
double type.
[_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT && __LONG_DOUBLE_IEEE128__]
(num_put::__do_put): Likewise.
* include/bits/locale_facets.tcc
[_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT && __LONG_DOUBLE_IEEE128__]
(num_get::__do_get, num_put::__do_put): Define.
* include/bits/locale_facets_nonio.h
[_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT && __LONG_DOUBLE_IEEE128__]
(money_get::__do_get): Declare vtable placeholder for __ibm128 long
double type.
[_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT && __LONG_DOUBLE_IEEE128__]
(money_put::__do_put): Likewise.
* include/bits/locale_facets_nonio.tcc
[_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT && __LONG_DOUBLE_IEEE128__]
(money_get::__do_get, money_put::__do_put): Define.
* include/ext/numeric_traits.h [_GLIBCXX_LONG_DOUBLE_ALT128_COMPAT]
(__numeric_traits<__ibm128>, __numeric_traits<__ieee128>): Define.
* libsupc++/Makefile.in: Regenerate.
* po/Makefile.in: Regenerate.
* python/Makefile.in: Regenerate.
* src/Makefile.am: Add compatibility-ldbl-alt128.cc and
compatibility-ldbl-alt128-cxx11.cc sources and recipes for objects.
* src/Makefile.in: Regenerate.
* src/c++11/Makefile.in: Regenerate.
* src/c++11/compatibility-ldbl-alt128-cxx11.cc: New file defining
symbols using the old 128-bit long double format, for the cxx11 ABI.
* src/c++11/compatibility-ldbl-alt128.cc: Likewise, for the
gcc4-compatible ABI.
* src/c++11/compatibility-ldbl-facets-aliases.h: New header for long
double compat aliases.
* src/c++11/cow-locale_init.cc: Add comment.
* src/c++11/cxx11-locale-inst.cc: Define C and C_is_char
unconditionally.
* src/c++11/cxx11-wlocale-inst.cc: Add sanity check. Include
locale-inst.cc directly, not via cxx11-locale-inst.cc.
* src/c++11/locale-inst-monetary.h: New header for monetary
category instantiations.
* src/c++11/locale-inst-numeric.h: New header for numeric category
instantiations.
* src/c++11/locale-inst.cc: Include new headers for monetary,
numeric, and long double definitions.
* src/c++11/wlocale-inst.cc: Remove long double compat aliases that
are defined in new header now.
* src/c++17/Makefile.am: Use -mabi=ibmlongdouble for
floating_from_chars.cc.
* src/c++17/Makefile.in: Regenerate.
* src/c++17/floating_from_chars.cc (from_chars_impl): Add
if-constexpr branch for __ieee128.
(from_chars): Overload for __ieee128.
* src/c++20/Makefile.in: Regenerate.
* src/c++98/Makefile.in: Regenerate.
* src/c++98/locale_init.cc (num_facets): Adjust calculation.
(locale::_Impl::_Impl(size_t)): Call _M_init_extra_ldbl128.
* src/c++98/localename.cc (num_facets): Adjust calculation.
(locale::_Impl::_Impl(const char*, size_t)): Call
_M_init_extra_ldbl128.
* src/filesystem/Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.
* testsuite/util/testsuite_abi.cc: Add new symbol versions.
Allow new symbols to be added to GLIBCXX_IEEE128_3.4.29 and
CXXABI_IEEE128_1.3.13 too.
* testsuite/26_numerics/complex/abi_tag.cc: Add u9__ieee128 to
regex matching expected symbols.

maintainer-scripts: Use /sourceware/snapshot-tmp/gcc as temp directory if possible

> https://gcc.gnu.org/pipermail/gccadmin/2020q4/017037.html
>
> OSError: [Errno 28] No space left on device:
> '/tmp/tmp.Zq3p6D4MxS/gcc/.git/objects/objn31xpefh' ->
> '/tmp/tmp.Zq3p6D4MxS/gcc/.git/objects/db/ffb02a4bcdd4ec04af3db75d86b8cc2e52bdff'
>
> Maybe change the script to use /sourceware/snapshot-tmp/gcc (which has
> rather more space) instead of /tmp?

This patch implements that.

2020-12-17 Jakub Jelinek <jakub@redhat.com>

* update_version_git: Put BASEDIR into /sourceware/snapshot-tmp/gcc
if it exist.

rs6000: Add support for powerpc64le-unknown-freebsd

This implements support for powerpc64le architecture on FreeBSD. Since
we don't have powerpcle (32-bit), I did not add support for powerpcle
here. This remains to be changed if there is powerpcle support in the
future.

2020-12-15 Piotr Kubaj <pkubaj@FreeBSD.org>

gcc/
* config.gcc (powerpc*le-*-freebsd*): Add.
* configure.ac (powerpc*le-*-freebsd*): Ditto.
* configure: Regenerate.
* config/rs6000/freebsd64.h (ASM_SPEC_COMMON): Use ENDIAN_SELECT.
(DEFAULT_ASM_ENDIAN): Add little endian support.
(LINK_OS_FREEBSD_SPEC64): Ditto.

test: add new Go tests from source repo

C: Drop qualifiers of assignment expressions. [PR98047]

ISO C17 6.5.15.1 specifies that the result is the
type the LHS would have after lvalue conversion.

2020-12-16 Martin Uecker <muecker@gwdg.de>

gcc/c/
PR c/98047
* c-typeck.c (build_modify_expr): Drop qualifiers.

gcc/testsuite/
PR c/98047
* gcc.dg/qual-assign-7.c: New test.

C: Avoid incorrect warning for volatile in compound expressions [PR98260]

2020-12-16 Martin Uecker <muecker@gwdg.de>

gcc/c/
PR c/98260
* c-parser.c (c_parser_expression): Look into
nop expression when marking expressions as read.

gcc/testsuite/
PR c/98260
* gcc.dg/unused-9.c: New test.

gcc: xtensa: rearrange DI mode constant loading

2020-12-16 Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp>
gcc/
* config/xtensa/xtensa.c (xtensa_emit_move_sequence): Try to
replace 'l32r' with 'movi' + 'slli' when optimizing for size.
* config/xtensa/xtensa.md (movdi): Split loading DI mode constant
into register pair into two loads of SI mode constants.

Arm: MVE: Split refactoring of remaining complex instrinsics

This refactors the complex numbers bits of MVE to go through the same unspecs
as the NEON variant.

This is pre-work to allow code to be shared between NEON and MVE for the complex
vectorization patches.

gcc/ChangeLog:

* config/arm/arm_mve.h (__arm_vcmulq_rot90_f16):
(__arm_vcmulq_rot270_f16, _arm_vcmulq_rot180_f16, __arm_vcmulq_f16,
__arm_vcmulq_rot90_f32, __arm_vcmulq_rot270_f32,
__arm_vcmulq_rot180_f32, __arm_vcmulq_f32, __arm_vcmlaq_f16,
__arm_vcmlaq_rot180_f16, __arm_vcmlaq_rot270_f16,
__arm_vcmlaq_rot90_f16, __arm_vcmlaq_f32, __arm_vcmlaq_rot180_f32,
__arm_vcmlaq_rot270_f32, __arm_vcmlaq_rot90_f32): Update builtin calls.
* config/arm/arm_mve_builtins.def (vcmulq_f, vcmulq_rot90_f,
vcmulq_rot180_f, vcmulq_rot270_f, vcmlaq_f, vcmlaq_rot90_f,
vcmlaq_rot180_f, vcmlaq_rot270_f): Removed.
(vcmulq, vcmulq_rot90, vcmulq_rot180, vcmulq_rot270, vcmlaq,
vcmlaq_rot90, vcmlaq_rot180, vcmlaq_rot270): New.
* config/arm/iterators.md (mve_rot): Add UNSPEC_VCMLA, UNSPEC_VCMLA90,
UNSPEC_VCMLA180, UNSPEC_VCMLA270, UNSPEC_VCMUL, UNSPEC_VCMUL90,
UNSPEC_VCMUL180, UNSPEC_VCMUL270.
(VCMUL): New.
* config/arm/mve.md (mve_vcmulq_f<mode, mve_vcmulq_rot180_f<mode>,
mve_vcmulq_rot270_f<mode>, mve_vcmulq_rot90_f<mode>, mve_vcmlaq_f<mode>,
mve_vcmlaq_rot180_f<mode>, mve_vcmlaq_rot270_f<mode>,
mve_vcmlaq_rot90_f<mode>): Removed.
(mve_vcmlaq<mve_rot><mode>, mve_vcmulq<mve_rot><mode>,
mve_vcaddq<mve_rot><mode>, cadd<rot><mode>3, mve_vcaddq<mve_rot><mode>):
New.
* config/arm/unspecs.md (UNSPEC_VCMUL90, UNSPEC_VCMUL270, UNSPEC_VCMUL,
UNSPEC_VCMUL180): New.
(VCMULQ_F, VCMULQ_ROT180_F, VCMULQ_ROT270_F, VCMULQ_ROT90_F,
VCMLAQ_F, VCMLAQ_ROT180_F, VCMLAQ_ROT90_F, VCMLAQ_ROT270_F): Removed.

Arm: Add NEON and MVE RTL patterns for Complex Addition.

This adds implementation for the optabs for complex additions.  With this the
following C code:

  void f90 (float complex a[restrict N], float complex b[restrict N],
    float complex c[restrict N])
  {
    for (int i=0; i < N; i++)
      c[i] = a[i] + (b[i] * I);
  }

generates

  f90:
  add     r3, r2, #1600
  .L2:
  vld1.32 {q8}, [r0]!
  vld1.32 {q9}, [r1]!
  vcadd.f32       q8, q8, q9, #90
  vst1.32 {q8}, [r2]!
  cmp     r3, r2
  bne     .L2
  bx      lr

instead of

  f90:
  add     r3, r2, #1600
  .L2:
  vld2.32 {d24-d27}, [r0]!
  vld2.32 {d20-d23}, [r1]!
  vsub.f32 q8, q12, q11
  vadd.f32 q9, q13, q10
  vst2.32 {d16-d19}, [r2]!
  cmp     r3, r2
  bne     .L2
  bx      lr

gcc/ChangeLog:

* config/arm/arm_mve.h (__arm_vcaddq_rot90_u8, __arm_vcaddq_rot270_u8,
__arm_vcaddq_rot90_s8, __arm_vcaddq_rot270_s8,
__arm_vcaddq_rot90_u16, __arm_vcaddq_rot270_u16,
__arm_vcaddq_rot90_s16, __arm_vcaddq_rot270_s16,
__arm_vcaddq_rot90_u32, __arm_vcaddq_rot270_u32,
__arm_vcaddq_rot90_s32, __arm_vcaddq_rot270_s32,
__arm_vcaddq_rot90_f16, __arm_vcaddq_rot270_f16,
__arm_vcaddq_rot90_f32, __arm_vcaddq_rot270_f32):  Update builtin calls.
* config/arm/arm_mve_builtins.def (vcaddq_rot90_u, vcaddq_rot270_u,
vcaddq_rot90_s, vcaddq_rot270_s, vcaddq_rot90_f, vcaddq_rot270_f):
Removed.
(vcaddq_rot90, vcaddq_rot270): New.
* config/arm/constraints.md (Dz): Include MVE.
* config/arm/iterators.md (mve_rot): New.
(supf): Remove VCADDQ_ROT270_S, VCADDQ_ROT270_U, VCADDQ_ROT90_S,
VCADDQ_ROT90_U.
(VCADDQ_ROT270, VCADDQ_ROT90): Removed.
* config/arm/mve.md (mve_vcaddq_rot270_<supf><mode,
mve_vcaddq_rot90_<supf><mode>, mve_vcaddq_rot270_f<mode>,
mve_vcaddq_rot90_f<mode>): Removed.
(mve_vcaddq<mve_rot><mode>, mve_vcaddq<mve_rot><mode>): New.
* config/arm/unspecs.md (VCADDQ_ROT270_S, VCADDQ_ROT90_S,
VCADDQ_ROT270_U, VCADDQ_ROT90_U, VCADDQ_ROT270_F,
VCADDQ_ROT90_F): Removed.
* config/arm/vec-common.md (cadd<rot><mode>3): New.

AArch64: Add NEON, SVE and SVE2 RTL patterns for Complex Addition.

This adds implementation for the optabs for add complex operations.  With this
the following C code:

  void f90 (float complex a[restrict N], float complex b[restrict N],
    float complex c[restrict N])
  {
    for (int i=0; i < N; i++)
      c[i] = a[i] + (b[i] * I);
  }

generates

  f90:
  mov     x3, 0
  .p2align 3,,7
  .L2:
  ldr     q0, [x0, x3]
  ldr     q1, [x1, x3]
  fcadd   v0.4s, v0.4s, v1.4s, #90
  str     q0, [x2, x3]
  add     x3, x3, 16
  cmp     x3, 1600
  bne     .L2
  ret

instead of

  f90:
  add     x3, x1, 1600
  .p2align 3,,7
  .L2:
  ld2     {v4.4s - v5.4s}, [x0], 32
  ld2     {v2.4s - v3.4s}, [x1], 32
  fsub    v0.4s, v4.4s, v3.4s
  fadd    v1.4s, v5.4s, v2.4s
  st2     {v0.4s - v1.4s}, [x2], 32
  cmp     x3, x1
  bne     .L2
  ret

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (cadd<rot><mode>3): New.
* config/aarch64/iterators.md (SVE2_INT_CADD_OP): New.
* config/aarch64/aarch64-sve.md (cadd<rot><mode>3): New.
* config/aarch64/aarch64-sve2.md (cadd<rot><mode>3): New.

testsuite: Adjust expected instruction count for PPC fold testcases.

commit r11-5958 changed the code generation for the vector logical fold
tests. This patch updates the expected instruction counts for different
instructions.

gcc/testsuite/ChangeLog:

2020-12-16 David Edelsohn <dje.gcc@gmail.com>

PR target/98280
* gcc.target/powerpc/fold-vec-logical-ors-char.c: Adjust count.
* gcc.target/powerpc/fold-vec-logical-ors-int.c: Adjust count.
* gcc.target/powerpc/fold-vec-logical-ors-longlong.c: Adjust count.
* gcc.target/powerpc/fold-vec-logical-ors-short.c: Adjust count.
* gcc.target/powerpc/fold-vec-logical-other-char.c: Adjust count.
* gcc.target/powerpc/fold-vec-logical-other-int.c: Adjust count.
* gcc.target/powerpc/fold-vec-logical-other-longlong.c: Adjust count.
* gcc.target/powerpc/fold-vec-logical-other-short.c: Adjust count.