git.libre-soc.org Git

tree-optimization/97769 - fix assert in peeling for alignment

The following removes an assert that can not easily be adjusted to
cover the additional cases we now handle after the removal of
the same-align DRs vector.

2020-11-10 Richard Biener <rguenther@suse.de>

PR tree-optimization/97769
* tree-vect-data-refs.c (vect_update_misalignment_for_peel):
Remove assert.

* gcc.dg/vect/pr97769.c: New testcase.

tree-optimization/97780 - fix ICE in fini_pre

This deals with blocks elimination added.

2020-11-10 Richard Biener <rguenther@suse.de>

PR tree-optimization/97780
* tree-ssa-pre.c (fini_pre): Deal with added basic blocks
when freeing PHI_TRANS_TABLE.

AArch64: Add FLAG for tbl/tbx intrinsics [PR94442]

2020-11-10 Zhiheng Xie <xiezhiheng@huawei.com>
Nannan Zheng <zhengnannan@huawei.com>

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def: Add proper FLAG
for tbl/tbx intrinsics.

openmp: Implement OpenMP 5.0 base-pointer attachement and clause ordering

This patch implements some parts of the target variable mapping changes
specified in OpenMP 5.0, including base-pointer attachment/detachment
behavior for array section list-items in map clauses, and ordering of
map clauses according to map kind.

2020-11-10 Chung-Lin Tang <cltang@codesourcery.com>

gcc/c-family/ChangeLog:

* c-common.h (c_omp_adjust_map_clauses): New declaration.
* c-omp.c (struct map_clause): Helper type for c_omp_adjust_map_clauses.
(c_omp_adjust_map_clauses): New function.

gcc/c/ChangeLog:

* c-parser.c (c_parser_omp_target_data): Add use of
new c_omp_adjust_map_clauses function. Add GOMP_MAP_ATTACH_DETACH as
handled map clause kind.
(c_parser_omp_target_enter_data): Likewise.
(c_parser_omp_target_exit_data): Likewise.
(c_parser_omp_target): Likewise.
* c-typeck.c (handle_omp_array_sections): Adjust COMPONENT_REF case to
use GOMP_MAP_ATTACH_DETACH map kind for C_ORT_OMP region type.
(c_finish_omp_clauses): Adjust bitmap checks to allow struct decl and
same struct field access to co-exist on OpenMP construct.

gcc/cp/ChangeLog:

* parser.c (cp_parser_omp_target_data): Add use of
new c_omp_adjust_map_clauses function. Add GOMP_MAP_ATTACH_DETACH as
handled map clause kind.
(cp_parser_omp_target_enter_data): Likewise.
(cp_parser_omp_target_exit_data): Likewise.
(cp_parser_omp_target): Likewise.
* semantics.c (handle_omp_array_sections): Adjust COMPONENT_REF case to
use GOMP_MAP_ATTACH_DETACH map kind for C_ORT_OMP region type. Fix
interaction between reference case and attach/detach.
(finish_omp_clauses): Adjust bitmap checks to allow struct decl and
same struct field access to co-exist on OpenMP construct.

gcc/ChangeLog:

* gimplify.c (is_or_contains_p): New static helper function.
(omp_target_reorder_clauses): New function.
(gimplify_scan_omp_clauses): Add use of omp_target_reorder_clauses to
reorder clause list according to OpenMP 5.0 rules. Add handling of
GOMP_MAP_ATTACH_DETACH for OpenMP cases.
* omp-low.c (is_omp_target): New static helper function.
(scan_sharing_clauses): Add scan phase handling of GOMP_MAP_ATTACH/DETACH
for OpenMP cases.
(lower_omp_target): Add lowering handling of GOMP_MAP_ATTACH/DETACH for
OpenMP cases.

gcc/testsuite/ChangeLog:

* c-c++-common/gomp/clauses-2.c: Remove dg-error cases now valid.
* gfortran.dg/gomp/map-2.f90: Likewise.
* c-c++-common/gomp/map-5.c: New testcase.

libgomp/ChangeLog:

* libgomp.h (enum gomp_map_vars_kind): Adjust enum values to be bit-flag
usable.
* oacc-mem.c (acc_map_data): Adjust gomp_map_vars argument flags to
'GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_ENTER_DATA'.
(goacc_enter_datum): Likewise for call to gomp_map_vars_async.
(goacc_enter_data_internal): Likewise.
* target.c (gomp_map_vars_internal):
Change checks of GOMP_MAP_VARS_ENTER_DATA to use bit-and (&). Adjust use
of gomp_attach_pointer for OpenMP cases.
(gomp_exit_data): Add handling of GOMP_MAP_DETACH.
(GOMP_target_enter_exit_data): Add handling of GOMP_MAP_ATTACH.
* testsuite/libgomp.c-c++-common/ptr-attach-1.c: New testcase.

IBM Z: Test long doubles in vector registers

gcc/testsuite/ChangeLog:

2020-11-05 Ilya Leoshkevich <iii@linux.ibm.com>

* gcc.target/s390/vector/long-double-callee-abi-scan.c: New test.
* gcc.target/s390/vector/long-double-caller-abi-run.c: New test.
* gcc.target/s390/vector/long-double-caller-abi-scan.c: New test.
* gcc.target/s390/vector/long-double-copysign.c: New test.
* gcc.target/s390/vector/long-double-fprx2-constant.c: New test.
* gcc.target/s390/vector/long-double-from-double.c: New test.
* gcc.target/s390/vector/long-double-from-float.c: New test.
* gcc.target/s390/vector/long-double-from-i16.c: New test.
* gcc.target/s390/vector/long-double-from-i32.c: New test.
* gcc.target/s390/vector/long-double-from-i64.c: New test.
* gcc.target/s390/vector/long-double-from-i8.c: New test.
* gcc.target/s390/vector/long-double-from-u16.c: New test.
* gcc.target/s390/vector/long-double-from-u32.c: New test.
* gcc.target/s390/vector/long-double-from-u64.c: New test.
* gcc.target/s390/vector/long-double-from-u8.c: New test.
* gcc.target/s390/vector/long-double-to-double.c: New test.
* gcc.target/s390/vector/long-double-to-float.c: New test.
* gcc.target/s390/vector/long-double-to-i16.c: New test.
* gcc.target/s390/vector/long-double-to-i32.c: New test.
* gcc.target/s390/vector/long-double-to-i64.c: New test.
* gcc.target/s390/vector/long-double-to-i8.c: New test.
* gcc.target/s390/vector/long-double-to-u16.c: New test.
* gcc.target/s390/vector/long-double-to-u32.c: New test.
* gcc.target/s390/vector/long-double-to-u64.c: New test.
* gcc.target/s390/vector/long-double-to-u8.c: New test.
* gcc.target/s390/vector/long-double-vec-duplicate.c: New test.
* gcc.target/s390/vector/long-double-wf.h: New test.
* gcc.target/s390/vector/long-double-wfaxb.c: New test.
* gcc.target/s390/vector/long-double-wfcxb-0001.c: New test.
* gcc.target/s390/vector/long-double-wfcxb-0111.c: New test.
* gcc.target/s390/vector/long-double-wfcxb-1011.c: New test.
* gcc.target/s390/vector/long-double-wfcxb-1101.c: New test.
* gcc.target/s390/vector/long-double-wfdxb.c: New test.
* gcc.target/s390/vector/long-double-wfixb.c: New test.
* gcc.target/s390/vector/long-double-wfkxb-0111.c: New test.
* gcc.target/s390/vector/long-double-wfkxb-1011.c: New test.
* gcc.target/s390/vector/long-double-wfkxb-1101.c: New test.
* gcc.target/s390/vector/long-double-wflcxb.c: New test.
* gcc.target/s390/vector/long-double-wflpxb.c: New test.
* gcc.target/s390/vector/long-double-wfmaxb-2.c: New test.
* gcc.target/s390/vector/long-double-wfmaxb-3.c: New test.
* gcc.target/s390/vector/long-double-wfmaxb-disabled.c: New test.
* gcc.target/s390/vector/long-double-wfmaxb.c: New test.
* gcc.target/s390/vector/long-double-wfmsxb-disabled.c: New test.
* gcc.target/s390/vector/long-double-wfmsxb.c: New test.
* gcc.target/s390/vector/long-double-wfmxb.c: New test.
* gcc.target/s390/vector/long-double-wfnmaxb-disabled.c: New test.
* gcc.target/s390/vector/long-double-wfnmaxb.c: New test.
* gcc.target/s390/vector/long-double-wfnmsxb-disabled.c: New test.
* gcc.target/s390/vector/long-double-wfnmsxb.c: New test.
* gcc.target/s390/vector/long-double-wfsqxb.c: New test.
* gcc.target/s390/vector/long-double-wfsxb-1.c: New test.
* gcc.target/s390/vector/long-double-wfsxb.c: New test.
* gcc.target/s390/vector/long-double-wftcixb-1.c: New test.
* gcc.target/s390/vector/long-double-wftcixb.c: New test.

IBM Z: Store long doubles in vector registers when possible

On z14+, there are instructions for working with 128-bit floats (long
doubles) in vector registers.  It's beneficial to use them instead of
instructions that operate on floating point register pairs, because it
allows to store 4 times more data in registers at a time, relieving
register pressure.  The raw performance of the new instructions is
almost the same as that of the new ones.

Implement by storing TFmode values in vector registers on z14+.  Since
not all operations are available with the new instructions, keep the
old ones available using the new FPRX2 mode, and convert between it and
TFmode when necessary (this is called "forwarder" expanders below).
Change the existing TFmode expanders to call either new- or old-style
ones depending on whether we are on z14+ or older machines
("dispatcher" expanders).

gcc/ChangeLog:

2020-11-03  Ilya Leoshkevich  <iii@linux.ibm.com>

* config/s390/s390-modes.def (FPRX2): New mode.
* config/s390/s390-protos.h (s390_fma_allowed_p): New function.
* config/s390/s390.c (s390_fma_allowed_p): Likewise.
(s390_build_signbit_mask): Support 128-bit masks.
(print_operand): Support printing the second word of a TFmode
operand as vector register.
(constant_modes): Add FPRX2mode.
(s390_class_max_nregs): Return 1 for TFmode on z14+.
(s390_is_fpr128): New function.
(s390_is_vr128): Likewise.
(s390_can_change_mode_class): Use s390_is_fpr128 and
s390_is_vr128 in order to determine whether mode refers to a FPR
pair or to a VR.
(s390_emit_compare): Force TFmode operands into registers on
z14+.
* config/s390/s390.h (HAVE_TF): New macro.
(EXPAND_MOVTF): New macro.
(EXPAND_TF): Likewise.
* config/s390/s390.md (PFPO_OP_TYPE_FPRX2): PFPO_OP_TYPE_TF
alias.
(ALL): Add FPRX2.
(FP_ALL): Add FPRX2 for z14+, restrict TFmode to z13-.
(FP): Likewise.
(FP_ANYTF): New mode iterator.
(BFP): Add FPRX2 for z14+, restrict TFmode to z13-.
(TD_TF): Likewise.
(xde): Add FPRX2.
(nBFP): Likewise.
(nDFP): Likewise.
(DSF): Likewise.
(DFDI): Likewise.
(SFSI): Likewise.
(DF): Likewise.
(SF): Likewise.
(fT0): Likewise.
(bt): Likewise.
(_d): Likewise.
(HALF_TMODE): Likewise.
(tf_fpr): New mode_attr.
(type): New mode_attr.
(*cmp<mode>_ccz_0): Use type instead of mode with fsimp.
(*cmp<mode>_ccs_0_fastmath): Likewise.
(*cmptf_ccs): New pattern for wfcxb.
(*cmptf_ccsfps): New pattern for wfkxb.
(mov<mode>): Rename to mov<mode><tf_fpr>.
(signbit<mode>2): Rename to signbit<mode>2<tf_fpr>.
(isinf<mode>2): Renamed to isinf<mode>2<tf_fpr>.
(*TDC_insn_<mode>): Use type instead of mode with fsimp.
(fixuns_trunc<FP:mode><GPR:mode>2): Rename to
fixuns_trunc<FP:mode><GPR:mode>2<FP:tf_fpr>.
(fix_trunctf<mode>2): Rename to fix_trunctf<mode>2_fpr.
(floatdi<mode>2): Rename to floatdi<mode>2<tf_fpr>, use type
instead of mode with itof.
(floatsi<mode>2): Rename to floatsi<mode>2<tf_fpr>, use type
instead of mode with itof.
(*floatuns<GPR:mode><FP:mode>2): Use type instead of mode for
itof.
(floatuns<GPR:mode><FP:mode>2): Rename to
floatuns<GPR:mode><FP:mode>2<tf_fpr>.
(trunctf<mode>2): Rename to trunctf<mode>2_fpr, use type instead
of mode with fsimp.
(extend<DSF:mode><BFP:mode>2): Rename to
extend<DSF:mode><BFP:mode>2<BFP:tf_fpr>.
(<FPINT:fpint_name><BFP:mode>2): Rename to
<FPINT:fpint_name><BFP:mode>2<BFP:tf_fpr>, use type instead of
mode with fsimp.
(rint<BFP:mode>2): Rename to rint<BFP:mode>2<BFP:tf_fpr>, use
type instead of mode with fsimp.
(<FPINT:fpint_name><DFP:mode>2): Use type instead of mode for
fsimp.
(rint<DFP:mode>2): Likewise.
(trunc<BFP:mode><DFP_ALL:mode>2): Rename to
trunc<BFP:mode><DFP_ALL:mode>2<BFP:tf_fpr>.
(trunc<DFP_ALL:mode><BFP:mode>2): Rename to
trunc<DFP_ALL:mode><BFP:mode>2<BFP:tf_fpr>.
(extend<BFP:mode><DFP_ALL:mode>2): Rename to
extend<BFP:mode><DFP_ALL:mode>2<BFP:tf_fpr>.
(extend<DFP_ALL:mode><BFP:mode>2): Rename to
extend<DFP_ALL:mode><BFP:mode>2<BFP:tf_fpr>.
(add<mode>3): Rename to add<mode>3<tf_fpr>, use type instead of
mode with fsimp.
(*add<mode>3_cc): Use type instead of mode with fsimp.
(*add<mode>3_cconly): Likewise.
(sub<mode>3): Rename to sub<mode>3<tf_fpr>, use type instead of
mode with fsimp.
(*sub<mode>3_cc): Use type instead of mode with fsimp.
(*sub<mode>3_cconly): Likewise.
(mul<mode>3): Rename to mul<mode>3<tf_fpr>, use type instead of
mode with fsimp.
(fma<mode>4): Restrict using s390_fma_allowed_p.
(fms<mode>4): Restrict using s390_fma_allowed_p.
(div<mode>3): Rename to div<mode>3<tf_fpr>, use type instead of
mode with fdiv.
(neg<mode>2): Rename to neg<mode>2<tf_fpr>.
(*neg<mode>2_cc): Use type instead of mode with fsimp.
(*neg<mode>2_cconly): Likewise.
(*neg<mode>2_nocc): Likewise.
(*neg<mode>2): Likeiwse.
(abs<mode>2): Rename to abs<mode>2<tf_fpr>, use type instead of
mode with fdiv.
(*abs<mode>2_cc): Use type instead of mode with fsimp.
(*abs<mode>2_cconly): Likewise.
(*abs<mode>2_nocc): Likewise.
(*abs<mode>2): Likewise.
(*negabs<mode>2_cc): Likewise.
(*negabs<mode>2_cconly): Likewise.
(*negabs<mode>2_nocc): Likewise.
(*negabs<mode>2): Likewise.
(sqrt<mode>2): Rename to sqrt<mode>2<tf_fpr>, use type instead
of mode with fsqrt.
(cbranch<mode>4): Use FP_ANYTF instead of FP.
(copysign<mode>3): Rename to copysign<mode>3<tf_fpr>, use type
instead of mode with fsimp.
* config/s390/s390.opt (flag_vx_long_double_fma): New
undocumented option.
* config/s390/vector.md (V_HW): Add TF for z14+.
(V_HW2): Likewise.
(VFT): Likewise.
(VF_HW): Likewise.
(V_128): Likewise.
(tf_vr): New mode_attr.
(tointvec): Add TF.
(mov<mode>): Rename to mov<mode><tf_vr>.
(movetf): New dispatcher.
(*vec_tf_to_v1tf): Rename to *vec_tf_to_v1tf_fpr, restrict to
z13-.
(*vec_tf_to_v1tf_vr): New pattern for z14+.
(*fprx2_to_tf): Likewise.
(*mov_tf_to_fprx2_0): Likewise.
(*mov_tf_to_fprx2_1): Likewise.
(add<mode>3): Rename to add<mode>3<tf_vr>.
(addtf3): New dispatcher.
(sub<mode>3): Rename to sub<mode>3<tf_vr>.
(subtf3): New dispatcher.
(mul<mode>3): Rename to mul<mode>3<tf_vr>.
(multf3): New dispatcher.
(div<mode>3): Rename to div<mode>3<tf_vr>.
(divtf3): New dispatcher.
(sqrt<mode>2): Rename to sqrt<mode>2<tf_vr>.
(sqrttf2): New dispatcher.
(fma<mode>4): Restrict using s390_fma_allowed_p.
(fms<mode>4): Likewise.
(neg_fma<mode>4): Likewise.
(neg_fms<mode>4): Likewise.
(neg<mode>2): Rename to neg<mode>2<tf_vr>.
(negtf2): New dispatcher.
(abs<mode>2): Rename to abs<mode>2<tf_vr>.
(abstf2): New dispatcher.
(float<mode>tf2_vr): New forwarder.
(float<mode>tf2): New dispatcher.
(floatuns<mode>tf2_vr): New forwarder.
(floatuns<mode>tf2): New dispatcher.
(fix_trunctf<mode>2_vr): New forwarder.
(fix_trunctf<mode>2): New dispatcher.
(fixuns_trunctf<mode>2_vr): New forwarder.
(fixuns_trunctf<mode>2): New dispatcher.
(<FPINT:fpint_name><VF_HW:mode>2<VF_HW:tf_vr>): New pattern.
(<FPINT:fpint_name>tf2): New forwarder.
(rint<mode>2<tf_vr>): New pattern.
(rinttf2): New forwarder.
(*trunctfdf2_vr): New pattern.
(trunctfdf2_vr): New forwarder.
(trunctfdf2): New dispatcher.
(trunctfsf2_vr): New forwarder.
(trunctfsf2): New dispatcher.
(extenddftf2_vr): New pattern.
(extenddftf2): New dispatcher.
(extendsftf2_vr): New forwarder.
(extendsftf2): New dispatcher.
(signbittf2_vr): New forwarder.
(signbittf2): New dispatchers.
(isinftf2_vr): New forwarder.
(isinftf2): New dispatcher.
* config/s390/vx-builtins.md (*vftci<mode>_cconly): Use VF_HW
instead of VECF_HW, add missing constraint, add vw support.
(vftci<mode>_intcconly): Use VF_HW instead of VECF_HW.
(*vftci<mode>): Rename to vftci<mode>, use VF_HW instead of
VECF_HW, and vw support.
(vftci<mode>_intcc): Use VF_HW instead of VECF_HW.

Fix wrong code for boolean negation in condition at -O2

The problem is the bitwise/logical dichotomy for operators and the
transition from the former to the latter for boolean types: if they
are 1-bit, that's straightforward but, if they are larger, then you
need to be careful because you cannot, on the one hand, turn a bitwise
AND into a logical AND and, on the other hand, *not* turn e.g. a
bitwise NOT into a logical NOT if they occur in the same computation,
as the first change will drop the masking that may need to be applied
after the bitwise NOT if it is not also changed.

Given that the ranger turns bitwise AND/OR into logical AND/OR for
booleans, the patch does the same for bitwise NOT.

gcc/ChangeLog:
* range-op.cc (operator_logical_not::fold_range): Tidy up.
(operator_logical_not::op1_range): Call above method.
(operator_bitwise_not::fold_range): If the type is compatible
with boolean, call op_logical_not.fold_range.
(operator_bitwise_not::op1_range): If the type is compatible
with boolean, call op_logical_not.op1_range.

gcc/testsuite/ChangeLog:
* gnat.dg/opt88.adb: New test.

More PRE TLC

This makes get_expr_value_id cheap and completes the
constant value-id simplification by turning the constant_value_expressions
into a direct map instead of a set of pre_exprs for the value.

2020-11-10 Richard Biener <rguenther@suse.de>

* tree-ssa-pre.c (pre_expr_d::value_id): Add.
(constant_value_expressions): Turn into an array of pre_expr.
(get_or_alloc_expr_for_nary): New function.
(get_or_alloc_expr_for_reference): Likewise.
(add_to_value): For constant values only ever add a single
CONSTANT.
(get_expr_value_id): Return the new value_id member.
(vn_valnum_from_value_id): Split out and simplify constant
value id handling.
(get_or_alloc_expr_for_constant): Set the value_id member.
(phi_translate_1): Use get_or_alloc_expr_for_*.
(compute_avail): Likewise.
(bitmap_find_leader): Simplify constant value id handling.

aarch64: Skip arm targets in vq*shr*n_high_n intrinsic tests

These tests should be skipped for arm targets as the instrinsics
are only supported on aarch64.

gcc/testsuite/ChangeLog

2020-11-10 David Candler <david.candler@arm.com>

* gcc.target/aarch64/advsimd-intrinsics/vqrshrn_high_n.c: Added skip
directive.
* gcc.target/aarch64/advsimd-intrinsics/vqrshrun_high_n.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vqshrn_high_n.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vqshrun_high_n.c: Likewise.

doc: Fix grammar in description of earlyclobber

gcc/ChangeLog:

* doc/md.texi (Modifiers): Fix grammar in description of
earlyclobber constraint modifier.

sccvn: Fix up push_partial_def little-endian bitfield handling [PR97764]

This patch fixes a thinko in the left-endian push_partial_def path.
As the testcase shows, we have 3 bitfields in the struct,
bitoff  bitsize
0       3
3       28
31      1
the corresponding read is the byte at offset 3 (i.e. 24 bits)
and push_partial_def first handles the full store ({}) to all bits
and then is processing the store to the middle bitfield with value of -1.
Here are the interesting spots:
  pd.offset -= offseti;
this adjusts the pd to { -21, 28 }, the (for little-endian lowest) 21
bits aren't interesting to us, we only care about the upper 7.
          len = native_encode_expr (pd.rhs, this_buffer, bufsize,
                                    MAX (0, -pd.offset) / BITS_PER_UNIT);
native_encode_expr has the offset parameter in bytes and we tell it
that we aren't interested in the first (lowest) two bytes of the number.
It encodes 0xff, 0xff with len == 2 then.
      HOST_WIDE_INT size = pd.size;
      if (pd.offset < 0)
        size -= ROUND_DOWN (-pd.offset, BITS_PER_UNIT);
we get 28 - 16, i.e. 12 - the 16 is subtracting those 2 bytes that we
omitted in native_encode_expr.
          size = MIN (size, (HOST_WIDE_INT) needed_len * BITS_PER_UNIT);
needed_len is how many bytes the read at most needs, and that is 1,
so we get size 8 and copy all 8 bits (i.e. a single byte plus nothing)
from the native_encode_expr filled this_buffer; this incorrectly sets
the byte to 0xff when we want 0x7f.  The above line is correct for the
pd.offset >= 0 case when we don't skip anything, but for the pd.offset < 0
case we need to subtract also the remainder of the bits we aren't interested
in (the code shifts the bytes by that number of bits).
If it weren't for the big-endian path, we could as well do
      if (pd.offset < 0)
        size += pd.offset;
but the big-endian path needs it differently.
With the following patch, amnt is 3 and we subtract from 12 the (8 - 3)
bits and thus get the 7 which is the value we want.

2020-11-10  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/97764
* tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): For
little-endian stores with negative pd.offset, subtract
BITS_PER_UNIT - amnt from size if amnt is non-zero.

* gcc.c-torture/execute/pr97764.c: New test.

Fortran: Fix function decl's location [PR95847]

gcc/fortran/ChangeLog:

PR fortran/95847
* trans-decl.c (gfc_get_symbol_decl): Do not (re)set the location
of an external procedure.
(build_entry_thunks, generate_coarray_init, create_main_function,
gfc_generate_function_code): Use fndecl's location in BIND_EXPR.

gcc/testsuite/ChangeLog:

PR fortran/95847
* gfortran.dg/coverage.f90: New test.

tree-optimization/97760 - reduction paths with unhandled live stmt

This makes sure we reject reduction paths with a live stmt that
is not the last one altering the value. This is because we do not
handle this in the epilogue unless there's a scalar epilogue loop.

2020-11-09 Richard Biener <rguenther@suse.de>

PR tree-optimization/97760
* tree-vect-loop.c (check_reduction_path): Reject
reduction paths we do not handle in epilogue generation.

* gcc.dg/vect/pr97760.c: New testcase.

Normalize VARYING for -fstrict-enums.

The problem here is that the representation for VARYING in
-fstrict-enums is different between value_range and irange.

The helper function irange::normalize_min_max() will normalize to
VARYING only if setting the range to the entire domain of the
underlying type.  That is, [0, 0xff..ff], not the domain as defined by
-fstrict-enums.  This causes problems because the multi-range version
of varying_p() will return true if the range is the domain as defined
by -fstrict-enums.  Thus, normalize_min_max and varying_p have
different concepts of varying for multi-ranges.

(BTW, legacy ranges are different because they never look at the
extremes of a range to determine varying-ness.  They only look at the
kind field.)

One approach is to change all the code to limit ranges to the domain
in the -fstrict-enums world, but this won't work because there are
various instances of gimple where the values assigned or compared are
beyond the limits of TYPE_{MIN,MAX}_VALUE.  One example is the
addition of 0xffffffff to represent subtraction.

This patch fixes multi-range varying_p() and set_varying() to agree
with the normalization code, using the extremes of the underlying type,
to represent varying.

gcc/ChangeLog:

PR tree-optimization/97767
* value-range.cc (dump_bound_with_infinite_markers): Use
wi::min_value and wi::max_value.
(range_tests_strict_enum): New.
(range_tests): Call range_tests_strict_enum.
* value-range.h (irange::varying_p): Use wi::min_value
and wi::max_value.
(irange::set_varying): Same.
(irange::normalize_min_max): Remove comment.

gcc/testsuite/ChangeLog:

* g++.dg/opt/pr97767.C: New test.

Adjust Keylocker regex pattern for darwin, and add missing aesenc256kl test.

gcc/testsuite/ChangeLog

* gcc.target/i386/keylocker-aesdec128kl.c: Adjust regex patterns.
* gcc.target/i386/keylocker-aesdec256kl.c: Likewise.
* gcc.target/i386/keylocker-aesdecwide128kl.c: Likewise.
* gcc.target/i386/keylocker-aesdecwide256kl.c: Likewise.
* gcc.target/i386/keylocker-aesenc128kl.c: Likewise.
* gcc.target/i386/keylocker-aesencwide128kl.c: Likewise.
* gcc.target/i386/keylocker-aesencwide256kl.c: Likewise.
* gcc.target/i386/keylocker-encodekey128.c: Likewise.
* gcc.target/i386/keylocker-encodekey256.c: Likewise.
* gcc.target/i386/keylocker-aesenc256kl.c: New test.

Fix logical_combine OR operation. Again.

The original fix was incorrect and results in loss of opportunities.
Revert the original fix. When processing logical chains, do not
follow chains outside of the current basic block. Use the import
value instead.

gcc/
PR tree-optimization/97567
* gimple-range-gori.cc: (gori_compute::logical_combine): False
OR operations should intersect the 2 results.
(gori_compute::compute_logical_operands_in_chain): If def chains
are outside the current basic block, don't follow them.
gcc/testsuite/
* gcc.dg/pr97567-2.c: New.

Daily bump.

c++: DR 1914 - Allow duplicate standard attributes.

Following Joseph's change for C to allow duplicate C2x standard attributes
<https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557272.html>,
this patch does a similar thing for C++.  This is DR 1914, to be resolved by
<wg21.link/p2156>, which is not part of the standard yet, but has wide
support so looks like a shoo-in.  The duplications now produce warnings
instead, but only if the attribute wasn't specified via a macro.

gcc/c-family/ChangeLog:

DR 1914
* c-common.c (attribute_fallthrough_p): Tweak the warning
message.

gcc/cp/ChangeLog:

DR 1914
* parser.c (cp_parser_check_std_attribute): Return bool.  Add a
location_t parameter.  Return true if the attribute wasn't duplicated.
Give a warning instead of an error.  Check more attributes.
(cp_parser_std_attribute_list): Don't add duplicated attributes to
the list.  Pass location to cp_parser_check_std_attribute.

gcc/testsuite/ChangeLog:

DR 1914
* c-c++-common/attr-fallthrough-2.c: Adjust dg-warning.
* g++.dg/cpp0x/fallthrough2.C: Likewise.
* g++.dg/cpp0x/gen-attrs-60.C: Turn dg-error into dg-warning.
* g++.dg/cpp1y/attr-deprecated-2.C: Likewise.
* g++.dg/cpp2a/attr-likely2.C: Adjust dg-warning.
* g++.dg/cpp2a/nodiscard-once.C: Turn dg-error into dg-warning.
* g++.dg/cpp0x/gen-attrs-72.C: New test.

c++: Consider only relevant template arguments in sat_hasher

A large source of cache misses in satisfy_atom is caused by the identity
of an (atom,args) pair within the satisfaction cache being determined by
the entire set of supplied template arguments rather than by the subset
of template arguments that the atom actually depends on.  For instance,
consider

  template <class T> concept range = range_v<T>;
  template <class U> void foo () requires range<U>;
  template <class U, class V> void bar () requires range<U>;

The associated constraints of foo and bar are equivalent: they both
consist of the atom range_v<T> (with mapping T -> U).  But the sat_cache
currently will never reuse a satisfaction value between the two atoms
because foo has one template parameter and bar has two, and the
satisfaction cache conservatively assumes that all template parameters
of the constrained decl are relevant to a satisfaction value of one of
its atoms.

This patch eliminates this assumption and makes the sat_cache instead
care about just the subset of args of an (atom,args) pair that is
relevant to satisfaction.

This patch additionally fixes a seemingly latent bug that was found when
testing against range-v3.  In the testcase concepts-decltype2.C below,
during normalization of f's constraints we end up forming a TARGET_EXPR
whose _SLOT has a DECL_CONTEXT that points to g instead of f because
current_function_decl is not updated before we start normalizing.
This patch fixes this accordingly, and also adds a sanity check to
keep_template_parm to verify each found parameter has a valid index.

With this patch, compile time and memory usage for the cmcstl2 test
test/algorithm/set_symmetric_difference4.cpp drops from 8.5s/1.2GB to
3.5s/0.4GB.

gcc/cp/ChangeLog:

* constraint.cc (norm_info::norm_info): Initialize orig_decl.
(norm_info::orig_decl): New data member.
(normalize_atom): When caching an atom for the first time,
compute a list of template parameters used in the targets of the
parameter mapping and store it in the TREE_TYPE of the mapping.
(get_normalized_constraints_from_decl): Set current_function_decl
appropriately when normalizing.  As an optimization, don't
set up a push_nested_class_guard when decl has no constraints.
(sat_hasher::hash): Use this list to hash only the template
arguments that are relevant to the atom.
(satisfy_atom): Use this list to compare only the template
arguments that are relevant to the atom.
* pt.c (keep_template_parm): Do a sanity check on the parameter's
index when flag_checking.

c++: Use two levels of caching in satisfy_atom

This improves the effectiveness of caching in satisfy_atom by querying
the cache again after we've instantiated the atom's parameter mapping.

Before instantiating its mapping, the identity of an (atom,args) pair
within the satisfaction cache is determined by idiosyncratic things like
the level and index of each template parameter used in targets of the
parameter mapping.  For example, the associated constraints of foo in

  template <class T> concept range = range_v<T>;
  template <class U, class V> void foo () requires range<U> && range<V>;

are range_v<T> (with mapping T -> U) /\ range_v<T> (with mapping T -> V).
If during satisfaction the template arguments supplied for U and V are
the same, then the satisfaction value of these two atoms will be the
same (despite their uninstantiated parameter mappings being different).

But sat_cache doesn't see this because it compares the uninstantiated
parameter mapping and the supplied template arguments of sat_entry's
independently.  So satisy_atom on this latter atom will end up fully
evaluating it instead of reusing the satisfaction value of the former.

But there is a point when the two atoms do look the same to sat_cache,
and that's after instantiating their parameter mappings.  By querying
the cache again at this point, we can avoid substituting the same
instantiated parameter mapping into the same expression a second time
around.

With this patch, compile time and memory usage for the cmcstl2 test
test/algorithm/set_symmetric_diference4.cpp drops from 11s/1.4GB to
8.5s/1.2GB with an --enable-checking=release compiler.

gcc/cp/ChangeLog:

* cp-tree.h (ATOMIC_CONSTR_MAP_INSTANTIATED_P): Define this flag
for ATOMIC_CONSTRs.
* constraint.cc (sat_hasher::hash): Use hash_atomic_constraint
if the flag is set, otherwise keep using a pointer hash.
(sat_hasher::equal): Return false if the flag's setting differs
on two atoms.  Call atomic_constraints_identical_p if the flag
is set, otherwise keep using a pointer equality test.
(satisfy_atom): After instantiating the parameter mapping, form
another ATOMIC_CONSTR using the instantiated mapping and query
the cache again.  Cache the satisfaction value of both atoms.
(diagnose_atomic_constraint): Simplify now that the supplied
atom has an instantiated mapping.

c++: Reuse identical ATOMIC_CONSTRs during normalization

Profiling revealed that sat_hasher::equal accounts for nearly 40% of
compile time in some cmcstl2 tests.

This patch eliminates this bottleneck by caching the ATOMIC_CONSTRs
returned by normalize_atom. This in turn allows us to replace the
expensive atomic_constraints_identical_p check in sat_hasher::equal
with cheap pointer equality, with no loss in cache hit rate.

With this patch, compile time for the cmcstl2 test
test/algorithm/set_symmetric_difference4.cpp drops from 19s to 11s with
an --enable-checking=release compiler.

gcc/cp/ChangeLog:

* constraint.cc (atom_cache): Define this deletable hash_table.
(normalize_atom): Use it to cache ATOMIC_CONSTRs when not
generating diagnostics.
(sat_hasher::hash): Use htab_hash_pointer instead of
hash_atomic_constraint.
(sat_hasher::equal): Test for pointer equality instead of
atomic_constraints_identical_p.
* cp-tree.h (struct atom_hasher): Moved and renamed from ...
* logic.cc (struct constraint_hash): ... here.
(clause::m_set): Adjust accordingly.

c++: Fix ICE with variadic concepts and aliases [PR93907]

This patch (naively) extends the PR93907 fix to also apply to variadic
concepts invoked with a type argument pack. Without this, we ICE on
the below testcase (a variadic version of concepts-using2.C) in the same
manner as we used to on concepts-using2.C before r10-7133.

gcc/cp/ChangeLog:

PR c++/93907
* constraint.cc (tsubst_parameter_mapping): Also canonicalize
the type arguments of a TYPE_ARGUMENT_PACk.

gcc/testsuite/ChangeLog:

PR c++/93907
* g++.dg/cpp2a/concepts-using3.C: New test, based off of
concepts-using2.C.

MAINTAINERS: Add myself for write after approval

2020-11-09 Pat Bernardi <bernardi@adacore.com>

* MAINTAINERS (Write After Approval): Add myself.

libstdc++: Remove <debug/array>

Add _GLIBCXX_ASSERTIONS assert in normal std::array and remove __gnu_debug::array
implementation.

libstdc++-v3/ChangeLog:

* include/debug/array: Remove.
* include/Makefile.am: Remove <debug/array>.
* include/Makefile.in: Regenerate.
* include/experimental/functional: Adapt.
* include/std/array: Move to _GLIBCXX_INLINE_VERSION namespace.
* include/std/functional: Adapt.
* include/std/span: Adapt.
* testsuite/23_containers/array/debug/back1_neg.cc:
Remove dg-require-debug-mode. Add -D_GLIBCXX_ASSERTIONS option.
* testsuite/23_containers/array/debug/back2_neg.cc: Likewise.
* testsuite/23_containers/array/debug/front1_neg.cc: Likewise.
* testsuite/23_containers/array/debug/front2_neg.cc: Likewise.
* testsuite/23_containers/array/debug/square_brackets_operator1_neg.cc:
Likewise.
* testsuite/23_containers/array/debug/square_brackets_operator2_neg.cc:
Likewise.
* testsuite/23_containers/array/element_access/60497.cc
* testsuite/23_containers/array/tuple_interface/get_debug_neg.cc:
Remove.
* testsuite/23_containers/array/tuple_interface/get_neg.cc
* testsuite/23_containers/array/tuple_interface/tuple_element_debug_neg.cc
* testsuite/23_containers/array/tuple_interface/tuple_element_neg.cc

c++: Call tsubst_pack_expansion from tsubst.

This was unnecessary (and incomplete) code duplication.

gcc/cp/ChangeLog:

* pt.c (tsubst): Replace *_ARGUMENT_PACK code with
a call to tsubst_argument_pack.

c++: Improve error location for class using-decl.

We should use the location of the using-declaration, not the location of the
class.

gcc/cp/ChangeLog:

* class.c (handle_using_decl): Add an iloc_sentinel.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/using26.C: Adjust location.
* g++.old-deja/g++.other/using1.C: Adjust location.

libstdc++: Make _GLIBCXX_DEBUG checks constexpr compatible

libstdc++-v3/ChangeLog:

* include/debug/assertions.h (__glibcxx_requires_non_empty_range):
Remove __builtin_expect.
(__glibcxx_requires_subscript): Likewise.
(__glibcxx_requires_nonempty): Likewise.
* include/debug/formatter.h (__check_singular): Add C++11 constexpr
qualification.
* include/debug/helper_functions.h (__check_singular): Likewise. Skip
check if constant evaluated.
(__valid_range): Do not skip check if constant evaluated.
* include/debug/macros.h (_GLIBCXX_DEBUG_VERIFY_COND_AT): Add
__builtin_expect.
(_GLIBCXX_DEBUG_VERIFY_AT_F): Use __glibcxx_assert_1.
* testsuite/21_strings/basic_string_view/element_access/char/back_constexpr_neg.cc:
New test.
* testsuite/21_strings/basic_string_view/element_access/char/constexpr.cc: New test.
* testsuite/21_strings/basic_string_view/element_access/char/constexpr_neg.cc: New test.
* testsuite/21_strings/basic_string_view/element_access/char/front_back_constexpr.cc:
New test.
* testsuite/21_strings/basic_string_view/element_access/char/front_constexpr_neg.cc:
New test.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/back_constexpr_neg.cc:
New test.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/constexpr.cc: New test.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/constexpr_neg.cc: New test.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/front_constexpr_neg.cc:
New test.
* testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc: New test.
* testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc: New test.
* testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc: New test.
* testsuite/25_algorithms/lower_bound/debug/partitioned_neg.cc: New test.
* testsuite/25_algorithms/lower_bound/debug/partitioned_pred_neg.cc: New test.
* testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_neg.cc: New test.
* testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_pred_neg.cc: New test.
* testsuite/25_algorithms/upper_bound/debug/constexpr_valid_range_neg.cc: New test.
* testsuite/25_algorithms/upper_bound/debug/partitioned_neg.cc: New test.
* testsuite/25_algorithms/upper_bound/debug/partitioned_pred_neg.cc: New test.

c++: Fix -Wvexing-parse ICE with omitted int [PR97762]

For declarations like

long f();

decl_specifiers->type will be NULL, but I neglected to handle this case,
therefore we ICE. So handle this case by pretending we've seen 'int',
which is good enough for -Wvexing-parse's purposes.

gcc/cp/ChangeLog:

PR c++/97762
* parser.c (warn_about_ambiguous_parse): Handle the case when
there is no type in the decl-specifiers.

gcc/testsuite/ChangeLog:

PR c++/97762
* g++.dg/warn/Wvexing-parse8.C: New test.

c-family: Avoid unnecessary work when -Wpragmas is being ignored

This speeds up handle_pragma_diagnostic by avoiding computing a spelling
suggestion for an unrecognized option inside a #pragma directive when
-Wpragmas warnings are being suppressed.

In the range-v3 library, which contains many instances of

  #pragma GCC diagnostic push
  #pragma GCC diagnostic ignored "-Wpragmas"
  #pragma GCC diagnostic ignored "-Wfoo"
  ...
  #pragma GCC diagnostic pop

(where -Wfoo stands for a warning option we don't recognize), this
reduces compile time by 33% for some of its tests.

gcc/c-family/ChangeLog:

* c-pragma.c (handle_pragma_diagnostic): Split the
unknown-option -Wpragmas diagnostic into a warning and a
subsequent note containing a spelling suggestion.  Avoid
computing the suggestion if -Wpragmas warnings are being
suppressed.

gcc/testsuite/ChangeLog:

* gcc.dg/pragma-diag-6.c: Adjust expected diagnostics
accordingly.

c-family: Fix regression in location-overflow-test-1.c [PR97117]

The r11-3266 patch that added macro support to -Wmisleading-indentation
accidentally suppressed the column-tracking diagnostic in
get_visual_column in some cases, e.g. in the location-overflow-test-1.c
testcase.

More generally, when all three tokens are on the same line and we've run
out of locations with column info, then their location_t values will be
equal, and we exit early from should_warn_for_misleading_indentation due
to the new check

  /* Give up if the loci are not all distinct.  */
  if (guard_loc == body_loc || body_loc == next_stmt_loc)
    return false;

before we ever call get_visual_column.

[ This new check is needed to detect and give up on analyzing code
  fragments where exactly two out of the three tokens come from the same
  macro expansion, e.g.

    #define MACRO \
      if (a)      \
        foo ();

    MACRO; bar ();

  Here, guard_loc and body_loc will be equal and point to the macro
  expansion point (and next_stmt_loc will point to 'bar').  The heuristics
  that the warning uses are not really valid in scenarios like these.  ]

In order to restore the column-tracking diagnostic, this patch moves the
the diagnostic code out from get_visual_column to earlier in
should_warn_for_misleading_indentation.  Moreover, it tests the three
locations for a zero column all at once, which I suppose should make us
issue the diagnostic more consistently.

gcc/c-family/ChangeLog:

PR testsuite/97117
* c-indentation.c (get_visual_column): Remove location_t
parameter.  Move the column-tracking diagnostic code from here
to ...
(should_warn_for_misleading_indentation): ... here, before the
early exit for when the loci are not all distinct.  Don't pass a
location_t argument to get_visual_column.
(assert_get_visual_column_succeeds): Don't pass a location_t
argument to get_visual_column.
(assert_get_visual_column_fails): Likewise.

arc: Improve/add instruction patterns to better use MAC instructions.

ARC MYP7+ instructions adds MAC instructions for either vector and
scalar data types. This patch adds a madd pattern for 16it datum using
the 32bit MAC instruction, and dot_prod patterns for v4hi vector
types. The 64bit moves are also upgraded by using vadd2 instuction.

2020-11-09 Claudiu Zissulescu <claziss@synopsys.com>

gcc/

* config/arc/arc.c (arc_split_move): Recognize vadd2 instructions.
* config/arc/arc.md (movdi_insn): Update pattern to use vadd2
instructions.
(movdf_insn): Likewise.
(maddhisi4): New pattern.
(umaddhisi4): Likewise.
* config/arc/simdext.md (mov<mode>_int): Update pattern to use
vadd2.
(sdot_prodv4hi): New pattern.
(udot_prodv4hi): Likewise.
(arc_vec_<V_US>mac_hi_v4hi): Update/renamed to
arc_vec_<V_US>mac_v2hiv2si.
(arc_vec_<V_US>mac_v2hiv2si_zero): New pattern.
* config/arc/constraints.md (Ral): Accumulator register
constraint.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>

Clean up irange self tests.

Currently we have all the irange and range-op tests in range-op.cc.
This patch splits them up into the appropriate file (irange
tests in value-range.cc and range-op tests in range-op.cc). The patch
also splits up the tests themselves by functionality. It's not perfect,
but significantly better than the mess we had.

gcc/ChangeLog:

* function-tests.c (test_ranges): Call range_op_tests.
* range-op.cc (build_range3): Move to value-range.cc.
(range3_tests): Same.
(int_range_max_tests): Same.
(multi_precision_range_tests): Same.
(range_tests): Same.
(operator_tests): Split up...
(range_op_tests): Split up...
(range_op_cast_tests): ...here.
(range_op_lshift_tests): ...here.
(range_op_rshift_tests): ...here.
(range_op_bitwise_and_tests): ...here.
* selftest.h (range_op_tests): New.
* value-range.cc (build_range3): New.
(range_tests_irange3): New.
(range_tests_int_range_max): New.
(range_tests_legacy): New.
(range_tests_misc): New.
(range_tests): New.

Fortran: Fix OpenACC in specification-part checks [PR90111]

OpenACC's routine and declare directives can appear anywhere in the
specification part, i.e. before/after use-stmts, import-stmt, implicit-part,
or declaration-constructs.

gcc/fortran/ChangeLog:

PR fortran/90111
* parse.c (case_decl): Move ST_OACC_ROUTINE and ST_OACC_DECLARE to ...
(case_omp_decl): ... here.
(verify_st_order): Update comment.

gcc/testsuite/ChangeLog:

PR fortran/90111
* gfortran.dg/goacc/specification-part.f90: New test.

libstdc++: Improve comment on _Power_of_2 helper function

libstdc++-v3/ChangeLog:

* include/bits/uniform_int_dist.h (__detail::_Power_of_2):
Document that true result for zero is intentional.

libstdc++: Remove redundant check for zero in std::__popcount

The popcount built-ins work fine for zero, so there's no need to check
for it.

libstdc++-v3/ChangeLog:

* include/std/bit (__popcount): Remove redundant check for zero.

tree-optimization/97761 - fix SLP live calculation

This removes a premature end of the DFS walk.

2020-11-09 Richard Biener <rguenther@suse.de>

PR tree-optimization/97761
* tree-vect-slp.c (vect_bb_slp_mark_live_stmts): Remove
premature end of DFS walk.

* gfortran.dg/vect/pr97761.f90: New testcase.

Cleanup irange::set.

[This is actually part of a larger patch that actually changes
behavior, but I thought I'd commit the non-invasive cleanups first
which will simplify the upcoming work.]

irange::set was doing more work than it should for legacy ranges.
I cleaned up various unnecessary calls to swap_out_of_order_endpoints,
as well as some duplicate code that could be done with normalize_min_max.

I also removed an obsolete comment wrt sticky infinite overflows.
Not only did the -INF/+INF(OVF) code get removed in 2017,
but normalize_min_max() uses wide ints, which ignored overflows
altogether.

gcc/ChangeLog:

* value-range.cc (irange::swap_out_of_order_endpoints): Rewrite
into static function.
(irange::set): Cleanup redundant manipulations.
* value-range.h (irange::normalize_min_max): Modify object
in-place instead of modifying arguments.

aarch64: Do not alter force_reg returned register expanding fcmla

2020-11-06 Andrea Corallo <andrea.corallo@arm.com>

* config/aarch64/aarch64-builtins.c
(aarch64_expand_fcmla_builtin): Do not alter force_reg returned
register.

libstdc++: Use 'inline' consistently in std::exception_ptr [PR 97729]

With PR c++/67453 fixed we can rely on the 'used' attribute to emit
inline constructors and destructors in libsupc++/eh_ptr.cc. This means
we don't need to suppress the 'inline' keyword on them in that file, and
don't need to force 'always_inline' on them in other files.

libstdc++-v3/ChangeLog:

PR libstdc++/97729
* libsupc++/exception_ptr.h (exception_ptr::exception_ptr())
(exception_ptr::exception_ptr(const exception_ptr&))
(exception_ptr::~exception_ptr()): Remove 'always_inline'
attributes. Use 'inline' unconditionally.

libstdc++: Include <typeinfo> even for -fno-rtti [PR 97758]

The std::function code now uses std::type_info* even when RTTI is
disabled, so it should include <typeinfo> unconditionally. Without this,
Clang can't compile <functional> with -fno-rtti (it works with GCC
because std::type_info gets declared automatically by the compiler).

libstdc++-v3/ChangeLog:

PR libstdc++/97758
* include/bits/std_function.h [!__cpp_rtti]: Include <typeinfo>.

config-ml.in: Suppress output from multi-do recipes

The FIXME comments saying "Leave out until this is tested a bit more"
are from 1997. I think they've been sufficiently tested.

ChangeLog:

* config-ml.in (multi-do, multi-clean): Add @ to silence recipes.
Remove FIXME comments.

tree-optimization/97753 - fix SLP induction vect

This fixes updating of the step vectors when filling up to group_size.

2020-11-09 Richard Biener <rguenther@suse.de>

PR tree-optimization/97753
* tree-vect-loop.c (vectorizable_induction): Fill vec_steps
when CSEing inside the group.

* gcc.dg/vect/pr97753.c: New testcase.

tree-optimization/97746 - fix order of mask precision computes

This fixes the order of walking PHIs and stmts for BB mask
precision compute.

2020-11-09 Richard Biener <rguenther@suse.de>

PR tree-optimization/97746
* tree-vect-patterns.c (vect_determine_precisions): First walk PHIs.

* gcc.dg/vect/bb-slp-pr97746.c: New testcase.

c++: ADL refactor

This refactors the ADL lookup. It just so happens the refactoring
makes dropping modules in simpler :) We break apart the namespace and
class fn processing, and move scope iteration to an outer function.
It'll also become possible to find the same enum in multiple place, so
we need to handle that idempotently.

gcc/cp/
* cp-tree.h (LOOKUP_FOUND_P): Add ENUMERAL_TYPE.
* name-lookup.c (class name_lookup): Add comments.
(name_lookup::adl_namespace_only): Replace with ...
(name_lookup::adl_class_fns): ... this and ...
(name_lookup::adl_namespace_fns): ... this.
(name_lookup::adl_namespace): Deal with inline nests here.
(name_lookup::adl_class): Complete the type here.
(name_lookup::adl_type): Call broken-out enum ..
(name_lookup::adl_enum): New. No need to call the namespace adl
if it is class-scope.
(name_lookup::search_adl): Iterate over collected scopes here.

c++: Consistently expose singleton overloads

This is a patch from my name-lookup overhaul. I noticed the parser
and one path in name-lookup looked through an overload of a single
known decl. It seems more consistent to do that in both paths through
name-lookup, and not in the parser itself.

gcc/cp/
* name-lookup.c (lookup_qualified_name): Expose an overload of a
singleton with known type.
(lookup_name_1): Just check the overload's type to expose it.
* parser.c (cp_parser_lookup_name): Do not do that check here.

CSE VN_INFO calls in PRE and VN

The following CSEs VN_INFO calls which nowadays are hashtable queries.

2020-11-09 Richard Biener <rguenther@suse.de>

* tree-ssa-pre.c (get_representative_for): CSE VN_INFO calls.
(create_expression_by_pieces): Likewise.
(insert_into_preds_of_block): Likewsie.
(do_pre_regular_insertion): Likewsie.
* tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_insert):
Likewise.
(eliminate_dom_walker::eliminate_stmt): Likewise.

Use a per-edge PRE PHI translation cache

This changes the phi translation cache to be per edge which
pushes it off the profiling radar.  For larger testcases the
combined hashtable causes a load of cache misses and making it
per edge allows to shrink the entry further.

2020-11-09  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97765
* tree-ssa-pre.c (bb_bitmap_sets::phi_translate_table): Add.
(PHI_TRANS_TABLE): New macro.
(phi_translate_table): Remove.
(expr_pred_trans_d::pred): Remove.
(expr_pred_trans_d::hash): Simplify.
(expr_pred_trans_d::equal): Likewise.
(phi_trans_add): Adjust.
(phi_translate): Likewise.  Remove hash-table expansion
detection and optimization.
(phi_translate_set): Allocate PHI_TRANS_TABLE here.
(init_pre): Adjsust.
(fini_pre): Free PHI_TRANS_TABLE.

arm: [testcase] Better narrow some bfloat16 testcase

2020-11-05 Andrea Corallo <andrea.corallo@arm.com>

* gcc.target/arm/simd/vld1_lane_bf16_1.c: Require target to
support and add -mfloat-abi=hard flag.
* gcc.target/arm/simd/vld1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vst1_lane_bf16_1.c: Likewise.
* gcc.target/arm/simd/vst1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c: Likewise.

Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for march=tremont

1. Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for march=tremont
2. Move PREFETCHW from march=broadwell to march=silvermont.
3. Add PREFETCHWT1 to march=knl

gcc/ChangeLog:

2020-11-09 Lili Cui <lili.cui@intel.com>

PR target/97685
* config/i386/i386.h:
(PTA_BROADWELL): Delete PTA_PRFCHW.
(PTA_SILVERMONT): Add PTA_PRFCHW.
(PTA_KNL): Add PTA_PREFETCHWT1.
(PTA_TREMONT): Add PTA_MOVDIRI, PTA_MOVDIR64B, PTA_CLDEMOTE and PTA_WAITPKG.
* doc/invoke.texi: Delete PREFETCHW for broadwell, skylake, knl, knm,
skylake-avx512, cannonlake, icelake-client, icelake-server, cascadelake,
cooperlake, tigerlake and sapphirerapids.
Add PREFETCHW for silvermont, goldmont, goldmont-plus and tremont.
Add XSAVEC and XSAVES for goldmont, goldmont-plus and tremont.
Add MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for tremont.
Add KEYLOCKER and HREST for alderlake.
Add AMX-BF16, AMX-TILE, AMX-INT8 and UINTR for sapphirerapids.
Add KEYLOCKER for tigerlake.

libiberty/pex-win32.c: Initialize orig_err

Initializing orig_err avoids a warning: "may be used uninitialized".
See 97108.

2020-09-14 Torbjörn SVENSSON <torbjorn.svensson@st.com>
Christophe Lyon <christophe.lyon@linaro.org>

libiberty/
* pex-win32.c (pex_win32_exec_child): Initialize orig_err.

ira: Recompute regstat as max_regno changes [PR97705]

As PR97705 shows, the commit r11-4637 caused some dumping
comparison difference error on pass ira. It exposed one
issue about the newly introduced function remove_scratches,
which can increase the largest pseudo reg number if it
succeeds, later some function will use the max_reg_num()
to get the latest max_regno, when iterating the numbers
we can access some data structures which are allocated as
the previous max_regno, some out of array bound accesses
can occur, the failure can be random since the values
beyond the array could be random.

This patch is to free/reinit/recompute the relevant data
structures that is regstat_n_sets_and_refs and reg_info_p
to ensure we won't access beyond some array bounds.

Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.

gcc/ChangeLog:

PR rtl-optimization/97705
* ira.c (ira): Refactor some regstat free/init/compute invocation
into lambda function regstat_recompute_for_max_regno, and call it
when max_regno increases as remove_scratches succeeds.

Daily bump.

Objective-C/C++ : Handle parsing @property 'class' attribute.

This attribute states that a property is one manipulated by class
methods (it requires a static variable and the setter and getter
must be provided explicitly, they cannot be @synthesized).

gcc/c-family/ChangeLog:

* c-common.h (OBJC_IS_PATTR_KEYWORD): Add class to the list
of keywords accepted in @property attribute contexts.
* c-objc.h (enum objc_property_attribute_group): Add
OBJC_PROPATTR_GROUP_CLASS.
(enum objc_property_attribute_kind): Add
OBJC_PROPERTY_ATTR_CLASS.

gcc/cp/ChangeLog:

* parser.c (cp_parser_objc_at_property_declaration): Handle
class keywords in @property attribute context.

gcc/objc/ChangeLog:

* objc-act.c (objc_prop_attr_kind_for_rid): Handle class
attribute.
(objc_add_property_declaration): Likewise.
* objc-act.h (PROPERTY_CLASS): Record class attribute state.

gcc/testsuite/ChangeLog:

* obj-c++.dg/property/at-property-4.mm: Test handling class
attributes.
* objc.dg/property/at-property-4.m: Likewise.

testsuite, Darwin, PPC : Skip zero scratch regs tests.

XFAIL-ing these is not sufficient, unfortunately, we need to
skip them completely.

gcc/testsuite/ChangeLog:

* c-c++-common/zero-scratch-regs-10.c: Skip for powerpc
Darwin.
* c-c++-common/zero-scratch-regs-11.c: Likewise.
* c-c++-common/zero-scratch-regs-8.c: Likewise.
* c-c++-common/zero-scratch-regs-9.c: Likewise.

testsuite, Darwin, X86 : Add target requires native tls to test.

The builtin_thread_pointer test does not work for emulated TLS.
Add a target requires to cover this.

gcc/testsuite/ChangeLog:

* gcc.target/i386/builtin_thread_pointer.c: Require native TLS.

rs6000: Fix bootstrap after r11-4793.

The patch omitted a change for rs6000.c, fixed thus.

gcc/ChangeLog:

* config/rs6000/rs6000.c (rs6000_mangle_decl_assembler_name): Change
DECL_IS_BUILTIN -> DECL_IS_UNDECLARED_BUILTIN.

Daily bump.

testsuite: Fix Wimplicit-fallthrough-20.c.

The r11-4813 patch removed "ignored" from the dg-warnings in this test,
causing this test to fail when compiled as C++.

gcc/testsuite/ChangeLog:

* c-c++-common/Wimplicit-fallthrough-20.c: Adjust dg-warning.

libcpp: Update cpp_wcwidth() to Unicode 13.0.0

generated_cpp_wcwidth.h was regenerated using Unicode 13.0.0 data files. No
material changes to the parsing scripts (either GCC- or glibc-sourced) were
necessary; glibc's utf8_gen.py was tweaked slightly by glibc and matched here.

contrib/ChangeLog:

* unicode/EastAsianWidth.txt: Update to Unicode 13.0.0.
* unicode/PropList.txt: Likewise.
* unicode/README: Likewise.
* unicode/UnicodeData.txt: Likewise.
* unicode/from_glibc/unicode_utils.py: Update to latest glibc version.
* unicode/from_glibc/utf8_gen.py: Likewise.

libcpp/ChangeLog:

* generated_cpp_wcwidth.h: Regenerated from Unicode 13.0.0 data.

Objective-C/C++ (C-family) : Add missing 'atomic' property attribute.

This is the default, but it is still legal in user code and therefore
we should handle it in parsing. Fix whitespace issues in the lines
affected.

gcc/c-family/ChangeLog:

* c-common.c (c_common_reswords): Add 'atomic' property
attribute.
* c-common.h (enum rid): Add RID_PROPATOMIC for atomic
property attributes.

gcc/objc/ChangeLog:

* objc-act.c (objc_prop_attr_kind_for_rid): Handle
RID_PROPATOMIC.

gcc/testsuite/ChangeLog:

* obj-c++.dg/property/at-property-4.mm: Test atomic property
attribute.
* objc.dg/property/at-property-4.m: Likewise.

Objective-C : Implement NSObject attribute.

This attribute allows pointers to be marked as pointers to
an NSObject-compatible object. This allows for additional
checking of assignment etc. when refering to pointers to
opaque types.

gcc/c-family/ChangeLog:

* c-attribs.c (handle_nsobject_attribute): New.
* c.opt: Add WNSObject-attribute.

gcc/objc/ChangeLog:

* objc-act.c (objc_compare_types): Handle NSObject type
attributes.
(objc_type_valid_for_messaging): Likewise.

gcc/testsuite/ChangeLog:

* obj-c++.dg/attributes/nsobject-01.mm: New test.
* objc.dg/attributes/nsobject-01.m: New test.

Fix Ada build failure for the SuSE PowerPC64/Linux compiler

gcc/ada/ChangeLog:
* gcc-interface/Makefile.in: Force target_cpu to powerpc if the
nominal target is powerpc64-suse-linux.

testsuite, Darwin, PPC : XFAIL zero-scratch-regs tests.

These tests fail because of an unimplemented 'sorry'; there
is no plan to implement this in the short term, so XFAILing
the tests to reduce noise.

gcc/testsuite/ChangeLog:

* c-c++-common/zero-scratch-regs-10.c: XFAIL for
powerpc-darwin.
* c-c++-common/zero-scratch-regs-11.c: Likewise.
* c-c++-common/zero-scratch-regs-8.c: Likewise.
* c-c++-common/zero-scratch-regs-9.c: Likewise.

Ada : Fix bootstrap after r11-4793.

The patch omitted a change for Ada, fixed thus.

gcc/ada/ChangeLog:

* gcc-interface/misc.c (gnat_printable_name): Change
DECL_IS_BUILTIN -> DECL_IS_UNDECLARED_BUILTIN.

C Parser: Implement mixing of labels and code.

Implement mixing of labels and code as adopted for C2X
and process some std-attributes on labels.

2020-11-06 Martin Uecker <muecker@gwdg.de>

gcc/
* doc/extend.texi: Document mixing labels and code.
* doc/invoke.texi: Likewise.

gcc/c/
* c-parser.c (c_parser_label): Implement mixing of labels and code.
(c_parser_all_labels): Likewise.

gcc/testsuite/
* c-c++-common/attr-fallthrough-2.c: Update compiler flags.
* c-c++-common/Wimplicit-fallthrough-20.c: Adapt test.
* gcc.dg/20031223-1.c: Update compiler flags and adapt test.
* gcc.dg/c11-labels-1.c: New test.
* gcc.dg/c11-labels-2.c: New test.
* gcc.dg/c11-labels-3.c: New test.
* gcc.dg/c2x-attr-syntax-3.c: Adapt test.
* gcc.dg/c2x-labels-1.c: New test.
* gcc.dg/c2x-labels-2.c: New test.
* gcc.dg/c2x-labels-3.c: New test.
* gcc.dg/decl-9.c: Update compiler flags and add error.
* gcc.dg/gomp/barrier-2.c: Update compiler flags and add warning.
* gcc.dg/gomp/declare-simd-5.c: Update compiler flags and adapt test.
* gcc.dg/gomp/declare-variant-2.c: Update compiler flags and add error.
* gcc.dg/label-compound-stmt-1.c: Update compiler flags.
* gcc.dg/parse-decl-after-label.c: Update compiler flags.

libsupc++: Make the destructor parameter to `__cxa_thread_atexit()` use the `__thiscall` calling convention for i686-w64-mingw32

The mingw-w64 implementations of `__cxa_thread_atexit()` and `__cxa_atexit()` have been
using `__thiscall` since two years ago. Using the default calling convention (which is
`__cdecl`) causes crashes as explained in PR83562.

Calling conventions have no effect on x86_64-w64-mingw32.

Reference: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83562
Reference: https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-crt/crt/cxa_thread_atexit.c
Reference: https://sourceforge.net/p/mingw-w64/mingw-w64/ci/f3e0fbb40cbc9f8821db8bd8a0c4dae8ff671e9f/
Reference: https://github.com/msys2/MINGW-packages/issues/7071
Signed-off-by: Liu Hao <lh_mouse@126.com>
2020-10-08 Liu Hao <lh_mouse@126.com>

libstdc++-v3:
* libsupc++/cxxabi.h: (__cxa_atexit): mark with _GLIBCXX_CDTOR_CALLABI
(__cxa_thread_atexit): ditto
* libsupc++/atexit_thread.cc: (__cxa_atexit): mark with
_GLIBCXX_CDTOR_CALLABI
(__cxa_thread_atexit): ditto
(elt): ditto

Daily bump.

rs6000: Don't use operands[] for temporaries in define_expand

In ac001f5ce604 Alan fixed my thinko using operands that do not refer
to anything mentioned in the RTL pattern.  Instead, it just uses fresh
new local rtxes for those.

This patch takes that a tiny bit further: it uses local rtx for all
temporaries used in the expanders.  As a bonus that simplifies the code
a tiny bit as well.

2020-11-06  Segher Boessenkool  <segher@kernel.crashing.org>

* config/rs6000/rs6000.md (@tablejump<mode>_normal): Don't abuse
operands[].
(@tablejump<mode>_nospec): Ditto.

MAINTAINERS: Update my email address.

2020-11-07 Martin Uecker <muecker@gwdg.de>

* MAINTAINERS: Update my email address.

rs6000: Use the correct minimized testcase

Use the correct minimized test case source rather than the large test
source.

gcc/testsuite/
* gcc.target/powerpc/pr64505.c: Run everywhere. Use correct minimized
test case.

rs6000: Fix default alignment ABI break caused by MMA base support

As part of the MMA base support, we incremented BIGGEST_ALIGNMENT in
order to align the __vector_pair and __vector_quad types to 256 and 512
bytes respectively. This had the unintended effect of changing the
default alignment used by __attribute__ ((__aligned__)) which causes
an ABI break because of some dodgy code in GLIBC's struct pthread.
The fix is to revert the BIGGEST_ALIGNMENT change and to force the
alignment on the type itself rather than the mode used by the type.

2020-11-06 Peter Bergner <bergner@linux.ibm.com>

gcc/
* config/rs6000/rs6000.h (BIGGEST_ALIGNMENT): Revert previous commit
so as not to break the ABI.
* config/rs6000/rs6000-call.c (rs6000_init_builtins): Set the ABI
mandated alignment for __vector_pair and __vector_quad types.

gcc/testsuite/
* gcc.target/powerpc/mma-alignment.c: New test.

Fix stack pointer handling in ms_hook_prologue functions for i386 target.

gcc/
PR target/91489
* config/i386/i386.md (simple_return): Also check
for ms_hook_prologue function attribute.
* config/i386/i386.c (ix86_can_use_return_insn_p):
Also check for ms_hook_prologue function attribute.
* config/i386/i386-protos.h (ix86_function_ms_hook_prologue): Declare.

gcc/testsuite
PR target/91489
* gcc.target/i386/ms_hook_prologue.c: Expand testcase
to reproduce PR target/91489 issue.

rs6000: Fix TARGET_POWERPC64 vs. TARGET_64BIT confusion

I gave Ke Wen bad advice, luckily David corrected me: it is true that we
cannot use TARGET_POWERPC64 on many 32-bit OSes, since either the kernel
or userland does not save the top half of the 64-bit integer registers,
but we do not have to care about that in separate patterns or related
code. The flag is automatically not enabled by default on targets that
do not handle this correctly.

This patch fixes it.

Segher

2020-11-06 Segher Boessenkool <segher@kernel.crashing.org>

PR target/96933
* config/rs6000/rs6000.c (rs6000_expand_vector_init): Use
TARGET_POWERPC64 instead of TARGET_64BIT.

builtins: Add DFP signaling NaN built-in functions

Add built-in functions __builtin_nansd32, __builtin_nansd64 and
__builtin_nansd128 to return signaling NaNs of decimal floating-point
types, analogous to the functions already present for binary
floating-point types.

This patch, independent of
<https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557136.html>
(pending review), is in preparation for adding the <float.h> macros
for such signaling NaNs that are in C2x, analogous to the macros for
other types that are in that patch.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  Also ran
the new tests for powerpc64le-linux-gnu to confirm they do work in the
case (hardware DFP) where floating-point exceptions are supported for
DFP.

gcc/
2020-11-06  Joseph Myers  <joseph@codesourcery.com>

* builtins.def (BUILT_IN_NANSD32, BUILT_IN_NANSD64)
(BUILT_IN_NANSD128): New built-in functions.
* fold-const-call.c (fold_const_call): Handle the new built-in
functions.
* doc/extend.texi (__builtin_nansd32, __builtin_nansd64)
(__builtin_nansd128): Document.
* doc/sourcebuild.texi (Effective-Target Keywords): Document
fenv_exceptions_dfp.

gcc/testsuite/
2020-11-06  Joseph Myers  <joseph@codesourcery.com>

* lib/target-supports.exp
(check_effective_target_fenv_exceptions_dfp): New.
* gcc.dg/dfp/builtin-snan-1.c, gcc.dg/dfp/builtin-snan-2.c: New
tests.

c++: Small tweak to can_convert_eh [PR81660]

While messing with check_handlers_1, I spotted this bug report which
complains that we don't warn about the case when we have two duplicated
handlers of type int.  can_convert_eh implements [except.handle] and
that says: A handler is a match for an exception object of type E if
- The handler is of type cv T or cv T& and E and T are the same type
   (ignoring the top-level cv-qualifiers), or [...]

but we don't implement this bullet properly for non-class types.  The
fix therefore seems pretty obvious.  Also change the return type to
bool when we're only returning yes/no.

gcc/cp/ChangeLog:

PR c++/81660
* except.c (can_convert_eh): Change the return type to bool.  If
the type TO and FROM are the same, return true.

gcc/testsuite/ChangeLog:

PR c++/81660
* g++.dg/warn/Wexceptions3.C: New test.
* g++.dg/eh/pr42859.C: Add dg-warning.
* g++.dg/torture/pr81659.C: Likewise.

Improve uninitialized warning with value range info

Function use_pred_not_overlap_with_undef_path_pred of
pass_late_warn_uninitialized
checks if predicate of variable use overlaps with predicate of undefined
control flow path.
For now, it only checks ssa_var comparing against constant, this can be
improved where
ssa_var compares against another ssa_var with value range info, as described in
comment:

+         /* Check value range info of rhs, do following transforms:
+              flag_var < [min, max]  ->  flag_var < max
+              flag_var > [min, max]  ->  flag_var > min
+
+            We can also transform LE_EXPR/GE_EXPR to LT_EXPR/GT_EXPR:
+              flag_var <= [min, max] ->  flag_var < [min, max+1]
+              flag_var >= [min, max] ->  flag_var > [min-1, max]
+            if no overflow/wrap.  */

gcc/

* tree-ssa-uninit.c (find_var_cmp_const): New function.
(use_pred_not_overlap_with_undef_path_pred): Call above.

libstdc++: Fix symbol version conflict in linker script

The change in r11-4748-50b840ac5e1d6534e345c3fee9a97ae45ced6bc7 causes
a build error on Solaris, due to the new explicit instantiation matching
patterns for two different symbol versions.

libstdc++-v3/ChangeLog:

* config/abi/pre/gnu.ver (GLIBCXX_3.4.21): Tighten up patterns
for basic_stringbuf that refer to __xfer_bufptrs.

Objective-C/C++ : Allow visibility prefix attributes on interfaces.

This passes visibiliy through without warning (so that, for example,
__attribute__((__visibility("default"))) does not result in any
diagnostic).

gcc/objc/ChangeLog:

* objc-act.c (start_class): Accept visibility attributes
without warning.

Objective-C/C++ (parsers) : Update @property attribute parsing.

At present, we are missing parsing and checking for around
half of the property attributes in use.  The existing ad hoc scheme
for the parser's communication with the Objective C validation
is not suitable for extending to cover all the missing cases.

Additionally:

1/ We were declaring errors in two cases that the reference
   implementation warns (or is silent).

   I've elected to warn for both those cases, (Wattributes) it
   could be that we should implement Wobjc-xxx-property warning
   masks (TODO).

2/ We were emitting spurious complaints about missing property
   attributes when these were not being parsed because we gave
   up on the first syntax error.

3/ The quality of the diagnostic locations was poor (that's
   true for much of Objective-C, we will have to improve it as
   we modernise areas).

We continue to attempt to keep the code, warning and error output
similar (preferably identical output) between the C and C++ front
ends.

The interface to the Objective-C-specific parts of the parsing is
simplified to a vector of parsed (but not fully-checked) property
attributes, this will simplify the addition of new attributes.

gcc/c-family/ChangeLog:

* c-objc.h (enum objc_property_attribute_group): New
(enum objc_property_attribute_kind): New.
(OBJC_PROPATTR_GROUP_MASK): New.
(struct property_attribute_info): Small class encapsulating
parser output from property attributes.
(objc_prop_attr_kind_for_rid): New
(objc_add_property_declaration): Simplify interface.
* stub-objc.c (enum rid): Dummy type.
(objc_add_property_declaration): Simplify interface.
(objc_prop_attr_kind_for_rid): New.

gcc/c/ChangeLog:

* c-parser.c (c_parser_objc_at_property_declaration):
Improve parsing fidelity. Associate better location info
with @property attributes.  Clean up the interface to
objc_add_property_declaration ().

gcc/cp/ChangeLog:

* parser.c (cp_parser_objc_at_property_declaration):
Improve parsing fidelity. Associate better location info
with @property attributes.  Clean up the interface to
objc_add_property_declaration ().

gcc/objc/ChangeLog:

* objc-act.c (objc_prop_attr_kind_for_rid): New.
(objc_add_property_declaration): Adjust to consume the
parser output using a vector of parsed attributes.

gcc/testsuite/ChangeLog:

* obj-c++.dg/property/at-property-1.mm: Adjust expected
diagnostics.
* obj-c++.dg/property/at-property-29.mm: Likewise.
* obj-c++.dg/property/at-property-4.mm: Likewise.
* obj-c++.dg/property/property-neg-2.mm: Likewise.
* objc.dg/property/at-property-1.m: Likewise.
* objc.dg/property/at-property-29.m: Likewise.
* objc.dg/property/at-property-4.m: Likewise.
* objc.dg/property/at-property-5.m: Likewise.
* objc.dg/property/property-neg-2.m: Likewise.

c++: Propagate attributes to clones in duplicate_decls [PR67453]

On the following testcase where the cdtor attributes aren't on the
in-class declaration but on an out-of-class definition, the cdtors
have their clones created from the in-class declaration, and later on
duplicate_decls updates attributes on the abstract cdtors, but nothing
propagates them to the clones.

2020-11-06 Jakub Jelinek <jakub@redhat.com>

PR c++/67453
* decl.c (duplicate_decls): Propagate DECL_ATTRIBUTES and
DECL_PRESERVE_P from olddecl to its clones if any.

* g++.dg/ext/attr-used-2.C: New test.

Darwin: Darwin 20 is to be macOS 11 (Big Sur).

As per Nigel Tufnel's assertion "... this one goes to 11".

The various parts of the code that deal with mapping Darwin versions
to macOS (X) versions need updating to deal with  a major version of
11.

So now we have, for example:

Darwin  4 => macOS (X) 10.0
…
Darwin 14 => macOS (X) 10.10
...
Darwin 19 => macOS (X) 10.15

Darwin 20 => macOS  11.0

Because of the historical duplication of the "10" in macOSX 10.xx and
the number of tools that expect this, it is likely that system tools will
allow macos11.0 and/or macosx11.0 (despite that the latter makes little
sense).

Update the link test to cover Catalina (Darwin19/10.15) and
Big Sur (Darwin20/11.0).

gcc/ChangeLog:

* config/darwin-c.c: Allow for Darwin20 to correspond to macOS 11.
* config/darwin-driver.c: Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/darwin-minversion-link.c: Allow for Darwin19 (macOS 10.15)
and Darwin20 (macOS 11.0).

rework PRE PHI translation cache

Turns out its size and time requirements can be stripped down
dramatically.

2020-11-06  Richard Biener  <rguenther@suse.de>

* tree-ssa-pre.c (expr_pred_trans_d): Modify so elements
are embedded rather than allocated.  Remove hashval member,
make all members integers.
(phi_trans_add): Adjust accordingly.
(phi_translate): Likewise.  Deal with re-allocation
of the table.

Combine new calculated ranges with existing range.

When a range is recalculated, retain what was previously known as IL changes
can produce different results from un-executed code. This also paves
the way for external injection of ranges.

gcc/
PR tree-optimization/97737
PR tree-optimization/97741
* gimple-range.cc: (gimple_ranger::range_of_stmt): Intersect newly
calculated ranges with the existing known global range.
gcc/testsuite/
* gcc.dg/pr97737.c: New.
* gcc.dg/pr97741.c: New.

Add PC as control register

gcc/
* config/rx/rx.md (CTRLREG_PC): Add.
* config/rx/rx.c (CTRLREG_PC): Add
(rx_expand_builtin_mvtc): Add warning: PC register cannot
be used as dest.

core: Rename DECL_IS_BUILTIN -> DECL_IS_UNDECLARED_BUILTIN

In cleaning up C++'s handling of hidden decls, I renamed its
DECL_BUILTIN_P, which checks for loc == BUILTINS_LOCATION to
DECL_UNDECLARED_BUILTIN_P, because the location gets updated, if user
source declares the builtin, and the predicate no longer holds.  The
original name was confusing me.  (The builtin may still retain builtin
properties in the redeclaration, and other predicates can still detect
that.)

I discovered that tree.h had its own variant 'DECL_IS_BUILTIN', which
behaves in (almost) the same manner.  And therefore has the same
mutating behaviour.

This patch deletes the C++ one, and renames tree.h's to
DECL_IS_UNDECLARED_BUILTIN, to emphasize its non-constantness.  I
guess _IS_ wins over _P

gcc/
* tree.h (DECL_IS_BUILTIN): Rename to ...
(DECL_IS_UNDECLARED_BUILTIN): ... here.  No need to use SOURCE_LOCUS.
* calls.c (maybe_warn_alloc_args_overflow): Adjust for rename.
* cfgexpand.c (pass_expand::execute): Likewise.
* dwarf2out.c (base_type_die, is_naming_typedef_decl): Likewise.
* godump.c (go_decl, go_type_decl): Likewise.
* print-tree.c (print_decl_identifier): Likewise.
* tree-pretty-print.c (dump_generic_node): Likewise.
* tree-ssa-ccp.c (pass_post_ipa_warn::execute): Likewise.
* xcoffout.c (xcoff_assign_fundamental_type_number): Likewise.
gcc/c-family/
* c-ada-spec.c (collect_ada_nodes): Rename
DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
(collect_ada_node): Likewise.
(dump_forward_type): Likewise.
* c-common.c (set_underlying_type): Rename
DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
(user_facing_original_type, c_common_finalize_early_debug): Likewise.
gcc/c/
* c-decl.c (diagnose_mismatched_decls): Rename
DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
(warn_if_shadowing, implicitly_declare, names_builtin_p)
(collect_source_refs): Likewise.
* c-typeck.c (inform_declaration, inform_for_arg)
(convert_for_assignment): Likewise.
gcc/cp/
* cp-tree.h (DECL_UNDECLARED_BUILTIN_P): Delete.
* cp-objcp-common.c (names_bultin_p): Rename
DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
* decl.c (decls_match): Likewise.  Replace
DECL_UNDECLARED_BUILTIN_P with DECL_IS_UNDECLARED_BUILTIN.
(duplicate_decls): Likewise.
* decl2.c (collect_source_refs): Likewise.
* name-lookup.c (anticipated_builtin_p, print_binding_level)
(do_nonmember_using_decl): Likewise.
* pt.c (builtin_pack_fn_p): Likewise.
* typeck.c (error_args_num): Likewise.
gcc/lto/
* lto-symtab.c (lto_symtab_merge_decls_1): Rename
DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
gcc/go/
* go-gcc.cc (Gcc_backend::call_expression): Rename
DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
libcc1/
* libcc1plugin.cc (address_rewriter): Rename
DECL_IS_BUILTIN->DECL_IS_UNDECLARED_BUILTIN.
* libcp1plugin.cc (supplement_binding): Likewise.

aarch64: Use intrinsics for upper saturating shift right

The use of vqshrn_high_n_s32 was triggering an unneeded register move, because
sqshrn2 is destructive but was declared as inline assembly in arm_neon.h. This
patch implements sqshrn2 and uqshrn2 as actual intrinsics which do not trigger
the unnecessary move, along with new tests to cover them.

gcc/ChangeLog

2020-11-06 David Candler <david.candler@arm.com>

* config/aarch64/aarch64-builtins.c
(TYPES_SHIFT2IMM): Add define.
(TYPES_SHIFT2IMM_UUSS): Add define.
(TYPES_USHIFT2IMM): Add define.
* config/aarch64/aarch64-simd.md
(aarch64_<sur>q<r>shr<u>n2_n<mode>): Add new insn for upper saturating shift right.
* config/aarch64/aarch64-simd-builtins.def: Add intrinsics.
* config/aarch64/arm_neon.h:
(vqrshrn_high_n_s16): Expand using intrinsic rather than inline asm.
(vqrshrn_high_n_s32): Likewise.
(vqrshrn_high_n_s64): Likewise.
(vqrshrn_high_n_u16): Likewise.
(vqrshrn_high_n_u32): Likewise.
(vqrshrn_high_n_u64): Likewise.
(vqrshrun_high_n_s16): Likewise.
(vqrshrun_high_n_s32): Likewise.
(vqrshrun_high_n_s64): Likewise.
(vqshrn_high_n_s16): Likewise.
(vqshrn_high_n_s32): Likewise.
(vqshrn_high_n_s64): Likewise.
(vqshrn_high_n_u16): Likewise.
(vqshrn_high_n_u32): Likewise.
(vqshrn_high_n_u64): Likewise.
(vqshrun_high_n_s16): Likewise.
(vqshrun_high_n_s32): Likewise.
(vqshrun_high_n_s64): Likewise.

gcc/testsuite/ChangeLog

2020-11-06 David Candler <david.candler@arm.com>

* gcc.target/aarch64/advsimd-intrinsics/vqrshrn_high_n.c: New testcase.
* gcc.target/aarch64/advsimd-intrinsics/vqrshrun_high_n.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vqshrn_high_n.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vqshrun_high_n.c: Likewise.
* gcc.target/aarch64/narrow_high-intrinsics.c: Update expected assembler
for sqshrun2, sqrshrun2, sqshrn2, uqshrn2, sqrshrn2 and uqrshrn2.

libcpp: Provide date routine

Joseph pointed me at cb_get_source_date_epoch, which allows repeatable
builds and solves a FIXME I had on the modules branch.  Unfortunately
it's used exclusively to generate __DATE__ and __TIME__ values, which
fallback to using a time(2) call.  It'd be nicer if the preprocessor
made whatever time value it determined available to the rest of the
compiler.  So this patch adds a new cpp_get_date function, which
abstracts the call to the get_source_date_epoch hook, or uses time
directly.  The value is cached.  Thus the timestamp I end up putting
on CMI files matches __DATE__ and __TIME__ expansions.  That seems
worthwhile.

libcpp/
* include/cpplib.h (enum class CPP_time_kind): New.
(cpp_get_date): Declare.
* internal.h (struct cpp_reader): Replace source_date_epoch with
time_stamp and time_stamp_kind.
* init.c (cpp_create_reader): Initialize them.
* macro.c (_cpp_builtin_macro_text): Use cpp_get_date.
(cpp_get_date): Broken out from _cpp_builtin_macro_text and
genericized.

aarch64: Support permutes on unpacked SVE vectors

This patch adds support for permuting unpacked SVE vectors using:

- DUP
- EXT
- REV[BHW]
- REV
- TRN[12]
- UZP[12]
- ZIP[12]

This involves rewriting the REV[BHW] permute code so that the inputs
and outputs of the insn pattern have the same mode as the vectors
being permuted.  This is different from the ACLE form, where the
reversal happens within individual elements rather than within
groups of multiple elements.

The patch does not add a conditional version of REV[BHW].  I'll come
back to that once we have partial-vector comparisons and selects.

The patch is really just enablement, adding an extra tool to the
toolbox.  It doesn't bring any significant vectorisation opportunities
on its own.  However, the patch does have one artificial example that
is now vectorised in a better way than before.

gcc/
* config/aarch64/aarch64-modes.def (VNx2BF, VNx4BF): Adjust nunits
and alignment based on the current VG.
* config/aarch64/iterators.md (SVE_ALL, SVE_24, SVE_2, SVE_4): Add
partial SVE BF modes.
(UNSPEC_REVBHW): New unspec.
(Vetype, Vesize, Vctype, VEL, Vel, vwcore, V_INT_CONTAINER)
(v_int_container, VPRED, vpred): Handle partial SVE BF modes.
(container_bits, Vcwtype): New mode attributes.
* config/aarch64/aarch64-sve.md
(@aarch64_sve_revbhw_<SVE_ALL:mode><PRED_HSD:mode>): New pattern.
(@aarch64_sve_dup_lane<mode>): Extended from SVE_FULL to SVE_ALL.
(@aarch64_sve_rev<mode>, @aarch64_sve_<perm_insn><mode>): Likewise.
(@aarch64_sve_ext<mode>): Likewise.
* config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle
E_VNx2BFmode and E_VNx4BFmode.
(aarch64_evpc_rev_local): Base the analysis on the container size
instead of the element size.  Use the new aarch64_sve_revbhw
patterns for SVE.
(aarch64_evpc_dup): Handle partial SVE data modes.  Use the
container size instead of the element size when applying the
SVE immediate limit.  Fix a previously incorrect bounds check.
(aarch64_expand_vec_perm_const_1): Handle partial SVE data modes.

gcc/testsuite/
* gcc.target/aarch64/sve/dup_lane_2.c: New test.
* gcc.target/aarch64/sve/dup_lane_3.c: Likewise.
* gcc.target/aarch64/sve/ext_4.c: Likewise.
* gcc.target/aarch64/sve/rev_2.c: Likewise.
* gcc.target/aarch64/sve/revhw_1.c: Likewise.
* gcc.target/aarch64/sve/revhw_2.c: Likewise.
* gcc.target/aarch64/sve/slp_perm_8.c: Likewise.
* gcc.target/aarch64/sve/trn1_2.c: Likewise.
* gcc.target/aarch64/sve/trn2_2.c: Likewise.
* gcc.target/aarch64/sve/uzp1_2.c: Likewise.
* gcc.target/aarch64/sve/uzp2_2.c: Likewise.
* gcc.target/aarch64/sve/zip1_2.c: Likewise.
* gcc.target/aarch64/sve/zip2_2.c: Likewise.

Add -fbit-tests option.

gcc/ChangeLog:

* common.opt: Add new -fbit-tests option.
* doc/invoke.texi: Document the option.
* tree-switch-conversion.c (bit_test_cluster::find_bit_tests):
Use the option.
* tree-switch-conversion.h (is_enabled): New function.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/switch-4.c: New test.

make PRE constant value IDs negative

This separates constant and non-constant value-ids to allow for
a more efficient constant_value_id_p and for more efficient bit-packing
inside the bitmap sets which never contain any constant values.

There's further optimization opportunities but at this stage
I'll do small refactorings.

2020-11-06  Richard Biener  <rguenther@suse.de>

* tree-ssa-sccvn.h (get_max_constant_value_id): Declare.
(get_next_constant_value_id): Likewise.
(value_id_constant_p): Inline and simplify.
* tree-ssa-sccvn.c (constant_value_ids): Remove.
(next_constant_value_id): Add.
(get_or_alloc_constant_value_id): Adjust.
(value_id_constant_p): Remove definition.
(get_max_constant_value_id): Define.
(get_next_value_id): Add assert for overflow.
(get_next_constant_value_id): Define.
(run_rpo_vn): Adjust.
(free_rpo_vn): Likewise.
(do_rpo_vn): Initialize next_constant_value_id.
* tree-ssa-pre.c (constant_value_expressions): New.
(add_to_value): Split into constant/non-constant value
handling.  Avoid exact re-allocation.
(vn_valnum_from_value_id): Adjust.
(phi_translate_1): Remove spurious exact re-allocation.
(bitmap_find_leader): Adjust.  Make sure we return
a CONSTANT value for a constant value id.
(do_pre_regular_insertion): Use 2 auto-elements for avail.
(do_pre_partial_partial_insertion): Likewise.
(init_pre): Allocate constant_value_expressions.
(fini_pre): Release constant_value_expressions.

tree-optimization/97706 - handle PHIs in pattern recog mask precison

This adds handling of PHIs to mask precision compute which is
eventually needed to detect a bool pattern when the def chain
contains such a PHI node.

2020-11-06 Richard Biener <rguenther@suse.de>

PR tree-optimization/97706
* tree-vect-patterns.c (possible_vector_mask_operation_p):
PHIs are possible mask operations.
(vect_determine_mask_precision): Handle PHIs.
(vect_determine_precisions): Walk PHIs in BB analysis.

* gcc.dg/vect/bb-slp-pr97706.c: New testcase.

c++: Parser tweaks

We need to adjust the wording for 'export'.  Between c++11 and c++20
it is deprecated.  Outside those ranges it is unsupported (at the
moment).  While here, there's also an unneeded setting of a bool --
it's inside an if block that just checked it was true.

gcc/cp/
* parser.c (cp_parser_template_declaration): Adjust 'export' warning.
(cp_parser_explicit_specialization): Remove unneeded bool setting.

testsuite: fix malloc alignment in test

gcc/testsuite/ChangeLog:

PR gcov-profile/97461
* gcc.dg/tree-prof/pr97461.c: Return aligned memory.

[Fortran] Remove OpenACC 'loop' inside 'parallel' special-case code

Instead, use the generic middle-end code, like already used for Fortran OpenACC
'loop' inside other compute constructs, orphaned 'loop' constructs, and C, C++
generally.

gcc/fortran/
* openmp.c (oacc_is_parallel, resolve_oacc_params_in_parallel):
Remove.
(resolve_oacc_loop_blocks): Don't call the former.
gcc/testsuite/
* gfortran.dg/goacc/loop-2-parallel-3.f95: Adjust.

Remove 'gfortran.dg/goacc/loop-6.f95'

What it's testing is adequately covered in other
'gfortran.dg/goacc/loop-2-parallel-*.f95' testcases.

gcc/testsuite/
* gfortran.dg/goacc/loop-6.f95: Remove.

Remove 'gfortran.dg/goacc/loop-5.f95'

What it's testing is adequately covered in other
'gfortran.dg/goacc/loop-2-*-tile.f95' testcases.

gcc/testsuite/
* gfortran.dg/goacc/loop-5.f95: Remove.

gcc-changelog: prevent double cherry-pick line

contrib/ChangeLog:

* gcc-changelog/git_commit.py: Add new check.
* gcc-changelog/test_email.py: Test it.
* gcc-changelog/test_patches.txt: Add new patch.

refactor SLP analysis

This passes down the graph entry kind down to vect_analyze_slp_instance
which simplifies it and makes it a shallow wrapper around
vect_build_slp_instance.

2020-11-06 Richard Biener <rguenther@suse.de>

* tree-vect-slp.c (vect_analyze_slp): Pass down the
SLP graph entry kind.
(vect_analyze_slp_instance): Simplify.
(vect_build_slp_instance): Adjust.
(vect_slp_check_for_constructors): Perform more
eligibility checks here.

Move ipa-refs from ggc to heap.

gcc/ChangeLog:

* ipa-ref.h (enum ipa_ref_use): Remove GTY marker.
(struct ipa_ref): Remove GTY marker; reorder for better packing.
(struct ipa_ref_list): Remove GTY marker; turn references
nad referring to va_heap, vl_ptr vectors; update accesors.
* cgraph.h (symtab_node::iterate_reference): Update.
* ipa-ref.c (ipa_ref::remove_reference): Update.
* symtab.c (symtab_node::create_reference): Update.
(symtab_node::remove_all_references): Update.
(symtab_node::resolve_alias): Update.

gcc/cp/ChangeLog:

* tree.c (cp_fix_function_decl_p): Do not access ipa_ref_list dirrectly.

ipa-modref: Fix comment typos

2020-11-06 Jakub Jelinek <jakub@redhat.com>

* ipa-modref-tree.h: Fix comment typos.
* ipa-modref.c: Likewise.