gcc.git
4 years agoFix PR97205
Bernd Edlinger [Sun, 1 Nov 2020 06:32:20 +0000 (07:32 +0100)]
Fix PR97205

This makes sure that stack allocated SSA_NAMEs are
at least MODE_ALIGNED.  Also increase the MEM_ALIGN
for the corresponding rtl objects.

gcc:
2020-11-03  Bernd Edlinger  <bernd.edlinger@hotmail.de>

PR target/97205
* cfgexpand.c (align_local_variable): Make SSA_NAMEs
at least MODE_ALIGNED.
(expand_one_stack_var_at): Increase MEM_ALIGN for SSA_NAMEs.

gcc/testsuite:
2020-11-03  Bernd Edlinger  <bernd.edlinger@hotmail.de>

PR target/97205
* gcc.c-torture/compile/pr97205.c: New test.

4 years agolibcpp: unbreak bootstrap
Nathan Sidwell [Tue, 3 Nov 2020 14:02:06 +0000 (06:02 -0800)]
libcpp: unbreak bootstrap

This fixes the bootstrap breakage I caused.  Sorry about that.

libcpp/
* init.c (cpp_read_main_file): Use cpp_get_deps result.

4 years agoAArch64: Add FLAG for AES/SHA/SM3/SM4 intrinsics [PR94442]
zhengnannan [Tue, 3 Nov 2020 13:56:39 +0000 (13:56 +0000)]
AArch64: Add FLAG for AES/SHA/SM3/SM4 intrinsics [PR94442]

2020-11-03  Zhiheng Xie  <xiezhiheng@huawei.com>
    Nannan Zheng  <zhengnannan@huawei.com>

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def: Add proper FLAG
for AES/SHA/SM3/SM4 intrinsics.

4 years agoAArch64: Add FLAG for compare intrinsics [PR94442]
zhengnannan [Tue, 3 Nov 2020 13:56:36 +0000 (13:56 +0000)]
AArch64: Add FLAG for compare intrinsics [PR94442]

2020-11-03  Zhiheng Xie  <xiezhiheng@huawei.com>
    Nannan Zheng  <zhengnannan@huawei.com>

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def: Add proper FLAG
for compare intrinsics.

4 years agoSave some memory at debug stream-in time
Richard Biener [Tue, 3 Nov 2020 11:28:03 +0000 (12:28 +0100)]
Save some memory at debug stream-in time

This allows us to release references to BLOCKs by not keeping
them rooted in the external_die_map but instead remove it from
there as soon as we created the corresponding stub DIE.  For
decls it doesn't help since we still keep the decl_die_table.

2020-11-03  Richard Biener  <rguenther@suse.de>

* dwarf2out.c (maybe_create_die_with_external_ref): Remove
hashtable entry.

4 years agoarm: Add vstN_lane_bf16 + vstNq_lane_bf16 intrisics
Andrea Corallo [Thu, 29 Oct 2020 10:20:23 +0000 (11:20 +0100)]
arm: Add vstN_lane_bf16 + vstNq_lane_bf16 intrisics

gcc/ChangeLog

2020-10-29  Andrea Corallo  <andrea.corallo@arm.com>

* config/arm/arm_neon.h (vst2_lane_bf16, vst2q_lane_bf16)
(vst3_lane_bf16, vst3q_lane_bf16, vst4_lane_bf16)
(vst4q_lane_bf16): New intrinsics.
* config/arm/arm_neon_builtins.def: Touch it for:
__builtin_neon_vst2_lanev4bf, __builtin_neon_vst2_lanev8bf,
__builtin_neon_vst3_lanev4bf, __builtin_neon_vst3_lanev8bf,
__builtin_neon_vst4_lanev4bf,__builtin_neon_vst4_lanev8bf.

gcc/testsuite/ChangeLog

2020-10-29  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/aarch64/advsimd-intrinsics/vst2_lane_bf16_indices_1.c:
Run it also for arm-*-*.
* gcc.target/aarch64/advsimd-intrinsics/vst2q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst3_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst3q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst4_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst4q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/arm/simd/vstn_lane_bf16_1.c: New test.

4 years agoarm: Add vldN_lane_bf16 + vldNq_lane_bf16 intrisics
Andrea Corallo [Mon, 26 Oct 2020 17:31:19 +0000 (18:31 +0100)]
arm: Add vldN_lane_bf16 + vldNq_lane_bf16 intrisics

gcc/ChangeLog

2020-10-29  Andrea Corallo  <andrea.corallo@arm.com>

* config/arm/arm_neon.h (vld2_lane_bf16, vld2q_lane_bf16)
(vld3_lane_bf16, vld3q_lane_bf16, vld4_lane_bf16)
(vld4q_lane_bf16): Add intrinsics.
* config/arm/arm_neon_builtins.def: Touch for:
__builtin_neon_vld2_lanev4bf, __builtin_neon_vld2_lanev8bf,
__builtin_neon_vld3_lanev4bf, __builtin_neon_vld3_lanev8bf,
__builtin_neon_vld4_lanev4bf, __builtin_neon_vld4_lanev8bf.
* config/arm/iterators.md (VQ_HS): Add V8BF to the iterator.

gcc/testsuite/ChangeLog

2020-10-29  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/aarch64/advsimd-intrinsics/vld2_lane_bf16_indices_1.c:
Run it also for the arm backend.
* gcc.target/aarch64/advsimd-intrinsics/vld2q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld3_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld3q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld4_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld4q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/arm/simd/vldn_lane_bf16_1.c: New test.

4 years agoarm: Add vst1_bf16 + vst1q_bf16 intrinsics
Andrea Corallo [Thu, 29 Oct 2020 14:11:37 +0000 (15:11 +0100)]
arm: Add vst1_bf16 + vst1q_bf16 intrinsics

gcc/ChangeLog

2020-10-29  Andrea Corallo  <andrea.corallo@arm.com>

* config/arm/arm_neon.h (vst1_bf16, vst1q_bf16): Add intrinsics.
* config/arm/arm_neon_builtins.def : Touch for:
__builtin_neon_vst1v4bf, __builtin_neon_vst1v8bf.

gcc/testsuite/ChangeLog

2020-10-29  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/arm/simd/vst1_bf16_1.c: New test.

4 years agoarm: Add vld1_bf16 + vld1q_bf16 intrinsics
Andrea Corallo [Thu, 29 Oct 2020 12:56:17 +0000 (13:56 +0100)]
arm: Add vld1_bf16 + vld1q_bf16 intrinsics

gcc/ChangeLog

2020-10-29  Andrea Corallo  <andrea.corallo@arm.com>

* config/arm/arm-builtins.c (VAR14): Define macro.
* config/arm/arm_neon_builtins.def: Touch for:
__builtin_neon_vld1v4bf, __builtin_neon_vld1v8bf.
* config/arm/arm_neon.h (vld1_bf16, vld1q_bf16): Add intrinsics.

gcc/testsuite/ChangeLog

2020-10-29  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/arm/simd/vld1_bf16_1.c: New test.

4 years agoarm: Add vst1_lane_bf16 + vstq_lane_bf16 intrinsics
Andrea Corallo [Fri, 23 Oct 2020 12:21:56 +0000 (14:21 +0200)]
arm: Add vst1_lane_bf16 + vstq_lane_bf16 intrinsics

gcc/ChangeLog

2020-10-23  Andrea Corallo  <andrea.corallo@arm.com>

* config/arm/arm_neon.h (vst1_lane_bf16, vst1q_lane_bf16): Add
intrinsics.
* config/arm/arm_neon_builtins.def (STORE1LANE): Add v4bf, v8bf.

gcc/testsuite/ChangeLog

2020-10-23  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/arm/simd/vst1_lane_bf16_1.c: New testcase.
* gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vst1_lane_bf16_indices_1.c: Likewise.

4 years agoarm: Add vld1_lane_bf16 + vldq_lane_bf16 intrinsics
Andrea Corallo [Wed, 21 Oct 2020 09:16:01 +0000 (11:16 +0200)]
arm: Add vld1_lane_bf16 + vldq_lane_bf16 intrinsics

gcc/ChangeLog

2020-10-21  Andrea Corallo  <andrea.corallo@arm.com>

* config/arm/arm_neon_builtins.def: Add to LOAD1LANE v4bf, v8bf.
* config/arm/arm_neon.h (vld1_lane_bf16, vld1q_lane_bf16): Add
intrinsics.

gcc/testsuite/ChangeLog

2020-10-21  Andrea Corallo  <andrea.corallo@arm.com>

* gcc.target/arm/simd/vld1_lane_bf16_1.c: New testcase.
* gcc.target/arm/simd/vld1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c: Likewise.

4 years agoc++: cp_tree_equal cleanups
Nathan Sidwell [Tue, 3 Nov 2020 13:11:42 +0000 (05:11 -0800)]
c++: cp_tree_equal cleanups

A couple of small fixes.  I noticed bind_template_template_parms was
not marking the parm a template parm (this broke some module
handling).  Debugging CALL_EXPR comparisons led me to refactor
cp_tree_equal's CALL_EXPR code (and my recent fix to debug printing of
same).  Finally TREE_VECS are best compared by comp_template_args.  I
recall that last piece being a left over from fixes during gcc-10.
I've been using it on the modules branch since then.

gcc/cp/
* tree.c (bind_template_template_parm): Mark the parm as a
template parm.
(cp_tree_equal): Refactor CALL_EXPR.  Use comp_template_args for
TREE_VECs.

4 years agoc++: rtti cleanups
Nathan Sidwell [Tue, 3 Nov 2020 13:08:18 +0000 (05:08 -0800)]
c++: rtti cleanups

Here are a few cleanups from the modules branch.  Generally some RAII,
and a bit of lazy namespace pushing.

gcc/cp/
* rtti.c (init_rtti_processing): Move var decl to its init.
(get_tinfo_decl): Likewise.  Break out creation to called helper
...
(get_tinfo_decl_direct): ... here.
(build_dynamic_cast_1): Move var decls to their initializers.
(tinfo_base_init): Set decl's location to BUILTINS_LOCATION.
(get_tinfo_desc): Only push ABI namespace when needed.  Set type's
context.

4 years agolibcpp: dependency emission tidying
Nathan Sidwell [Tue, 3 Nov 2020 12:59:48 +0000 (04:59 -0800)]
libcpp: dependency emission tidying

This patch cleans up the interface to the dependency generation a
little.  We now only check the option in one place, and the
cpp_get_deps function returns nullptr if there are no dependencies.  I
also reworded the -MT and -MQ help text to be make agnostic -- as
there are ideas about emitting, say, JSON.

libcpp/
* include/mkdeps.h: Include cpplib.h
(deps_write): Adjust first parm type.
* mkdeps.c: Include internal.h
(make_write): Adjust first parm type.  Check phony option
directly.
(deps_write): Adjust first parm type.
* init.c (cpp_read_main_file): Use get_deps.
* directives.c (cpp_get_deps): Check option before initializing.
gcc/c-family/
* c.opt (MQ,MT): Reword description to be make-agnostic.
gcc/fortran/
* cpp.c (gfc_cpp_add_dep): Only add dependency if we're recording
them.
(gfc_cpp_init): Likewise for target.

4 years agoaarch64: ACLE intrinsics convert BF16 to Float32
Dennis Zhang [Tue, 3 Nov 2020 13:00:51 +0000 (13:00 +0000)]
aarch64: ACLE intrinsics convert BF16 to Float32

This patch enables intrinsics to convert BFloat16 scalar and vector
operands to Float32 modes. The intrinsics are implemented by shifting
each BFloat16 item 16 bits to left using shl/shll/shll2 instructions.

gcc/ChangeLog:

2020-11-03  Dennis Zhang  <dennis.zhang@arm.com>

* config/aarch64/aarch64-simd-builtins.def(vbfcvt): New entry.
(vbfcvt_high, bfcvt): Likewise.
* config/aarch64/aarch64-simd.md(aarch64_vbfcvt<mode>): New entry.
(aarch64_vbfcvt_highv8bf, aarch64_bfcvtsf): Likewise.
* config/aarch64/arm_bf16.h (vcvtah_f32_bf16): New intrinsic.
* config/aarch64/arm_neon.h (vcvt_f32_bf16): Likewise.
(vcvtq_low_f32_bf16, vcvtq_high_f32_bf16): Likewise.

gcc/testsuite/ChangeLog

* gcc.target/aarch64/advsimd-intrinsics/bfcvt-compile.c
(test_vcvt_f32_bf16, test_vcvtq_low_f32_bf16): New tests.
(test_vcvtq_high_f32_bf16, test_vcvth_f32_bf16): Likewise.

4 years agobootstrap/97666 - fix array of bool allocation
Richard Biener [Tue, 3 Nov 2020 11:06:19 +0000 (12:06 +0100)]
bootstrap/97666 - fix array of bool allocation

This fixes the bad assumption that sizeof (bool) == 1

2020-11-03  Richard Biener  <rguenther@suse.de>

PR bootstrap/97666
* tree-vect-slp.c (vect_build_slp_tree_2): Scale
allocation of skip_args by sizeof (bool).

4 years agotree-optimization/80928 - SLP vectorize nested loop induction
Richard Biener [Tue, 3 Nov 2020 10:52:47 +0000 (11:52 +0100)]
tree-optimization/80928 - SLP vectorize nested loop induction

This adds SLP vectorization of nested inductions.

2020-11-03  Richard Biener <rguenther@suse.de>

PR tree-optimization/80928
* tree-vect-loop.c (vectorizable_induction): SLP vectorize
nested inductions.

* gcc.dg/vect/vect-outer-slp-2.c: New testcase.
* gcc.dg/vect/vect-outer-slp-3.c: Likewise.

4 years agotestsuite: Fix gcc.target/i386/zero-scratch-regs-*.c scan-asm directives
Uros Bizjak [Tue, 3 Nov 2020 12:06:42 +0000 (13:06 +0100)]
testsuite: Fix gcc.target/i386/zero-scratch-regs-*.c scan-asm directives

Improve zero-scratch-regs-*.c scan-asm regexps
and add target selectors for 32bit targets.

2020-11-03  Uroš Bizjak  <ubizjak@gmail.com>

gcc/testsuite/ChangeLog:

* gcc.target/i386/zero-scratch-regs-1.c: Add ia32 target
selector where appropriate.  Improve scan-assembler regexp.
* gcc.target/i386/zero-scratch-regs-2.c: Ditto.
* gcc.target/i386/zero-scratch-regs-3.c: Ditto.
* gcc.target/i386/zero-scratch-regs-4.c: Ditto.
* gcc.target/i386/zero-scratch-regs-5.c: Ditto.
* gcc.target/i386/zero-scratch-regs-6.c: Ditto.
* gcc.target/i386/zero-scratch-regs-7.c: Ditto.
* gcc.target/i386/zero-scratch-regs-8.c: Ditto.
* gcc.target/i386/zero-scratch-regs-9.c: Ditto.
* gcc.target/i386/zero-scratch-regs-10.c: Ditto.
* gcc.target/i386/zero-scratch-regs-13.c: Ditto.
* gcc.target/i386/zero-scratch-regs-14.c: Ditto.
* gcc.target/i386/zero-scratch-regs-15.c: Ditto.
* gcc.target/i386/zero-scratch-regs-16.c: Ditto.
* gcc.target/i386/zero-scratch-regs-17.c: Ditto.
* gcc.target/i386/zero-scratch-regs-18.c: Ditto.
* gcc.target/i386/zero-scratch-regs-19.c: Ditto.
* gcc.target/i386/zero-scratch-regs-20.c: Ditto.
* gcc.target/i386/zero-scratch-regs-21.c: Ditto.
* gcc.target/i386/zero-scratch-regs-22.c: Ditto.
* gcc.target/i386/zero-scratch-regs-23.c: Ditto.
* gcc.target/i386/zero-scratch-regs-24.c: Ditto.
* gcc.target/i386/zero-scratch-regs-25.c: Ditto.
* gcc.target/i386/zero-scratch-regs-26.c: Ditto.
* gcc.target/i386/zero-scratch-regs-27.c: Ditto.
* gcc.target/i386/zero-scratch-regs-28.c: Ditto.
* gcc.target/i386/zero-scratch-regs-29.c: Ditto.
* gcc.target/i386/zero-scratch-regs-30.c: Ditto.
* gcc.target/i386/zero-scratch-regs-31.c: Ditto.

4 years agoAdd missing require-effective-target lto
Olivier Hainque [Fri, 28 Feb 2020 16:44:57 +0000 (16:44 +0000)]
Add missing require-effective-target lto

This prevents failure of an lto test in configurations
missing LTO support, such as VxWorks for kernel mode.

2020-11-02  Olivier Hainque  <hainque@adacore.com>

gcc/testsuite/
* gcc.dg/tree-ssa/pr71077.c: Add
dg-require-effective-target lto.

4 years agoAdd dg-require-effective-target fpic to gcc i386 tests
Olivier Hainque [Tue, 3 Nov 2020 09:51:43 +0000 (09:51 +0000)]
Add dg-require-effective-target fpic to gcc i386 tests

This change adds

 /* { dg-require-effective-target fpic } */

to tests in gcc.target/i386 that do use -fpic or -fPIC
but don't currently query the target support.

This corresponds to what many other fpic tests do
and helps the vxWorks ports at least, as -fpic is
typically not supported in at least one of the two
major modes of such port (kernel vs RTP).

2020-11-03  Olivier Hainque  <hainque@adacore.com>

gcc/testsuite/

* gcc.target/i386/pr45352-1.c: Add dg-require-effective-target fpic.
* gcc.target/i386/pr47602.c: Likewise.
* gcc.target/i386/pr55151.c: Likewise.
* gcc.target/i386/pr55458.c: Likewise.
* gcc.target/i386/pr56348.c: Likewise.
* gcc.target/i386/pr57097.c: Likewise.
* gcc.target/i386/pr65753.c: Likewise.
* gcc.target/i386/pr65915.c: Likewise.
* gcc.target/i386/pr66232-5.c: Likewise.
* gcc.target/i386/pr66334.c: Likewise.
* gcc.target/i386/pr66819-2.c: Likewise.
* gcc.target/i386/pr67265.c: Likewise.
* gcc.target/i386/pr81481.c: Likewise.
* gcc.target/i386/pr83994.c: Likewise.

4 years agoAvoid recursion in tree-inline
Jan Hubicka [Tue, 3 Nov 2020 10:56:05 +0000 (11:56 +0100)]
Avoid recursion in tree-inline

gcc/ChangeLog:

2020-11-03  Jan Hubicka  <hubicka@ucw.cz>

PR ipa/97578
* ipa-inline-transform.c (maybe_materialize_called_clones): New
function.
(inline_transform): Use it.

gcc/testsuite/ChangeLog:

2020-11-03  Jan Hubicka  <hubicka@ucw.cz>

* gcc.c-torture/compile/pr97578.c: New test.

4 years agotestsuite/97688 - fix check_vect () with __AVX2__
Richard Biener [Tue, 3 Nov 2020 09:24:02 +0000 (10:24 +0100)]
testsuite/97688 - fix check_vect () with __AVX2__

This fixes the cpuid check to always specify a subleaf zero
which is required to detect AVX2 and doesn't hurt for level one.
Without this fix we get zero runtime coverage when -mavx2 is
specified.

2020-11-03  Richard Biener  <rguenther@suse.de>

PR testsuite/97688
* gcc.dg/vect/tree-vect.h (check_vect): Fix the x86 cpuid
check to always specify subleaf zero.

4 years agotree-optimization/97678 - fix SLP induction epilogue vectorization
Richard Biener [Tue, 3 Nov 2020 08:53:11 +0000 (09:53 +0100)]
tree-optimization/97678 - fix SLP induction epilogue vectorization

This restores not tracking SLP nodes for induction initial values
in not nested context because this interferes with peeling and
epilogue vectorization.

2020-11-03  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97678
* tree-vect-slp.c (vect_build_slp_tree_2): Do not track
the initial values of inductions when not nested.
* tree-vect-loop.c (vectorizable_induction): Look at
PHI node initial values again for SLP and not nested
inductions.  Handle LOOP_VINFO_MASK_SKIP_NITERS and cost
invariants.

* gcc.dg/vect/pr97678.c: New testcase.

4 years agoFortran: Add !GCC$ attributes DEPRECATED
Tobias Burnus [Tue, 3 Nov 2020 08:55:58 +0000 (09:55 +0100)]
Fortran: Add !GCC$ attributes DEPRECATED

gcc/fortran/ChangeLog:

* decl.c (ext_attr_list): Add EXT_ATTR_DEPRECATED.
* gfortran.h (ext_attr_id_t): Ditto.
* gfortran.texi (GCC$ ATTRIBUTES): Document it.
* resolve.c (resolve_variable, resolve_function,
resolve_call, resolve_values): Show -Wdeprecated-declarations warning.
* trans-decl.c (add_attributes_to_decl): Skip those
with no middle_end_name.

gcc/testsuite/ChangeLog:

* gfortran.dg/attr_deprecated.f90: New test.

4 years agox86: Optimize aes<aeswideklvariant>u8 a bit, fix whitespace
Uros Bizjak [Tue, 3 Nov 2020 08:51:01 +0000 (09:51 +0100)]
x86: Optimize aes<aeswideklvariant>u8 a bit, fix whitespace

2020-11-03  Uroš Bizjak  <ubizjak@gmail.com>

gcc/

* config/i386/sse.md (aes<aeswideklvariant>u8):
Do not use xmm_regs array.  Fix whitespace.

4 years agox86: Fix comment in ix86_expand_builtin
Uros Bizjak [Tue, 3 Nov 2020 08:46:59 +0000 (09:46 +0100)]
x86: Fix comment in ix86_expand_builtin

2020-11-03  Uroš Bizjak  <ubizjak@gmail.com>

gcc/

* config/i386/i386-expand.c (ix86_expand_builtin): Fix comment.

4 years ago[OpenACC] Enable inconsistent nested 'reduction' clauses checking for OpenACC 'kernels'
Thomas Schwinge [Thu, 22 Oct 2020 09:04:22 +0000 (11:04 +0200)]
[OpenACC] Enable inconsistent nested 'reduction' clauses checking for OpenACC 'kernels'

gcc/
* omp-low.c (scan_omp_for) <OpenACC>: Move earlier inconsistent
nested 'reduction' clauses checking.
gcc/testsuite/
* c-c++-common/goacc/nested-reductions-1-kernels.c: Extend.
* c-c++-common/goacc/nested-reductions-2-kernels.c: Likewise.
* gfortran.dg/goacc/nested-reductions-1-kernels.f90: Likewise.
* gfortran.dg/goacc/nested-reductions-2-kernels.f90: Likewise.

4 years ago[OpenACC] Split up testcases for inconsistent nested 'reduction' clauses checking
Thomas Schwinge [Thu, 22 Oct 2020 07:45:31 +0000 (09:45 +0200)]
[OpenACC] Split up testcases for inconsistent nested 'reduction' clauses checking

gcc/testsuite/
* c-c++-common/goacc/nested-reductions.c: Split file into...
* c-c++-common/goacc/nested-reductions-1-kernels.c: ... this...
* c-c++-common/goacc/nested-reductions-1-parallel.c: ..., this...
* c-c++-common/goacc/nested-reductions-1-routine.c: ..., and this.
* c-c++-common/goacc/nested-reductions-warn.c: Split file into...
* c-c++-common/goacc/nested-reductions-2-kernels.c: ... this...
* c-c++-common/goacc/nested-reductions-2-parallel.c: ..., this...
* c-c++-common/goacc/nested-reductions-2-routine.c: ..., and this.
* gfortran.dg/goacc/nested-reductions.f90: Split file into...
* gfortran.dg/goacc/nested-reductions-1-kernels.f90: ... this...
* gfortran.dg/goacc/nested-reductions-1-parallel.f90: ..., this...
* gfortran.dg/goacc/nested-reductions-1-routine.f90: ..., and
this.
* gfortran.dg/goacc/nested-reductions-warn.f90: Split file into...
* gfortran.dg/goacc/nested-reductions-2-kernels.f90: ... this...
* gfortran.dg/goacc/nested-reductions-2-parallel.f90: ..., this...
* gfortran.dg/goacc/nested-reductions-2-routine.f90: ..., and
this.

4 years agolibstdc++: use lt_host_flags for libstdc++.la
Jonathan Yong [Tue, 3 Nov 2020 07:47:12 +0000 (07:47 +0000)]
libstdc++: use lt_host_flags for libstdc++.la

For platforms like Mingw and Cygwin, cygwin refuses to generate the
shared library without using -no-undefined.

Attached patch makes sure the right flags are used, since libtool is
already used to link libstdc++.

libstdc++-v3/ChangeLog:

* src/Makefile.am (libstdc___la_LINK): Add lt_host_flags.
* src/Makefile.in: Regenerate.

4 years ago[Fortran] More precise location information for OpenACC 'gang', 'worker', 'vector...
Thomas Schwinge [Tue, 27 Oct 2020 16:14:10 +0000 (17:14 +0100)]
[Fortran] More precise location information for OpenACC 'gang', 'worker', 'vector' clauses with argument [PR92793]

gcc/fortran/
PR fortran/92793
* trans-openmp.c (gfc_trans_omp_clauses): More precise location
information for OpenACC 'gang', 'worker', 'vector' clauses with
argument.
gcc/testsuite/
PR fortran/92793
* gfortran.dg/goacc/pr92793-1.f90: Adjust.

4 years ago[OpenACC] More precise diagnostics for 'gang', 'worker', 'vector' clauses with argume...
Thomas Schwinge [Tue, 27 Oct 2020 16:13:16 +0000 (17:13 +0100)]
[OpenACC] More precise diagnostics for 'gang', 'worker', 'vector' clauses with arguments on 'loop' only allowed in 'kernels' regions

Instead of at the location of the 'loop' directive, 'error_at' the location of
the improper clause, and 'inform' at the location of the enclosing parent
compute construct/routine.

The Fortran testcases come with some XFAILing, to be resolved later.

gcc/
* omp-low.c (scan_omp_for) <OpenACC>: More precise diagnostics for
'gang', 'worker', 'vector' clauses with arguments only allowed in
'kernels' regions.
gcc/testsuite/
* c-c++-common/goacc/pr92793-1.c: Extend.
* gfortran.dg/goacc/pr92793-1.f90: Likewise.

4 years agopass: Run cleanup passes before SLP [PR96789]
Kewen Lin [Tue, 3 Nov 2020 02:51:47 +0000 (02:51 +0000)]
pass: Run cleanup passes before SLP [PR96789]

As the discussion in PR96789, we found that some scalar stmts
which can be eliminated by some passes after SLP, but we still
modeled their costs when trying to SLP, it could impact
vectorizer's decision.  One typical case is the case in PR96789
on target Power.

As Richard suggested there, this patch is to introduce one pass
called pre_slp_scalar_cleanup which has some secondary clean up
passes, for now they are FRE and DSE.  It introduces one new
TODO flags group called pending TODO flags, unlike normal TODO
flags, the pending TODO flags are passed down in the pipeline
until one of its consumers can perform the requested action.
Consumers should then clear the flags for the actions that they
have taken.

Soem compilation time statistics on all SPEC2017 INT bmks were
collected on one Power9 machine for several option sets below:
  A1: -Ofast -funroll-loops
  A2: -O1
  A3: -O1 -funroll-loops
  A4: -O2
  A5: -O2 -funroll-loops

the corresponding increment rate is trivial:
  A1       A2       A3        A4        A5
  0.08%    0.00%    -0.38%    -0.10%    -0.05%

Bootstrapped/regtested on powerpc64le-linux-gnu P8.

gcc/ChangeLog:

PR tree-optimization/96789
* function.h (struct function): New member unsigned pending_TODOs.
* passes.c (class pass_pre_slp_scalar_cleanup): New class.
(make_pass_pre_slp_scalar_cleanup): New function.
(pass_data_pre_slp_scalar_cleanup): New pass data.
* passes.def: (pass_pre_slp_scalar_cleanup): New pass, add
pass_fre and pass_dse as its children.
* timevar.def (TV_SCALAR_CLEANUP): New timevar.
* tree-pass.h (PENDING_TODO_force_next_scalar_cleanup): New
pending TODO flag.
(make_pass_pre_slp_scalar_cleanup): New declare.
* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1):
Once any outermost loop gets unrolled, flag cfun pending_TODOs
PENDING_TODO_force_next_scalar_cleanup on.

gcc/testsuite/ChangeLog:

PR tree-optimization/96789
* gcc.dg/tree-ssa/ssa-dse-28.c: Adjust.
* gcc.dg/tree-ssa/ssa-dse-29.c: Likewise.
* gcc.dg/vect/bb-slp-41.c: Likewise.
* gcc.dg/tree-ssa/pr96789.c: New test.

4 years agolibgcc: Expose the instruction pointer and stack pointer in SEH _Unwind_Backtrace
Martin Storsjö [Tue, 8 Sep 2020 12:21:51 +0000 (15:21 +0300)]
libgcc: Expose the instruction pointer and stack pointer in SEH _Unwind_Backtrace

Previously, the SEH version of _Unwind_Backtrace did unwind
the stack and call the provided callback function as intended,
but there was little the caller could do within the callback to
actually get any info about that particular level in the unwind.

Set the ra and cfa pointers, which are used by _Unwind_GetIP
and _Unwind_GetCFA, to allow using these functions from the
callacb to inspect the state at each stack frame.

2020-09-08  Martin Storsjö  <martin@martin.st>

libgcc/
* unwind-seh.c (_Unwind_Backtrace): Set the ra and cfa pointers
before calling the callback.

4 years agoDaily bump.
GCC Administrator [Tue, 3 Nov 2020 00:16:23 +0000 (00:16 +0000)]
Daily bump.

4 years agocan_implement_as_sibling_call_p REG_PARM_STACK_SPACE check
Alan Modra [Sun, 27 Sep 2020 09:41:58 +0000 (19:11 +0930)]
can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check

This moves an #ifdef block of code from calls.c to
targetm.function_ok_for_sibcall.  Only two targets, x86 and rs6000,
define REG_PARM_STACK_SPACE or OUTGOING_REG_PARM_STACK_SPACE macros
that might vary depending on the called function.  Macros like
UNITS_PER_WORD don't change over a function boundary, nor does the
MIPS ABI, nor does TARGET_64BIT on PA-RISC.  Other targets are even
more trivially proven to not need the calls.c code.

Besides cleaning up a small piece of #ifdef code, the motivation for
this patch is to allow tail calls on PowerPC for functions that
require less reg_parm_stack_space than their caller.  The original
code in calls.c only permitted tail calls when exactly equal, but on
PowerPC we can tail call if the callee has less or equal
REG_PARM_STACK_SPACE than the caller, as demonstrated by the
testcase.  So we should use

  /* If reg parm stack space increases, we cannot sibcall.  */
  if (REG_PARM_STACK_SPACE (decl ? decl : fntype)
      > INCOMING_REG_PARM_STACK_SPACE (current_function_decl))

and note the change to use INCOMING_REG_PARM_STACK_SPACE.
REG_PARM_STACK_SPACE has always been wrong there for PowerPC.  See
https://gcc.gnu.org/pipermail/gcc-patches/2014-May/389867.html for why
if you're curious.  Not that it matters, because PowerPC can do
without this check entirely, relying on a stack slot test in generic
code.

a) The generic code checks that arg passing stack in the callee is not
   greater than that in the caller, and,
b) ELFv2 only allocates reg_parm_stack_space when some parameter is
   passed on the stack.
Point (b) means that zero reg_parm_stack_space implies zero stack
space, and non-zero reg_parm_stack_space implies non-zero stack
space.  So the case of 0 reg_parm_stack_space in the caller and 64 in
the callee will be caught by (a).

gcc/
PR middle-end/97267
* calls.h (maybe_complain_about_tail_call): Declare.
* calls.c (maybe_complain_about_tail_call): Make global.
(can_implement_as_sibling_call_p): Delete reg_parm_stack_space
param.  Adjust caller.  Move REG_PARM_STACK_SPACE check to..
* config/i386/i386.c (ix86_function_ok_for_sibcall): ..here.

gcc/testsuite/
PR middle-end/97267
* gcc.target/powerpc/pr97267.c: New test.

4 years agoExpand reg_equiv when scratches are removed.
Vladimir N. Makarov [Mon, 2 Nov 2020 21:52:17 +0000 (16:52 -0500)]
Expand reg_equiv when scratches are removed.

gcc/ChangeLog:

* ira.c (ira_remove_scratches): Rename to remove_scratches.  Make
it static and returning flag of any change.
(ira.c): Call ira_expand_reg_equiv in case of removing scratches.

4 years agox86: Also require MMX for __builtin_ia32_maskmovq
H.J. Lu [Mon, 21 Sep 2020 12:33:46 +0000 (05:33 -0700)]
x86: Also require MMX for __builtin_ia32_maskmovq

MMX emulation with SEE is implemented at MMX intrinsic level, not at MMX
instruction level.  _mm_maskmove_si64 intrinsic for "MASKMOVQ mm1, mm2"
is emulated with __builtin_ia32_maskmovdqu.  Since SSE "MASKMOVQ mm1, mm2"
builtin function, __builtin_ia32_maskmovq, can't be emulated with XMM
registers, make __builtin_ia32_maskmovq also require MMX instead of SSE
only.

gcc/

PR target/97140
* config/i386/i386-expand.c (ix86_expand_builtin): Require MMX
for __builtin_ia32_maskmovq.

gcc/testsuite/

PR target/97140
* gcc.target/i386/pr97140.c: New test.

4 years agoDaily bump.
GCC Administrator [Mon, 2 Nov 2020 20:53:00 +0000 (20:53 +0000)]
Daily bump.

4 years agoCorrect -Wstringop-overflow and -Wstringop-overread.
Martin Sebor [Mon, 2 Nov 2020 20:47:45 +0000 (13:47 -0700)]
Correct -Wstringop-overflow and -Wstringop-overread.

gcc/ChangeLog:
* doc/invoke.texi (-Wstringop-overflow): Correct default setting.
(-Wstringop-overread): Move past -Wstringop-overflow.

4 years agogcc: quote characters in texi source
François-Xavier Coudert [Mon, 2 Nov 2020 20:15:10 +0000 (21:15 +0100)]
gcc: quote characters in texi source

gcc/ChangeLog:

PR bootstrap/57076
* Makefile.in (gcc-vers.texi): Quote @, { and }.

4 years agolibstdc++: Add c++2a <syncstream>
Thomas Rodgers [Mon, 2 Nov 2020 18:06:06 +0000 (10:06 -0800)]
libstdc++: Add c++2a <syncstream>

libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in (INPUT): Add new header.
* include/Makefile.am (std_headers): Add new header.
* include/Makefile.in: Regenerate.
* include/precompiled/stdc++.h: Include new header.
* include/std/syncstream: New header.
* include/std/version: Add __cpp_lib_syncbuf.
* testsuite/27_io/basic_syncbuf/1.cc: New test.
* testsuite/27_io/basic_syncbuf/2.cc: Likewise.
* testsuite/27_io/basic_syncbuf/basic_ops/1.cc:
Likewise.
* testsuite/27_io/basic_syncbuf/requirements/types.cc:
Likewise.
* testsuite/27_io/basic_syncbuf/sync_ops/1.cc:
Likewise.
* testsuite/27_io/basic_syncstream/1.cc: Likewise.
* testsuite/27_io/basic_syncstream/2.cc: Likewise.
* testsuite/27_io/basic_syncstream/basic_ops/1.cc:
Likewise.
* testsuite/27_io/basic_syncstream/requirements/types.cc:
Likewise.

4 years agoc++: Fixup some vardecls and whitespace
Nathan Sidwell [Mon, 2 Nov 2020 18:28:52 +0000 (10:28 -0800)]
c++: Fixup some vardecls and whitespace

Move some var decls to their initializers.  Correct some whitespace.

gcc/cp/
* decl.c (start_decl_1): Refactor declarations.  Fixup some
whitespace.
(lookup_and_check_tag): Fixup some whitespace.

4 years agoc++: refactor duplicate decls
Nathan Sidwell [Mon, 2 Nov 2020 18:24:16 +0000 (10:24 -0800)]
c++: refactor duplicate decls

A couple of paths in duplicate decls dealing with templates and
builtins were overly complicated.  Fixing thusly.

gcc/cp/
* decl.c (duplicate_decls): Refactor some template & builtin
handling.

4 years agoc++: Delete unused hash type
Nathan Sidwell [Mon, 2 Nov 2020 17:29:14 +0000 (09:29 -0800)]
c++: Delete unused hash type

Since I redid block-scope extern decls, the need for a uid->decl
hasher has gone away.  Deleting thusly.

gcc/cp/
* cp-tree.h (struct cxx_int_tree_map): Delete.
(struct cxx_int_tree_map_hasher): Delete.
* cp-gimplify.c (cxx_int_tree_map_hasher::equal): Delete.
(cxx_int_tree_map_hasher::hash): Delete.

4 years agoc++: Don't purge the satisfaction caches
Patrick Palka [Mon, 2 Nov 2020 18:19:29 +0000 (13:19 -0500)]
c++: Don't purge the satisfaction caches

The adoption of P2104 ("Disallow changing concept values") means we can
memoize the result of satisfaction indefinitely and no longer have to
clear the satisfaction caches on various events that would affect
satisfaction.  To that end, this patch removes the invalidation routine
clear_satisfaction_cache and adjusts its callers appropriately.

This provides a large reduction in compile time and memory use in some
cases.  For example, on the libstdc++ test std/ranges/adaptor/join.cc,
compile time and memory usage drops nearly 75%, from 7.5s/770MB to
2s/230MB, with a --enable-checking=release compiler.

gcc/cp/ChangeLog:

* class.c (finish_struct_1): Don't call clear_satisfaction_cache.
* constexpr.c (clear_cv_and_fold_caches): Likewise.  Remove bool
parameter.
* constraint.cc (clear_satisfaction_cache): Remove definition.
* cp-tree.h (clear_satisfaction_cache): Remove declaration.
(clear_cv_and_fold_caches): Remove bool parameter.
* typeck2.c (store_init_value): Remove argument to
clear_cv_and_fold_caches.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-complete1.C: Delete test that became
ill-formed after P2104.

4 years agoAdd bcd builtings listed in appendix B of the ABI
Carl Love [Mon, 31 Aug 2020 21:12:31 +0000 (16:12 -0500)]
Add bcd builtings listed in appendix B of the ABI

2020-10-29  Carl Love  <cel@us.ibm.com>

gcc/
PR target/93449
* config/rs6000/altivec.h (__builtin_bcdadd, __builtin_bcdadd_lt,
__builtin_bcdadd_eq, __builtin_bcdadd_gt, __builtin_bcdadd_ofl,
__builtin_bcdadd_ov, __builtin_bcdsub, __builtin_bcdsub_lt,
__builtin_bcdsub_eq, __builtin_bcdsub_gt, __builtin_bcdsub_ofl,
__builtin_bcdsub_ov, __builtin_bcdinvalid, __builtin_bcdmul10,
__builtin_bcddiv10, __builtin_bcd2dfp, __builtin_bcdcmpeq,
__builtin_bcdcmpgt, __builtin_bcdcmplt, __builtin_bcdcmpge,
__builtin_bcdcmple): Add defines.
* config/rs6000/altivec.md: Add UNSPEC_BCDSHIFT.
(BCD_TEST): Add le, ge to code iterator.
Add VBCD mode iterator.
(bcd<bcd_add_sub>_test, *bcd<bcd_add_sub>_test2,
bcd<bcd_add_sub>_<code>, bcd<bcd_add_sub>_<code>): Add mode to name.
Change iterator from V1TI to VBCD.
(*bcdinvalid_<mode>, bcdshift_v16qi): New define_insn.
(bcdinvalid_<mode>, bcdmul10_v16qi, bcddiv10_v16qi): New define.
* config/rs6000/dfp.md (dfp_denbcd_v16qi_inst): New define_insn.
(dfp_denbcd_v16qi): New define_expand.
* config/rs6000/rs6000-builtin.def (BU_P8V_MISC_1): New define.
(BCDADD): Replaced with BCDADD_V1TI and BCDADD_V16QI.
(BCDADD_LT): Replaced with BCDADD_LT_V1TI and BCDADD_LT_V16QI.
(BCDADD_EQ): Replaced with BCDADD_EQ_V1TI and BCDADD_EQ_V16QI.
(BCDADD_GT): Replaced with BCDADD_GT_V1TI and BCDADD_GT_V16QI.
(BCDADD_OV): Replaced with BCDADD_OV_V1TI and BCDADD_OV_V16QI.
(BCDSUB_V1TI, BCDSUB_V16QI, BCDSUB_LT_V1TI, BCDSUB_LT_V16QI,
BCDSUB_LE_V1TI, BCDSUB_LE_V16QI, BCDSUB_EQ_V1TI, BCDSUB_EQ_V16QI,
BCDSUB_GT_V1TI, BCDSUB_GT_V16QI, BCDSUB_GE_V1TI, BCDSUB_GE_V16QI,
BCDSUB_OV_V1TI, BCDSUB_OV_V16QI, BCDINVALID_V1TI, BCDINVALID_V16QI,
BCDMUL10_V16QI, BCDDIV10_V16QI, DENBCD_V16QI): New builtin definitions.
(BCDADD, BCDADD_LT, BCDADD_EQ, BCDADD_GT, BCDADD_OV, BCDSUB, BCDSUB_LT,
BCDSUB_LE, BCDSUB_EQ, BCDSUB_GT, BCDSUB_GE, BCDSUB_OV, BCDINVALID,
BCDMUL10, BCDDIV10, DENBCD): New overload definitions.
* config/rs6000/rs6000-call.c (P8V_BUILTIN_VEC_BCDADD, P8V_BUILTIN_VEC_BCDADD_LT,
P8V_BUILTIN_VEC_BCDADD_EQ, P8V_BUILTIN_VEC_BCDADD_GT, P8V_BUILTIN_VEC_BCDADD_OV,
P8V_BUILTIN_VEC_BCDINVALID, P9V_BUILTIN_VEC_BCDMUL10, P8V_BUILTIN_VEC_DENBCD.
P8V_BUILTIN_VEC_BCDSUB, P8V_BUILTIN_VEC_BCDSUB_LT, P8V_BUILTIN_VEC_BCDSUB_LE,
P8V_BUILTIN_VEC_BCDSUB_EQ, P8V_BUILTIN_VEC_BCDSUB_GT, P8V_BUILTIN_VEC_BCDSUB_GE,
P8V_BUILTIN_VEC_BCDSUB_OV): New overloaded specifications.
(CODE_FOR_bcdadd): Replaced with CODE_FOR_bcdadd_v16qi and CODE_FOR_bcdadd_v1ti.
(CODE_FOR_bcdadd_lt): Replaced with CODE_FOR_bcdadd_lt_v16qi and CODE_FOR_bcdadd_lt_v1ti.
(CODE_FOR_bcdadd_eq): Replaced with CODE_FOR_bcdadd_eq_v16qi and CODE_FOR_bcdadd_eq_v1ti.
(CODE_FOR_bcdadd_gt): Replaced with CODE_FOR_bcdadd_gt_v16qi and CODE_FOR_bcdadd_gt_v1ti.
(CODE_FOR_bcdsub): Replaced with CODE_FOR_bcdsub_v16qi and CODE_FOR_bcdsub_v1ti.
(CODE_FOR_bcdsub_lt): Replaced with CODE_FOR_bcdsub_lt_v16qi and CODE_FOR_bcdsub_lt_v1ti.
(CODE_FOR_bcdsub_eq): Replaced with CODE_FOR_bcdsub_eq_v16qi and CODE_FOR_bcdsub_eq_v1ti.
(CODE_FOR_bcdsub_gt): Replaced with CODE_FOR_bcdsub_gt_v16qi and CODE_FOR_bcdsub_gt_v1ti.
(rs6000_expand_ternop_builtin):  Add CODE_FOR_dfp_denbcd_v16qi to else if.
* doc/extend.texi: Add documentation for new builtins.

gcc/testsuite/
* gcc.target/powerpc/bcd-2.c: Add include altivec.h.
* gcc.target/powerpc/bcd-3.c: Add include altivec.h.
* gcc.target/powerpc/bcd-4.c: New test.

4 years agoc++: Some additional tests
Nathan Sidwell [Mon, 2 Nov 2020 16:54:16 +0000 (08:54 -0800)]
c++: Some additional tests

I created a few tests on the modules branch that are not actually
module-related.  Here they are.

gcc/testsuite/
* g++.dg/concepts/pack-1.C: New.
* g++.dg/lookup/using53.C: Add an enum.
* g++.dg/template/error25.C: Relax 'export' error check.

4 years agooptions: Tiny refactor
Nathan Sidwell [Mon, 2 Nov 2020 16:50:42 +0000 (08:50 -0800)]
options:  Tiny refactor

This changes more on the modules branch, but let's move the
declaration to the initializer now.

gcc/c-family/
* c-opts.c (c_common_post_options): Move var decl to its
initialization point.

4 years agocore: Synchronize tree-cst & wide-int caching expectations
Nathan Sidwell [Mon, 2 Nov 2020 16:46:16 +0000 (08:46 -0800)]
core: Synchronize tree-cst & wide-int caching expectations

I fell over an ICE where wide_int_to_type_1's expectations of pointer
value caching didn't match that of cache_integer_cst's behaviour.  I
don't know why it only exhibited on the modules branch, but it seems
pretty wrong.  This patch matches up the behaviours and adds a comment
about that.

gcc/
* tree.c (cache_integer_cst): Fixup pointer caching to match
wide_int_to_type_1's expectations.  Add comment.

4 years agocore: id_equal should forward
Nathan Sidwell [Mon, 2 Nov 2020 16:43:17 +0000 (08:43 -0800)]
core: id_equal should forward

I noticed the two id_equal functions directly called strcmp.  This
changes one of them to call the other with args swapped.

gcc/
* tree.h (id_equal): Call the symetric predicate with swapped
arguments.

4 years agocore: debug-print whole call expr
Nathan Sidwell [Mon, 2 Nov 2020 16:38:30 +0000 (08:38 -0800)]
core: debug-print whole call expr

In debugging some call-expr handling, I got confused because the debug
printer elided NULL call operands.  This changes the printer to display
them as NULL.

gcc/
* print-tree.c (print_node): Display all the operands of a call
expr.

4 years agocpplib: Macro use location and comparison
Nathan Sidwell [Mon, 2 Nov 2020 16:29:58 +0000 (08:29 -0800)]
cpplib:  Macro use location and comparison

Our macro use hook passes a location, but doesn't recieve it from the
using location.  This patch adds the extra location_t parameter and
passes it though.

A second cleanup is breaking out the macro comparison code from the
redefinition warning.  That;ll turn out useful for modules.

Finally, there's a filename comparison needed for the location
optimization of rewinding from line 2 (occurs during the emission of
builtin macros).

libcpp/
* internal.h (_cpp_notify_macro_use): Add location parm.
(_cpp_maybe_notify_macro_use): Likewise.
* directives.c (_cpp_do_file_change): Check we've not changed file
when optimizing a rewind.
(do_ifdef): Pass location to _cpp_maybe_notify_macro_use.
(do_ifndef): Likewise.  Delete obsolete comment about powerpc.
* expr.c (parse_defined): Pass location to
_cpp_maybe_notify_macro_use.
* macro.c (enter_macro_context): Likewise.
(warn_of_redefinition): Break out helper function.  Call it.
(compare_macros): New function broken out of warn_of_redefinition.
(_cpp_new_macro): Zero all fields.
(_cpp_notify_macro_use): Add location parameter.

4 years agoAdd hint * too 2nd alternative of the 1st scratch in *vsx_extract_<mode>_store_p9.
Vladimir N. Makarov [Mon, 2 Nov 2020 16:03:54 +0000 (11:03 -0500)]
Add hint * too 2nd alternative of the 1st scratch in *vsx_extract_<mode>_store_p9.

gcc/ChangeLog:

* config/rs6000/vsx.md (*vsx_extract_<mode>_store_p9): Add hint *
to 2nd alternative of the 1st scratch.

4 years ago[PATCH] aarch64: Fix PR97638
Sudakshina Das [Mon, 2 Nov 2020 15:52:22 +0000 (15:52 +0000)]
[PATCH] aarch64: Fix PR97638

Currently the testcase in the patch was failing to produce
a 'bti c' at the beginning of the function. This was because
in aarch64_pac_insn_p, we were wrongly returning at the first
check!

2020-10-30  Sudakshina Das  <sudi.das@arm.com>

gcc/ChangeLog:

PR target/97638
* config/aarch64/aarch64-bti-insert.c (aarch64_pac_insn_p): Update
return value on INSN_P check.

gcc/testsuite/ChangeLog:

PR target/97638
* gcc.target/aarch64/pr97638.c: New test.a

4 years agoRewrite SLP induction vectorization
Richard Biener [Mon, 2 Nov 2020 11:38:04 +0000 (12:38 +0100)]
Rewrite SLP induction vectorization

This rewrites SLP induction vectorization to handle different
inductions in the different SLP lanes.  It also changes SLP
build to represent the initial value (but not the cycle) so
it can be enhanced to handle outer loop vectorization later.

Note this FAILs gcc.dg/vect/costmodel/x86_64/costmodel-pr30843.c
because it removes one CSE optimization that no longer works
with non-uniform initial value and step.  I'll see to recover
from this after outer loop vectorization of inductions works.

It might be a bit friendlier to variable-size vectors now
but then we're now building the step vector from scalars ...

2020-11-02  Richard Biener  <rguenther@suse.de>

* tree.h (build_real_from_wide): Declare.
* tree.c (build_real_from_wide): New function.
* tree-vect-slp.c (vect_build_slp_tree_2): Remove
restriction on induction vectorization, represent
the initial value.
* tree-vect-loop.c (vect_model_induction_cost): Inline ...
(vectorizable_induction): ... here.  Rewrite SLP
code generation.

* gcc.dg/vect/slp-49.c: New testcase.

4 years agoipa-cp: New debug counters for IPA-CP
Martin Jambor [Mon, 2 Nov 2020 14:43:28 +0000 (15:43 +0100)]
ipa-cp: New debug counters for IPA-CP

Martin Liška has been asking me to add debug counters to the IPA-CP pass so
that testcase reductions are easier.  The pass already has one for the bit
value propagation, so this patch adds one for value_range propagation
and one for the actual constant propagation.

gcc/ChangeLog:

2020-10-30  Martin Jambor  <mjambor@suse.cz>

* dbgcnt.def (ipa_cp_values): New counter.
(ipa_cp_vr): Likewise.
* ipa-cp.c (decide_about_value): Check and bump ipa_cp_values debug
counter.
(decide_whether_version_node): Likewise.
(ipcp_store_vr_results):Check and bump ipa_cp_vr debug counter.

4 years agoarm: Fix multiple inheritance thunks for thumb-1 with -mpure-code
Christophe Lyon [Mon, 2 Nov 2020 14:40:10 +0000 (14:40 +0000)]
arm: Fix multiple inheritance thunks for thumb-1 with -mpure-code

When -mpure-code is used, we cannot load delta from code memory (like
we do without -mpure-code).

This patch builds the value of mi_delta into r3 with a series of
movs/adds/lsls.

We also do some cleanup by not emitting the function address and delta
via .word directives at the end of the thunk since we don't use them
with -mpure-code.

No need for new testcases, this bug was already identified by:
g++.dg/ipa/pr46287-3.C
g++.dg/ipa/pr46984.C
g++.dg/opt/thunk1.C
g++.dg/torture/pr46287.C
g++.dg/torture/pr45699.C

2020-11-02  Christophe Lyon  <christophe.lyon@linaro.org>

gcc/
* config/arm/arm.c (arm_thumb1_mi_thunk): Build mi_delta in r3 and
do not emit function address and delta when -mpure-code is used.

4 years agoarm: Call thumb1_gen_const_int from thumb1_movsi_insn
Christophe Lyon [Mon, 2 Nov 2020 14:39:52 +0000 (14:39 +0000)]
arm: Call thumb1_gen_const_int from thumb1_movsi_insn

thumb1_movsi_insn used the same algorithm to build a constant in asm
than thumb1_gen_const_int_1 does in RTL. Since the previous patch added
support for asm generation in thumb1_gen_const_int_1, this patch calls
it from thumb1_movsi_insn to avoid duplication.

We need to introduce a new proxy function, thumb1_gen_const_int_print
to select the right template.

This patch also adds a new testcase as the updated alternative is only
used by thumb-1 processors that also support movt/movw.

2020-11-02  Christophe Lyon  <christophe.lyon@linaro.org>

gcc/
* config/arm/thumb1.md (thumb1_movsi_insn): Call
thumb1_gen_const_int_print.
* config/arm/arm-protos.h (thumb1_gen_const_int_print): Add
prototype.
* config/arm/arm.c (thumb1_gen_const_int_print): New.

gcc/testsuite/
* gcc.target/arm/pure-code/no-literal-pool-m23.c: New.

4 years agoarm: Improve thumb1_gen_const_int
Christophe Lyon [Mon, 2 Nov 2020 14:39:24 +0000 (14:39 +0000)]
arm: Improve thumb1_gen_const_int

Enable thumb1_gen_const_int to generate RTL or asm depending on the
context, so that we avoid duplicating code to handle constants in
Thumb-1 with -mpure-code.

Use a template so that the algorithm is effectively shared, and
rely on two classes to handle the actual emission as RTL or asm.

The generated sequence is improved to handle right-shiftable and small
values with less instructions. We now generate:

128:
        movs    r0, r0, #128
264:
        movs    r3, #33
        lsls    r3, #3
510:
        movs    r3, #255
        lsls    r3, #1
512:
        movs    r3, #1
        lsls    r3, #9
764:
        movs    r3, #191
        lsls    r3, #2
65536:
        movs    r3, #1
        lsls    r3, #16
0x123456:
        movs    r3, #18 ;0x12
        lsls    r3, #8
        adds    r3, #52 ;0x34
        lsls    r3, #8
        adds    r3, #86 ;0x56
0x1123456:
        movs    r3, #137 ;0x89
        lsls    r3, #8
        adds    r3, #26 ;0x1a
        lsls    r3, #8
        adds    r3, #43 ;0x2b
        lsls    r3, #1
0x1000010:
        movs    r3, #16
        lsls    r3, #16
        adds    r3, #1
        lsls    r3, #4
0x1000011:
        movs    r3, #1
        lsls    r3, #24
        adds    r3, #17
-8192:
movs r3, #1
lsls r3, #13
rsbs r3, #0

The patch adds a testcase which does not fully exercise
thumb1_gen_const_int, as other existing patterns already catch small
constants.  These parts of thumb1_gen_const_int are used by
arm_thumb1_mi_thunk.

2020-11-02  Christophe Lyon  <christophe.lyon@linaro.org>

gcc/
* config/arm/arm.c (thumb1_const_rtl, thumb1_const_print): New
classes.
(thumb1_gen_const_int): Rename to ...
(thumb1_gen_const_int_1): ... New helper function. Add capability
to emit either RTL or asm, improve generated code.
(thumb1_gen_const_int_rtl): New function.
* config/arm/arm-protos.h (thumb1_gen_const_int): Rename to
thumb1_gen_const_int_rtl.
* config/arm/thumb1.md: Call thumb1_gen_const_int_rtl instead
of thumb1_gen_const_int.

gcc/testsuite/
* gcc.target/arm/pure-code/no-literal-pool-m0.c: New.

4 years agoSimplify and enhance 'libgomp.oacc-c-c++-common/pr85486*.c' [PR85486]
Thomas Schwinge [Wed, 28 Oct 2020 09:56:20 +0000 (10:56 +0100)]
Simplify and enhance 'libgomp.oacc-c-c++-common/pr85486*.c' [PR85486]

Avoid code duplication, and better test what we expect to happen.

libgomp/
PR target/85486
* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Simplify and enhance.
* testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise.

4 years agoFurther improve Fortran column location information [PR92793]
Thomas Schwinge [Fri, 30 Oct 2020 12:13:51 +0000 (13:13 +0100)]
Further improve Fortran column location information [PR92793]

Building on top of commit 9c81750c5bedd7883182ee2684a012c6210ebe1d "Fortran] PR
92793 - fix column used for error diagnostic", there is another place where we
have to use 'gfc_get_location' returning column-corrected locations.

For example, this improves column location information for OMP constructs.

gcc/fortran/
PR fortran/92793
* trans.c (gfc_set_backend_locus): Use 'gfc_get_location'.
(gfc_restore_backend_locus): Adjust.
gcc/testsuite/
PR fortran/92793
* gfortran.dg/goacc/pr92793-1.f90: Adjust.

4 years agolibgomp testsuite: tell warning from error diagnostics, etc. [PR80219, PR85303]
Thomas Schwinge [Thu, 29 Oct 2020 09:29:19 +0000 (10:29 +0100)]
libgomp testsuite: tell warning from error diagnostics, etc. [PR80219, PR85303]

This changes makes 'dg-warning', 'dg-error', 'dg-bogus', 'dg-message' behave as
expected, and also enables use of relative line numbers as well as 'dg-line'.

libgomp/
PR testsuite/80219
PR testsuite/85303
* testsuite/lib/libgomp.exp (libgomp_init): Set
'gcc_warning_prefix', 'gcc_error_prefix'.

4 years agoFortran: OpenMP - fixes for omp atomic [PR97655]
Tobias Burnus [Mon, 2 Nov 2020 12:07:17 +0000 (13:07 +0100)]
Fortran: OpenMP - fixes for omp atomic [PR97655]

gcc/fortran/ChangeLog:

PR fortran/97655
* openmp.c (gfc_match_omp_atomic): Fix mem-order handling;
reject specifying update + capture together.

gcc/testsuite/ChangeLog:

PR fortran/97655
* gfortran.dg/gomp/atomic.f90: Update tree-dump counts; move
invalid OMP 5.0 code to ...
* gfortran.dg/gomp/atomic-2.f90: ... here; update dg-error.
* gfortran.dg/gomp/requires-9.f90: Update tree dump scan.

4 years agotree-optimization/97558 - compute vectype for SLP nested cycles
Richard Biener [Mon, 2 Nov 2020 10:09:56 +0000 (11:09 +0100)]
tree-optimization/97558 - compute vectype for SLP nested cycles

This makes sure to compute the vector type for invariant SLP children
of nested cycles.

2020-11-02  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97558
* tree-vect-loop.c (vectorizable_reduction): For nested SLP
cycles compute invariant operands vector type.

* gcc.dg/vect/pr97558-2.c: New testcase.

4 years agoAdd test for PR97505.
Aldy Hernandez [Mon, 2 Nov 2020 10:34:47 +0000 (11:34 +0100)]
Add test for PR97505.

gcc/testsuite/ChangeLog:

PR tree-optimization/97505
* gcc.dg/pr97505.c: New test.

4 years agotree-optimization/97558 - avoid SLP analyzing irrelevant stmts
Richard Biener [Mon, 2 Nov 2020 08:38:09 +0000 (09:38 +0100)]
tree-optimization/97558 - avoid SLP analyzing irrelevant stmts

This avoids analyzing reductions that are not relevant (thus dead)
which eventually will lead into crashes because the participating
stmts meta is not analyzed.  For this to work the patch also
properly removes reduction groups that are not uniformly recognized
as patterns.

2020-11-02  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97558
* tree-vect-loop.c (vect_fixup_scalar_cycles_with_patterns):
Check for any mismatch in pattern vs. non-pattern and dissolve
the group if there is one.
* tree-vect-slp.c (vect_analyze_slp_instance): Avoid
analyzing not relevant reductions.
(vect_analyze_slp): Avoid analyzing not relevant reduction
groups.

* gcc.dg/vect/pr97558.c: New testcase.

4 years agotree-optimization/97650 - fix ICE in vect_get_and_check_slp_defs
Richard Biener [Mon, 2 Nov 2020 07:59:02 +0000 (08:59 +0100)]
tree-optimization/97650 - fix ICE in vect_get_and_check_slp_defs

I was mistaken to treat vect_external_def as only applying to
SSA_NAME defs, so check for that.

2020-11-02  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97650
* tree-vect-slp.c (vect_get_and_check_slp_defs): Check
for SSA_NAME before checking SSA_NAME_IS_DEFAULT_DEF.

* gcc.dg/vect/bb-slp-pr97650.c: New testcase.

4 years agoRISC-V: Check multiletter extension has more than 1 letter
Kito Cheng [Thu, 20 Aug 2020 09:19:41 +0000 (17:19 +0800)]
RISC-V: Check multiletter extension has more than 1 letter

gcc/ChangeLog:

* common/config/riscv/riscv-common.c
(riscv_subset_list::parse_multiletter_ext): Checking multiletter
extension has more than 1 letter.

gcc/testsuite/ChangeLog

* gcc.target/riscv/arch-7.c: New.
* gcc.target/riscv/attribute-10.c: Update test arch string.

4 years agoRISC-V: Add configure option: --with-multilib-generator to flexible config multi...
Kito Cheng [Fri, 19 Jun 2020 07:36:23 +0000 (00:36 -0700)]
RISC-V: Add configure option: --with-multilib-generator to flexible config multi-lib settings.

 - Able to configure complex multi-lib rule in configure time, without modify
   any in-tree source.

 - I was consider to implmenet this into `--with-multilib-list` option,
   but I am not sure who will using that with riscv*-*-elf*, so I decide to
   using another option name for that.

 - --with-multilib-generator will pass arguments to multilib-generator, and
   then using the generated multi-lib config file to build the toolchain.

   e.g. Build riscv gcc, default arch/abi is rv64gc/lp64, and build multilib
       for rv32imafd/ilp32 and rv32i/ilp32; rv32ic/ilp32 will reuse
       rv32i/ilp32.
    $ <GCC-SRC>/configure \
       --target=riscv64-elf \
       --with-arch=rv64gc --with-abi=lp64 \
       --with-multilib-generator=rv32i-ilp32--c;rv32imafd-ilp32--

V3 Changes:

 - Rename --with-multilib-config to --with-multilib-generator
 - Check --with-multilib-generator and --with-multilib-list can't be used at
   same time.

V2 Changes:

 - Fix --with-multilib-config hanling on non riscv*-*-elf* triple.

gcc/ChangeLog:

* config.gcc (riscv*-*-*): Handle --with-multilib-generator.
* configure: Regen.
* configure.ac: Add --with-multilib-generator.
* config/riscv/multilib-generator: Exit when parsing arch string error.
* config/riscv/t-withmultilib-generator: New.
* doc/install.texi: Document --with-multilib-generator.

4 years agoarm: Improve handling of relocations with small offsets with -mpure-code on v6m ...
Christophe Lyon [Mon, 2 Nov 2020 07:34:50 +0000 (07:34 +0000)]
arm: Improve handling of relocations with small offsets with -mpure-code on v6m (PR96770)

With -mpure-code on v6m (thumb-1), we can use small offsets with
upper/lower relocations to avoid the extra addition of the
offset.

This patch accepts expressions symbol+offset as legitimate constants
when the literal pool is disabled, making sure that the offset is
within the range supported by thumb-1 [0..255] as described in the
AAELF32 documentation.

It also makes sure that thumb1_movsi_insn emits an error in case we
try to use it with an unsupported RTL construct.

2020-09-28  Christophe Lyon  <christophe.lyon@linaro.org>

gcc/
PR target/96770
* config/arm/arm.c (thumb_legitimate_constant_p): Accept
(symbol_ref + addend) when literal pool is disabled.
(arm_valid_symbolic_address_p): Add support for thumb-1 without
MOVT/MOVW.
* config/arm/thumb1.md (*thumb1_movsi_insn): Accept (symbol_ref +
addend) in the pure-code alternative.

gcc/testsuite/
PR target/96770
* gcc.target/arm/pure-code/pr96770.c: New test.

4 years agoarm: Avoid indirection with -mpure-code on v6m (PR96967)
Christophe Lyon [Mon, 2 Nov 2020 07:31:22 +0000 (07:31 +0000)]
arm: Avoid indirection with -mpure-code on v6m (PR96967)

With -mpure-code on v6m (thumb-1), to avoid a useless indirection when
building the address of a symbol, we want to consider SYMBOL_REF as a
legitimate constant. This way, we build the address using a series of
upper/lower relocations instead of loading the address from memory.

This patch also fixes a missing "clob" conds attribute for
thumb1_movsi_insn, needed because that alternative clobbers the flags.

2020-11-02  Christophe Lyon  <christophe.lyon@linaro.org>

gcc/
PR target/96967
* config/arm/arm.c (thumb_legitimate_constant_p): Add support for
disabled literal pool in thumb-1.
* config/arm/thumb1.md (thumb1_movsi_symbol_ref): Remove.
(*thumb1_movsi_insn): Add support for SYMBOL_REF with -mpure-code.

gcc/testsuite
PR target/96967
* gcc.target/arm/pure-code/pr96767.c: New test.

4 years agoDarwin: Adjust the PCH area to allow for 16384byte page size.
Iain Sandoe [Sat, 8 Aug 2020 11:15:09 +0000 (12:15 +0100)]
Darwin: Adjust the PCH area to allow for 16384byte page size.

Newer versions of Darwin report pagesize 20 which means that we
need to adjust the aligment of the PCH area.

gcc/ChangeLog:

* config/host-darwin.c: Align pch_address_space to 16384.

4 years agoObjective-C : Implement SEL as a built-in typedef.
Iain Sandoe [Sat, 24 Oct 2020 08:48:44 +0000 (09:48 +0100)]
Objective-C : Implement SEL as a built-in typedef.

The reference implementation for Objective-C provides the SEL
typedef (although it is also available from <objc/objc.h>).

gcc/objc/ChangeLog:

* objc-act.c (synth_module_prologue): Get the SEL identifier.
* objc-act.h (enum objc_tree_index): Add OCTI_SEL_NAME.
(objc_selector_name): New.
(SEL_TYPEDEF_NAME): New.
* objc-gnu-runtime-abi-01.c
(gnu_runtime_01_initialize): Initialize SEL typedef.
* objc-next-runtime-abi-01.c
(next_runtime_01_initialize): Likewise.
* objc-next-runtime-abi-02.c

gcc/testsuite/ChangeLog:

* obj-c++.dg/SEL-typedef.mm: New test.
* objc.dg/SEL-typedef.m: New test.

4 years agoObjective-C/C++ : Improve '@' keyword locations.
Iain Sandoe [Fri, 30 Oct 2020 19:06:58 +0000 (19:06 +0000)]
Objective-C/C++ : Improve '@' keyword locations.

When we are lexing tokens for Objective-C, we combine '@' tokens
with a following keyword (when that keyword is a valid Objective-C
one or, for Objective-C, one of the C++ keywords that can appear in
this position).  The responsibility is passed on to the parser to
validate the resulting combination.

The combination of tokens was being done without applying the rule
to their locations - so that we get:

@property
^

instead of what the user might expect:

@property
^~~~~~~~~

This patch combines the source range of the keyword with that of the
'@' sign - which improves diagnostics.

gcc/c-family/ChangeLog:

* c-lex.c (c_lex_with_flags): When combining '@' with a
keyword for Objective-C, combine the location ranges too.

4 years agoObjective-C++ : Address a FIXME.
Iain Sandoe [Fri, 30 Oct 2020 19:24:07 +0000 (19:24 +0000)]
Objective-C++ : Address a FIXME.

We can avoid the spurious additional complaint about a closing
')' by short-circuiting the test in the case we know there's a
syntax error already reported.

gcc/cp/ChangeLog:

* parser.c (cp_parser_objc_at_property_declaration): Use any
exisiting syntax error to suppress complaints about a missing
closing parenthesis in parsing property attributes.

gcc/testsuite/ChangeLog:

* obj-c++.dg/property/at-property-1.mm: Adjust test after
fixing spurious error output.

4 years agoi386: Set the stack usage to 0 for naked functions
Pat Bernardi [Sun, 1 Nov 2020 17:51:08 +0000 (18:51 +0100)]
i386: Set the stack usage to 0 for naked functions

gcc/ChangeLog

* config/i386/i386.c (ix86_expand_prologue): Set the stack usage to 0
for naked functions.

4 years agoipa: Fix segmentation fault in function_summary<clone_info*>::get(cgraph_node*)
Iain Buclaw [Sun, 1 Nov 2020 15:39:10 +0000 (16:39 +0100)]
ipa: Fix segmentation fault in function_summary<clone_info*>::get(cgraph_node*)

PR 97660 occurs when cgraph_node::get returns NULL, and this NULL
cgraph_node is then passed to clone_info::get.  As the original assert
prior to the regressing change in r11-4587 allowed for the cgraph_node
to be NULL, clone_info::get is now only called when cgraph_node::get
returns a nonnull value.

gcc/ChangeLog:

PR ipa/97660
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Don't call
clone_info::get when cgraph_node::get returns NULL.

4 years agotestsuite, X86 : Add target requires masm_intel to three tests.
Iain Sandoe [Sun, 1 Nov 2020 16:27:54 +0000 (16:27 +0000)]
testsuite, X86 : Add target requires masm_intel to three tests.

These tests currently fail on targets without Intel assembler support.

gcc/testsuite/ChangeLog:

* gcc.target/i386/amxbf16-asmintel-1.c: Require masm_intel.
* gcc.target/i386/amxint8-asmintel-1.c: Likewise.
* gcc.target/i386/amxtile-asmintel-1.c: Likewise.

4 years agolibstdc++: Define type traits for wchar_t even when libc support missing
Jonathan Wakely [Sun, 1 Nov 2020 10:56:36 +0000 (10:56 +0000)]
libstdc++: Define type traits for wchar_t even when libc support missing

This meets the requirement that std::is_integral_v<wchar_t> is true,
even when full library support for wchar_t via specializations of
char_traits etc. is not provided. This is done by checking
__WCHAR_TYPE__ to see if the compiler knows about the type, rather than
checking the library's own _GLIBCXX_USE_WCHAR_T autoconf macro.

This assumes that the C++ compiler correctly defines wchar_t as a
distinct type, not a typedef for one of the other integeral types. This
is always true for G++ and should be true for any supported non-GNU
compilers.

Similarly, the std::make_unsigned and std::make_signed traits and the
internal helpers std::__is_integer and std::__is_char are also changed
to depend on the same macro.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_integral<wchar_t>)
(make_unsigned<wchar_t>, make_signed<wchar_t>): Define based
on #ifdef __WCHAR_TYPE__ instead of _GLIBCXX_USE_WCHAR_T.
* include/bits/cpp_type_traits.h (__is_integer<wchar_t>)
(__is_char<wchar_t>): Likewise.

4 years agolibstdc++: Fix gnu-version-namespace buid
François Dumont [Fri, 30 Oct 2020 12:11:49 +0000 (13:11 +0100)]
libstdc++: Fix gnu-version-namespace buid

Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog

* src/c++17/floating_from_chars.cc (_GLIBCXX_USE_CX11_ABI): Add define.
(buffering_string): New.
[!_GLIBCXX_USE_CXX11_ABI](reserve_string): New.
(from_chars): Adapt.
* src/c++20/sstream-inst.cc: Limit instantiations to
_GLIBCXX_USE_CXX11_ABI.

4 years agolibstdc++: Prefer double to long double in std::shuffle_order_engine
Jonathan Wakely [Sat, 31 Oct 2020 07:16:47 +0000 (07:16 +0000)]
libstdc++: Prefer double to long double in std::shuffle_order_engine

The transition algorithm for std::shuffle_order_engine uses long double
to ensure that the value (max() - min() + 1) can be accurately
represented, to avoid bias in the shuffling. However, when the base
engine's range is small enough we can avoid slower long double
arithmetic by using double. For example, long double is unnecessary for
any base engine returning 32-bit values.

This makes std::knuth_b::operator() about 15% faster on x86_64, and
probably even more on targets where long double uses soft-float.

libstdc++-v3/ChangeLog:

* include/bits/random.h (independent_bit_engine): Fix typo
in comment.
(shuffle_order_engine): Fix incorrect description in comment.
* include/bits/random.tcc (__representable_as_double
(__p1_representable_as_double): New helper functions.
(shuffle_order_engine::operator()): Use double for calculation
if (max() - min() + 1) is representable as double.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.

4 years agoMove clone_info to summary
Jan Hubicka [Sat, 31 Oct 2020 09:18:06 +0000 (10:18 +0100)]
Move clone_info to summary

* Makefile.in: (OBJS): Add symtab-clones.o
(GTFILES): Add symtab-clones.h
* cgraph.c: Include symtab-clones.h.
(cgraph_edge::resolve_speculation): Fix formating
(cgraph_edge::redirect_call_stmt_to_callee): Update.
(cgraph_update_edges_for_call_stmt): Update
(release_function_body): Fix formating.
(cgraph_node::remove): Fix formating.
(cgraph_node::dump): Fix formating.
(cgraph_node::get_availability): Fix formating.
(cgraph_node::call_for_symbol_thunks_and_aliases): Fix formating.
(set_const_flag_1): Fix formating.
(set_pure_flag_1): Fix formating.
(cgraph_node::can_remove_if_no_direct_calls_p): Fix formating.
(collect_callers_of_node_1): Fix formating.
(clone_of_p): Update.
(cgraph_node::verify_node): Update.
(cgraph_c_finalize): Call clone_info::release ().
* cgraph.h (struct cgraph_clone_info): Move to symtab-clones.h.
(cgraph_node): Remove clone_info.
(symbol_table): Add m_clones.
* cgraphclones.c: Include symtab-clone.h.
(duplicate_thunk_for_node): Update.
(cgraph_node::create_clone): Update.
(cgraph_node::create_virtual_clone): Update.
(cgraph_node::find_replacement): Update.
(cgraph_node::materialize_clone): Update.
* gengtype.c (open_base_files): Include symtab-clones.h.
* ipa-cp.c: Include symtab-clones.h.
(initialize_node_lattices): Update.
(want_remove_some_param_p): Update.
(create_specialized_node): Update.
* ipa-fnsummary.c: Include symtab-clones.h.
(ipa_fn_summary_t::duplicate): Update.
* ipa-modref.c: Include symtab-clones.h.
(update_signature): Update.
* ipa-param-manipulation.c: Include symtab-clones.h.
(ipa_param_body_adjustments::common_initialization): Update.
* ipa-prop.c: Include symtab-clones.h.
(adjust_agg_replacement_values): Update.
(ipcp_get_parm_bits): Update.
(ipcp_update_bits): Update.
(ipcp_update_vr): Update.
* ipa-sra.c: Include symtab-clones.h.
(process_isra_node_results): Update.
(disable_unavailable_parameters): Update.
* lto-cgraph.c: Include symtab-clone.h.
(output_cgraph_opt_summary_p): Update.
(output_node_opt_summary): Update.
(input_node_opt_summary): Update.
* symtab-clones.cc: New file.
* symtab-clones.h: New file.
* tree-inline.c (expand_call_inline): Update.
(update_clone_info): Update.
(tree_function_versioning): Update.

4 years agoHandle fnspec in local ipa-modref
Jan Hubicka [Sat, 31 Oct 2020 07:56:40 +0000 (08:56 +0100)]
Handle fnspec in local ipa-modref

* ipa-modref.c (modref_summary::dump): Dump writes_errno.
(parm_map_for_arg): Break out from ...
(merge_call_side_effects): ... here.
(get_access_for_fnspec): New function.
(process_fnspec): New function.
(analyze_call): Use it.
(analyze_stmt): Update.
(analyze_function): Initialize writes_errno.
(modref_summaries::duplicate): Duplicate writes_errno.
* ipa-modref.h (struct modref_summary): Add writes_errno.
* tree-ssa-alias.c (call_may_clobber_ref_p_1): Check errno.

4 years agolibstdc++: Use double for unordered container load factors [PR 96958]
Jonathan Wakely [Sat, 31 Oct 2020 00:52:57 +0000 (00:52 +0000)]
libstdc++: Use double for unordered container load factors [PR 96958]

My previous commit for this PR changed the types from long double to
double, but didn't change the uses of __builtin_ceill and
__builtin_floorl. It also failed to change the non-inline functions in
src/c++11/hashtable_c++0x.cc. This should fix it properly now.

libstdc++-v3/ChangeLog:

PR libstdc++/96958
* include/bits/hashtable_policy.h (_Prime_rehash_policy)
(_Power2_rehash_policy): Use ceil and floor instead of ceill and
floorl.
* src/c++11/hashtable_c++0x.cc (_Prime_rehash_policy): Likewise.
Use double instead of long double.

4 years agolibstdc++: Don't initialize from *this inside some views [PR97600]
Patrick Palka [Sat, 31 Oct 2020 00:33:19 +0000 (20:33 -0400)]
libstdc++: Don't initialize from *this inside some views [PR97600]

This works around a subtle issue where instantiating the begin()/end()
member of some views (as part of return type deduction) inadvertently
requires computing the satisfaction value of range<foo_view>.

This is problematic because the constraint range<foo_view> requires the
begin()/end() member to be callable.  But it's not callable until we've
deduced its return type, so evaluation of range<foo_view> yields false
at this point.  And if after both members are instantiated (and their
return types deduced) we evaluate range<foo_view> again, this time it
will yield true since the begin()/end() members are now both callable.
This makes the program ill-formed according to [temp.constr.atomic]/3:

  If, at different points in the program, the satisfaction result is
  different for identical atomic constraints and template arguments, the
  program is ill-formed, no diagnostic required.

The views affected by this issue are those whose begin()/end() member
has a placeholder return type and that member initializes an _Iterator
or _Sentinel object from a reference to *this.  The second condition is
relevant because it means explicit conversion functions are considered
during overload resolution (as per [over.match.copy], I think), and
therefore it causes g++ to check the constraints of the conversion
function view_interface<foo_view>::operator bool().  And this conversion
function's constraints indirectly require range<foo_view>.

This issue is observable on trunk only with basic_istream_view (as in
the testcase in the PR).  But a pending patch that makes g++ memoize
constraint satisfaction values indefinitely (it currently invalidates
the satisfaction cache on various events) causes many existing tests for
the other affected views to fail, because range<foo_view> then remains
false for the whole compilation.

This patch works around this issue by adjusting the constructors of the
_Iterator and _Sentinel types of the affected views to take their
foo_view argument by pointer instead of by reference, so that g++ no
longer considers explicit conversion functions when resolving the
direct-initialization inside these views' begin()/end() members.

libstdc++-v3/ChangeLog:

PR libstdc++/97600
* include/std/ranges (basic_istream_view::begin): Initialize
_Iterator from 'this' instead of '*this'.
(basic_istream_view::_Iterator::_Iterator): Adjust constructor
accordingly.
(filter_view::_Iterator::_Iterator): Take a filter_view*
argument instead of a filter_view& argument.
(filter_view::_Sentinel::_Sentinel): Likewise.
(filter_view::begin): Initialize _Iterator from 'this' instead
of '*this'.
(filter_view::end): Likewise.
(transform_view::_Iterator::_Iterator): Take a _Parent* instead
of a _Parent&.
(filter_view::_Iterator::operator+): Adjust accordingly.
(filter_view::_Iterator::operator-): Likewise.
(filter_view::begin): Initialize _Iterator from 'this' instead
of '*this'.
(filter_view::end): Likewise.
(join_view::_Iterator): Take a _Parent* instead of a _Parent&.
(join_view::_Sentinel): Likewise.
(join_view::begin): Initialize _Iterator from 'this' instead of
'*this'.
(join_view::end): Initialize _Sentinel from 'this' instead of
'*this'.
(split_view::_OuterIter): Take a _Parent& instead of a _Parent*.
(split_view::begin): Initialize _OuterIter from 'this' instead
of '*this'.
(split_view::end): Likewise.
* testsuite/std/ranges/97600.cc: New test.

4 years agolibstdc++: Implement P2017R1 "Conditionally borrowed ranges"
Jonathan Wakely [Fri, 30 Oct 2020 18:39:43 +0000 (18:39 +0000)]
libstdc++: Implement P2017R1 "Conditionally borrowed ranges"

This makes some range adaptors model the borrowed_range concept if they
are adapting a borrowed range. This hasn't been added to the C++23
working paper yet, but it has been approved by LWG, and the
recommendation is to treat it as a defect report for C++20 as well.

libstdc++-v3/ChangeLog:

* include/std/ranges (enable_borrowed_view<take_view<T>>)
(enable_borrowed_view<drop_view<T>>)
(enable_borrowed_view<drop_while_view<T>>)
(enable_borrowed_view<reverse_view<T>>)
(enable_borrowed_view<common_view<T>>)
(enable_borrowed_view<elements_view<T>>): Add partial
specializations as per P2017R1.
* testsuite/std/ranges/adaptors/conditionally_borrowed.cc:
New test.

4 years agoPowerPC: Don't assume all targets have GLIBC.
Michael Meissner [Fri, 30 Oct 2020 22:36:25 +0000 (18:36 -0400)]
PowerPC: Don't assume all targets have GLIBC.

gcc/
2020-10-30  Michael Meissner  <meissner@linux.ibm.com>

* config/rs6000/rs6000.c (glibc_supports_ieee_128bit): New helper
function.
(rs6000_option_override_internal): Call it.

4 years agolibstdc++: Use double for unordered container load factors [PR 96958]
Jonathan Wakely [Fri, 30 Oct 2020 15:14:33 +0000 (15:14 +0000)]
libstdc++: Use double for unordered container load factors [PR 96958]

These calculations were changed to use long double nearly ten years ago
in order to get more precision than float:
https://gcc.gnu.org/pipermail/libstdc++/2011-September/036420.html

However, double should be sufficient, whlie being potentially faster
than long double, and not requiring soft FP calculations for targets
without native long double support.

libstdc++-v3/ChangeLog:

PR libstdc++/96958
* include/bits/hashtable_policy.h (_Prime_rehash_policy)
(_Power2_rehash_policy): Use double instead of long double.

4 years agolibstdc++: Fix some more warnings in test
Jonathan Wakely [Fri, 30 Oct 2020 10:47:25 +0000 (10:47 +0000)]
libstdc++: Fix some more warnings in test

libstdc++-v3/ChangeLog:

* testsuite/23_containers/vector/bool/modifiers/insert/31370.cc:
Avoid -Wcatch-value warnings.

4 years agoPR libfortran/97581 - clean up size calculation of random generator state
Harald Anlauf [Fri, 30 Oct 2020 19:49:32 +0000 (20:49 +0100)]
PR libfortran/97581 - clean up size calculation of random generator state

The random number generator internal state may be saved to/restored from
an array of integers.  Clean up calculation of needed number of elements
to avoid redefiniton of auxiliary macro SZ.

libgfortran/ChangeLog:

* intrinsics/random.c (SZ_IN_INT_4): Define size of state in int32_t.
(SZ_IN_INT_8): Define size of state in int64_t.
(SZ): Remove.
(random_seed_i4): Use size SZ_IN_INT_4 instead of SZ.
(random_seed_i8): Use size SZ_IN_INT_8 instead of SZ.

4 years agoAdd -fzero-call-used-regs option and zero_call_used_regs function attributes.
qing zhao [Fri, 30 Oct 2020 19:41:38 +0000 (20:41 +0100)]
Add -fzero-call-used-regs option and zero_call_used_regs function attributes.

This new feature causes the compiler to zero a  subset of all call-used
registers at function return.  This is used to increase program security
by either mitigating Return-Oriented Programming (ROP) attacks or
preventing information leakage through registers.

gcc/ChangeLog:

2020-10-30  Qing Zhao  <qing.zhao@oracle.com>
    H.J.Lu  <hjl.tools@gmail.com>

* common.opt: Add new option -fzero-call-used-regs
* config/i386/i386.c (zero_call_used_regno_p): New function.
(zero_call_used_regno_mode): Likewise.
(zero_all_vector_registers): Likewise.
(zero_all_st_registers): Likewise.
(zero_all_mm_registers): Likewise.
(ix86_zero_call_used_regs): Likewise.
(TARGET_ZERO_CALL_USED_REGS): Define.
* df-scan.c (df_epilogue_uses_p): New function.
(df_get_exit_block_use_set): Replace EPILOGUE_USES with
df_epilogue_uses_p.
* df.h (df_epilogue_uses_p): Declare.
* doc/extend.texi: Document the new zero_call_used_regs attribute.
* doc/invoke.texi: Document the new -fzero-call-used-regs option.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_ZERO_CALL_USED_REGS): New hook.
* emit-rtl.h (struct rtl_data): New field must_be_zero_on_return.
* flag-types.h (namespace zero_regs_flags): New namespace.
* function.c (gen_call_used_regs_seq): New function.
(class pass_zero_call_used_regs): New class.
(pass_zero_call_used_regs::execute): New function.
(make_pass_zero_call_used_regs): New function.
* optabs.c (expand_asm_reg_clobber_mem_blockage): New function.
* optabs.h (expand_asm_reg_clobber_mem_blockage): Declare.
* opts.c (zero_call_used_regs_opts): New structure array
initialization.
(parse_zero_call_used_regs_options): New function.
(common_handle_option): Handle -fzero-call-used-regs.
* opts.h (zero_call_used_regs_opts): New structure array.
* passes.def: Add new pass pass_zero_call_used_regs.
* recog.c (valid_insn_p): New function.
* recog.h (valid_insn_p): Declare.
* resource.c (init_resource_info): Replace EPILOGUE_USES with
df_epilogue_uses_p.
* target.def (zero_call_used_regs): New hook.
* targhooks.c (default_zero_call_used_regs): New function.
* targhooks.h (default_zero_call_used_regs): Declare.
* tree-pass.h (make_pass_zero_call_used_regs): Declare.

gcc/c-family/ChangeLog:

2020-10-30  Qing Zhao  <qing.zhao@oracle.com>
    H.J.Lu  <hjl.tools@gmail.com>

* c-attribs.c (c_common_attribute_table): Add new attribute
zero_call_used_regs.
(handle_zero_call_used_regs_attribute): New function.

gcc/testsuite/ChangeLog:

2020-10-30  Qing Zhao  <qing.zhao@oracle.com>
    H.J.Lu  <hjl.tools@gmail.com>

* c-c++-common/zero-scratch-regs-1.c: New test.
* c-c++-common/zero-scratch-regs-10.c: New test.
* c-c++-common/zero-scratch-regs-11.c: New test.
* c-c++-common/zero-scratch-regs-2.c: New test.
* c-c++-common/zero-scratch-regs-3.c: New test.
* c-c++-common/zero-scratch-regs-4.c: New test.
* c-c++-common/zero-scratch-regs-5.c: New test.
* c-c++-common/zero-scratch-regs-6.c: New test.
* c-c++-common/zero-scratch-regs-7.c: New test.
* c-c++-common/zero-scratch-regs-8.c: New test.
* c-c++-common/zero-scratch-regs-9.c: New test.
* c-c++-common/zero-scratch-regs-attr-usages.c: New test.
* gcc.target/i386/zero-scratch-regs-1.c: New test.
* gcc.target/i386/zero-scratch-regs-10.c: New test.
* gcc.target/i386/zero-scratch-regs-11.c: New test.
* gcc.target/i386/zero-scratch-regs-12.c: New test.
* gcc.target/i386/zero-scratch-regs-13.c: New test.
* gcc.target/i386/zero-scratch-regs-14.c: New test.
* gcc.target/i386/zero-scratch-regs-15.c: New test.
* gcc.target/i386/zero-scratch-regs-16.c: New test.
* gcc.target/i386/zero-scratch-regs-17.c: New test.
* gcc.target/i386/zero-scratch-regs-18.c: New test.
* gcc.target/i386/zero-scratch-regs-19.c: New test.
* gcc.target/i386/zero-scratch-regs-2.c: New test.
* gcc.target/i386/zero-scratch-regs-20.c: New test.
* gcc.target/i386/zero-scratch-regs-21.c: New test.
* gcc.target/i386/zero-scratch-regs-22.c: New test.
* gcc.target/i386/zero-scratch-regs-23.c: New test.
* gcc.target/i386/zero-scratch-regs-24.c: New test.
* gcc.target/i386/zero-scratch-regs-25.c: New test.
* gcc.target/i386/zero-scratch-regs-26.c: New test.
* gcc.target/i386/zero-scratch-regs-27.c: New test.
* gcc.target/i386/zero-scratch-regs-28.c: New test.
* gcc.target/i386/zero-scratch-regs-29.c: New test.
* gcc.target/i386/zero-scratch-regs-30.c: New test.
* gcc.target/i386/zero-scratch-regs-31.c: New test.
* gcc.target/i386/zero-scratch-regs-3.c: New test.
* gcc.target/i386/zero-scratch-regs-4.c: New test.
* gcc.target/i386/zero-scratch-regs-5.c: New test.
* gcc.target/i386/zero-scratch-regs-6.c: New test.
* gcc.target/i386/zero-scratch-regs-7.c: New test.
* gcc.target/i386/zero-scratch-regs-8.c: New test.
* gcc.target/i386/zero-scratch-regs-9.c: New test.

4 years agoTake insn scratch RA requirements into account in IRA.
Vladimir N. Makarov [Fri, 30 Oct 2020 19:05:22 +0000 (15:05 -0400)]
Take insn scratch RA requirements into account in IRA.

  The patch changes insn scratches which require registers for all
insn alternatives (in other words w/o X constraint in scratch
constraint string).  This is done before IRA staring its work.  LRA
still continue to change the rest scratches (with X constraint and in
insn created during IRA) into pseudos.  As before the patch at the end
of LRA work, spilled scratch pseudos (for which X constraint was
chosen) changed into scratches back.

gcc/ChangeLog:

* lra.c (get_scratch_reg): New function.
(remove_scratches_1): Rename remove_insn_scratches.  Use
ira_remove_insn_scratches and get_scratch_reg.
(remove_scratches): Do not
initialize scratches, scratch_bitmap, and scratch_operand_bitmap.
(lra): Call ira_restore_scratches instead of restore_scratches.
(struct sloc, sloc_t, scratches, scratch_bitmap)
(scratch_operand_bitmap, lra_former_scratch_p)
(lra_former_scratch_operand_p, lra_register_new_scratch_op)
(restore_scratches): Move them to ...
* ira.c: ... here.
(former_scratch_p, former_scratch_operand_p): Rename to
ira_former_scratch_p and ira_former_scratch_operand_p.
(contains_X_constraint_p): New function.
(register_new_scratch_op): Rename to ira_register_new_scratch_op.
Change it to work for IRA and LRA.
(restore_scratches): Rename to ira_restore_scratches.
(get_scratch_reg, ira_remove_insn_scratches): New functions.
(ira): Call ira_remove_scratches if we use LRA.
* ira.h (ira_former_scratch_p, ira_former_scratch_operand_p): New
prototypes.
(ira_register_new_scratch_op, ira_restore_scratches): New prototypes.
(ira_remove_insn_scratches): New prototype.
* lra-int.h (lra_former_scratch_p, lra_former_scratch_operand_p):
Remove prototypes.
(lra_register_new_scratch_op): Ditto.
* lra-constraints.c: Rename lra_former_scratch_p and
lra_former_scratch_p to ira_former_scratch_p and to
ira_former_scratch_p.
* lra-remat.c: Ditto.
* lra-spills.c: Rename lra_former_scratch_p to ira_former_scratch_p.

4 years agoPR middle-end/97556 - ICE on excessively large index into a multidimensional array
Martin Sebor [Fri, 30 Oct 2020 19:04:29 +0000 (13:04 -0600)]
PR middle-end/97556 - ICE on excessively large index into a multidimensional array

gcc/ChangeLog:

PR middle-end/97556
* builtins.c (access_ref::add_offset): Cap offset lower bound
to at most the the upper bound.

gcc/testsuite/ChangeLog:

PR middle-end/97556
* gcc.dg/Warray-bounds-70.c: New test.

4 years agolibstdc++: Fix the default constructor of ranges::__detail::__box
Patrick Palka [Fri, 30 Oct 2020 16:33:13 +0000 (12:33 -0400)]
libstdc++: Fix the default constructor of ranges::__detail::__box

The class template semiregular-box<T> of [range.semi.wrap] is specified
to value-initialize the underlying object whenever its type is default
initializable.  Our primary template for __detail::__box respects this
requirement, but the recently added partial specialization (for types
that are already semiregular) does not.

This patch fixes this issue, and additionally makes the corresponding in
place constructor explicit (as in the primary template).

libstdc++-v3/ChangeLog:

* include/std/ranges (__detail::__box): For the partial
specialization used by types that are already semiregular,
make the default constructor value-initialize the underlying
object instead of default-initializing it.  Make its in place
constructor explicit.
* testsuite/std/ranges/adaptors/detail/semiregular_box.cc:
Augment test.

4 years agotestsuite: Avoid TCL errors when rootme or ASAN/TSAN/UBSAN is not avail
Tobias Burnus [Fri, 30 Oct 2020 16:11:20 +0000 (17:11 +0100)]
testsuite: Avoid TCL errors when rootme or ASAN/TSAN/UBSAN is not avail

gcc/testsuite/
* g++.dg/guality/guality.exp: Skip $rootme-based check if unset.
* gcc.dg/guality/guality.exp: Likewise.
* gfortran.dg/guality/guality.exp: Likewise.
* lib/asan-dg.exp: Don't use $asan_saved_library_path if not set.
* lib/tsan-dg.exp: Don't use $tsan_saved_library_path if not set.
* lib/ubsan-dg.exp: Don't use $ubsan_saved_library_path if not set.

4 years agoFortran: Update omp atomic for OpenMP 5
Tobias Burnus [Fri, 30 Oct 2020 14:57:46 +0000 (15:57 +0100)]
Fortran: Update omp atomic for OpenMP 5

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_clauses): Handle atomic clauses.
(show_omp_node): Call it for atomic.
* gfortran.h (enum gfc_omp_atomic_op): Add GFC_OMP_ATOMIC_UNSET,
remove GFC_OMP_ATOMIC_SEQ_CST and GFC_OMP_ATOMIC_ACQ_REL.
(enum gfc_omp_memorder): Replace OMP_MEMORDER_LAST by
OMP_MEMORDER_UNSET, add OMP_MEMORDER_SEQ_CST/OMP_MEMORDER_RELAXED.
(gfc_omp_clauses): Add capture and atomic_op.
(gfc_code): remove omp_atomic.
* openmp.c (enum omp_mask1): Add atomic, capture, memorder clauses.
(gfc_match_omp_clauses): Match them.
(OMP_ATOMIC_CLAUSES): Add.
(gfc_match_omp_flush): Update for 'last' to 'unset' change.
(gfc_match_omp_oacc_atomic): Removed and placed content ..
(gfc_match_omp_atomic): ... here. Update for OpenMP 5 clauses.
(gfc_match_oacc_atomic): Match directly here.
(resolve_omp_atomic, gfc_resolve_omp_directive): Update.
* parse.c (parse_omp_oacc_atomic): Update for struct gfc_code changes.
* resolve.c (gfc_resolve_blocks): Update assert.
* st.c (gfc_free_statement): Also call for EXEC_O{ACC,MP}_ATOMIC.
* trans-openmp.c (gfc_trans_omp_atomic): Update.
(gfc_trans_omp_flush): Update for 'last' to 'unset' change.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/atomic-2.f90: New test.
* gfortran.dg/gomp/atomic.f90: New test.

4 years agoFix thunk info WRT PCH
Jan Hubicka [Fri, 30 Oct 2020 13:30:43 +0000 (14:30 +0100)]
Fix thunk info WRT PCH

PR pch/97593
* cgraph.c (cgraph_node::create_thunk): Register thunk as early during
parsing.
* cgraphunit.c (analyze_functions): Call
thunk_info::process_early_thunks.
* symtab-thunks.cc (struct unprocessed_thunk): New struct.
(thunks): New static variable.
(thunk_info::register_early): New member function.
(thunk_info::process_early_thunks): New member function.
* symtab-thunks.h (thunk_info::register_early): Declare.
(thunk_info::process_early_thunks): Declare.

4 years agoDisable TBAA for array descriptors.
Jan Hubicka [Fri, 30 Oct 2020 13:28:23 +0000 (14:28 +0100)]
Disable TBAA for array descriptors.

* trans-types.c: Include alias.h
(gfc_get_array_type_bounds): Set typeless storage.

4 years agotree-optimization/97623 - avoid excessive insert iteration for hoisting
Richard Biener [Fri, 30 Oct 2020 12:32:32 +0000 (13:32 +0100)]
tree-optimization/97623 - avoid excessive insert iteration for hoisting

This avoids requiring insert iteration for back-to-back hoisting
opportunities as seen in the added testcase.  For the PR at hand
this halves the number of insert iterations retaining only
the hard to avoid PRE / hoist insert back-to-backs.

2020-10-30  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97623
* tree-ssa-pre.c (insert): First do hoist insertion in
a backward walk.

* gcc.dg/tree-ssa/ssa-hoist-7.c: New testcase.

4 years agotree-optimization/97626 - handle SCCs properly in SLP stmt analysis
Richard Biener [Fri, 30 Oct 2020 10:26:18 +0000 (11:26 +0100)]
tree-optimization/97626 - handle SCCs properly in SLP stmt analysis

This makes sure to roll-back the whole SCC when we fail stmt
analysis, otherwise the optimistic visited treatment breaks down
with different entries.  Rollback is easy when tracking additions
to visited in a vector which also makes the whole thing cheaper
than the two hash-sets used before.

2020-10-30  Richard Biener  <rguenther@suse.de>

PR tree-optimization/97626
* tree-vect-slp.c (vect_slp_analyze_node_operations):
Exchange the lvisited hash-set for a vector, roll back
recursive adds to visited when analysis failed.
(vect_slp_analyze_operations): Likewise.

* gcc.dg/vect/bb-slp-pr97626.c: New testcase.